0.2.0 • Published 4 years ago

@hackademymx/github-classroom-scraper v0.2.0

Weekly downloads
-
License
MIT
Repository
github
Last release
4 years ago

GitHub Classroom Scraper

Scraper for obtaining information about activities from a GitHub classroom.

Requirements

  • Have installed the latest version of node.
  • Enabled 2 factor authentication in your GitHub account, with an authenticator app.
  • You need to set env variables with your GitHub username and password, they are GH_EMAIL, GH_PASSWORD respectively.

Install

To install this tool, you need to do it with npm. We suggest to do it globally if you are going to use only the CLI.

npm i -g @hackademymx/github-classroom-scraper

or if you prefer you could install it locally

npm i @hackademymx/github-classroom-scraper

Usage

CLI

To use the CLI tool, you need to type:

github-classroom-scraper -u YOUR_CLASSROOM_URL -o YOUR_OTP

The different flags are:

OptionNameDescriptionRequiredDefault
uclassroom-urlThe URL of your classroom. e.g. https://classroom.github.com/classrooms/your-classroomYESNA
ootpThe one time password of your 2FAYESNA
rregular-waitThe waiting interval in ms for fetching info. Increase it for low speed connectionsNO5000
hheadlessIt controls if you see the automated browser or noNOtrue

It will throw two files:

  • resultsPerActivity.json: All the results per activity in the following format:
{
  "Activity Name": [
    {
      "userName": "student username",
      "description": "The user's activity message. e.g. 'Latest commit passed 7 commits Submitted'",
      "activityTitle": "Activity Name",
      "isSubmitted": "Parsed submitted value in description",
      "commitsMade": "Parsed commits value in description"
    }
  ]
}
  • resultsPerUser.csv: A condensed activities for JS and Python exercises. It has the following columns:
UserActivitiesSubmittedPython CompletedJS CompletedTotal Tried Activities

Module

To use this as a module, just import it as another dependency

import scraper from "@hackademymx/github-classroom-scraper";

This is an async function, that has the following signature:

scraper(githubClassroomUrl, user, password, otp, { regularWait, headless, navigationTimeout, defaultViewport, generateFiles }) -> Object
ParamDescriptionDefault
githubClasroomUrlThe URL of your classroom. e.g. https://classroom.github.com/classrooms/your-classroomNA
userYour GitHub userNA
passwordYour GitHub passwordNA
otpThe one time password that the app throws youNA
regularWaitThe time that waits in ms to info to load. Increase for low speed connections5000
headlessIf you want to see the actual browser workingtrue
navigationTimeoutTime that browser with no interaction will wait in ms before throwing an exception24000
defaultViewportThe viewport that the browser will launch. Null for max resolutionnull
generateFilesGenerate result files as the clitrue

The result object will be a list of objects that have the following structure:

[
  {
    userName: "The username of the student that solved the activity",
    description:
      "The description of the activity e.g. 'Latest commit passed  6 commits  Not Submitted'",
    activityTitle: "The title of the activity",
  },
];

Contribution

Fork the repo, and install the dependencies:

npm install

Feel free to open an issue or pull request. Contributions welcome!

License

This project is licensed under the terms of the MIT license.

Made with 💙 and 🌮 in 🇲🇽