1.0.21 • Published 4 years ago

job-funnel v1.0.21

Weekly downloads
1
License
GPL-3.0-or-later
Repository
github
Last release
4 years ago

Job Funnel TS

Automated tool for scraping job postings into a .xlsx files inspired by Job Funnel and written in Typescript. Currently supports LinkedIn, Monster (experimental), Glassdoor (experimental), Indeed (experimental).

Usage

  1. Ensure that you have Node 14.x+ installed and available globally
  2. Install Job Funnel as a global package: npm install -g job-funnel
  3. Generate a new config file in a local folder: job-funnel generate-config
  4. Update 'config.yaml' file and specify your credentials
  5. Update 'pages' section in 'config.yaml' files with a list of search pages URLs. You can get a URL by running a search on a supported job website, e.g. LinkedIn Jobs page, and copying the URL of the results page
  6. Run the crawlers: job-funnel scan. To run all of the supported crawlers, including experimental, specify them explicitly using '--sites' parameter: job-funnel scan --sites linkedin monster glassdoor indeed
  7. Export the results to 'report.xlsx' file: job-funnel export

Optional Steps

  1. If needed, run job-funnel wipe-db to wipe the cached job results database
  2. Run job-funnel --debug scan to see the crawling process. It's useful for troubleshooting sometimes. Use job-funnel --debug scan --sites linkedin monster glassdoor indeed to use all of the available crawlers in debug mode
  3. Run job-funnel without any parameters to see help

Config Example

crawlers:
  linkedin:
    pages:
      - https://www.linkedin.com/jobs/search/?f_E=2%2C3%2C4&f_TPR=r604800&geoId=90000070&keywords=qa%20analyst&location=New%20York%20City%20Metropolitan%20Area&f_TP=1%2C2&redirect=false&position=1&pageNum=0
      - https://www.linkedin.com/jobs/search/?distance=50&f_E=2%2C3%2C4&f_TPR=r86400&geoId=104047727&keywords=qa%20analyst&location=Jersey%20City%2C%20New%20Jersey%2C%20United%20States&f_TP=1%2C2&redirect=false&position=1&pageNum=0
      - https://www.linkedin.com/jobs/search?keywords=Qa%20tester&location=New%20York%20City%20Metropolitan%20Area&geoId=90000070&trk=public_jobs_jobs-search-bar_search-submit&redirect=false&position=1&pageNum=0
    credentials:
      username: foobar@gmail.com
      password: foobarbaz
  monster:
    pages:
      - https://www.monster.com/jobs/search/Full-Time_8?q=qa-analyst&where=07302&rad=20&tm=3&jobid=220754835
    filters:
      radius: 40 # supported values: (empty), 5, 10, 20, 30, 40, 50, 60, 75, 100, 150, 200
      job_status: Full-Time # supported values: (empty), Part-Time
      posted: 1 # supported values: (empty), -1 (any date), 0 (today), 1 (yesterday), 3 (last 3 days), 7 (last 7 days), 14 (last 14 days), 30 (last 30 days)
    credentials:
      username: foo@bar.baz
      password: foobarbaz
  glassdoor:
    pages:
      - https://www.glassdoor.com/Job/jersey-city-qa-analyst-jobs-SRCH_IL.0,11_IC1126819_KO12,22.htm?jobType=fulltime&fromAge=1&radius=50
    credentials:
      username: foo@bar.baz
      password: foobarbaz
  indeed:
    pages:
      - https://www.indeed.com/jobs?q=QA%20Analyst&l=Jersey%20City%2C%20NJ&radius=50&rbl=New%20York%2C%20NY&jlid=45f6c4ded55c00bf&jt=fulltime&vjk=a573133dd9847a53
    credentials:
      username: foo@bar.baz
      password: foobarbaz

Development

Installation

  1. Ensure that you have Node 14.x+ installed and available globally
  2. Clone the repository: git clone https://github.com/alehkot/job-funnel-ts.git
  3. Run npm install
  4. Run cp config.yaml.sample config.yaml to create your configuration file

Running the crawlers

  1. Update 'config.yaml' file and specify your credentials
  2. Update 'pages' section in 'config.yaml' files with a list of search pages URLs. You can get a URL by running a search on a supported website copying the URL of the results page
  3. Use npm run start-dev -- scan to run a crawler
  4. Use npm run start-dev -- export to export results into 'report.xlsx' file (it's possible to run this command any number of times, subsequent runs will just append new results to the database table)
  5. Use npm run start-dev -- wipe-db to wipe the database
  6. Use npm run start-dev -- generate-config to generate a new config file
  7. Use --debug global flag to disable headless Puppeteer mode and increase artificial crawling delays, for example npm run start-dev -- --debug scan

Notes

The development in the very early stages, so there might be considerable code changes in the future.

1.0.19

4 years ago

1.0.21

4 years ago

1.0.20

4 years ago

1.0.18

4 years ago

1.0.17

4 years ago

1.0.16

4 years ago

1.0.15

4 years ago

1.0.14

4 years ago

1.0.13

4 years ago

1.0.12

4 years ago

1.0.11

4 years ago

1.0.9

4 years ago

1.0.10

4 years ago

1.0.8

4 years ago

1.0.7

4 years ago

1.0.6

4 years ago

1.0.5

4 years ago

1.0.4

4 years ago

1.0.3

4 years ago

1.0.2

4 years ago

1.0.1

4 years ago

1.0.0

4 years ago