Scraping alternative websites for jobs.

Atl Job scrapes a bunch of green/social/alternative websites to send digest of new job postings by email. Also generates an Excel file with job postings informations.

The scraped data include: job title, type, salary, week_hours, date posted, apply before date and full description. Additionnaly, a set of keywords matches are automatically checked against all jobs and added as a new column. (See screens)

Job postings mailling lists 🔥

  • Montréal / Québec:
    alt_job_mtl Google Group. Join to receive a daily digest of new Montréal and Province of Québec job postings.

Supported websites

Alt Job is wrote in an extensible way, only 30 lines of code are required to support a new job posting site! Focused on Canada/Québec for now, please contribute to improve the software or expand the scope 🙂

Supports the following websites:

The support of the following websites is on the TODO:


Install all requirements (see for more details)

python3 -m pip install 'alt_job[all]'

Require Python >= 3.6


Sample config file


##### General config #####

# Logging

# Jobs data file, default is ~/jobs.json
# jobs_datafile=/home/user/Jobs/jobs-mtl.json

# Asynchronous workers, number of site to scan at the same time
# Default to 5.
# workers=10

##### Mail sender #####

# Email server settings
[email protected]
[email protected]

# Email notif settings
mailto=["[email protected]"]

##### Scrapers #####

# Website domain
# URL to start the scraping, required for all scrapers


# Load full jobs details: If supported by the scraper,
#   this will follow each job posting link in listing and parse full job description.
#   turn on to parse all job informations
# Default to False!


# Load all new pages: If supported by the scraper,
#   this will follow each "next page" links and parse next listing page
#   until older (in database) job postings are found.
# Default to False!


# Disabled scraper
# []
# url=

# Multiple start URLs crawl

Run it

python3 -m alt_job -c /home/user/Jobs/alt_job.conf


Some of the config options can be overwritten with CLI arguments.

  -c <File path> [<File path> ...], --config_file <File path> [<File path> ...]
                        configuration file(s). Default locations will be
                        checked and loaded if file exists:
                        `~/.alt_job/alt_job.conf`, `~/alt_job.conf` or
                        `./alt_job.conf` (default: [])
  -t, --template_conf   print a template config file and exit. (default:
  -V, --version         print Alt Job version and exit. (default: False)
  -x <File path>, --xlsx_output <File path>
                        Write all NEW jobs to Excel file (default: None)
  -s <Website> [<Website> ...], --enabled_scrapers <Website> [<Website> ...]
                        List of enabled scrapers. By default it's all scrapers
                        configured in config file(s) (default: [])
  -j <File path>, --jobs_datafile <File path>
                        JSON file to store ALL jobs data. Default is
                        '~/jobs.json'. Use 'null' keyword to disable the
                        storage of the datafile, all jobs will be considered
                        as new and will be loaded (default: )
  --workers <Number>    Number of websites to scrape asynchronously (default:
  --full, --load_all_jobs
                        Load the full job description page to parse
                        additionnal data. This settings is applied to all
                        scrapers (default: False)
  --all, --load_all_new_pages
                        Load new job listing pages until older jobs are found.
                        This settings is applied to all scrapers (default:
  --quick, --no_load_all_jobs
                        Do not load the full job description page to parse
                        additionnal data (Much more faster). This settings is
                        applied to all scrapers (default: False)
  --first, --no_load_all_new_pages
                        Load only the first job listing page. This settings is
                        applied to all scrapers (default: False)
  --mailto <Email> [<Email> ...]
                        Emails to notify of new job postings (default: [])
  --log_level <String>  Alt job log level. Exemple: DEBUG (default: INFO)
  --scrapy_log_level <String>
                        Scrapy log level. Exemple: DEBUG (default: ERROR)

