Uses: Scrapy, Selenium web driver, Chromium headless, docker and python3.
The first spider aims to visit as more linkedin's user pages as possible :-D, the objective is to gain visibility with your account: since LinkedIn notifies the issued User when someone visits his page.
This spider aims to collect all the users working for a company on linkedin.
Install docker from the official website https://www.docker.com/
Install VNC viewer if you do not have one. For ubuntu, go for vinagre:
sudo apt-get update sudo apt-get install vinagre
conf.py and fill the quotes with your credentials.
Only linkedin random spider, not the companies spider. Open your terminal, move to the project folder and type:
docker-compose up -d --build
Open vinagre, and type address and port
localhost:5900. The password is
vinagre localhost:5900 or make view
Use your terminal again, type in the same window:
Setup your python virtual environment (trivial but mandatory):
virtualenvs -p python3.6 .venv source .venv/bin/activate pip install -r requirements.txt
Create the selenium server, open the VNC window and launch the tests, type those in three different terminals on the project folder:
make dev make view make tests
For more details have a look at the Makefile (here is used to shortcut and not to build).
scrapy crawl companies -a selenium_hostname=localhost -o output.csv
scrapy crawl random -a selenium_hostname=localhost -o output.csv
scrapy crawl byname -a selenium_hostname=localhost -o output.csv
This code is in no way affiliated with, authorized, maintained, sponsored or endorsed by Linkedin or any of its affiliates or subsidiaries. This is an independent and unofficial project. Use at your own risk.
This project violates Linkedin's User Agreement Section 8.2, and because of this, Linkedin may (and will) temporarily or permantly ban your account. We are not responsible for your account being banned.