Scrapping Connections' info on Linkedin

Last update: Feb 11, 2022

Overview

Scrap It!

! Disclaimer:

THIS CODE HAS BEEN IMPLEMENTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE INTERVIEW PROCESS OF MCI.IR AND INTERVIEWEES WERE SUPPOSED TO PUSH THE CODE ON THEIR GITHUB. CONTACT ME TO REMOVE THIS REPOSITORY, IN CASE IT IS AGAINST YOUR TOS.
IF ANY CONNECTION IS NOT OK TO THEIR CONTACT INFO BE HERE, CONTACT ME TO REMOVE THEM ASAP.

Functionalities:

This script automatically:

opens your Linkedin profile
accesses your connections page
crawls the page for grabbing their profile links
scraps each person's information and dumps it to Sqlite db
and simultaneously logs all necessary level of info into Linkedin.log

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Creator
Low Coupling
High Cohesion
Indirection
Modularization
Information Expert

Log/DB files:

Further develepments notes:

Check out other DBs that supports multithreading which anable us dumpping all information rows at once
change IP per request (You can find its code on my "Social Media Computing course" repository)
Sometimes you need to scroll down manually when "connection" page is being loaded. You can add one line code to scroll down for you.

References:

https://www.linkedin.com/pulse/how-easy-scraping-data-from-linkedin-profiles-david-craven

https://www.geeksforgeeks.org/scrape-linkedin-using-selenium-and-beautiful-soup-in-python/

https://stackoverflow.com/questions/28883769/remove-odd-indexed-elements-from-list-in-python#:~:text=Fun%20fact%3A%20to%20remove%20all,remove(x)%20.

https://stackoverflow.com/questions/34759787/fetch-all-href-link-using-selenium-in-python

https://www.tutorialspoint.com/fetch-all-href-link-using-selenium-in-python

https://stackoverflow.com/questions/64717302/deprecationwarning-executable-path-has-been-deprecated-selenium-python

https://chromedriver.chromium.org/home

https://www.youtube.com/watch?v=-ARI4Cz-awo

Scrapping Connections' info on Linkedin

Related tags

Overview

Scrap It!

Functionalities:

DataFlowDiagram

Enlisted desing patterns are (but not limited to):

Log/DB files:

Further develepments notes:

References:

Owner

MohammadReza Ardestani

Command line program to download documents from web portals.

Discord webhook spammer with proxy support and proxy scraper

A Powerful Spider(Web Crawler) System in Python.

12306抢票脚本

for those who dont want to pay $10/month for high school game footage with ads

ChromiumJniGenerator - Jni Generator module extracted from Chromium project

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

Python Web Scrapper Project

Tool to scan for secret files on HTTP servers

淘宝、天猫半价抢购，抢电视、抢茅台，干死黄牛党

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

京东云无线宝积分推送，支持查看多设备积分使用情况

Audio media crawler for lbry.

Html Content / Article Extractor, web scrapping lib in Python

A web service for scanning media hosted by a Matrix media repository

A tool to easily scrape youtube data using the Google API

A python script to extract answers to any question on Quora (Quora+ included)

Scraping followers of an instagram account

Linkedin webscraping - Linkedin web scraping with python

This is a web crawler that works on employ email data by gmane.org and visualizes it in different ways.