Facebook Group Scraping Using Beautiful Soup & Selenium

Last update: Aug 12, 2022

Overview

Notes

The scraper should only be used for educational purposes
Kindly refrain from scraping sensitive or private information
It is highly recommended to scrape public (and not private) groups
Ask for consent from the group adminstrator and/or group members before running any code
I am not responsible for any misuse of the code in any shape or form

Facebook Group Scraping Using Beautiful Soup & Selenium

Extract Facebook group posts that are related to a specific topic and write them to a .json file. This project was created in order to gather data needed to build a chatbot for a university's website.

Input

User's Credentials
Facebook Group URL
Number of Scrolls
- Number of posts you want to collect
Directory of the Chromedriver
Optional: Specific topic to be searched

What the Scraper Does

Logs into Facebook using the User's Credentials
Enters the group specified by the User
Searches for the topic
Extracts all posts & their comments

Scraper Output

.json file that includes:

Each post
The comments replying to it

Format of file:

{ 
   "tag": "Topic 1",
   "patterns":  [ "Post text" ],
   "responses": [ "Comment 1", 
        "Comment 2",
        "Comment 3"  
    ]
}

Setup Requirements

Make sure chrome is installed
Install Chromedriver and place it in the same directory as the file
Enter inputs required by the code
Run the code

Updates

Scrape comments found in "view more comments"
Add a file for inputs only
Add comments to the code
Add an option to scrape the general group discussions and not specific topics

Facebook Group Scraping Using Beautiful Soup & Selenium

Related tags

Overview

Notes

Facebook Group Scraping Using Beautiful Soup & Selenium

Input

What the Scraper Does

Scraper Output

Format of file:

Setup Requirements

Updates

Owner

Fatima Ghadieh

联通手机营业厅自动做任务、签到、领流量、领积分等。

Scraping Thailand COVID-19 data from the DDC's tableau dashboard

Snowflake database loading utility with Scrapy integration

A tool for scraping and organizing data from NewsBank API searches

Scrapy-based cyber security news finder

Twitter Scraper

👁️ Tool for Data Extraction and Web Requests.

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

a way to scrape a database of all of the isef projects

Extract embedded metadata from HTML markup

robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.

Transistor, a Python web scraping framework for intelligent use cases.

crypto currency scraping

Web-Scrapper using Python and Flask

A training task for web scraping using python multithreading and a real-time-updated list of available proxy servers.

UdemyBot - A Simple Udemy Free Courses Scrapper

Subscrape - A Python scraper for substrate chains

A module for CME that spiders hashes across the domain with a given hash.

Python script to check if there is any differences in responses of an application when the request comes from a search engine's crawler.

Scrapes mcc-mnc.com and outputs 3 files with the data (JSON, CSV & XLSX)