Tools for collecting social media data around focal events

Overview

Social Media Focal Events

The focalevents codebase provides tools for organizing data collected around focal events on social media.

It is often difficult to organize data from multiple API queries. For example, we may collect tweets when a hashtag starts trending by using Twitter’s filter stream. Later, we may make a separate query to the search endpoint to backfill our stream with what we missed before we started it, or update it with tweets that occurred since we stopped it. We may also want to get reply threads, quote tweets, or user timelines based on the tweets we collected. All of these queries are related to a common focal event—the hashtag—but they require several separate calls to the API. It is easy for these multiple queries to result in many disjoint files, making it difficult to organize, merge, update, backfill, and preprocess them quickly and reliably.

To address these issues, focalevents can be used to organize social media focal event data collected from Twitter’s v2 API using academic credentials and PostgreSQL. It is easy to do any of the following with the tools here:

  • Query Twitter’s full archive or filter stream for focal event data
  • Backfill and update those queries with additional data
  • Collect conversation threads and quote tweets of focal event tweets
  • Retrieve full user timelines for any user tweeting during a focal event

All of these functionalities are easy, single line commands, rather than long multi-line scripts, as are typically needed to read IDs, query the API, output data, and merge it with existing data. This allows researchers to design more complex studies of social media data, and spend more time focusing on data analysis, rather than data storage and maintenance.

Installation and Documentation

The repository's code can be downloaded directly from Github, or cloned using git:

git clone https://github.com/ryanjgallagher/focalevents

See the full documentation for more information about installing, configuring, and using the focalevent tools.

A Note

The code here is written and maintained by a single person. First and foremost, it has been designed to help them manage their own data and create replicable pipelines. They are sharing it in the hope that it may help others who have similar workflows and are interested in organizing their Twitter data according to focal events using PostgreSQL.

Requests for enhancements or additions to the code will likely be declined if the author does not anticipate using them in their own research. It is highly unlikely that the code will ever be adapted to work with databases other than PostgreSQL. Further, general problems with database setup or conflicts with pre-existing database structures are beyond the scope of this project and will not be addressed.

Owner
Ryan Gallagher
Network science PhD student merging networks and NLP for computational social science
Ryan Gallagher
This is a Fava extension to display a grouped portfolio view in Fava for a set of Beancount accounts.

Fava Portfolio Summary This is a Fava extension to display a grouped portfolio view in Fava for a set of Beancount accounts. It can also calculate MWR

18 Dec 26, 2022
A repository for all ZenML projects that are specific production use-cases.

ZenFiles Original Image source: https://www.goodfon.com/wallpaper/x-files-sekretnye-materialy.html And naturally, all credits to the awesome X-Files s

ZenML 66 Jan 06, 2023
A Linux webcam plugin for BGMv2 as used in our demos.

The goal of this repository is to supplement the main Real-Time High Resolution Background Matting repo with a working demo of a videoconferencing plu

Andrey Ryabtsev 144 Dec 27, 2022
Python library to natively send files to Trash (or Recycle bin) on all platforms.

Send2Trash -- Send files to trash on all platforms Send2Trash is a small package that sends files to the Trash (or Recycle Bin) natively and on all pl

Andrew Senetar 224 Jan 04, 2023
Another Provably Rare Gem Miner 💎 (for Raritygems)

Provably Rare Gem Miner Go (for Rarity) Pull Request is strongly welcome as I don't know anything about Golang/Python/Web3. Usage Install Python 3.x i

朱里 6 Apr 22, 2022
Odoo modules related to website/webshop

Website Apps related to Odoo it's website/webshop features: webshop_public_prices: allow configuring to hide or show product prices and add to cart bu

Yenthe Van Ginneken 9 Nov 04, 2022
A full-featured, hackable tiling window manager written and configured in Python

A full-featured, hackable tiling window manager written and configured in Python Features Simple, small and extensible. It's easy to write your own la

Qtile 3.8k Dec 31, 2022
Replay Felica Exchange For Python

FelicaReplay Replay Felica Exchange Description Standalone Replay Module Usage Save FelicaRelay (=2.0) output to file, then python replay.py [FILE].

3 Jul 14, 2022
This repo is for scripts to run various clients at the merge f2f

merge-f2f This repo is for scripts to run various clients at the merge f2f. Tested with Lighthouse! Tested with Geth! General dependecies sudo apt-get

Parithosh Jayanthi 2 Apr 03, 2022
Nextstrain build targeted to Omicron

About This repository analyzes viral genomes using Nextstrain to understand how SARS-CoV-2, the virus that is responsible for the COVID-19 pandemic, e

Bedford Lab 9 May 25, 2022
It is convenient to quickly import Python packages from the network.

It is convenient to quickly import Python packages from the network.

zmaplex 1 Jan 18, 2022
To check my COVID-19 vaccine appointment, I wrote an infinite loop that sends me a Whatsapp message hourly using Twilio and Selenium. It works on my Raspberry Pi computer.

COVID-19_vaccine_appointment To check my COVID-19 vaccine appointment, I wrote an infinite loop that sends me a Whatsapp message hourly using Twilio a

Ayyuce Demirbas 24 Dec 17, 2022
an opensourced roblox group finder writen in python 100% free and virus-free

Roblox-Group-Finder an opensourced roblox group finder writen in python 100% free and virus-free note : if you don't want install python or just use w

mollomm1 1 Nov 11, 2021
万能通用对象池,可以池化任意自定义类型的对象。

pip install universal_object_pool 此包能够将一切任意类型的python对象池化,是万能池,适用范围远大于单一用途的mysql连接池 http连接池等。 框架使用对象池包,自带实现了4个对象池。可以直接开箱用这四个对象池,也可以作为例子学习对象池用法。

12 Dec 15, 2022
A C-like hardware description language (HDL) adding high level synthesis(HLS)-like automatic pipelining as a language construct/compiler feature.

██████╗ ██╗██████╗ ███████╗██╗ ██╗███╗ ██╗███████╗ ██████╗ ██╔══██╗██║██╔══██╗██╔════╝██║ ██║████╗ ██║██╔════╝██╔════╝ ██████╔╝██║██████╔╝█

Julian Kemmerer 391 Jan 01, 2023
It's like Forth but in Python

It's like Forth but written in Python. But I don't actually know for sure since I never programmed in Forth, I only heard that it's some sort of stack-based programming language. Porth is also stack-

Tsoding 619 Dec 21, 2022
Final project for ENGG 5402 Advanced Robotics in CUHK

Final project Final project Update Foundations Ubuntu virtual machine Ubuntu How to use Github to keep tracking the change of code version? Docker Set

Junjia Liu 8 Aug 01, 2022
Load dependent libraries dynamically.

dypend dypend Load dependent libraries dynamically. A few days ago, I encountered many users feedback in an open source project. The Problem is they c

Louis 5 Mar 02, 2022
Remove Sheet Protection from .xlsx files. Easily.

🔓 Excel Sheet Unlocker Remove sheet protection from .xlsx files. How to use Run Run the script/packaged executable from the command line. Universal u

Daniel 3 Nov 16, 2022
redun aims to be a more expressive and efficient workflow framework

redun yet another redundant workflow engine redun aims to be a more expressive and efficient workflow framework, built on top of the popular Python pr

insitro 372 Jan 04, 2023