Find papers by keywords and venues. Then download it automatically

Last update: Dec 15, 2022

Related tags

Web Crawling paper-finder

Overview

paper finder

Find papers by keywords and venues. Then download it automatically.

How to use this?

Search

CLI

python search.py -k "knowledge tracing,knowledge trace" -v "KDD,IJCAI" -o data/kt_result.csv

min_year : paper >= min_year
max_year : paper<=max_year
-k : keywords, different keywords split use ,
-v : venue, split using ,. If default, will use the default venues.
o : output file path

Python api

from search import search
keyword_list=['knowledge tracing','knowledge trace']
venue_list=['KDD','IJCAI']
search(keyword_list=keyword_list,venue_list=venue_list,min_year=2016,max_year=2021,output='data/kt_result.csv')

Your can find venues' name in there.

Download

CLI

python download.py -i data/kt_result.csv  -o pdfs

i : the csv path from search
o : the dir to save pdfs, we will create sub folder for each venue. Such as pdfs/AIED

Python api

from utils.download import download_from_df
import pandas as pd

csv_path = "data/kt_result.csv"
df = pd.read_csv(csv_path)
df = download_from_df(df,save_dir='pdfs')
df.to_csv(csv_path.replace('.csv','_download_result.csv'),index=False)

Todo

Search papers.
Download papers

Author Warning

This code is only used for academic communication. The author has no liability for copyright. DO NOT ENGAGE IN ANY ILLEGAL ACTIVITIES. Please download and read the genuine articles from the publisher.

This tool crawls a list of websites and download all PDF and office documents

This tool crawls a list of websites and download all PDF and office documents. Then it analyses the PDF documents and tries to detect accessibility issues.

7 Sep 30, 2022

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings and results from live.skidor.com Usage: Put the python file in a dedic

0 Jan 7, 2022

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

LiveSkidorDownload Simple tool to scrape and download cross country ski timings

0 Jan 7, 2022

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

39 Dec 28, 2022

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Movies-Scraper You are probably tired of navigating through a movie website to get the right movie you'd want to watch during the weekend. There may e

1 Jan 31, 2022

Script used to download data for stocks.

This script is useful for downloading stock market data for a wide range of companies specified by their respective tickers. The script reads in the d

71 Oct 4, 2022

Download images from forum threads

Forum Image Scraper Downloads images from forum threads Only works with forums which doesn't require a login to view and have an incremental paginatio

9 Nov 16, 2022

Command line program to download documents from web portals.

command line document download made easy Highlights list available documents in json format or download them filter documents using string matching re

16 Dec 26, 2022

download NCERT books using scrapy

download_ncert_books download NCERT books using scrapy Downloading Books: You can either use the spider by cloning this repo and following the instruc

1 Dec 2, 2022

Releases(v0.1)

v0.1(Dec 6, 2022)

增加引用查询、增加代码链接查询
Source code(tar.gz)
Source code(zip)
v0.0.4(Jun 15, 2022)
add examples

fix search api bug

Source code(tar.gz)
Source code(zip)
v0.0.3(Mar 2, 2022)

Source code(tar.gz)
Source code(zip)

Find papers by keywords and venues. Then download it automatically

Related tags

Overview

paper finder

How to use this?

Search

CLI

Python api

Download

CLI

Python api

Todo

Author Warning

You might also like...

This tool crawls a list of websites and download all PDF and office documents

Simple tool to scrape and download cross country ski timings and results from live.skidor.com

Liveskidordownload - Simple tool to scrape and download cross country ski timings and results from live.skidor.com

A Telegram crawler to search groups and channels automatically and collect any type of data from them.

This code will be able to scrape movies from a movie website and also provide download links to newly uploaded movies.

Script used to download data for stocks.

Download images from forum threads

Command line program to download documents from web portals.

download NCERT books using scrapy

Releases(v0.1)

v0.1(Dec 6, 2022)

v0.0.4(Jun 15, 2022)

v0.0.3(Mar 2, 2022)

Owner

Jiahao Chen (TabChen)

Web scraper build using python.

Linkedin webscraping - Linkedin web scraping with python

Searching info from Google using Python Scrapy

Async Python 3.6+ web scraping micro-framework based on asyncio

A spider for Universal Online Judge(UOJ) system, converting problem pages to PDFs.

Auto Join: A GitHub action script to automatically invite everyone to the organization who star your repository.

Scraping news from Ucsal portal with Scrapy.

An application that on a given url, crowls a web page and gets all words, sorts and counts them.

Simple library for exploring/scraping the web or testing a website you’re developing

🐞 Douban Movie / Douban Book Scarpy

WebScraping - Scrapes Job website for python developer jobs and exports the data to a csv file

淘宝茅台抢购最新优化版本，淘宝茅台秒杀，优化了茅台抢购线程队列

A crawler of doubamovie

一款利用Python来自动获取QQ音乐上某个歌手所有歌曲歌词的爬虫软件

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Web and PDF Scraper Refactoring

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸 每日一句 + 毒鸡汤（从2月份稳定运行至今）

Find papers by keywords and venues. Then download it automatically

Google Developer Profile Badge Scraper

Divar.ir Ads scrapper

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）