用python爬取江苏几大高校的就业网站,并提供3种方式通知给用户,分别是通过微信发送、命令行直接输出、windows气泡通知。

Overview

crawler_for_university

用python爬取江苏几大高校的就业网站,并提供3种方式通知给用户,分别是通过微信发送、命令行直接输出、windows气泡通知。

环境依赖

wxpy,requests,bs4等库

功能描述

该项目基于python,通过爬虫爬各高校的就业信息网,爬取招聘信息并存储,如果碰到新的信息,则输出,提供3种输出方式:

微信发送消息

微信发消息基于网页版微信实现,使用wxpy库,使用该库的同时,不能使用电脑版或pad版微信,否则会挤下线。 并非所有用户都能使用该功能,查询自己能否使用该功能,需要打开https://wx.qq.com/。检测能否扫码登录,如果可以,则能使用。

直接命令行输出

如果不能使用,可以直接命令行输出爬取后的信息。

windows下利用气泡通知

windows下提供操作中心显示通知,可以在windows的操作中心查看消息。

重要代码描述

该函数用以爬取url的信息

def get_url(url, kv):
    '''
    用以爬取网站内容的函数
    :param url:输入url
    :param kv:headers信息
    :return:返回爬取到的内容
    '''
    try:
        r = requests.get(url, headers=kv)
        r.raise_for_status()
        return r
    except:
        try:
            time.sleep(3)
            r = requests.get(url, headers=kv)
            r.raise_for_status()
            return r
        except:
            return 0

该函数输入大学简称,对网页内容进行爬取,筛选,然后发送通知。

def get_job(university):
    '''
    用来获取各大学的就业信息网的内容
    :param university:输入学校简称
    :return:无
    '''
    global url_list, send_target
    job_url = 'http://' + university + '.91job.org.cn/campus'  # 生成url
    r = get_url(url=job_url, kv={'User-Agent': 'Mozilla/5.0'})
    soup = BeautifulSoup(r.text, 'lxml')
    r_soup = soup.find_all(attrs={'class': 'infoList'})  # 解析网页找到对应的内容
    for i in r_soup:  # 遍历每个结果
        temp = i.find(attrs={'class': 'span7'}).find(name='a').get('href')  # 找到通知对应的网站
        url = job_url + temp[7:]  # 生成招聘信息对应的网站
        if url not in url_list:  # 如果这条信息之前并未存储
            with open("url_list.txt", "a+") as f:  # 打开文件,并添加招聘信息
                f.write(url + '\n')
            url_list.append(url)  # 本地list里面也添加信息
            message_title = university_list[university] + '有一条招聘消息:'  # 标题
            message_text = i.get_text() + url  # 内容
            if 1 in model_choose:  # 模式1,直接print
                print('*' * 100)
                print(message_title + message_text)
            if 2 in model_choose:  # 模式2,给微信好友发消息
                send_target.send(message_title + message_text)
            if 3 in model_choose:  # 模式3,windows气泡消息
                if flag:
                    message.show_msg(message_title, message_text, 1)
            if flag:  # 提示音
                winsound.Beep(freq, duration)
            else:
                os.system('play --no-show-progress --null --channels 1 synth %s sine %f' % (duration / 1000, freq))

使用方法

下载main文件,安装所需要的库,在命令行下面代码进行运行

python main.py
SmartScraper: 简单、自动、快捷的Python网络爬虫

SmartScraper: 简单、自动、快捷的Python网络爬虫 Note: The origin developer of SmartScraper is Alireza Mika, I only change a little code of AutoScraper. SmartScraper

DaDeng 9 Apr 16, 2022
京东茅台抢购最新优化版本,京东茅台秒杀,优化了茅台抢购进程队列

京东茅台抢购最新优化版本,京东茅台秒杀,优化了茅台抢购进程队列

MaoTai 129 Dec 14, 2022
CRI Scrape is a tool for get general info about Italian Red Cross in GAIA Platform

CRI Scrape CRI Scrape is a tool for get general info about Italian Red Cross in GAIA Platform Disclaimer This code is only for educational purpose. So

Vincenzo Cardone 0 Jul 23, 2022
A Telegram crawler to search groups and channels automatically and collect any type of data from them.

Introduction This is a crawler I wrote in Python using the APIs of Telethon months ago. This tool was not intended to be publicly available for a numb

39 Dec 28, 2022
Instagram_scrapper - This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or excel file easily.

Instagram_scrapper This project allow you to scrape the list of followers, following or both from a public Instagram account, and create a csv or exce

Lakhdar Belkharroubi 5 Oct 17, 2022
Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

Binance harvester - A Python 3 script to harvest data from the Binance socket stream and calculate popular TA indicators and produce lists of top trending coins

68 Oct 08, 2022
A Python library for automating interaction with websites.

Home page https://mechanicalsoup.readthedocs.io/ Overview A Python library for automating interaction with websites. MechanicalSoup automatically stor

4.3k Jan 07, 2023
Facebook Group Scraping Using Beautiful Soup & Selenium

Extract Facebook group posts that are related to a specific topic and write them to a .json file.

Fatima Ghadieh 14 Aug 12, 2022
IGLS - Instagram Like Scraper CLI tool

IGLS - Instagram Like Scraper It's a web scraping command line tool based on python and selenium. Description This is a trial tool for learning purpos

Shreshth Goyal 5 Oct 29, 2021
A web scraper that exports your entire WhatsApp chat history.

WhatSoup 🍲 A web scraper that exports your entire WhatsApp chat history. Table of Contents Overview Demo Prerequisites Instructions Frequen

Eddy Harrington 87 Jan 06, 2023
This program scrapes information and images for movies and TV shows.

Media-WebScraper This program scrapes information and images for movies and TV shows. Summary For more information on the program, read the WebScrape_

1 Dec 05, 2021
A repository with scraping code and soccer dataset from understat.com.

UNDERSTAT - SHOTS DATASET As many people interested in soccer analytics know, Understat is an amazing source of information. They provide Expected Goa

douglasbc 48 Jan 03, 2023
热搜榜-python爬虫+正则re+beautifulsoup+xpath

仓库简介 微博热搜榜, 参数wb 百度热搜榜, 参数bd 360热点榜, 参数360 csdn热榜接口, 下方查看 其他热搜待加入 如何使用? 注册vercel fork到你的仓库, 右上角 点击这里完成部署(一键部署) 请求参数 vercel配置好的地址+api?tit=+参数(仓库简介有参数信息

Harry 3 Jul 08, 2022
Proxy scraper. Format: IP | PORT | COUNTRY | TYPE

proxy scraper 🔎 Installation: git clone https://github.com/ebankoff/proxy_scraper Required pip libraries (pip install library name): lxml beautifulso

Eban'ko 19 Dec 07, 2022
Jobinja.ir jobs scraper.

Jobinja.ir Dataset Introduction This project is a simple web scraper that scraps pages of jobinja.ir concurrently and writes and update (if file gets

Iman Kermani 3 Apr 15, 2022
A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.

WaGpScraper A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working

Muhammed Rizad 27 Dec 18, 2022
Google Scholar Web Scraping

Google Scholar Web Scraping This is a python script that asks for a user to input the url for a google scholar profile, and then it writes publication

Suzan M 1 Dec 12, 2021
Create crawler get some new products with maximum discount in banimode website

crawler-banimode create crawler and get some new products with maximum discount in banimode website. این پروژه کوچک جهت یادگیری و کار با ابزار سلنیوم

nourollah rezaei 2 Feb 17, 2022
This is a module that I had created along with my friend. It's a basic web scraping module

QuickInfo PYPI link : https://pypi.org/project/quickinfo/ This is the library that you've all been searching for, it's built for developers and allows

OneBit 2 Dec 13, 2021
Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json

Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website (prior form publication) by form number and returns the results as json. It provides the option to download pdfs over a ra

1 Jan 04, 2022