Python wrapper for Wikipedia

Overview

Wikipedia API

Wikipedia-API is easy to use Python wrapper for Wikipedias' API. It supports extracting texts, sections, links, categories, translations, etc from Wikipedia. Documentation provides code snippets for the most common use cases.

build status Documentation Status Test Coverage Version Py Versions GitHub stars

Installation

This package requires at least Python 3.4 to install because it's using IntEnum.

pip3 install wikipedia-api

Usage

Goal of Wikipedia-API is to provide simple and easy to use API for retrieving informations from Wikipedia. Bellow are examples of common use cases.

Importing

import wikipediaapi

How To Get Single Page

Getting single page is straightforward. You have to initialize Wikipedia object and ask for page by its name. It's parameter language has be one of supported languages.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    page_py = wiki_wiki.page('Python_(programming_language)')

How To Check If Wiki Page Exists

For checking, whether page exists, you can use function exists.

page_py = wiki_wiki.page('Python_(programming_language)')
print("Page - Exists: %s" % page_py.exists())
# Page - Exists: True

page_missing = wiki_wiki.page('NonExistingPageWithStrangeName')
print("Page - Exists: %s" %     page_missing.exists())
# Page - Exists: False

How To Get Page Summary

Class WikipediaPage has property summary, which returns description of Wiki page.

import wikipediaapi
    wiki_wiki = wikipediaapi.Wikipedia('en')

    print("Page - Title: %s" % page_py.title)
    # Page - Title: Python (programming language)

    print("Page - Summary: %s" % page_py.summary[0:60])
    # Page - Summary: Python is a widely used high-level programming language for

How To Get Page URL

WikipediaPage has two properties with URL of the page. It is fullurl and canonicalurl.

print(page_py.fullurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

print(page_py.canonicalurl)
# https://en.wikipedia.org/wiki/Python_(programming_language)

How To Get Full Text

To get full text of Wikipedia page you should use property text which constructs text of the page as concatanation of summary and sections with their titles and texts.

wiki_wiki = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.WIKI
)

p_wiki = wiki_wiki.page("Test 1")
print(p_wiki.text)
# Summary
# Section 1
# Text of section 1
# Section 1.1
# Text of section 1.1
# ...


wiki_html = wikipediaapi.Wikipedia(
        language='en',
        extract_format=wikipediaapi.ExtractFormat.HTML
)
p_html = wiki_html.page("Test 1")
print(p_html.text)
# <p>Summary</p>
# <h2>Section 1</h2>
# <p>Text of section 1</p>
# <h3>Section 1.1</h3>
# <p>Text of section 1.1</p>
# ...

How To Get Page Sections

To get all top level sections of page, you have to use property sections. It returns list of WikipediaPageSection, so you have to use recursion to get all subsections.

def print_sections(sections, level=0):
        for s in sections:
                print("%s: %s - %s" % ("*" * (level + 1), s.title, s.text[0:40]))
                print_sections(s.sections, level + 1)


print_sections(page_py.sections)
# *: History - Python was conceived in the late 1980s,
# *: Features and philosophy - Python is a multi-paradigm programming l
# *: Syntax and semantics - Python is meant to be an easily readable
# **: Indentation - Python uses whitespace indentation, rath
# **: Statements and control flow - Python's statements include (among other
# **: Expressions - Some Python expressions are similar to l

How To Get Page In Other Languages

If you want to get other translations of given page, you should use property langlinks. It is map, where key is language code and value is WikipediaPage.

def print_langlinks(page):
        langlinks = page.langlinks
        for k in sorted(langlinks.keys()):
            v = langlinks[k]
            print("%s: %s - %s: %s" % (k, v.language, v.title, v.fullurl))

print_langlinks(page_py)
# af: af - Python (programmeertaal): https://af.wikipedia.org/wiki/Python_(programmeertaal)
# als: als - Python (Programmiersprache): https://als.wikipedia.org/wiki/Python_(Programmiersprache)
# an: an - Python: https://an.wikipedia.org/wiki/Python
# ar: ar - بايثون: https://ar.wikipedia.org/wiki/%D8%A8%D8%A7%D9%8A%D8%AB%D9%88%D9%86
# as: as - পাইথন: https://as.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8

page_py_cs = page_py.langlinks['cs']
print("Page - Summary: %s" % page_py_cs.summary[0:60])
# Page - Summary: Python (anglická výslovnost [ˈpaiθtən]) je vysokoúrovňový sk

How To Get Links To Other Pages

If you want to get all links to other wiki pages from given page, you need to use property links. It's map, where key is page title and value is WikipediaPage.

def print_links(page):
        links = page.links
        for title in sorted(links.keys()):
            print("%s: %s" % (title, links[title]))

print_links(page_py)
# 3ds Max: 3ds Max (id: ??, ns: 0)
# ?:: ?: (id: ??, ns: 0)
# ABC (programming language): ABC (programming language) (id: ??, ns: 0)
# ALGOL 68: ALGOL 68 (id: ??, ns: 0)
# Abaqus: Abaqus (id: ??, ns: 0)
# ...

How To Get Page Categories

If you want to get all categories under which page belongs, you should use property categories. It's map, where key is category title and value is WikipediaPage.

def print_categories(page):
        categories = page.categories
        for title in sorted(categories.keys()):
            print("%s: %s" % (title, categories[title]))


print("Categories")
print_categories(page_py)
# Category:All articles containing potentially dated statements: ...
# Category:All articles with unsourced statements: ...
# Category:Articles containing potentially dated statements from August 2016: ...
# Category:Articles containing potentially dated statements from March 2017: ...
# Category:Articles containing potentially dated statements from September 2017: ...

How To Get All Pages From Category

To get all pages from given category, you should use property categorymembers. It returns all members of given category. You have to implement recursion and deduplication by yourself.

def print_categorymembers(categorymembers, level=0, max_level=1):
        for c in categorymembers.values():
            print("%s: %s (ns: %d)" % ("*" * (level + 1), c.title, c.ns))
            if c.ns == wikipediaapi.Namespace.CATEGORY and level < max_level:
                print_categorymembers(c.categorymembers, level=level + 1, max_level=max_level)


cat = wiki_wiki.page("Category:Physics")
print("Category members: Category:Physics")
print_categorymembers(cat.categorymembers)

# Category members: Category:Physics
# * Statistical mechanics (ns: 0)
# * Category:Physical quantities (ns: 14)
# ** Refractive index (ns: 0)
# ** Vapor quality (ns: 0)
# ** Electric susceptibility (ns: 0)
# ** Specific weight (ns: 0)
# ** Category:Viscosity (ns: 14)
# *** Brookfield Engineering (ns: 0)

How To See Underlying API Call

If you have problems with retrieving data you can get URL of undrerlying API call. This will help you determine if the problem is in the library or somewhere else.

import wikipediaapi
import sys
wikipediaapi.log.setLevel(level=wikipediaapi.logging.DEBUG)

# Set handler if you use Python in interactive mode
out_hdlr = wikipediaapi.logging.StreamHandler(sys.stderr)
out_hdlr.setFormatter(wikipediaapi.logging.Formatter('%(asctime)s %(message)s'))
out_hdlr.setLevel(wikipediaapi.logging.DEBUG)
wikipediaapi.log.addHandler(out_hdlr)

wiki = wikipediaapi.Wikipedia(language='en')

page_ostrava = wiki.page('Ostrava')
print(page_ostrava.summary)
# logger prints out: Request URL: http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Ostrava&explaintext=1&exsectionformat=wiki

External Links

Other Badges

Code Climate Issue Count Coveralls Version Py Versions implementations Downloads Tags github-release Github commits (since latest release) GitHub forks GitHub stars GitHub watchers GitHub commit activity the past week, 4 weeks, year Last commit GitHub code size in bytes GitHub repo size in bytes PyPi License PyPi Wheel PyPi Format PyPi PyVersions PyPi Implementations PyPi Status PyPi Downloads - Day PyPi Downloads - Week PyPi Downloads - Month Libraries.io - SourceRank Libraries.io - Dependent Repos

Other Pages

.. toctree::
        :maxdepth: 2

        API
        CHANGES
        DEVELOPMENT
        wikipediaapi/api

Owner
Martin Majlis
Martin Majlis
Send pm to Admin - Telegram

Send pm to Admin - Telegram

Ahoora 3 Nov 17, 2022
TFT Bot that automatically surrenders and allows finishing TFT Passes easily.

Image Based TFT Bot TFT Bot that automatically surrenders and allows finishing TFT Passes easily. Please read full file! You can check new releases he

1 Feb 06, 2022
Discord bot for name verifying. Created for TinkerHubGCEK discord server. Tinky is now deployed in heroku

Custom Discord bot This custom discord-python bot assigns roles to members joined at discord server. It looks and compares a list before verifying the

Edwin Jose George 2 Dec 16, 2021
Some examples regarding how to use the Twitter APIs for academic research

Twitter Developer Platform: Using Twitter APIs for Academic Research All the scripts require a config.ini file in which the keys are put. There is a t

Federico Bianchi 6 Feb 13, 2022
Dashboard to monitor the performance of your Binance Futures account

futuresboard A python based scraper and dashboard to monitor the performance of your Binance Futures account. Note: A local sqlite3 database config/fu

86 Dec 29, 2022
veez music bot is a telegram music bot project, allow you to play music on voice chat group telegram.

🎶 Veez Music Bot Music bot for playing music on telegram voice chat group. Requirements 📝 FFmpeg NodeJS nodesource.com Python 3.7+ PyTgCalls 🧪 Get

levina 143 Jun 19, 2022
A Discord Rich Presence App to set your own custom rich presence.

discord-rich-presence A Discord Rich Presence App to set your own custom rich presence. #BUILDS Ready to use package are available inside "finalpackag

1 Nov 22, 2021
C Y B Ξ R UserBot is a project that simplifies the use of Telegram.

C Y B Ξ R USΞRBOT 🇦🇿 C Y B Ξ R UserBot is a project that simplifies the use of Telegram. All rights reserved. Automatic Setup Android: open Termux p

FVREED 4 Dec 07, 2022
Discord bot that displays Jazz Jackrabbit 2 server status, current gamemode as "Playing.." status

JJ2-server-status-discord-bot Discord bot that displays Jazz Jackrabbit 2 server status, current gamemode as "Playing.." status How to setup: 0. Downl

2 Dec 09, 2021
Polar devices Python API and CLI.

loophole - Polar devices API About Python API for Polar devices. Command line interface included. Tested with: A360 Loop M400 Installation pip install

[roscoe] 145 Sep 14, 2022
LHXP・Official "LH - Cyber Security" Discord Leveling-Bot

LHXP・Official "LH - Cyber Security" Discord Leveling-Bot Based on nsde/NOVΛLIX Feature Overview /clear @user Requires admin permission Purges all XP

Felix・onlix 2 Mar 09, 2022
Detects members having unicode names. Public bot: @scarletwitchprobot

✨ Scarletwitch bot ✨ Detects unicode names members in a tg chat & provides a option to take action on that user ! Public bot: @scarletwitchprobot Supp

ÁÑÑÍHÌLÅTØR SPÄRK 18 Nov 12, 2022
A Telegram Bot To Stream Videos in Telegram Voice Chat.

Video Stream X Bot Telegram bot project for streaming video on telegram video chat, powered by tgcalls and pyrogram Deploy to Heroku 👨‍🔧 The easy wa

Mⷨoͦns͛ᴛⷮeͤrͬ Zeͤrͬoͦ 13 Dec 05, 2022
WhatsApp Api Python - This documentation aims to exemplify the use of Moorse Whatsapp API in Python

WhatsApp API Python ChatBot Este repositório contém uma aplicação que se utiliza

Moorse.io 3 Jan 08, 2022
A head unit UI designed to replace the RTx/SMEG/RNEG/NG4/RCC/NAC

HeadUnit UI (Come discuss about it on our Discord!) Intro This is the UI part of a headunit project from OpenLeo, based on python and kivy, it looks l

OpenLeo 6 Nov 23, 2022
A telegram media to pixeldrain stream link bot

Pixeldrain-Bot A telegram media to pixeldrain stream link bot Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - ht

Fayas Noushad 11 Oct 21, 2022
The official Python library for Shodan

shodan: The official Python library and CLI for Shodan Shodan is a search engine for Internet-connected devices. Google lets you search for websites,

John Matherly 2.1k Dec 31, 2022
BLYRIC is a Twitter bot that tweets a song lyric every night.

BLYRIC BLYRIC, a bot that tweets a song lyric every night. Follow on Twitter: @blyric_ Overview BLYRIC is a Twitter bot that tweets a song quote every

Bruno Kenzo Hyodo 6 Oct 05, 2022
Halcyon is a Matrix bot library created with the intention of being easy to install and use. Inspired by discord.py

Halcyon is a Matrix bot library with the goal of being easy to install and use. The library takes inspiration from discord.py and the Slack li

Wes Ring 19 Jan 06, 2023
A Python Library to interface with Tumblr v2 REST API & OAuth

Tumblpy Tumblpy is a Python library to help interface with Tumblr v2 REST API & OAuth Features Retrieve user information and blog information Common T

Mike Helmick 125 Jun 20, 2022