🏆 • 5050 most frequent words in 109 languages

Last update: Nov 24, 2022

Overview

🏆 Most Common Words Multilingual

5000 most frequent words in 109 languages. Uses wordfrequency.info as a source.

🔗 License

source code license
data is released under different license(s), as they're taken from online sources. Feel free to contribute with your own data!

🌐 Language	📁 File
Afrikaans (af)	.txt
Albanian (sq)	.txt
Amharic (am)	.txt
Arabic (ar)	.txt
Armenian (hy)	.txt
Azerbaijani (az)	.txt
Basque (eu)	.txt
Belarusian (be)	.txt
Bengali (bn)	.txt
Bosnian (bs)	.txt
Bulgarian (bg)	.txt
Catalan (ca)	.txt
Cebuano (ceb)	.txt
Chichewa (ny)	.txt
Chinese (simplified) (zh-CN)	.txt
Chinese (traditional) (zh-TW)	.txt
Corsican (co)	.txt
Croatian (hr)	.txt
Czech (cs)	.txt
Danish (da)	.txt
Dutch (nl)	.txt
English (en)	.txt
Esperanto (eo)	.txt
Estonian (et)	.txt
Filipino (tl)	.txt
Finnish (fi)	.txt
French (fr)	.txt
Frisian (fy)	.txt
Galician (gl)	.txt
Georgian (ka)	.txt
German (de)	.txt
Greek (el)	.txt
Gujarati (gu)	.txt
Haitian creole (ht)	.txt
Hausa (ha)	.txt
Hawaiian (haw)	.txt
Hebrew (iw)	.txt
Hindi (hi)	.txt
Hmong (hmn)	.txt
Hungarian (hu)	.txt
Icelandic (is)	.txt
Igbo (ig)	.txt
Indonesian (id)	.txt
Irish (ga)	.txt
Italian (it)	.txt
Japanese (ja)	.txt
Javanese (jw)	.txt
Kannada (kn)	.txt
Kazakh (kk)	.txt
Khmer (km)	.txt
Kinyarwanda (rw)	.txt
Korean (ko)	.txt
Kurdish (ku)	.txt
Kyrgyz (ky)	.txt
Lao (lo)	.txt
Latin (la)	.txt
Latvian (lv)	.txt
Lithuanian (lt)	.txt
Luxembourgish (lb)	.txt
Macedonian (mk)	.txt
Malagasy (mg)	.txt
Malay (ms)	.txt
Malayalam (ml)	.txt
Maltese (mt)	.txt
Maori (mi)	.txt
Marathi (mr)	.txt
Mongolian (mn)	.txt
Myanmar (my)	.txt
Nepali (ne)	.txt
Norwegian (no)	.txt
Odia (or)	.txt
Pashto (ps)	.txt
Persian (fa)	.txt
Polish (pl)	.txt
Portuguese (pt)	.txt
Punjabi (pa)	.txt
Romanian (ro)	.txt
Russian (ru)	.txt
Samoan (sm)	.txt
Scots gaelic (gd)	.txt
Serbian (sr)	.txt
Sesotho (st)	.txt
Shona (sn)	.txt
Sindhi (sd)	.txt
Sinhala (si)	.txt
Slovak (sk)	.txt
Slovenian (sl)	.txt
Somali (so)	.txt
Spanish (es)	.txt
Sundanese (su)	.txt
Swahili (sw)	.txt
Swedish (sv)	.txt
Tajik (tg)	.txt
Tamil (ta)	.txt
Tatar (tt)	.txt
Telugu (te)	.txt
Thai (th)	.txt
Turkish (tr)	.txt
Turkmen (tk)	.txt
Ukrainian (uk)	.txt
Urdu (ur)	.txt
Uyghur (ug)	.txt
Uzbek (uz)	.txt
Vietnamese (vi)	.txt
Welsh (cy)	.txt
Xhosa (xh)	.txt
Yiddish (yi)	.txt
Yoruba (yo)	.txt
Zulu (zu)	.txt

Count the frequency of letters or words in a text file and show a graph.

Word Counter By EBUS Coding Club Count the frequency of letters or words in a text file and show a graph. Requirements Python 3.9 or higher matplotlib

0 Apr 9, 2022

This program do translate english words to portuguese

Python-Dictionary This program is used to translate english words to portuguese. Web-Scraping This program use BeautifulSoap to make web scraping, so

1 Oct 10, 2022

Python powered crossword generator with database with 20k+ polish words

crossword_generator Generate simple crossword puzzle from words and definitions fetched from krzyżowki.edu.pl endpoints -/ string:word - returns js

0 Jan 4, 2022

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

This Project is based on NLTK(Natural Language Toolkit) It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

2 Nov 17, 2021

Russian words synonyms and antonyms

ru_synonyms Russian words synonyms and antonyms. Install pip install git+https://github.com/ahmados/rusynonyms.git Usage from ru_synonyms import Anto

7 Dec 14, 2022

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Unsupervised technique to Glossary and Definition Extraction Code Files GPT2-DefinitionModel.ipynb - GPT-2 model for definition generation. Data_Gener

28 May 25, 2021

Turkish Stop Words Türkçe Dolgu Sözcükleri

trstop Turkish Stop Words Türkçe Dolgu Sözcükleri In this repository I put Turkish stop words that is contained in the first 10 thousand words with th

103 Nov 12, 2022

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

speech-recognition-py Speech recognition is the ability of computer software to identify words and phrases in spoken language and convert them to huma

1 Apr 3, 2022

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

1 Feb 1, 2022

Comments

build(deps): bump certifi from 2021.10.8 to 2022.12.7
Bumps certifi from 2021.10.8 to 2022.12.7.

Commits

9e9e840 2022.12.07

b81bdb2 2022.09.24

939a28f 2022.09.14

aca828a 2022.06.15.2

de0eae1 Only use importlib.resources's new files() / Traversable API on Python ≥3.11 ...

b8eb5e9 2022.06.15.1

47fb7ab Fix deprecation warning on Python 3.11 (#199)

b0b48e0 fixes #198 -- update link in license

9d514b4 2022.06.15

4151e88 Add py.typed to MANIFEST.in to package in sdist (#196)

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0
build(deps): bump numpy from 1.21.4 to 1.22.0
Bumps numpy from 1.21.4 to 1.22.0.

Release notes

Sourced from numpy's releases.

v1.22.0

NumPy 1.22.0 Release Notes

NumPy 1.22.0 is a big release featuring the work of 153 contributors spread over 609 pull requests. There have been many improvements, highlights are:

Annotations of the main namespace are essentially complete. Upstream is a moving target, so there will likely be further improvements, but the major work is done. This is probably the most user visible enhancement in this release.

A preliminary version of the proposed Array-API is provided. This is a step in creating a standard collection of functions that can be used across application such as CuPy and JAX.

NumPy now has a DLPack backend. DLPack provides a common interchange format for array (tensor) data.

New methods for quantile, percentile, and related functions. The new methods provide a complete set of the methods commonly found in the literature.

A new configurable allocator for use by downstream projects.

These are in addition to the ongoing work to provide SIMD support for commonly used functions, improvements to F2PY, and better documentation.

The Python versions supported in this release are 3.8-3.10, Python 3.7 has been dropped. Note that 32 bit wheels are only provided for Python 3.8 and 3.9 on Windows, all other wheels are 64 bits on account of Ubuntu, Fedora, and other Linux distributions dropping 32 bit support. All 64 bit wheels are also linked with 64 bit integer OpenBLAS, which should fix the occasional problems encountered by folks using truly huge arrays.

Expired deprecations

Deprecated numeric style dtype strings have been removed

Using the strings "Bytes0", "Datetime64", "Str0", "Uint32", and "Uint64" as a dtype will now raise a TypeError.

(gh-19539)

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio

numpy.loads was deprecated in v1.15, with the recommendation that users use pickle.loads instead. ndfromtxt and mafromtxt were both deprecated in v1.17 - users should use numpy.genfromtxt instead with the appropriate value for the usemask parameter.

(gh-19615)

... (truncated)

Commits

4adc87d Merge pull request #20685 from charris/prepare-for-1.22.0-release

fd66547 REL: Prepare for the NumPy 1.22.0 release.

125304b wip

c283859 Merge pull request #20682 from charris/backport-20416

5399c03 Merge pull request #20681 from charris/backport-20954

f9c45f8 Merge pull request #20680 from charris/backport-20663

794b36f Update armccompiler.py

d93b14e Update test_public_api.py

7662c07 Update init.py

311ab52 Update armccompiler.py

Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

@dependabot use these labels will set the current labels as the default for future PRs for this repo and language

@dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language

@dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language

@dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

dependencies
opened by dependabot[bot] 0

Releases(0.1.0)

0.1.0(Dec 19, 2021)

Data comes from wordfrequency.info.
Source code(tar.gz)
Source code(zip)
all.json(13.20 MB)
most-common-words.zip(2.24 MB)

Owner

🃏 effectively learn new languages by using cool methods, such as flashcards and most common words!

GitHub Repository

Open-World Entity Segmentation

Open-World Entity Segmentation Project Website Lu Qi*, Jason Kuen*, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia This projec

408 Dec 29, 2022

Source code and dataset for ACL 2019 paper "ERNIE: Enhanced Language Representation with Informative Entities"

ERNIE Source code and dataset for "ERNIE: Enhanced Language Representation with Informative Entities" Reqirements: Pytorch=0.4.1 Python3 tqdm boto3 r

1.3k Dec 30, 2022

Sequence Modeling with Structured State Spaces

Structured State Spaces for Sequence Modeling This repository provides implementations and experiments for the following papers. S4 Efficiently Modeli

902 Jan 06, 2023

SDL: Synthetic Document Layout dataset

SDL is the project that synthesizes document images. It facilitates multiple-level labeling on document images and can generate in multiple languages.

0 Oct 07, 2021

IndoBERTweet is the first large-scale pretrained model for Indonesian Twitter. Published at EMNLP 2021 (main conference)

IndoBERTweet 🐦 🇮🇩 1. Paper Fajri Koto, Jey Han Lau, and Timothy Baldwin. IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effe

40 Nov 30, 2022

Predict the spans of toxic posts that were responsible for the toxic label of the posts

toxic-spans-detection An attempt at the SemEval 2021 Task 5: Toxic Spans Detection. The Toxic Spans Detection task of SemEval2021 required participant

3 Jul 24, 2022

Built for cleaning purposes in military institutions

Ferramenta do AL Construído para fins de limpeza em instituições militares. Instalação Requer python = 3.2 pip install -r requirements.txt Usagem Exe

0 Aug 13, 2022

ZUNIT - Toward Zero-Shot Unsupervised Image-to-Image Translation

ZUNIT Dependencies you can install all the dependencies by pip install -r requirements.txt Datasets Download CUB dataset. Unzip the birds.zip at ./da

9 Jun 24, 2022

Exploration of BERT-based models on twitter sentiment classifications

twitter-sentiment-analysis Explore the relationship between twitter sentiment of Tesla and its stock price/return. Explore the effect of different BER

2 Oct 02, 2022

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

epub2audiobook Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Input examples qual a pasta do seu

7 Aug 25, 2022

Code for paper "Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features"

Role-oriented Network Embedding Based on Adversarial Learning between Higher-order and Local Features Train python main.py --dataset brazil-flights C

0 Jun 28, 2022

A natural language modeling framework based on PyTorch

Overview PyText is a deep-learning based NLP modeling framework built on PyTorch. PyText addresses the often-conflicting requirements of enabling rapi

6.4k Jan 08, 2023

BMInf (Big Model Inference) is a low-resource inference package for large-scale pretrained language models (PLMs).

377 Jan 02, 2023

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive

I label phrases on a scale of five values: negative, somewhat negative, neutral, somewhat positive, positive. Obstacles like sentence negation, sarcasm, terseness, language ambiguity, and many others

1 Jan 13, 2022

This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection"

Splinter This repository contains the code, models and datasets discussed in our paper "Few-Shot Question Answering by Pretraining Span Selection", to

88 Dec 31, 2022

🏆 • 5050 most frequent words in 109 languages

Related tags

Overview

🏆 Most Common Words Multilingual

🔗 License

You might also like...

Count the frequency of letters or words in a text file and show a graph.

This program do translate english words to portuguese

Python powered crossword generator with database with 20k+ polish words

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Russian words synonyms and antonyms

The projects lets you extract glossary words and their definitions from a given piece of text automatically using NLP techniques

Turkish Stop Words Türkçe Dolgu Sözcükleri

The ability of computer software to identify words and phrases in spoken language and convert them to human-readable text

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Comments

build(deps): bump certifi from 2021.10.8 to 2022.12.7

build(deps): bump numpy from 1.21.4 to 1.22.0

v1.22.0

NumPy 1.22.0 Release Notes

Expired deprecations

Deprecated numeric style dtype strings have been removed

Expired deprecations for loads, ndfromtxt, and mafromtxt in npyio