Python library for parsing resumes using natural language processing and machine learning

Last update: Jul 29, 2021

Overview

CVParser

Python library for parsing resumes using natural language processing and machine learning.

Setup

Installation on Linux and Mac OS

Follow the guide here on how to clone or fork a repo
Follow the guide here on how to create virtualenv

To create a normal virtualenv (example myvenv) and activate it (see Code below).

$ virtualenv --python=python3 myvenv

$ source myvenv/bin/activate

(myvenv) $ pip install -r requirements.txt

Usage

from cvparser.parser import CVParser

CVParser.download_nlk_data()


parser = CVParser(file_path="path/to/file.[pdf|doc|docx|png|jpeg]")
parser.parse()
print(parser.json())

Re-training the Model

cd into the train folder.
Delete the folder model and the file train.json.
Copy your new training data into the train folder. The train data must be in json. This can be generated using the data annotation tool called Dataturk. The file containing the training data must be named train.json.
Then, start re-training the model by execute the python script in the train folder named manual_training.py.
Then test your new model by #usage .

Python library for parsing resumes using natural language processing and machine learning

Related tags

Overview

CVParser

Setup

Installation on Linux and Mac OS

Usage

Re-training the Model

Owner

nafiu

Weird Sort-and-Compress Thing

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

Library for fast text representation and classification.

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

UniSpeech - Large Scale Self-Supervised Learning for Speech

Question answering app is used to answer for a user given question from user given text.

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

A library for finding knowledge neurons in pretrained transformer models.

Artificial Conversational Entity for queries in Eulogio "Amang" Rodriguez Institute of Science and Technology (EARIST)

A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

Beautiful visualizations of how language differs among document types.

用Resnet101+GPT搭建一个玩王者荣耀的AI

TFIDF-based QA system for AIO2 competition

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Tokenizer - Module python d'analyse syntaxique et de grammaire, tokenization

Finally, some decent sample sentences

Code for hyperboloid embeddings for knowledge graph entities

Machine learning models from Singapore's NLP research community