A GitHub action that suggests type annotations for Python using machine learning.

Overview

Typilus: Suggest Python Type Annotations

A GitHub action that suggests type annotations for Python using machine learning.

This action makes suggestions within each pull request as suggested edits. You can then directly apply these suggestions to your code or ignore them.

Sample Suggestion Sample Suggestion

What are Python type annotations? Introduced in Python 3.5, type hints (more traditionally called type annotations) allow users to annotate their code with the expected types. These annotations are optionally checked by external tools, such as mypy and pyright, to prevent type errors; they also facilitate code comprehension and navigation. The typing module provides the core types.

Why use machine learning? Given the dynamic nature of Python, type inference is challenging, especially over partial contexts. To tackle this challenge, we use a graph neural network model that predicts types by probabilistically reasoning over a program’s structure, names, and patterns. This allows us to make suggestions with only a partial context, at the cost of suggesting some false positives.

Install Action in your Repository

To use the GitHub action, create a workflow file. For example,

name: Typilus Type Annotation Suggestions

# Controls when the action will run. Triggers the workflow on push or pull request
# events but only for the master branch
on:
  pull_request:
    branches: [ master ]

jobs:
  suggest:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    steps:
    # Checks-out your repository under $GITHUB_WORKSPACE, so that typilus can access it.
    - uses: actions/[email protected]
    - uses: typilus/[email protected]
      env:
        GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        MODEL_PATH: path/to/model.pkl.gz   # Optional: provide the path of a custom model instead of the pre-trained model.
        SUGGESTION_CONFIDENCE_THRESHOLD: 0.8   # Configure this to limit the confidence of suggestions on un-annotated locations. A float in [0, 1]. Default 0.8
        DISAGREEMENT_CONFIDENCE_THRESHOLD: 0.95  # Configure this to limit the confidence of suggestions on annotated locations.  A float in [0, 1]. Default 0.95

The action uses the GITHUB_TOKEN to retrieve the diff of the pull request and to post comments on the analyzed pull request.

Technical Details & Internals

This GitHub action is a reimplementation of the Graph2Class model of Allamanis et al. PLDI 2020 using the ptgnn library. Internally, it uses a Graph Neural Network to predict likely type annotations for Python code.

This action uses a pre-trained neural network that has been trained on a corpus of open-source repositories that use Python's type annotations. At this point we do not support online adaptation of the model to each project.

Training your own model

You may wish to train your own model and use it in this action. To do so, please follow the steps in ptgnn. Then provide a path to the model in your GitHub action configuration, through the MODEL_PATH environment variable.

Contributing

We welcome external contributions and ideas. Please look at the issues in the repository for ideas and improvements.

You might also like...
 30 Days Of Machine Learning Using Pytorch
30 Days Of Machine Learning Using Pytorch

Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch

customer churn prediction prevention in telecom industry using machine learning and survival analysis

Telco Customer Churn Prediction - Plotly Dash Application Description This dash application allows you to predict telco customer churn using machine l

using Machine Learning Algorithm to classification AppleStore application

AppleStore-classification-with-Machine-learning-Algo- using Machine Learning Algorithm to classification AppleStore application. the first step : 1: p

CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning applications.

SmartSim Example Zoo This repository contains CrayLabs and user contibuted examples of using SmartSim for various simulation and machine learning appl

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.
Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.

Trading Tesla with Machine Learning and Sentiment Analysis An interactive program to train a Random Forest Classifier to predict Tesla daily prices us

A machine learning web application for binary classification using streamlit
A machine learning web application for binary classification using streamlit

Machine Learning web App This is a machine learning web application for binary classification using streamlit options this application contains 3 clas

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning

Predicting Keystrokes using an Audio Side-Channel Attack and Machine Learning My

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

Microsoft contributing libraries, tools, recipes, sample codes and workshop contents for machine learning & deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

A data preprocessing package for time series data. Design for machine learning and deep learning.

Comments
  • IndexError: list index out of range

    IndexError: list index out of range

    Diff GET Status Code:  200
    Traceback (most recent call last):
      File "/usr/src/entrypoint.py", line 81, in <module>
        changed_files = get_changed_files(diff_rq.text)
      File "/usr/src/changeutils.py", line 38, in get_changed_files
        assert file_diff_lines[3].startswith("---")
    IndexError: list index out of range
    

    logs_302.zip

    opened by ZdenekM 1
  • Several small fixes

    Several small fixes

    Here are couple of things I noticed trying Typilus inference using GH Action:

    • gracefully handle patches that include a file renames (\wo any content modifications) by skipping such files
    • extractor stats reporting only processed files
    opened by bzz 0
  • Create a ptgnn-based Typilus model

    Create a ptgnn-based Typilus model

    Create and use the full Typilus model instead of graph2class.

    • [ ] Implement it in ptgnn
    • [ ] Use action cache to store intermediate result
    • [ ] Auto-update type space "once in a while"
    enhancement 
    opened by mallamanis 0
Releases(v0.9)
AI and Machine Learning with Kubeflow, Amazon EKS, and SageMaker

Data Science on AWS - O'Reilly Book Get the book on Amazon.com Book Outline Quick Start Workshop (4-hours) In this quick start hands-on workshop, you

Data Science on AWS 2.8k Jan 03, 2023
A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

2.3k Dec 29, 2022
Simulation of early COVID-19 using SIR model and variants (SEIR ...).

COVID-19-simulation Simulation of early COVID-19 using SIR model and variants (SEIR ...). Made by the Laboratory of Sustainable Life Assessment (GYRO)

José Paulo Pereira das Dores Savioli 1 Nov 17, 2021
Exemplary lightweight and ready-to-deploy machine learning project

Exemplary lightweight and ready-to-deploy machine learning project

snapADDY GmbH 6 Dec 20, 2022
Python package for stacking (machine learning technique)

vecstack Python package for stacking (stacked generalization) featuring lightweight functional API and fully compatible scikit-learn API Convenient wa

Igor Ivanov 671 Dec 25, 2022
MaD GUI is a basis for graphical annotation and computational analysis of time series data.

MaD GUI Machine Learning and Data Analytics Graphical User Interface MaD GUI is a basis for graphical annotation and computational analysis of time se

Machine Learning and Data Analytics Lab FAU 10 Dec 19, 2022
QML: A Python Toolkit for Quantum Machine Learning

QML is a Python2/3-compatible toolkit for representation learning of properties of molecules and solids.

176 Dec 09, 2022
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Model Serving Made Easy BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. Supports multi

BentoML 4.4k Jan 04, 2023
AutoOED: Automated Optimal Experiment Design Platform

AutoOED is an optimal experiment design platform powered with automated machine learning to accelerate the discovery of optimal solutions. Our platform solves multi-objective optimization problems an

Yunsheng Tian 107 Jan 03, 2023
BASTA: The BAyesian STellar Algorithm

BASTA: BAyesian STellar Algorithm Current stable version: v1.0 Important note: BASTA is developed for Python 3.8, but Python 3.7 should work as well.

BASTA team 16 Nov 15, 2022
Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Miles Cranmer 924 Jan 03, 2023
AP1 Transcription Factor Binding Site Prediction

A machine learning project that predicted binding sites of AP1 transcription factor, using ChIP-Seq data and local DNA shape information.

1 Jan 21, 2022
XManager: A framework for managing machine learning experiments 🧑‍🔬

XManager is a platform for packaging, running and keeping track of machine learning experiments. It currently enables one to launch experiments locally or on Google Cloud Platform (GCP). Interaction

DeepMind 620 Dec 27, 2022
🌊 River is a Python library for online machine learning.

River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on strea

OnlineML 4k Jan 03, 2023
Course files for "Ocean/Atmosphere Time Series Analysis"

time-series This package contains all necessary files for the course Ocean/Atmosphere Time Series Analysis, an introduction to data and time series an

Jonathan Lilly 107 Nov 29, 2022
This repository has datasets containing information of Uber pickups in NYC from April 2014 to September 2014 and January to June 2015. data Analysis , virtualization and some insights are gathered here

uber-pickups-analysis Data Source: https://www.kaggle.com/fivethirtyeight/uber-pickups-in-new-york-city Information about data set The dataset contain

B DEVA DEEKSHITH 1 Nov 03, 2021
K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

Emin 1 Dec 13, 2021
(3D): LeGO-LOAM, LIO-SAM, and LVI-SAM installation and application

SLAM-application: installation and test (3D): LeGO-LOAM, LIO-SAM, and LVI-SAM Tested on Quadruped robot in Gazebo ● Results: video, video2 Requirement

EungChang-Mason-Lee 203 Dec 26, 2022
Provide an input CSV and a target field to predict, generate a model + code to run it.

automl-gs Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learn

Max Woolf 1.8k Jan 04, 2023
scikit-learn: machine learning in Python

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started

neurodata 3 Dec 16, 2022