A library for pattern matching on symbolic expressions in Python.

Overview

MatchPy

MatchPy is a library for pattern matching on symbolic expressions in Python.

Work in progress

Latest version released on PyPi Latest version released via conda-forge Test coverage Build status of the master branch Documentation Status The Journal of Open Source Software Digital Object Identifier

Installation

MatchPy is available via PyPI, and for Conda via conda-forge. It can be installed with pip install matchpy or conda install -c conda-forge matchpy.

Overview

This package implements pattern matching in Python. Pattern matching is a powerful tool for symbolic computations, operating on symbolic expressions. Given a pattern and an expression (which is usually called subject), the goal of pattern matching is to find a substitution for all the variables in the pattern such that the pattern becomes the subject. As an example, consider the pattern f(x), where f is a function and x is a variable, and the subject f(a), where a is a constant symbol. Then the substitution that replaces x with a is a match. MatchPy supports associative and/or commutative function symbols, as well as sequence variables, similar to pattern matching in Mathematica.

A detailed example of how to use MatchPy can be found here.

MatchPy supports both one-to-one and many-to-one pattern matching. The latter makes use of similarities between patterns to efficiently find matches for multiple patterns at the same time.

A list of publications about MatchPy can be found below.

Expressions

Expressions are tree-like data structures, consisting of operations (functions, internal nodes) and symbols (constants, leaves):

>>> from matchpy import Operation, Symbol, Arity
>>> f = Operation.new('f', Arity.binary)
>>> a = Symbol('a')
>>> print(f(a, a))
f(a, a)

Patterns are expressions which may contain wildcards (variables):

>>> from matchpy import Pattern, Wildcard
>>> x = Wildcard.dot('x')
>>> print(Pattern(f(a, x)))
f(a, x_)

In the previous example, x is the name of the variable. However, it is also possible to use wildcards without names:

>>> w = Wildcard.dot()
>>> print(Pattern(f(w, w)))
f(_, _)

It is also possible to assign variable names to entire subexpressions:

>>> print(Pattern(f(w, a, variable_name='y')))
y: f(_, a)

Pattern Matching

Given a pattern and an expression (which is usually called subject), the idea of pattern matching is to find a substitution that maps wildcards to expressions such that the pattern becomes the subject. In MatchPy, a substitution is a dict that maps variable names to expressions.

>>> from matchpy import match
>>> y = Wildcard.dot('y')
>>> b = Symbol('b')
>>> subject = f(a, b)
>>> pattern = Pattern(f(x, y))
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{x ↦ a, y ↦ b}

Applying the substitution to the pattern results in the original expression.

>>> from matchpy import substitute
>>> print(substitute(pattern, substitution))
f(a, b)

Sequence Wildcards

Sequence wildcards are wildcards that can match a sequence of expressions instead of just a single expression:

>>> z = Wildcard.plus('z')
>>> pattern = Pattern(f(z))
>>> subject = f(a, b)
>>> substitution = next(match(subject, pattern))
>>> print(substitution)
{z ↦ (a, b)}

Associativity and Commutativity

MatchPy natively supports associative and/or commutative operations. Nested associative operators are automatically flattened, the operands in commutative operations are sorted:

>>> g = Operation.new('g', Arity.polyadic, associative=True, commutative=True)
>>> print(g(a, g(b, a)))
g(a, a, b)

Associativity and commutativity is also considered for pattern matching:

>>> pattern = Pattern(g(b, x))
>>> subject = g(a, a, b)
>>> print(next(match(subject, pattern)))
{x ↦ g(a, a)}
>>> h = Operation.new('h', Arity.polyadic)
>>> pattern = Pattern(h(b, x))
>>> subject = h(a, a, b)
>>> list(match(subject, pattern))
[]

Many-to-One Matching

When a fixed set of patterns is matched repeatedly against different subjects, matching can be sped up significantly by using many-to-one matching. The idea of many-to-one matching is to construct a so called discrimination net, a data structure similar to a decision tree or a finite automaton that exploits similarities between patterns. In MatchPy, there are two such data structures, implemented as classes: DiscriminationNet and ManyToOneMatcher. The DiscriminationNet class only supports syntactic pattern matching, that is, operations are neither associative nor commutative. Sequence variables are not supported either. The ManyToOneMatcher class supports associative and/or commutative matching with sequence variables. For syntactic pattern matching, the DiscriminationNet should be used, as it is usually faster.

>>> pattern1 = Pattern(f(a, x))
>>> pattern2 = Pattern(f(y, b))
>>> matcher = ManyToOneMatcher(pattern1, pattern2)
>>> subject = f(a, b)
>>> matches = matcher.match(subject)
>>> for matched_pattern, substitution in sorted(map(lambda m: (str(m[0]), str(m[1])), matches)):
...     print('{} matched with {}'.format(matched_pattern, substitution))
f(a, x_) matched with {x ↦ b}
f(y_, b) matched with {y ↦ a}

Roadmap

Besides the existing features, we plan on adding the following to MatchPy:

  • Support for Mathematica's Alternatives: For example f(a | b) would match either f(a) or f(b).
  • Support for Mathematica's Repeated: For example f(a..) would match f(a), f(a, a), f(a, a, a), etc.
  • Support pattern sequences (PatternSequence in Mathematica). These are mainly useful in combination with Alternatives or Repeated, e.g. f(a | (b, c)) would match either f(a) or f(b, c). f((a a)..) would match any f with an even number of a arguments.
  • All these additional pattern features need to be supported in the ManyToOneMatcher as well.
  • Better integration with existing types such as dict.
  • Code generation for both one-to-one and many-to-one matching. There is already an experimental implementation, but it still has some dependencies on MatchPy which can probably be removed.
  • Improving the documentation with more examples.
  • Better test coverage with more randomized tests.
  • Implementation of the matching algorithms in a lower-level language, for example C, both for performance and to make MatchPy's functionality available in other languages.

Contributing

If you have some issue or want to contribute, please feel free to open an issue or create a pull request. Help is always appreciated!

The Makefile has several tasks to help development:

  • To install all needed packages, you can use make init .
  • To run the tests you can use make test. The tests use pytest.
  • To generate the documentation you can use make docs .
  • To run the style checker (pylint) you can use make check .

If you have any questions or need help with setting things up, please open an issue and we will try the best to assist you.

Publications

Manuel Krebber and Henrik Barthels
Journal of Open Source Software, Volume 3(26), pp. 2, June 2018.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 7th Workshop on Python for High-Performance and Scientific Computing, November 2017.

Manuel Krebber, Henrik Barthels and Paolo Bientinesi
Proceedings of the 15th Python in Science Conference, July 2017.

Manuel Krebber
Master Thesis, RWTH Aachen University, May 2017

If you want to cite MatchPy, please reference the JOSS paper:

@article{krebber2018,
    author    = {Manuel Krebber and Henrik Barthels},
    title     = {{M}atch{P}y: {P}attern {M}atching in {P}ython},
    journal   = {Journal of Open Source Software},
    year      = 2018,
    pages     = 2,
    month     = jun,
    volume    = {3},
    number    = {26},
    doi       = "10.21105/joss.00670",
    web       = "http://joss.theoj.org/papers/10.21105/joss.00670",
}
Comments
  • `ManyToOneReplacer` slow for small expressions

    `ManyToOneReplacer` slow for small expressions

    I have added ~1700 ReplacementRules in ManyToOneReplacer. The .replace() has become very slow even for very small expressions(like 2*x).

    File: https://github.com/parsoyaarihant/sympy/blob/502fd311d64155560383a5f4f3b2710c318dc410/sympy/integrals/rubi/patterns.py

    opened by parsoyaarihant 38
  • Added internal implementation of the Hopcroft-Karp algorithm

    Added internal implementation of the Hopcroft-Karp algorithm

    Added Hopcroft-Karp algorithm in order to free MatchPy of its GPL dependency. Readapted from the C++ implementation in SymEngine.

    Translating the C++ implementation in SymEngine back to Python.

    I'm the author of the C++ implementation so no copyright notice mentioning SymEngine is necessary.

    opened by Upabjojr 23
  • `ManyToOneReplacer` for Rubi

    `ManyToOneReplacer` for Rubi

    Hi, I implemented ~80 rules using ManyToOneReplacer for Rubi integration and MatchPy was able to match the subjects very quickly. However, when I added ~550 rules, there was significant decrease in speed while matching even smaller subjects.

    It would be great if you could advice us on how to solve this issue.

    Here are the files for our project:

    Patterns: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/patterns.py Constraint: https://github.com/parsoyaarihant/sympy/blob/rubi4/sympy/rubi/constraint.py All Rubi Files: https://github.com/parsoyaarihant/sympy/tree/rubi4/sympy/rubi

    opened by parsoyaarihant 21
  • Optional arguments in machpy

    Optional arguments in machpy

    I wanted to know if matchpy supports optional arguments during match.

    Mathematica: https://reference.wolfram.com/language/ref/Optional.html

    Example:

    >>> x = Symbol('x')
    >>> a_ = Wildcard.dot('a', optional=1)
    >>> a_free = CustomConstraint(lambda a, x: str(x) not in a.symbols)
    >>> x_ = Wildcard.dot('x')
    >>> pattern = Pattern(Mul(a_, x_), a_free)
    >>> next(match(x, pattern))
    {a: 1}
    
    opened by parsoyaarihant 20
  • Fixing the code generator

    Fixing the code generator

    The code generator enforces the reading of all variables whenever a constraint has been encountered. In this PR, the constraint is check only if all variables have been read.

    Furthermore, indentation has been changes to four spaces and a related bug forcing to always use bugs has been fixed as well.

    opened by Upabjojr 15
  • Repeated definition of the same CustomConstraint

    Repeated definition of the same CustomConstraint

    In the current RUBI rules patterns in SymPy, the same constraint is defined many times. For example: CustomConstraint(lambda a, x: FreeQ(a, x)) has its definition repeated more and more, like in:

    https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L22

    and in the next rule the same constraint is redefined: https://github.com/sympy/sympy/blob/628286fb5cc2248002bb0b141a911757a2e81fa7/sympy/integrals/rubi/rules/miscellaneous_algebraic.py#L26

    Python defines a new object for every new lambda definition, even if the functions return the same identical value.

    Does this practice have some potential negative consequences, like:

    • the size of the generated decision tree?
    • the overall speed of MatchPy?
    opened by Upabjojr 13
  • using `int` as Wildcard

    using `int` as Wildcard

    Hi, I am trying to define class ConstantWild (similar to ConstantSymbol you suggested earlier).

    Here is the code:

    from matchpy import Wildcard
    
    class ConstantWild(Wildcard):
        def __init__(self, value):
            super(self.__class__, self).__init__(min_count=1, fixed_size=True, variable_name=str(value))
            self.value = value
    
        def __str__(self):
            return str(self.value)
    

    The above code have same attributes as Wildcard.dot but I don't understand why it is giving errors.

    opened by parsoyaarihant 9
  • Custom sorting key for substitute( )

    Custom sorting key for substitute( )

    Substitute now accepts custom sorting key to specify custom sorting for elements in Multiset.

    This should solve the compatibility problem with SymPy which does not allow comparison operators (>, >=, <, <=) to be evaluated to booleans if the truth of the expression cannot be easily determined.

    opened by Upabjojr 8
  • Need Help in code generation.

    Need Help in code generation.

    I could not find any documentation related to code generation. Last year @wheerd had written a code generation script for rubi in sympy. https://gist.github.com/wheerd/13ac5a0e9b560db6201d96136fa0bbcc

    But code generated was huge so, It cannot be used. This year I have removed the repeated definition of constraints. So code generation should work probably.

    However, I am unable to write the code generation script. Here is the new structure of rubi in sympy. Can someone update the gist accordingly?

    rules = [] #list to keep track which rules has been applied
    def cons_f1(m, x):
        return FreeQ(m, x)
    cons1 = CustomConstraint(cons_f1)
    
    def cons_f2(m):
        return NonzeroQ(m + S(1))
    cons2 = CustomConstraint(cons_f2)
    
    pattern1 = Pattern(Integral(x_**WC('m', S(1)), x_), cons1, cons2)
    def replacement1(m, x):
        rules.append(1)
        return Simp(x**(m + S(1))/(m + S(1)), x)
    rule1 = ReplacementRule(pattern1, replacement1)
    
    help wanted question 
    opened by ashishkg0022 7
  • Alternate way to define constraint

    Alternate way to define constraint

    I wanted to know if there is alternatative way to define constraint in matchpy. Example:

    
    def FreeQ(vars, x): # returns True of any `var` contains `x`
    	if any(i.contains(x) for i in vars):
    		return False
    	return True
    
    def NonzeroQ(e): # returns True if `e` is not zero.
    	return e != 0
    
    pattern = Pattern(Int(Pow(Add(a_, Mul(b_, x_)), m_), x_), FreeQ(list(a, b, m), x), NonzeroQ(Add(m, 1)))
    

    Thanks.

    opened by parsoyaarihant 7
  • GPL'd dependency: hopcroftkarp library

    GPL'd dependency: hopcroftkarp library

    The hopcroftkarp library upon which this project depends is GPL'd.

    I have re-implemented the Hopcroft-Karp algorithm in C++ as part of an attempt to make MatchPy usable in SymEngine (that is, SymPy's core rewritten in C++). I didn't perform extensive testing, so there may be bugs. That code can be translated into Python.

    enhancement 
    opened by Upabjojr 6
  • FreeQ equivalent in MatchPy?

    FreeQ equivalent in MatchPy?

    Mathematica has FreeQ to test whether an expression contains a symbol. This is very useful in pattern matching for equations as you can specify that, for example, in a * x + b == 0 the variables a and b should not contain the variable x.

    In SymPy we are currently using things like CustomConstraint(lambda a, x: not a.has(x)), where a.has(x) is a SymPy expression that tells you if x is contained in the expression tree of a.

    Would it make sense to add an optimized FreeQ-like tester that checks whether the variable is contained in the expression during the matching iteration of MatchPy?

    opened by Upabjojr 1
  • Optional wildcards to match the identity element of the current node

    Optional wildcards to match the identity element of the current node

    When using the optional=value argument in Wildcard, one can specify which value to get in case no matching is found. The value has to be given for every wildcard.

    In case of addition and multiplication, the usual optional values are the identity elements, zero and one respectively.

    It would be nice to have the possibility to have the optional value to return the identity element of the current node, if available.

    enhancement 
    opened by Upabjojr 7
  • substitute not working with SymPy

    substitute not working with SymPy

    Calling sorted( ) in https://github.com/HPAC/matchpy/blob/5cae3f275e3a1f725518516bf3c700dc3be03a56/matchpy/functions.py#L87 fails with SymPy, as SymPy objects are not comparable with operators (<, <=, >, >=). Operators are overloaded in SymPy to create instances of inequality objects.

    SymPy has a function called default_sort_key meant to deal with this problem:

    from sympy import symbols
    x, y, z = symbols("x y z")
    from sympy.core.compatibility import default_sort_key
    sorted([z, x, y], key=default_sort_key)
    

    The question is, is it possible to modify MatchPy to specify a custom sorting key so that SymPy can specify the way its objects should be sorted?

    Or is it an issue with Multiset?

    opened by Upabjojr 11
  • One-to-one matching: symbols with variable_name do not work in commutative functions

    One-to-one matching: symbols with variable_name do not work in commutative functions

    In one-to-one matching, symbols that have a variable_name do not work in commutative functions. They seem to work correctly in many-to-one matching. Example:

    from matchpy import Operation, Symbol, Arity, Pattern, match, ManyToOneMatcher
    
    f = Operation.new('f', Arity.binary, commutative=True)
    a_x = Symbol('a', variable_name='x')
    a = Symbol('a')
    b = Symbol('b')
    
    subject = f(a, b)
    pattern = Pattern(f(a_x, b))
    
    print(list(match(subject, pattern)))
    
    matcher = ManyToOneMatcher(pattern)
    print(list(matcher.match(subject)))
    

    This results in

    []
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    but it should result in

    [{'x': Symbol('a')}]
    [(Pattern(f(a: x, b)), {'x': Symbol('a')})]
    

    So far, we don't have any tests that verify this functionality. Once it is fixed, we should add some.

    bug 
    opened by hbarthels 0
  • Match arbitrary function

    Match arbitrary function

    Is there any way to match a function with a certain number of arguments? Suppose I've defined a binary function sum, is there a way to create a pattern f(a, b) that matches sum(x, y)?

    enhancement 
    opened by sidmani 4
  • Improve checking correctness of input

    Improve checking correctness of input

    To avoid issues such as #60, we should check that objects are created and functions are used correctly, at least in those cases where so far, incorrect input does not produce any errors.

    enhancement 
    opened by hbarthels 0
Releases(0.5.5)
Owner
High-Performance and Automatic Computing
Umeå University & RWTH Aachen
High-Performance and Automatic Computing
Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

Coursework project for DIP class. The goal is to use vision to guide the Dashgo robot through two traffic cones in bright color.

Yueqian Liu 3 Oct 24, 2022
Jack Morgan's Advent of Code Solutions

Advent-of-Code Jack Morgan's Advent of Code Solutions Usage Run . initiate.sh year day To initiate a day. This sets up a template python file, and pul

Jack Morgan 1 Dec 10, 2021
Python meta class and abstract method library with restrictions.

abcmeta Python meta class and abstract method library with restrictions. This library provides a restricted way to validate abstract methods. The Pyth

Morteza NourelahiAlamdari 8 Dec 14, 2022
Automatic certificate unpinning for Android apps

What is this? Script used to perform automatic certificate unpinning of an APK by adding a custom network security configuration that permits user-add

Antoine Neuenschwander 5 Jul 28, 2021
A Sophisticated And Beautiful Doxing Tool

Garuda V1.1 A Sophisticated And Beautiful Doxing Tool Works on Android[Termux] | Linux | Windows Don't Forget to give it a star ❗ How to use ❓ First o

The Cryptonian 67 Jan 10, 2022
Attempt at a Windows version of the plotman Chia Plot Manager system

windows plotman: an attempt to get plotman to work on windows THIS IS A BETA. Not ready for production use just yet. Almost, but not quite there yet.

59 May 11, 2022
Unofficial Python Library to communicate with SESAME 3 series products from CANDY HOUSE, Inc.

pysesame3 Unofficial Python Library to communicate with SESAME 3 series products from CANDY HOUSE, Inc. This project aims to control SESAME 3 series d

Masaki Tagawa 18 Dec 12, 2022
B-Pkg is a simple tool in python for installing all basic package in termux

Basic-Pkg 👉🏻 Basic-Pkg 👈🏻 B-Pkg is a simple tool in python for installing all basic package in termux This is my first tool, I hope you will like

Macgaiver 3 Oct 21, 2021
A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI. Even For Admin Panel

Love-Calculator A GUI love Calculator which saves all the User Data in text file(sql based script will be uploaded soon). Interative GUI, even For Adm

Adithya Krishnan 1 Mar 22, 2022
App to get data from popular polish pages with job offers

Job board parser I written simple app to get me data from popular pages with job offers, because I wanted to knew immidietly if there is some new offe

0 Jan 04, 2022
Application to list countries in order of travel from the United States.

Application to list countries in order of travel from the United States.

Broden Wanner 1 Nov 03, 2021
Structured Exceptions for Python

XC: Structured exceptions for Python XC encourages a structured, disciplined approach to use of exceptions: it reduces the overhead of declaring excep

Bob Gautier 2 May 28, 2021
CarolinaCon CTF Online

CarolinaCon Online CTF CTF challenges from CarolinaCon Online April 23 through April 25, 2021. All challenges from the CTF will eventually be here. Co

49th Security Division 6 May 04, 2022
Script to calculate the italian fiscal code of a person.

fiscal_code Hi! This is my first public repository, so please be kind if it is not well formatted or it contains errors. I started learning Python abo

FrancescoDiMuro 1 Nov 20, 2021
Always fill your package requirements without the user having to do anything! Simple and easy!

WSL Should now work always-fill-reqs-python3 Always fill your package requirements without the user having to do anything! Simple and easy! Supported

Hashm 7 Jan 19, 2022
Stack-overflow-import - Import arbitrary code from Stack Overflow as Python modules.

StackOverflow Importer Do you ever feel like all you’re doing is copy/pasting from Stack Overflow? Let’s take it one step further. from stackoverflow

Filip Haglund 3.7k Jan 08, 2023
Get a link to the web version of a git-tracked file or directory

githyperlink Get a link to the web version of a git-tracked file or directory. Applies to GitHub and GitLab remotes (and maybe others but those are no

Tomas Fiers 2 Nov 08, 2022
Dev-meme - A repository that contains memes just for people like us

A repository that contains memes just for people like us. Coders are constantly

Padmashree Jha 4 Oct 31, 2022
Convert .1pux to .csv

1PasswordConverter Convert .1pux to .csv 1Password uses this new export format .1pux, I assume stands for 1 Password User eXport. As of right now, 1Pa

Shayne Hartford 7 Dec 16, 2022
Pixelarticons - Pixel Art Icons made simple for Flutter, powered by pixelarticons and fontify

Pixelarticons - Pixel Art Icons made simple for Flutter, powered by pixelarticons and fontify

lask 16 Dec 12, 2022