Functional tensors for probabilistic programming

Overview

Build Status Latest Version Documentation Status

Funsor

Funsor is a tensor-like library for functions and distributions.

See Functional tensors for probabilistic programming for a system description.

Installing

Install using pip:

Funsor supports Python 3.6+.

pip install funsor

Install from source:

git clone [email protected]:pyro-ppl/funsor.git
cd funsor
git checkout master
pip install .

Using funsor

Funsor can be used through a number of interfaces:

  • Funsors can be used directly for probabilistic computations, using PyTorch optimizers in a standard training loop. Start with these examples: discrete_hmm, eeg_slds, kalman_filter, pcfg, sensor, slds, and vae.
  • Funsors can be used to implement custom inference algorithms within Pyro, using custom elbo implementations in standard pyro.infer.SVI training. See these examples: mixed_hmm and bart forecasting.
  • funsor.pyro provides a number of Pyro-compatible (and PyTorch-compatible) distribution classes that use funsors under the hood, as well utilities to convert between funsors and distributions.
  • funsor.minipyro provides a limited alternate backend for the Pyro probabilistic programming language, and can perform some ELBO computations exactly.

Design

See design doc.

The goal of this library is to generalize Pyro's delayed inference algorithms from discrete to continuous variables, and to create machinery to enable partially delayed sampling compatible with universality. To achieve this goal this library makes three orthogonal design choices:

  1. Open terms are objects. Funsors generalize the tensor interface to also cover arbitrary functions of multiple variables ("inputs"), where variables may be integers, real numbers, or real tensors. Function evaluation / substitution is the basic operation, generalizing tensor indexing. This allows probability distributions to be first-class Funsors and make use of existing tensor machinery, for example we can generalize tensor contraction to computing analytic integrals in conjugate probabilistic models.

  2. Support nonstandard interpretation. Funsors support user-defined interpretations, including, eager, lazy, mixed eager+lazy, memoized (like opt_einsum's sharing), and approximate interpretations like Monte Carlo approximations of integration operations (e.g. .sum() over a funsor dimension).

  3. Named dimensions. Substitution is the most basic operation of Funsors. To avoid the difficulties of broadcasting and advanced indexing in positionally-indexed tensor libraries, all Funsor dimensions are named. Indexing uses the .__call__() method and can be interpreted as substitution (with well-understood semantics). Funsors are viewed as algebraic expressions with one algebraic free variable per dimension. Each dimension is either covariant (an output) or contravariant (an input).

Using funsor we can easily implement Pyro-style delayed sampling, roughly:

trace_log_prob = 0.

def pyro_sample(name, dist, obs=None):
    assert isinstance(dist, Funsor)
    if obs is not None:
        value = obs
    elif lazy:
        # delayed sampling (like Pyro's parallel enumeration)
        value = funsor.Variable(name, dist.support)
    else:
        value = dist.sample('value')[0]['value']

    # save log_prob in trace
    trace_log_prob += dist(value)

    return value

# ...later during inference...
loss = -trace_log_prob.reduce(logaddexp)  # collapses delayed variables

See funsor/minipyro.py for complete implementation.

Related projects

Citation

If you use Funsor, please consider citing:

@article{obermeyer2019functional,
  author = {Obermeyer, Fritz and Bingham, Eli and Jankowiak, Martin and
            Phan, Du and Chen, Jonathan P},
  title = {{Functional Tensors for Probabilistic Programming}},
  journal = {arXiv preprint arXiv:1910.10775},
  year = {2019}
}
Comments
  • Profiling GaussianHMM with NumPyro backend

    Profiling GaussianHMM with NumPyro backend

    I just do profiling funsor.pyro.hmm.GaussianHMM and pyro.distributions.GaussianHMM and have some observations:

    • funsor is slow comparing to pyro, especially for small hidden/obs dims. With batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2, pyro takes 5ms while funsor takes 35ms to evaluate log_prob.
    • The overhead in funsor seems to be constant. I increased obs_dim, hidden_dim to 30, 20 and verified that .
    • torch.cat takes a large amount of time in funsor (e.g. with batch_dim, time_dim = 50, 60, this op takes 7ms per the total 70ms). Similarly, torch.pad takes a portion of time in pyro (but the time is less than torch.cat in funsor). I think the reason is we replace pad by new_zeros + cat in funsor.
    • Time for numerical calculation (except cat, pad) seems to be similar between two versions. It seems to me that at some points, funsor switch dimensions. I guess this is due to the fact that in funsor, we store dimensions in a set. For example, I looked at args of triangular solve and got (this is not important I believe)
    funsor torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    funsor torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    funsor torch.Size([1, 5, 2, 4]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([1, 5, 2, 1]) torch.Size([1, 5, 2, 2])
    funsor torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    funsor torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 287 ms, sys: 30.9 ms, total: 318 ms
    Wall time: 28.3 ms
    pyro torch.Size([5, 3, 2, 4]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 3, 2, 1]) torch.Size([5, 3, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 4]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 1, 2, 1]) torch.Size([5, 1, 2, 2])
    pyro torch.Size([5, 2, 2]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    pyro torch.Size([5, 2, 1]) torch.Size([5, 2, 2])
    CPU times: user 70.4 ms, sys: 1.17 ms, total: 71.6 ms
    Wall time: 5.84 ms
    
    Profiling code
    import pyro.distributions as dist
    import pytest
    import torch
    from pyro.distributions.util import broadcast_shape
    
    from funsor.pyro.hmm import DiscreteHMM, GaussianHMM, GaussianMRF, SwitchingLinearHMM
    from funsor.testing import assert_close, random_mvn
    
    batch_dim, time_dim, obs_dim, hidden_dim = 5, 6, 3, 2
    
    init_shape = (batch_dim,)
    trans_mat_shape = trans_mvn_shape = obs_mat_shape = obs_mvn_shape = (batch_dim, time_dim)
    init_dist = random_mvn(init_shape, hidden_dim)
    trans_mat = torch.randn(trans_mat_shape + (hidden_dim, hidden_dim))
    trans_dist = random_mvn(trans_mvn_shape, hidden_dim)
    obs_mat = torch.randn(obs_mat_shape + (hidden_dim, obs_dim))
    obs_dist = random_mvn(obs_mvn_shape, obs_dim)
    
    actual_dist = GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    expected_dist = dist.GaussianHMM(init_dist, trans_mat, trans_dist, obs_mat, obs_dist)
    assert actual_dist.batch_shape == expected_dist.batch_shape
    assert actual_dist.event_shape == expected_dist.event_shape
    
    shape = broadcast_shape(init_shape + (1,),
                            trans_mat_shape, trans_mvn_shape,
                            obs_mat_shape, obs_mvn_shape)
    data = obs_dist.expand(shape).sample()
    assert data.shape == actual_dist.shape()
    
    %time actual_log_prob = actual_dist.log_prob(data)
    %time expected_log_prob = expected_dist.log_prob(data)
    

    With the above observations, I think that using jax.jit will resolve this "constant" overhead. So I would like to go ahead and implement GaussianHMM in NumPyro. This requires modifications of current funsor.pyro to make it work with NumPyro. All changes in this PR are just for profiling, not to merge. Hopefully, jax.jit will work well with lazily evaluation mechanisms in funsor.

    Tasks

    • [x] todo
    discussion 
    opened by fehiepsi 14
  • Support global backend

    Support global backend

    Addresses #207

    Follow up discussions at #317, this PR supports global backend in funsor.

    To make it easier for reviewing, I list here important changes in this PR:

    • Separate out jax and numpy implementation. I also take this chance to reorder those ops according to their names (to make it easier to compare different backends or to find some specific implementation).
    • Use numpy for default implementation of numeric ops. funsor.torch or funsor.jax are lazily imported, depending on the environment variable FUNSOR_BACKEND or the usage of funsor.set_backend(...).
    • We can change backend using the utility set_backend. There is a problem: if we have change backend to JAX, we can't revert back to numpy backend.
    • Port Pyro Tensordot implementation to funsor. Pyro implementation has some statements x.dim(), which is not available for ndarray
    • Despite that we will use numpy backend by default, I still use PyTorch names (except for amax, amin) for those ops (like the current master branch). I guess Pyro devs would like PyTorch names than NumPy names for numeric ops.

    Some small changes:

    • replace torch._C._get_tracing_state by funsor.util.get_tracing_state
    • add funsor.util.is_nn_module
    • add ops.is_numeric_array to check if a variable is a numeric array. We can rename it to ops.is_tensor to match torch.is_tensor if prefered.
    • add ops.detach to detach gradient. This is no op for numpy backend.
    • fix issues of ops.log which gives warnings at 0. value
    • support both tuple or iterator for testing utilities randn, zeros,... (because I found so many places in test files that use that pattern torch.zeros(1, 2, 3) instead of torch.zeros((1, 2, 3))).
    • dispatch for recursion_reinterpret.register(torch.Tensor) is implemented in torch.py file, instead of interpreter.py file. (similar for children, to_funsor, allclose)
    • others are cleanup and reordering changes

    Tests pass with all backends

    • [x] test_affine
    • [x] test_alpha_conversion
    • [x] test_cnf
    • [x] test_delta
    • [x] test_gaussian
    • [x] test_import
    • [x] test_integrate
    • [x] test_joint
    • [x] test_sum_product
    • [x] test_tensor
    • [x] test_terms

    Tests only work with torch/pyro

    • test/pyro/
    • test/test_minipyro.py
    • test/test_adjoint.py: requires einsum.adjoint.require_backward
    • test/test_distributions.py: refactoring
    • test/test_einsum.py
    • test/test_memoize.py: requires some einsum stuffs
    • test/test_optimizer.py: require distributions and einsum

    TODO:

    • [x] separate numpy and jax backend.
    • [x] revise DICE logic
    awaiting review 
    opened by fehiepsi 12
  • Add a single normal form for sum-product contractions

    Add a single normal form for sum-product contractions

    Resolves #156

    As outlined in #156 I'm using this as a staging branch for the refactoring PRs associated with the proposed changes there:

    • [x] #157 Original PR (diff)
    • [x] #173 Refactor Affine into Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/e20978978283998341f5ed75a755cebc76fa2bab)
    • [x] #165 Refactor optimizer to use Contraction and normalize (https://github.com/pyro-ppl/funsor/pull/157/commits/a4b2ec29b10f6649d86cf9fee24d56e574b279c8)
    • [x] #169 Refactor Joint into multivariate Delta and Contraction (https://github.com/pyro-ppl/funsor/pull/157/commits/d2c79b0c8cdab99b370ca160de53dc8d62469dd9)
    • [x] Separate unfolding from normalization (https://github.com/pyro-ppl/funsor/pull/157/commits/ffdd38176567242fde59ab1f070488ab98cce72e)
    • [x] Troubleshoot performance issues (mostly caused by repeated unfolding)
    • [x] Use #212 for pattern matching
    • [x] Remove obsolete patterns (diff)
    • [x] Update MultiDelta to keep separate log_densitys for each point (https://github.com/pyro-ppl/funsor/pull/157/commits/30cbcb1c13806991cdeb70e5618799b60cf844b9)
    • [x] Fix regressions in adjoint and einsum
    • [x] Add missing patterns for minipyro
    • [x] Update bound inference tests for Number

    Original description: This PR takes a first step towards the goals in #156 by adding two things:

    1. A Contraction term representing finitary sum-product operations that takes advantage of multipledispatch.Variadic for pattern matching. This is also necessary for #72.
    2. A normalize interpretation that uses associative and distributive properties to put linear expressions into something like DNF, grouping operations into single Contraction terms where possible.

    As a demonstration, I have included patterns and tests illustrating how Joint is equivalent to interleaving eager and normalize.

    Tested:

    • I copied smoke tests and test cases verbatim from test_joint.py, test_gaussian.py, test_affine.py and test_einsum.py
    awaiting review refactor 
    opened by eb8680 10
  • DiCE factors from importance sampling?

    DiCE factors from importance sampling?

    In Tensor.unscaled_sample, a DiCE factor is added to the sample. In Gaussian.unscaled_sample, this isn't necessary because of the parameterization of the delta function.

    So far including DiCE with the samples doesn't sit quite right conceptually. Could it make sense to instead approach this from importance sampling? The default interpretation of adjoint(p) @ f for continuous p might be to use importance sampling with some default proposal q. To maintain differentiability, the importance weights would be p / q.detach(). If samples can be drawn from p, then p can be used for the proposal, and the importance weights would be p / p.detach(), which in this context are the DiCE factors.

    Am I missing some benefits of the DiCE formalism?

    discussion 
    opened by mattwescott 10
  • Generalize partial sum product to Markov models

    Generalize partial sum product to Markov models

    This attempts to generalize partial_sum_product to support elimination of markov variables following discussion with @eb8680 and @fritzo at pyro-ppl/pyro#2687.

    TODO

    • [x] modified_partial_sum_product that eliminates first-order markov variables using parallel-scan algorithm.
    • [x] Add more tests
    • [ ] Rename modified_partial_sum_product or replace partial_sum_product by it?

    Tests

    • [x] _expected_modified_partial_sum_product unrolls one markov dim at a time
    • [x] test_modified_partial_sum_product_0
    • [x] test_modified_partial_sum_product_1
    • [x] test_modified_partial_sum_product_2
    • [x] test_modified_partial_sum_product_3
    • [x] test_modified_partial_sum_product_4
    • [x] test_modified_partial_sum_product_5
    • [x] test_modified_partial_sum_product_6
    • [x] test_modified_partial_sum_product_7: xfail x and y markov vars with different ordinals
    • [x] test_modified_partial_sum_product_8
    • [x] test_modified_partial_sum_product_9
    • [x] test_modified_partial_sum_product_10
    • [x] test_modified_partial_sum_product_11
    • [x] test_modified_partial_sum_product_12: xfail w_curr in time plate but lower ordinal than y and x markov vars
    • [x] test_modified_partial_sum_product_13
    • [x] test_modified_partial_sum_product_14
    • [x] test_modified_partial_sum_product_15: xfail y depends on two time dims time and tones
    • [x] test_modified_partial_sum_product_16: x_prev->y_curr and y_prev->x_curr

    Note

    • Markov sites cannot be within local plates inside (see tests 2 and 10)

    Update

    • Added unrolling of markov vars in local plates to handle tests 2 and 10

    Useful links

    • pyro-ppl/funsor#167
    • pyro-ppl/funsor#204
    • pyro-ppl/funsor#293
    • pyro-ppl/numpyro#686
    • pyro-ppl/funsor#399
    enhancement WIP 
    opened by ordabayevy 9
  • Make funsor.distributions backend agnostic

    Make funsor.distributions backend agnostic

    Blocked by https://github.com/pyro-ppl/pyro-api/pull/19

    This is similar to #326 but doesn't include funsor.numpyro stuff. I have resolved has_rsample issue and make the whole test_distributions backend agnostic.

    awaiting review 
    opened by fehiepsi 9
  • refactor Gaussian to use Tensor api instead of torch.Tensor

    refactor Gaussian to use Tensor api instead of torch.Tensor

    This PR follows Approach 2 in #207 to make Gaussian backend-agnostic. My plan is to

    • check if info_vec is torch.Tensor, then set self.backend = TorchTensor.
    • make backend-specific ops such as triangular_solve, cholesky static methods of their corresponding TorchTensor type. For example,
    x.triangular_solve(y)
    

    will be

    self.backend.triangular_solve(x, y)
    

    It seems that there is a lot of work to do but I hope that future refactoring will be easier after this.

    Tasks

    • ~[] subclass Tensor to TorchTensor~ this is not necessary for now
    • [x] collect and implement all required ~static methods~ ops for TorchTensor
    • [x] use the above ~static methods~ ops in Gaussian
    • [x] do the same for other classes/functions in gaussian.py
    • [x] make sure all tests pass
    • [x] make align_tensors and materialize backend agnostic
    • [x] make numpy backend pass for gaussian smoke test

    Future PRs

    • deal with torch._C._get_tracing_state() statements
    • make sure that numpy backend will work
    awaiting review 
    opened by fehiepsi 9
  • `NotImplementedError` when using `apply_optimizer`

    `NotImplementedError` when using `apply_optimizer`

    This term below comes up in test_pyroapi_funsor:

    from funsor.cnf import Contraction
    from funsor.tensor import Tensor
    import torch
    import funsor.ops as ops
    from funsor import Bint, Real
    from funsor.terms import Unary, Binary, Variable, Number, lazy, to_data
    from funsor.constant import Constant
    from funsor.delta import Delta
    import funsor
    funsor.set_backend("torch")
    
    with lazy:
        term = Contraction(ops.add, ops.mul,
         frozenset({Variable('plate_inner__BOUND_78', Bint[2]), Variable('x__BOUND_77', Real), Variable('plate_outer__BOUND_79', Bint[3])}),  # noqa
         (Unary(ops.exp,
           Contraction(ops.null, ops.add,
            frozenset(),
            (Delta(
              (('x__BOUND_77',
                (Tensor(
                  torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
                  (('plate_outer__BOUND_79',
                    Bint[3],),),
                  'real'),
                 Number(0.0),),),)),
             Constant(
              (('plate_inner__BOUND_78',
                Bint[2],),),
              Tensor(
               torch.tensor([0.0, 0.0, 0.0], dtype=torch.float64),
               (('plate_outer__BOUND_79',
                 Bint[3],),),
               'real')),))),
          Unary(ops.all,
           Binary(ops.eq,
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'),
            Tensor(
             torch.tensor([-1.5227684840130555, 0.38168390241818895, -1.0276085758313882], dtype=torch.float64),  # noqa
             (('plate_outer__BOUND_79',
               Bint[3],),),
             'real'))),
          Tensor(
           torch.tensor([[-3.304130052277938, -0.9255234395261538, -1.5122103473560844], [-3.217490519312117, -1.2663745664889694, -1.2900109994655682]],
    dtype=torch.float64),  # noqa
           (('plate_inner__BOUND_78',
             Bint[2],),
            ('plate_outer__BOUND_79',
             Bint[3],),),
           'real'),))
    
    x = to_data(funsor.optimizer.apply_optimizer(term))
    

    Error message:

    Traceback (most recent call last):
      File "hehe.py", line 54, in <module>
        x = to_data(funsor.optimizer.apply_optimizer(term))
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 169, in apply_optimizer
        return interpreter.reinterpret(expr)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 255, in reinterpret
        return recursion_reinterpret(x)
      File "/home/ordabayev/repos/funsor/funsor/interpreter.py", line 236, in recursion_reinterpret
        return _STACK[-1].interpret(type(x), *map(recursion_reinterpret, children(x)))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 87, in optimize_contraction_variadic
        return optimize.interpret(Contraction, r, b, v, tuple(ts))
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/optimizer.py", line 145, in optimize_contract_finitary_funsor
        path_end = Contraction(
      File "/home/ordabayev/repos/funsor/funsor/terms.py", line 210, in __call__
        return interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 196,
    in interpret
        result = s.interpret(cls, *args)
      File "/home/ordabayev/repos/funsor/funsor/interpretations.py", line 155,
    in interpret
        return self.dispatch(cls, *args)(*args)
      File "/home/ordabayev/repos/funsor/funsor/cnf.py", line 323, in eager_contraction_tensor
        raise NotImplementedError("TODO")
    NotImplementedError: TODO
    
    enhancement 
    opened by ordabayevy 8
  • Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fix nan gradients in TraceEnum_ELBO when evaluated eagerly

    Fixes nan gradients in TraceEnum_ELBO when evaluated eagerly (mentioned in https://github.com/pyro-ppl/funsor/issues/493).

    The problem seems to be that reflect.interpret in line 62 leads to substitutions that have adjoint as a base_interpretation which in turn leads to some extra expressions being added to the tape. Here I fix it by changing interpretation to _old_interpretation.

    This also solves test_adjoint.py::test_sequential_sum_product_adjoint: xfail_param(MarkovProduct, reason="mysteriously doubles adjoint values?") (blocked by #545 ).

    awaiting review 
    opened by ordabayevy 7
  • Avoid creating distributions when inferring value domain

    Avoid creating distributions when inferring value domain

    Addresses #412 pair coded with @eb8680

    This refactors logic in Distribution._infer_value_domain() to avoid creating a temporary dummy distribution if backend distributions implement a .infer_shapes() method.

    awaiting review refactor 
    opened by fritzo 7
  • Switch to sqrt(precision) representation in Gaussian

    Switch to sqrt(precision) representation in Gaussian

    Resolves #567 Adapts @fehiepsi's https://github.com/pyro-ppl/pyro/pull/2019

    This switches the internal Gaussian representation to a numerically stable and space efficient representation

    - Gaussian(info_vec, precision, inputs)
    + Gaussian(white_vec, prec_sqrt, inputs)
    

    In the new parametrization, Gaussians represent the log-density function

    Gaussian(white_vec, prec_sqrt, OrderedDict(x=Reals[n]))
      = -1/2 || x @ prec_sqrt - white_vec ||^2
    

    These two parameters are shaped to efficiently represent low-rank data:

    assert white_vec.shape == batch_shape + (rank,)
    assert prec_sqrt.shape == batch_shape + (dim, rank)
    

    reducing space complexity from O(dim(dim+1)) to O(rank(dim+1)). In my real-world example rank=1, dim=2369, and batch_shape=(1343,), so the space reduction is 30GB → 13MB.

    Computations is cheap in this representation: addition amounts to concatenation, and plate-reduction amounts to transpose and reshape. Some ops are only supported on full-rank Gaussians, and I've added checks based on the new property .is_full_rank. This partial support is ok because the Gaussian funsors that arise in Bayesian models are all full rank due to priors (notwithstanding numerical loss of rank).

    Because the Gaussian funsor is internal, the interface change should not cause breakage of most user code, since most user code uses to_funsor() and to_data() with backend-specific distributions. One broken piece of user code is Pyro's AutoGaussianFunsor which will need an update (and which will be sped up).

    As suggested by @eb8680, I've added some optional kwarg parametrizations and properties to support conversion to other Gaussian representations, e.g. g = Gaussian(mean=..., covariance=..., inputs=...) and g._mean, g._covariance. This allows more Gaussian math to live in gaussian.py.

    Tested

    • [x] added new tests
    • [x] pass existing funsor tests (linear algebra)
    • [x] pass existing funsor tests (patterns)
    • [x] pass NumPyro tests (conjugacy)
    • [x] pass Pyro tests on a branch (https://github.com/pyro-ppl/pyro/pull/2943)
    • [x] check computational cost on the pyro-cov model (results: under 12GB memory)
    awaiting review refactor 
    opened by fritzo 6
  • Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    Feature Request: Add Support for TransformedDistribution with non-trivial event dimensions

    https://github.com/pyro-ppl/funsor/blob/1d07af18c21894dd56e2f4f877c7845430c3b729/funsor/distribution.py#L531

    Opening a feature request as per this comment by @fehiepsi

    Also note that the same issue happens when using TruncatedNormal distribution, see this comment.

    enhancement 
    opened by pankajb64 0
  • FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    FR rules for (Gaussian + UniformPolygon).reduce(logaddexp)

    It would be great to have reduction rules for uniform likelihoods wrt Gaussian latent variables:

    • [ ] (MyContinuousDist + Uniform).reduce(logaddexp) implemented via .cdf()
    • [ ] (Gaussian + UniformPolygon).reduce(logaddexp) implemented via quadrature (MC or QMC)

    This could be used for Pyro and NumPyro probabilistic programs with constraint-like likelihoods, something like

    def model(data):
        z = pyro.sample("z", Normal(...))
        # Observe a constraint like lb < z < ub.
        pyro.sample("x", Uniform(...), obs=z)
    

    here we can be lazy and compute marginal likelihood via Normal(...).cdf([lb, ub]).

    enhancement 
    opened by fritzo 0
  • Delta integrate pattern

    Delta integrate pattern

    (Separating this from #593 PR)

    This proposes the following logic for the Delta Integrate pattern:

    • If reduced_var is in integrand.inputs then apply substitution to delta and integrand:
    delta = Delta("x",  point, log_density)
    integrand = Variable("x")
    Integrate(delta, integrand, reduced_vars="x")
      => delta(x=point).exp() * integrand(x=point)
      => log_density.exp() * point
    

    where log_density can be a Dice factor or an importance weight in general.

    • If reduced_var is not in integrand.inputs then just reduce delta:
    delta = Delta("x",  point, log_density)
    integrand = Number(3.0)
    Integrate(delta, integrand, reduced_vars="x")
      => delta.reduce(logaddexp, "x").exp() * integrand
      => 1 * 3.0
      => 3.0
    

    where delta is normalized.

    awaiting review 
    opened by ordabayevy 0
  • Provenance funsor

    Provenance funsor

    Provenance tracking implementation in funsor. Here the provenance is the set of (name, point) tuples of RV samples that were in the history of the computations of the tracked term. Substitution results in the multiplication by the density:

    >>> logp_x = Provenance(logp_value, {("x", point)})
    >>> logp_x(x=point)
    logp_value
    

    The Provenance funsor is intended to be used in https://github.com/pyro-ppl/pyro/pull/2893 as a wrapper for ProvenanceTensor.

    enhancement awaiting review 
    opened by ordabayevy 1
  • FR Thompson sampling example

    FR Thompson sampling example

    I came across this cute example of nested semiring dynamic programming in the context of Bayesian optimization. Thompson sampling first performs Bayesian regression, fitting a posterior p(θ|Xs,ys) over parameters θ given a fully-supervised dataset of (X,y) pairs, then repeatedly samples parameters θ and optimizes expected reward E[y|X,θ] over X. That is, at each step

    X_optimal <- arg max_X int_y int_θ y p(y|θ,X)  p(θ) prod_i p(yi|θ,Xi)
    # new choice                       --reward-- prior ---likelihood---
    

    For example in Pyroed p(θ) prod_i p(yi|θ,Xi) is approximated variationally, and E[y|θ,X] is just a tensor contraction (although there it is optimized via annealed Gibbs sampling rather than via dynamic programming, as an aside we could add an annealed Gibbs sampling interpretation to Funsor for sum-product contractions).

    What would be a good example here, where we might leverage Funsor's dynamic programming to implement both the integral and argmax operators?

    examples 
    opened by fritzo 0
Releases(0.4.4)
  • 0.4.4(Dec 28, 2022)

  • 0.4.3(Mar 21, 2022)

    What's Changed

    • Update black by @fritzo in https://github.com/pyro-ppl/funsor/pull/583
    • Extend partial_sum_product() to cases tractable for Gaussians by @fritzo in https://github.com/pyro-ppl/funsor/pull/584
    • update classifiers by @anukaal in https://github.com/pyro-ppl/funsor/pull/586
    • Fix provenance tensor to work with PyTorch 1.11 by @fritzo in https://github.com/pyro-ppl/funsor/pull/588
    • Update to PyTorch 1.11.0 by @fritzo in https://github.com/pyro-ppl/funsor/pull/589
    • Bump to version 0.4.3 by @fritzo in https://github.com/pyro-ppl/funsor/pull/590

    New Contributors

    • @anukaal made their first contribution in https://github.com/pyro-ppl/funsor/pull/586

    Full Changelog: https://github.com/pyro-ppl/funsor/compare/0.4.2...0.4.3

    Source code(tar.gz)
    Source code(zip)
  • 0.4.2(Dec 13, 2021)

    Enhancements

    • #543 Added ProvenanceTensor for torch backend
    • #543 Added Constant funsor
    • #549 Added funsor.recipes.forward_fiter_backward_rsample()
    • #553 Added a Precondition interpretation for Gaussians, plus funsor.recipes.forward_fiter_backward_precondition()
    • #572 Support Funsors in pprint.pprint() and pdb

    Breaking changes

    • #568 Switch to sqrt(precision) representation for Gaussian
    Source code(tar.gz)
    Source code(zip)
  • 0.4.1(Jul 2, 2021)

  • 0.4.0(Jan 24, 2021)

    • Added new sum product algorithms (contributed by @ordabayevy)
    • Added many new conjugacy patterns
    • Automated op registration and made adding new ops easier
    • #419 Support ExtendedDistribution
    • #432 Support IndependentConstraint
    • #397 Allow .reduce(op, vars_not_in_inputs)
    • #430 Add Tuple Funsor
    • #409 Change sarkka_bilmes_product interface
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Oct 15, 2020)

    • #380 Drop support for Python 3.5 and PyTorch <=1.5
    • #352 New syntax for funsor.domain.Domains: bint(n) -> Bint[n], reals(*shape) -> Reals[shape]
    • #369 New StatefulInterpretation pattern to simplify context-dependent interpretations
    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jul 27, 2020)

    • Supports multiple backends: numpy (default), torch, and jax. To change the backend, use funsor.set_backend("torch") or funsor.set_backend("jax").
    Source code(tar.gz)
    Source code(zip)
  • 0.1.2(Oct 24, 2019)

  • 0.1.1(Oct 24, 2019)

  • 0.1.0(Oct 23, 2019)

Owner
Pyro - Deep Universal Probabilistic Programming
Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Elicited Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations. Credit to Brett Hoove

Ryan McGeehan 3 Nov 04, 2022
Implementation in Python of the reliability measures such as Omega.

reliabiliPy Summary Simple implementation in Python of the [reliability](https://en.wikipedia.org/wiki/Reliability_(statistics) measures for surveys:

Rafael Valero Fernández 2 Apr 27, 2022
Package for decomposing EMG signals into motor unit firings, as used in Formento et al 2021.

EMGDecomp Package for decomposing EMG signals into motor unit firings, created for Formento et al 2021. Based heavily on Negro et al, 2016. Supports G

13 Nov 01, 2022
Visions provides an extensible suite of tools to support common data analysis operations

Visions And these visions of data types, they kept us up past the dawn. Visions provides an extensible suite of tools to support common data analysis

168 Dec 28, 2022
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
For making Tagtog annotation into csv dataset

tagtog_relation_extraction for making Tagtog annotation into csv dataset How to Use On Tagtog 1. Go to Project Downloads 2. Download all documents,

hyeong 4 Dec 28, 2021
A set of procedures that can realize covid19 virus detection based on blood.

A set of procedures that can realize covid19 virus detection based on blood.

Nuyoah-xlh 3 Mar 07, 2022
A Python package for the mathematical modeling of infectious diseases via compartmental models

A Python package for the mathematical modeling of infectious diseases via compartmental models. Originally designed for epidemiologists, epispot can be adapted for almost any type of modeling scenari

epispot 12 Dec 28, 2022
This repo contains a simple but effective tool made using python which can be used for quality control in statistical approach.

This repo contains a powerful tool made using python which is used to visualize, analyse and finally assess the quality of the product depending upon the given observations

SasiVatsal 8 Oct 18, 2022
Manage large and heterogeneous data spaces on the file system.

signac - simple data management The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproduc

Glotzer Group 109 Dec 14, 2022
PySpark bindings for H3, a hierarchical hexagonal geospatial indexing system

h3-pyspark: Uber's H3 Hexagonal Hierarchical Geospatial Indexing System in PySpark PySpark bindings for the H3 core library. For available functions,

Kevin Schaich 12 Dec 24, 2022
A simplified prototype for an as-built tracking database with API

Asbuilt_Trax A simplified prototype for an as-built tracking database with API The purpose of this project is to: Model a database that tracks constru

Ryan Pemberton 1 Jan 31, 2022
track your GitHub statistics

GitHub-Stalker track your github statistics 👀 features find new followers or unfollowers find who got a star on your project or remove stars find who

Bahadır Araz 34 Nov 18, 2022
Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

University of Stuttgart Visualization Research Center 2 Jul 01, 2022
Analyze the Gravitational wave data stored at LIGO/VIRGO observatories

Gravitational-Wave-Analysis This project showcases how to analyze the Gravitational wave data stored at LIGO/VIRGO observatories, using Python program

1 Jan 23, 2022
Big Data & Cloud Computing for Oceanography

DS2 Class 2022, Big Data & Cloud Computing for Oceanography Home of the 2022 ISblue Big Data & Cloud Computing for Oceanography class (IMT-A, ENSTA, I

Ocean's Big Data Mining 5 Mar 19, 2022
API>local_db>AWS_RDS - Disclaimer! All data used is for educational purposes only.

APIlocal_dbAWS_RDS Disclaimer! All data used is for educational purposes only. ETL pipeline diagram. Aim of project By creating a fully working pipe

0 Apr 25, 2022
DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN

DenseClus is a Python module for clustering mixed type data using UMAP and HDBSCAN. Allowing for both categorical and numerical data, DenseClus makes it possible to incorporate all features in cluste

Amazon Web Services - Labs 53 Dec 08, 2022
vartests is a Python library to perform some statistic tests to evaluate Value at Risk (VaR) Models

gg I wasn't satisfied with any of the other available Gemini clients, so I wrote my own. Requires Python 3.9 (maybe older, I haven't checked) and opti

RAFAEL RODRIGUES 5 Jan 03, 2023
BasstatPL is a package for performing different tabulations and calculations for descriptive statistics.

BasstatPL is a package for performing different tabulations and calculations for descriptive statistics. It provides: Frequency table constr

Angel Chavez 1 Oct 31, 2021