spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

Overview

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

PyPI version python version Code style: black github actions pytest github actions docs github coverage CodeFactor

spaCy-wrap is minimal library intended for wrapping fine-tuned transformers from the Huggingface model hub in your spaCy pipeline allowing inclusion of existing models within SpaCy workflows.

As for as possible it follows a similar API as spacy-transformers.

Installation

Installing spacy-wrap is simple using pip:

pip install spacy_wrap

There is no reason to update from GitHub as the version on PyPI should always be the same as on GitHub.

Example

The following shows a simple example of how you can quickly add a fine-tuned transformer model from the Huggingface model hub. In this example we will use the sentiment model by Barbieri et al. (2020) for classifying whether a tweet is positive, negative or neutral. We will add this model to a blank English pipeline:

import spacy
import spacy_wrap

nlp = spacy.blank("en")

config = {
    "doc_extension_trf_data": "clf_trf_data",  # document extention for the forward pass
    "doc_extension_prediction": "sentiment",  # document extention for the prediction
    "labels": ["negative", "neutral", "positive"],
    "model": {
        "name": "cardiffnlp/twitter-roberta-base-sentiment",  # the model name or path of huggingface model
    },
}

transformer = nlp.add_pipe("classification_transformer", config=config)
transformer.model.initialize()

doc = nlp("spaCy is a wonderful tool")

print(doc._.clf_trf_data)
# TransformerData(wordpieces=...
print(doc._.sentiment)
# 'positive'
print(doc._.sentiment_prob)
#{'prob': array([0.004, 0.028, 0.969], dtype=float32), 'labels': ['negative', 'neutral', 'positive']}

These pipelines can also easily be applied to multiple documents using the nlp.pipe as one would expect from a spaCy component:

docs = nlp.pipe(
    [
        "I hate wrapping my own models",
        "Isn't there a tool for this?",
        "spacy-wrap is great for wrapping models",
    ]
)

for doc in docs:
    print(doc._.sentiment)
# 'negative'
# 'neutral'
# 'positive'

More Examples

It is always nice to have more than one example. Here is another one where we add the Hate speech model for Danish to a blank Danish pipeline:

import spacy
import spacy_wrap

nlp = spacy.blank("da")

config = {
    "doc_extension_trf_data": "clf_trf_data",  # document extention for the forward pass
    "doc_extension_prediction": "hate_speech",  # document extention for the prediction
    "labels": ["Not hate Speech", "Hate speech"],
    "model": {
        "name": "DaNLP/da-bert-hatespeech-detection",  # the model name or path of huggingface model
    },
}

transformer = nlp.add_pipe("classification_transformer", config=config)
transformer.model.initialize()

doc = nlp("Senile gamle idiot") # old senile idiot

doc._.clf_trf_data
# TransformerData(wordpieces=...
doc._.hate_speech
# "Hate speech"
doc._.hate_speech_prob
# {'prob': array([0.013, 0.987], dtype=float32), 'labels': ['Not hate Speech', 'Hate speech']}

📖 Documentation

Documentation
🔧 Installation Installation instructions for spacy-wrap.
📰 News and changelog New additions, changes and version history.
🎛 Documentation The reference for spacy-wrap's API.

💬 Where to ask questions

Type
🚨 FAQ FAQ
🚨 Bug Reports GitHub Issue Tracker
🎁 Feature Requests & Ideas GitHub Issue Tracker
👩‍💻 Usage Questions GitHub Discussions
🗯 General Discussion GitHub Discussions
Comments
  • Custom model for NER

    Custom model for NER

    Hello thanks you for setting up this ..

    The examples are amazing.

    Is it possible to use this wrapper with a Named Entity Recognition model?

    If that is the case, is it possible to add an example with a NER model from hugging face?

    Following the example, I am trying to add this but it is not working, I do not why may be I should go back and learn how spacy works.

    import spacy
    import spacy_wrap
    
    nlp = spacy.blank("fr")
    
    config = {
        "model": {
            "@architectures": "spacy-transformers.TransformerModel.v3",
            "name": "Jean-Baptiste/camembert-ner-with-dates",
            "tokenizer_config" : {"use_fast": False},
            "get_spans":  {"@span_getters": "spacy-transformers.doc_spans.v1"}
        }
    }
    
    transformer = nlp.add_pipe("ner", config=config)
    
    enhancement good first issue 
    opened by espoirMur 6
  • :arrow_up: Bump sphinx-notes/pages from 2 to 3

    :arrow_up: Bump sphinx-notes/pages from 2 to 3

    Bumps sphinx-notes/pages from 2 to 3.

    Release notes

    Sourced from sphinx-notes/pages's releases.

    3.0beta

    2.1

    No release notes provided.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies github_actions 
    opened by dependabot[bot] 3
  • Updated in accordance with spacy-transformers update

    Updated in accordance with spacy-transformers update

    Update following this PR allowing for alternate model loaders. This should make the code-base easier to maintain and removes existing monkey patch from the code.

    opened by KennethEnevoldsen 3
  • Poetry dependency resolution fails. PyPi 1.0.0 version requires spaCy < 3.3.0

    Poetry dependency resolution fails. PyPi 1.0.0 version requires spaCy < 3.3.0

    How to reproduce the behaviour

    pyproject.toml contains

    spacy = "^3.3.0"
    

    Add spacy-wrap with

    poetry add spacy_wrap
    

    Info about spaCy

    • spaCy version: 3.3.0
    • Platform: Windows-10-10.0.19043-SP0
    • Python version: 3.9.10
    • Pipelines: en_core_web_sm (3.3.0), en_core_web_trf (3.3.0)

    Issue

    Dependency resolution fails with poetry

    SolverProblemError
    
      Because spacy-wrap (1.0.0) depends on spacy (>=3.2.1,<3.3.0)
       and no versions of spacy-wrap match >1.0.0,<2.0.0, spacy-wrap (>=1.0.0,<2.0.0) requires spacy (>=3.2.1,<3.3.0).
      So, because spacytest depends on both spacy (^3.3.0) and spacy-wrap (^1.0.0), version solving failed.
    

    setup,cfg still has

    install_requires = 
    	spacy_transformers>=1.1.4,<1.2.0
    	spacy>=3.2.1,<3.3.0
    	thinc>=8.0.13,<8.1.0
    
    bug 
    opened by Dris101 3
  • :arrow_up: Bump ruff from 0.0.191 to 0.0.211

    :arrow_up: Bump ruff from 0.0.191 to 0.0.211

    Bumps ruff from 0.0.191 to 0.0.211.

    Release notes

    Sourced from ruff's releases.

    v0.0.211

    What's Changed

    Full Changelog: https://github.com/charliermarsh/ruff/compare/v0.0.210...v0.0.211

    v0.0.210

    What's Changed

    New Contributors

    Full Changelog: https://github.com/charliermarsh/ruff/compare/v0.0.209...v0.0.210

    v0.0.209

    What's Changed

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 2
  • :arrow_up: Bump MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26

    :arrow_up: Bump MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26

    Bumps MishaKav/pytest-coverage-comment from 1.1.25 to 1.1.26.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies github_actions 
    opened by dependabot[bot] 2
  • :arrow_up: Update sphinxext-opengraph requirement from <0.7.4,>=0.7.3 to >=0.7.3,<0.7.6

    :arrow_up: Update sphinxext-opengraph requirement from <0.7.4,>=0.7.3 to >=0.7.3,<0.7.6

    Updates the requirements on sphinxext-opengraph to permit the latest version.

    Release notes

    Sourced from sphinxext-opengraph's releases.

    v0.7.5

    What's Changed

    Full Changelog: https://github.com/wpilibsuite/sphinxext-opengraph/compare/v0.7.4...v0.7.5

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update sphinx requirement from <5.4.0,>=5.3.0 to >=5.3.0,<6.2.0

    :arrow_up: Update sphinx requirement from <5.4.0,>=5.3.0 to >=5.3.0,<6.2.0

    Updates the requirements on sphinx to permit the latest version.

    Release notes

    Sourced from sphinx's releases.

    v6.1.0

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    Changelog

    Sourced from sphinx's changelog.

    Release 6.1.0 (released Jan 05, 2023)

    Dependencies

    Incompatible changes

    • #10979: gettext: Removed support for pluralisation in get_translation. This was unused and complicated other changes to sphinx.locale.

    Deprecated

    • sphinx.util functions:

      • Renamed sphinx.util.typing.stringify() to sphinx.util.typing.stringify_annotation()
      • Moved sphinx.util.xmlname_checker() to sphinx.builders.epub3._XML_NAME_PATTERN

      Moved to sphinx.util.display:

      • sphinx.util.status_iterator
      • sphinx.util.display_chunk
      • sphinx.util.SkipProgressMessage
      • sphinx.util.progress_message

      Moved to sphinx.util.http_date:

      • sphinx.util.epoch_to_rfc1123
      • sphinx.util.rfc1123_to_epoch

      Moved to sphinx.util.exceptions:

      • sphinx.util.save_traceback
      • sphinx.util.format_exception_cut_frames

    Features added

    • Cache doctrees in the build environment during the writing phase.
    • Make all writing phase tasks support parallel execution.
    • #11072: Use PEP 604 (X | Y) display conventions for typing.Optional and typing.Optional types within the Python domain and autodoc.

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update pytest-cov requirement from <3.0.1,>=3.0.0 to >=3.0.0,<4.0.1

    :arrow_up: Update pytest-cov requirement from <3.0.1,>=3.0.0 to >=3.0.0,<4.0.1

    Updates the requirements on pytest-cov to permit the latest version.

    Changelog

    Sourced from pytest-cov's changelog.

    4.0.0 (2022-09-28)

    Note that this release drops support for multiprocessing.

    • --cov-fail-under no longer causes pytest --collect-only to fail Contributed by Zac Hatfield-Dodds in [#511](https://github.com/pytest-dev/pytest-cov/issues/511) <https://github.com/pytest-dev/pytest-cov/pull/511>_.

    • Dropped support for multiprocessing (mostly because issue 82408 <https://github.com/python/cpython/issues/82408>_). This feature was mostly working but very broken in certain scenarios and made the test suite very flaky and slow.

      There is builtin multiprocessing support in coverage and you can migrate to that. All you need is this in your .coveragerc::

      [run] concurrency = multiprocessing parallel = true sigterm = true

    • Fixed deprecation in setup.py by trying to import setuptools before distutils. Contributed by Ben Greiner in [#545](https://github.com/pytest-dev/pytest-cov/issues/545) <https://github.com/pytest-dev/pytest-cov/pull/545>_.

    • Removed undesirable new lines that were displayed while reporting was disabled. Contributed by Delgan in [#540](https://github.com/pytest-dev/pytest-cov/issues/540) <https://github.com/pytest-dev/pytest-cov/pull/540>_.

    • Documentation fixes. Contributed by Andre Brisco in [#543](https://github.com/pytest-dev/pytest-cov/issues/543) <https://github.com/pytest-dev/pytest-cov/pull/543>_ and Colin O'Dell in [#525](https://github.com/pytest-dev/pytest-cov/issues/525) <https://github.com/pytest-dev/pytest-cov/pull/525>_.

    • Added support for LCOV output format via --cov-report=lcov. Only works with coverage 6.3+. Contributed by Christian Fetzer in [#536](https://github.com/pytest-dev/pytest-cov/issues/536) <https://github.com/pytest-dev/pytest-cov/issues/536>_.

    • Modernized pytest hook implementation. Contributed by Bruno Oliveira in [#549](https://github.com/pytest-dev/pytest-cov/issues/549) <https://github.com/pytest-dev/pytest-cov/pull/549>_ and Ronny Pfannschmidt in [#550](https://github.com/pytest-dev/pytest-cov/issues/550) <https://github.com/pytest-dev/pytest-cov/pull/550>_.

    3.0.0 (2021-10-04)

    Note that this release drops support for Python 2.7 and Python 3.5.

    • Added support for Python 3.10 and updated various test dependencies. Contributed by Hugo van Kemenade in [#500](https://github.com/pytest-dev/pytest-cov/issues/500) <https://github.com/pytest-dev/pytest-cov/pull/500>_.
    • Switched from Travis CI to GitHub Actions. Contributed by Hugo van Kemenade in [#494](https://github.com/pytest-dev/pytest-cov/issues/494) <https://github.com/pytest-dev/pytest-cov/pull/494>_ and [#495](https://github.com/pytest-dev/pytest-cov/issues/495) <https://github.com/pytest-dev/pytest-cov/pull/495>_.
    • Add a --cov-reset CLI option. Contributed by Danilo Šegan in [#459](https://github.com/pytest-dev/pytest-cov/issues/459) <https://github.com/pytest-dev/pytest-cov/pull/459>_.
    • Improved validation of --cov-fail-under CLI option. Contributed by ... Ronny Pfannschmidt's desire for skark in [#480](https://github.com/pytest-dev/pytest-cov/issues/480) <https://github.com/pytest-dev/pytest-cov/pull/480>_.
    • Dropped Python 2.7 support.

    ... (truncated)

    Commits
    • 28db055 Bump version: 3.0.0 → 4.0.0
    • 57e9354 Really update the changelog.
    • 56b810b Update chagelog.
    • f7fced5 Add support for LCOV output
    • 1211d31 Fix flake8 error
    • b077753 Use modern approach to specify hook options
    • 00713b3 removed incorrect docs on data_file.
    • b3dda36 Improve workflow with a collecting status check. (#548)
    • 218419f Prevent undesirable new lines to be displayed when report is disabled
    • 60b73ec migrate build command from distutils to setuptools
    • Additional commits viewable in compare view

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 1
  • :arrow_up: Update sphinx requirement from <5.0.0,>=4.5.0 to >=4.5.0,<7.0.0

    :arrow_up: Update sphinx requirement from <5.0.0,>=4.5.0 to >=4.5.0,<7.0.0

    Updates the requirements on sphinx to permit the latest version.

    Release notes

    Sourced from sphinx's releases.

    v6.0.0

    Changelog: https://www.sphinx-doc.org/en/master/changes.html

    Changelog

    Sourced from sphinx's changelog.

    Release 6.0.0 (released Dec 29, 2022)

    Dependencies

    • #10468: Drop Python 3.6 support
    • #10470: Drop Python 3.7, Docutils 0.14, Docutils 0.15, Docutils 0.16, and Docutils 0.17 support. Patch by Adam Turner

    Incompatible changes

    • #7405: Removed the jQuery and underscore.js JavaScript frameworks.

      These frameworks are no longer be automatically injected into themes from Sphinx 6.0. If you develop a theme or extension that uses the jQuery, $, or $u global objects, you need to update your JavaScript to modern standards, or use the mitigation below.

      The first option is to use the sphinxcontrib.jquery_ extension, which has been developed by the Sphinx team and contributors. To use this, add sphinxcontrib.jquery to the extensions list in conf.py, or call app.setup_extension("sphinxcontrib.jquery") if you develop a Sphinx theme or extension.

      The second option is to manually ensure that the frameworks are present. To re-add jQuery and underscore.js, you will need to copy jquery.js and underscore.js from the Sphinx repository_ to your static directory, and add the following to your layout.html:

      .. code-block:: html+jinja

      {%- block scripts %} {{ super() }} {%- endblock %}

      .. _sphinxcontrib.jquery: https://github.com/sphinx-contrib/jquery/

      Patch by Adam Turner.

    • #10471, #10565: Removed deprecated APIs scheduled for removal in Sphinx 6.0. See :ref:dev-deprecated-apis for details. Patch by Adam Turner.

    • #10901: C Domain: Remove support for parsing pre-v3 style type directives and roles. Also remove associated configuration variables c_allow_pre_v3 and c_warn_on_allowed_pre_v3. Patch by Adam Turner.

    Features added

    ... (truncated)

    Commits

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 1
  • :arrow_up: Bump furo from 2022.9.29 to 2022.12.7

    :arrow_up: Bump furo from 2022.9.29 to 2022.12.7

    Bumps furo from 2022.9.29 to 2022.12.7.

    Changelog

    Sourced from furo's changelog.

    Changelog

    2022.12.07 -- Reverent Raspberry

    • ✨ Add support for Sphinx 6.
    • ✨ Improve footnote presentation with docutils 0.18+.
    • Drop support for Sphinx 4.
    • Improve documentation about what the edit button does.
    • Improve handling of empty-flexboxes for better print experience on Chrome.
    • Improve styling for inline signatures.
    • Replace the meta generator tag with a comment.
    • Tweak labels with icons to prevent users selecting icons as text on touch.

    2022.09.29 -- Quaint Quartz

    • Add ability to set arbitrary URLs for edit button.
    • Add support for aligning text in MyST-parser generated tables.

    2022.09.15 -- Pragmatic Pistachio

    • Add a minimum version constraint on pygments.
    • Add an explicit dependency on sass.
    • Change right sidebar title from "Contents" to "On this page".
    • Correctly position sidebars on small screens.
    • Correctly select only Furo's own svg in related pages nav.
    • Make numpy-style documentation headers consistent.
    • Retitle the reference section.
    • Update npm dependencies.

    2022.06.21 -- Opulent Opal

    • Fix docutils <= 0.17.x compatibility.
    • Bump to the latest Node.js LTS.

    2022.06.04.1 -- Naughty Nickel bugfix

    • Fix the URL used in the "Edit this page" for Read the Docs builds.

    2022.06.04 -- Naughty Nickel

    ... (truncated)

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies python 
    opened by dependabot[bot] 1
  • IndexError when emojis in input

    IndexError when emojis in input

    How to reproduce the behaviour

    import spacy
    import spacy_wrap
    nlp = spacy.blank("en")
    
    # specify model from the hub
    config = {"model": {"name": "dslim/bert-base-NER"}}
    # add it to the pipe
    nlp.add_pipe("token_classification_transformer", config=config)
    
    doc = nlp("My name is Wolfgang 🚀 and I live in Berlin.")
    

    Your Environment

    • spacy-wrap Version Used: spacy_wrap-1.2.0-py2.py3-none-any.whl
    • Operating System: MacOS, Apple M1, Ventura 13.0.1
    • Python Version Used: 3.10.6
    • spaCy Version Used: 3.4.3
    • Environment Information: poetry
    • Torch version: 1.13.0-cp310 torch = {url = "https://files.pythonhosted.org/packages/79/b3/eaea3fc35d0466b9dae1e3f9db08467939347b3aaa53c0fd81953032db33/torch-1.13.0-cp310-none-macosx_11_0_arm64.whl"}

    (Had to set torch manually because spacy-wrap fails to install on Mac with the default torch version. Specifically, the dependency nvidia-cublas-cu11 returns a RuntimeError with "Unable to find installation candidates for nvidia-cublas-cu11 (11.10.3.66)". No distributions available for Mac, see https://pypi.org/project/nvidia-cublas-cu11/11.10.3.66/#files )

    Above example yields an IndexError in TokenClassificationTransformer.convert_to_token_predictions(data, aggregation_strategy, labels)

          305 logits = data.model_output.logits[0]
          306 for align in data.align:
          307     # aggregate the logits for each token
    --> 308     agg_token_logits = agg(logits[align.data[:, 0]])
          309     token_probabilities_ = {
          310         "prob": softmax(agg_token_logits).round(decimals=3),
          311         "label": labels,
          312     }
          313     token_probabilities.append(token_probabilities_)
    
    IndexError: index 0 is out of bounds for axis 1 with size 0
    

    Issue is the same with other models. Tried saattrupdan/nbailab-base-ner-scandi.

    Tests with transformers pipeline works with same models.

    from transformers import pipeline
    pipe = pipeline(model="dslim/bert-base-NER")
    res = pipe("My name is Wolfgang 🚀 and I live in Berlin.")
    [e["word"] for e in res]
    # ['Wolfgang', '🚀', 'Berlin']
    
    bug 
    opened by nthomsencph 6
Releases(v1.3.0)
Owner
Kenneth Enevoldsen
Interdisciplinary PhD Student on representation learning in Clinical NLP and Genetics at Aarhus University and Interacting Minds Centre
Kenneth Enevoldsen
Segmenter - Transformer for Semantic Segmentation

Segmenter - Transformer for Semantic Segmentation

592 Dec 27, 2022
A CSRankings-like index for speech researchers

Speech Rankings This project mimics CSRankings to generate an ordered list of researchers in speech/spoken language processing along with their possib

Mutian He 19 Nov 26, 2022
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models

Deepvoice3_pytorch PyTorch implementation of convolutional networks-based text-to-speech synthesis models: arXiv:1710.07654: Deep Voice 3: Scaling Tex

Ryuichi Yamamoto 1.8k Dec 30, 2022
In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Transformers are all you need In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a

Aymen Berriche 8 Apr 13, 2022
NumPy String-Indexed is a NumPy extension that allows arrays to be indexed using descriptive string labels

NumPy String-Indexed NumPy String-Indexed is a NumPy extension that allows arrays to be indexed using descriptive string labels, rather than conventio

Aitan Grossman 1 Jan 08, 2022
This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

Nepali-news-notifier This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular in

Sachit Yadav 1 Feb 11, 2022
Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module.

Import Subtitles for Blender VSE Addon for adding subtitle files to blender VSE as Text sequences. Using pysub2 python module. Supported formats by py

4 Feb 27, 2022
The Classical Language Toolkit

Notice: This Git branch (dev) contains the CLTK's upcoming major release (v. 1.0.0). See https://github.com/cltk/cltk/tree/master and https://docs.clt

Classical Language Toolkit 754 Jan 09, 2023
This github repo is for Neurips 2021 paper, NORESQA A Framework for Speech Quality Assessment using Non-Matching References.

NORESQA: Speech Quality Assessment using Non-Matching References This is a Pytorch implementation for using NORESQA. It contains minimal code to predi

Meta Research 36 Dec 08, 2022
Implementation of legal QA system based on SentenceKoBART

LegalQA using SentenceKoBART Implementation of legal QA system based on SentenceKoBART How to train SentenceKoBART Based on Neural Search Engine Jina

Heewon Jeon(gogamza) 75 Dec 27, 2022
L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources.

L3Cube-MahaCorpus L3Cube-MahaCorpus a Marathi monolingual data set scraped from different internet sources. We expand the existing Marathi monolingual

21 Dec 17, 2022
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

MTFAA-Net Unofficial PyTorch implementation of Baidu's MTFAA-Net: "Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speec

Shimin Zhang 87 Dec 19, 2022
📔️ Generate a text-based journal from a template file.

JGen 📔️ Generate a text-based journal from a template file. Contents Getting Started Example Overview Usage Details Reserved Keywords Gotchas Getting

Harrison Broadbent 21 Sep 25, 2022
Long text token classification using LongFormer

Long text token classification using LongFormer

abhishek thakur 161 Aug 07, 2022
PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

PRAnCER (Platform enabling Rapid Annotation for Clinical Entity Recognition) is a web platform that enables the rapid annotation of medical terms within clinical notes. A user can highlight spans of

Sontag Lab 39 Nov 14, 2022
Speach Recognitions

easy_meeting Добро пожаловать в интерфейс сервиса автопротоколирования совещаний Easy Meeting. Website - http://cf5c-62-192-251-83.ngrok.io/ Принципиа

Maksim 3 Feb 18, 2022
A Fast Command Analyser based on Dict and Pydantic

Alconna Alconna 隶属于ArcletProject, 在Cesloi内有内置 Alconna 是 Cesloi-CommandAnalysis 的高级版,支持解析消息链 一般情况下请当作简易的消息链解析器/命令解析器 文档 暂时的文档 Example from arclet.alcon

19 Jan 03, 2023
**NSFW** A chatbot based on GPT2-chitchat

DangBot -- 好怪哦,再来一句 卡群怪话bot,powered by GPT2 for Chinese chitchat Training Example: python train.py --lr 5e-2 --epochs 30 --max_len 300 --batch_size 8

Tommy Yang 11 Jul 21, 2022
Large-scale pretraining for dialogue

A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT) This repository contains the source code and trained model for a large-

Microsoft 1.8k Jan 07, 2023
Skipgram Negative Sampling in PyTorch

PyTorch SGNS Word2Vec's SkipGramNegativeSampling in Python. Yet another but quite general negative sampling loss implemented in PyTorch. It can be use

Jamie J. Seol 287 Dec 14, 2022