Lightweight library for accessing data and configuration

Related tags

Miscellaneousaccsr
Overview

accsr

This lightweight library contains utilities for managing, loading, uploading, opening and generally wrangling data and configurations. It was battle tested in multiple projects at appliedAI.

Please open new issues for bugs, feature requests and extensions. See more details about the structure and workflow in the developer's readme.

Overview

Source code documentation and usage examples are here. We also provide notebooks with examples in TODO.

Installation

Install the latest release with

pip install accsr

To live on the edge, install the latest develop version with

pip install --pre accsr
Comments
  • Dryrun-based pull/push and tqdm

    Dryrun-based pull/push and tqdm

    Added tqdm progress bar to RemoteStorage.push/pull methods

    • First determines the total number of bytes to push/pull
    • Updates the number of transferred bytes after every file

    resolve #4 #5 #9

    opened by fariedabuzaid 4
  • Fix test failure because of failed name resolution

    Fix test failure because of failed name resolution

    This PR fixes the test failure related to failed name resolution on CI. It makes the tests run inside a container because this is apparently needed to enable host name resolution for the minio service from within the tests.

    You can refer to these links for more information:

    • https://docs.github.com/en/actions/using-containerized-services/about-service-containers#mapping-docker-host-and-service-container-ports
    • https://docs.github.com/en/actions/using-containerized-services/creating-redis-service-containers#running-jobs-in-containers
    opened by AnesBenmerzoug 2
  • Add Simulation Mode

    Add Simulation Mode

    Add a new flag to the RemoteStorage push/pull operation. If True the function should determine and return the operations that need to be conducted without actually performing them.

    enhancement 
    opened by fariedabuzaid 0
  • Transactional safety for push and pull in remote storage

    Transactional safety for push and pull in remote storage

    Currently, pushing and pulling of directories does not check whether the entire operation can be performed successfully (e.g. if modified files already exist and overwrite_existing=False). This leads to a partial execution before an error is thrown and thus to an unpredictable state.

    We should check if the entire operation can be performed before pushing/pulling anything.

    Also, to be more familiar to git users, overwrite_existing should be renamed to force. This is a breaking change, the minor version should be bumped

    enhancement 
    opened by MischaPanch 0
  • chore: release version 0.3.5-dev0

    chore: release version 0.3.5-dev0

    @MischaPanch can you release the current dev branch? I found a bug in the old version which seems to fixed now. Would be great to get the fix installed. Not urgent though, I can work with the dev branch for now :)

    opened by slettner 0
  • Improve docs by extending notebooks

    Improve docs by extending notebooks

    We have essentially no documentation on how to use accsr. The interplay of storage and config modules should be demonstrated in notebooks. See tests/conftest.py for an example how a storage service is instantiated during local testing an in CI.

    documentation 
    opened by MischaPanch 0
  • Move convenient path selections in push/pull

    Move convenient path selections in push/pull

    We should make it easier to push/pull a bunch of paths based on patterns. For that we should add

    • [ ] Permit passing glob-patterns to push/pull
    • [ ] Add the possibility to pass a regex as except_matches kwarg to permit simple exclusion of files. The current regex kwarg should be renamed to if_matches.

    This would permit things like

    storage.push("data/**/*.jpg", except_matches=r".*test.*") 
    

    We could additionally allow passing a except_condition: Callable[[str]], bool] = None (or do you think if_condition is more natural?), in which case the above can be rewritten

    storage.push("data/**/*.jpg", except_condition=lambda n: "test" in n) 
    

    The condition could be even made more general, mapping the metadata-object to a bool (thereby e.g. allowing filtering by size), at the cost of a more complicated interface for callables. @fariedabuzaid @AnesBenmerzoug what do you think?

    enhancement 
    opened by MischaPanch 0
  • CI: make caching work within containers

    CI: make caching work within containers

    @MischaPanch This fixes the tests but it currently break caching.

    WARNING: The directory '/github/home/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
    

    I think running jobs inside containers is the way to go and we should invest some time to make caching work with it.

    Originally posted by @AnesBenmerzoug in https://github.com/appliedAI-Initiative/accsr/issues/1#issuecomment-976536009

    Build/CI 
    opened by MischaPanch 0
Releases(v0.3.4)
Owner
appliedAI Initiative
The appliedAI Initiative aims to lift Germany and Europe to the AI age by accelerating the adoption of AI technology
appliedAI Initiative
Watcher for systemdrun user scopes

Systemctl Memory Watcher Animated watcher for systemdrun user scopes. Usage Launch some process in your GNU-Linux or compatible OS with systemd-run co

Antonio Vanegas 2 Jan 20, 2022
Percolation simulation using python

PythonPercolation Percolation simulation using python Exemple de percolation : Etude statistique sur le pourcentage de remplissage jusqu'à percolation

Tony Chouteau 1 Sep 08, 2022
A tool to help plan vacations with friends and family

Vacationer In Development A tool to help plan vacations with friends and family Deployment Requirements: NPM Docker Docker-Compose Deployment Instruct

JK 2 Oct 05, 2021
Demo of using DataLoader to prevent out of memory

Demo of using DataLoader to prevent out of memory

3 Jun 25, 2022
A Python library that helps data scientists to infer causation rather than observing correlation.

A Python library that helps data scientists to infer causation rather than observing correlation.

QuantumBlack Labs 1.7k Jan 04, 2023
Roblox Limited Sniper For Python

Info this is version 2.1 version 3 will support more options (install python: https://www.python.org) the program will buy any limited item with a pri

1 Dec 09, 2021
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Apache Airflow Apache Airflow (or simply Airflow) is a platform to programmatically author, schedule, and monitor workflows. When workflows are define

The Apache Software Foundation 28.6k Dec 28, 2022
Files relating to polymtl university

This is a tool I developed quickly, which allows users to visualize class availability by day of the week for a given program at polymtl. The schedule

PN 3 Mar 15, 2022
Repo to demo translating colab/jupyter notebook to streamlit webapp

Repo to demo translating colab/jupyter notebook to streamlit webapp

Marisa Smith 2 Feb 02, 2022
A 100% python file organizer. Keep your computer always organized!

PythonOrganizer A 100% python file organizer. Keep your computer always organized! To run the project, just clone the folder and run the installation

3 Dec 02, 2022
Tool to audit and fix Python project requirements.

Requirement Auditor Utility to revise and updated python requirement files.

Luis Carlos Berrocal 1 Nov 07, 2021
Source for the Fedora Silverblue and Kinoite variants.

Source for the Fedora Silverblue and Kinoite variants.

Fedora Kinoite 7 Aug 20, 2022
lets learn Python language with basic examples. highly recommended for beginners who just start coding.

Lets Learn Python 🐍 Learn python from basic programs. learn python from scratch. 1.Online python compiler: https://www.onlinegdb.com/online_python_co

Subhranshu Choudhury 1 Jan 18, 2022
List of short Codeforces problems with a statement of 1000 characters or less. Python script and data files included.

Shortest problems on Codeforces List of Codeforces problems with a short problem statement of 1000 characters or less. Sorted for each rating level. B

32 Dec 24, 2022
A python script to turn tabs into spaces the right way.

detab A python script to turn tabs into spaces the right way. detab turns all tabs into spaces, not just leading tabs. Not all tabs have the same leng

1 Jan 26, 2022
How did Covid affect businesses?

NYC_Business_Analysis How did Covid affect businesses? COVID's effect on NYC businesses We all know that businesses in NYC have been affected by COVID

AK 1 Jan 15, 2022
Scripts for hosting urbit in production-ish

Urbit Sysops Contains some helpful scripts for hosting Urbit. There are two variants included in this repo: one using docker, and one using plain syst

Jōshin 12 Sep 25, 2022
Weblate is a copylefted libre software web-based continuous localization system

Weblate is a copylefted libre software web-based continuous localization system, used by over 2500 libre projects and companies in more than 165 count

Weblate 7 Dec 15, 2022
Replit theme sync; Github theme sync but in Replit.

This is a Replit theme sync, basically meaning that it keeps track of the current time (which may need to be edited later on), and if the time passes morning, afternoon, etc, the theme switches. The

Glitch 8 Jun 25, 2022
A wrapper around the python Tkinter library for customizable and modern ui-elements in Tkinter

CustomTkinter With CustomTkinter you can create modern looking user interfaces in python with tkinter. CustomTkinter is a tkinter extension which prov

4.9k Jan 02, 2023