A supercharged SQLite library for Python

Overview
supersqlite

SuperSQLite: a supercharged SQLite library for Python

pipeline status   Build Status   Build status
PyPI version   license   Python version

A feature-packed Python package and for utilizing SQLite in Python by Plasticity. It is intended to be a drop-in replacement to Python's built-in SQLite API, but without any limitations. It offers unique features like remote streaming over HTTP and bundling of extensions like JSON, R-Trees (geospatial indexing), and Full Text Search. SuperSQLite is also packaged with pre-compiled native binaries for SQLite and all of its extensions for nearly every platform as to avoid any C/C++ compiler errors during install.

Table of Contents

Installation

You can install this package with pip:

pip install supersqlite # Python 2.7
pip3 install supersqlite # Python 3

Motivation

SQLite, is a fast, popular embedded database, used by large enterprises. It is the most widely-deployed database and has billions of deployments. It has a built-in binding in Python.

The Python bindings, however, often are compiled against an out-of-date copy of SQLite or may be compiled with limitations set to low levels. Moreover, it is difficult to load extremely useful extensions like JSON1 that adds JSON functionality to SQLite or FTS5 that adds full-text search functionality to SQLite since they must be compiled with a C/C++ compiler on each platform before being loaded.

SuperSQLite aims to solve these problems by packaging a newer version of SQLite natively pre-compiled for every platform along with natively pre-compiled SQLite extensions. SuperSQLite also adds useful unique new features like remote streaming over HTTP to read from a centralized SQLite database.

Moreover, by default, SQLite does not enable some optimizations that can result in speedups. SuperSQLite compiles SQLite with various optimizations and allows you to select your workload at runtime to further automatically configure the connection to be optimized for your workload.

When to use SuperSQLite?

SQLite is extremely reliable and durable for large amounts of data (up to 140TB). It is considered one of the most well-engineered and well-tested software solutions today, with 711x more test code than implementation code.

SQLite is faster than nearly every other database at read-heavy use cases (especially compared to databases that may use a client-server model with network latency like MySQL, PostgreSQL, MongoDB, DynamoDB, etc.). You can also instantiate SQLite completely in-memory to remove disk latency, if your data will fit within RAM. For key/value use cases, you can get comparable or better read/write performance to key/value databases like LevelDB with the LSM1 extension.

When you have a write-heavy workload with multiple servers that need to write concurrently to a shared database (backend to a website), you would probably want to choose something that has a client-server model instead like PostgreSQL, although SQLite can handle processing write requests fast enough that it is sufficient for most concurrent write loads. In fact, Expensify uses SQLite for their entire backend. If you need the database to be automatically replicated or automatically sharded across machines or other distributed features, you probably want to use something else.

See Appropriate Uses For SQLite for more information and Well-Known Users of SQLite for example use cases.

Using the Library

Instead of 'import sqlite3', use:

from supersqlite import sqlite3

This retains compatibility with the sqlite3 package, while adding the various enhancements.

Connecting

Given the above import, connect to a sqlite database file using:

conn = sqlite3.connect('foo.db')

Querying

Remote Streaming over HTTP

Workload Optimizations

Extensions

JSON1

FTS3, FTS4, FTS5

LSM1

R*Tree

Other

Custom

Export SQLite Resources

Optimizations

Other Documentation

SuperSQLite extends the apsw Python SQLite wrapper and adds on to its functionality. You can find the full documentation for that library here, which in turn attempts to implement PEP 249 (DB API). The connection object, cursor object, etc. are all apsw.Connection, apsw.Cursor. Note, however, that some monkey-patching has been done to make the library more in-line and compatible as a drop-in replacement for Python's built-in sqlite3 module.

Other documentation is not available at this time. See the source file directly (it is well commented) if you need more information about a method's arguments or want to see all supported features.

Other Programming Languages

Currently, this library only supports Python. There are no plans to port it to any other languages, but since SQLite has a native C implementation and has bindings in most languages, you can use the export functions to load SuperSQLite's SQLite extensions in the SQLite bindings of other programming languages or link SuperSQLite's version of SQLite to a native binary.

Contributing

The main repository for this project can be found on GitLab. The GitHub repository is only a mirror. Pull requests for more tests, better error-checking, bug fixes, performance improvements, or documentation or adding additional utilties / functionalities are welcome on GitLab.

You can contact us at [email protected].

Roadmap

  • Out of the box, "fast-write" configuration option that makes the connection optimized for fast-writing.
  • Out of the box, "fast-read" configuration option that makes the conenction optimized for fast-reading.
  • Optimize streaming cache behavior

Other Notable Projects

  • pysqlite - The built-in sqlite3 module in Python.
  • apsw - Powers the main API of SuperSQLite, aims to port all of SQLite's API functionality (like VFSes) to Python, not just the query APIs.
  • Magnitude - Another project by Plasticity that uses SuperSQLite's unique features for machine learning embedding models.

LICENSE and Attribution

This repository is licensed under the license found here.

The SQLite "feather" icon is taken from the SQLite project which is released as public domain.

This project is not affiliated with the official SQLite project.

Comments
  • Missing __enter__() with cursor

    Missing __enter__() with cursor

    The syntax : with conn.cursor() as cursor: ... work with standard Python SQLite and others database drivers (postgres, ...) But the method __enter__() is not defined in supersqlite driver.

    opened by pprados 0
  • AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'

    AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension'

    Is supersqlite enable loading sqlite extensions? What can cause this error:

    from supersqlite import sqlite3 as sqlite33 conn33=sqlite33.connect("mydbfile.db") conn33.enable_load_extension(True) Traceback (most recent call last): File "", line 1, in AttributeError: 'sqlite3.Connection' object has no attribute 'enable_load_extension' thank you.

    opened by dbricker-intel 0
  • Publish supersqlite on conda-forge

    Publish supersqlite on conda-forge

    It would be very convenient to have this library available on conda-forge so it could be installed with the Conda package manager, which is ideal for packages with binary dependencies. Is that a possibility?

    opened by JWCook 0
  • No module named 'supersqlite.third_party.internal.apsw'

    No module named 'supersqlite.third_party.internal.apsw'

    Hello, I try to install supersqlite from pip and conda. I can use

    docker run -it ubuntu
    # then
    apt-get update ; \
    apt-get install -y python-apsw python3 python3-pip ; \
    pip3 install supersqlite ; python3 -c "import supersqlite"
    

    It's correct.

    Now, I try with conda

    docker run -it conda/miniconda3
    # Then
    conda update conda -y ; \
    conda init bash ;
    exec bash
    # and
    conda install -c conda-forge apsw -y ; \
    pip3 install supersqlite ; \
    python3 -c "import supersqlite"
    

    I receive and error: ModuleNotFoundError: No module named 'supersqlite.third_party.internal.apsw'

    Collecting supersqlite
      Downloading supersqlite-0.0.78.tar.gz (25.8 MB)
         |████████████████████████████████| 25.8 MB 3.9 MB/s 
    Building wheels for collected packages: supersqlite
      Building wheel for supersqlite (setup.py) ... done
      Created wheel for supersqlite: filename=supersqlite-0.0.78-cp39-cp39-linux_x86_64.whl size=71094064 sha256=64d23d0848fba148a4506b0905135d23df4d7099660bbba683584b71623d38a2
      Stored in directory: /root/.cache/pip/wheels/c2/83/40/cffebda33928fae730f81985e5d75078d257db3586bd419905
    Successfully built supersqlite
    Installing collected packages: supersqlite
    Successfully installed supersqlite-0.0.78
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/usr/local/lib/python3.9/site-packages/supersqlite/__init__.py", line 47, in <module>
        import supersqlite.third_party.internal.apsw as apsw
    ModuleNotFoundError: No module named 'supersqlite.third_party.internal.apsw'
    

    You can reproduce this bug with this docker file

    # SuperSqliteDockerfile file
    FROM continuumio/anaconda3
    
    RUN conda create --name testSupersqlite python=3.7 ; \
        conda activate testSupersqlite ; \
        pip3 install supersqlite
    
    ENTRYPOINT python -c "import supersqlite"
    

    and

    docker run --rm -it $(docker build -q -f SuperSqliteDockerfile .)
    

    How can I resolve this problem?

    Thanks

    opened by pprados 1
  • Fix typo that causes install to break

    Fix typo that causes install to break

    pip install of supersqlite is broken due to a typo in a requirement name: lsb-db should actually be lsm-db.

    This also resolves https://github.com/plasticityai/supersqlite/issues/5

    opened by DDevine 2
  • [BUG] Outdated/typo in requirements.txt

    [BUG] Outdated/typo in requirements.txt

    Description

    It seems current requirements.txt is either outdated or contains a typo in https://github.com/plasticityai/supersqlite/blob/d74da749c6fa5df021df3968b854b9a59f829e17/requirements.txt#L1 When trying to pip install lsb-db==0.6.4 I get following error

    Could not find a version that satisfies the requirement lsb-db==0.6.4 (from versions: none)
    

    I guess this could be a typo for lsm-db?

    Expected behavior

    Installing this package via pip won't fail

    System

    Ubuntu 18.04LTS / Python 3.6 / pip 19.1.1

    cc @AjayP13

    opened by johnygomez 0
Releases(0.0.78)
Owner
Plasticity
The official GitHub account of Plasticity
Plasticity
Pony Object Relational Mapper

Downloads Pony Object-Relational Mapper Pony is an advanced object-relational mapper. The most interesting feature of Pony is its ability to write que

3.1k Jan 04, 2023
Logica is a logic programming language that compiles to StandardSQL and runs on Google BigQuery.

Logica: language of Big Data Logica is an open source declarative logic programming language for data manipulation. Logica is a successor to Yedalog,

Evgeny Skvortsov 1.5k Dec 30, 2022
MySQL database connector for Python (with Python 3 support)

mysqlclient This project is a fork of MySQLdb1. This project adds Python 3 support and fixed many bugs. PyPI: https://pypi.org/project/mysqlclient/ Gi

PyMySQL 2.2k Dec 25, 2022
a small, expressive orm -- supports postgresql, mysql and sqlite

peewee Peewee is a simple and small ORM. It has few (but expressive) concepts, making it easy to learn and intuitive to use. a small, expressive ORM p

Charles Leifer 9.7k Dec 30, 2022
Async ORM based on PyPika

PyPika-ORM - ORM for PyPika SQL Query Builder The package gives you ORM for PyPika with asycio support for a range of databases (SQLite, PostgreSQL, M

Kirill Klenov 7 Jun 04, 2022
Easy-to-use data handling for SQL data stores with support for implicit table creation, bulk loading, and transactions.

dataset: databases for lazy people In short, dataset makes reading and writing data in databases as simple as reading and writing JSON files. Read the

Friedrich Lindenberg 4.2k Jan 02, 2023
SQL for Humans™

Records: SQL for Humans™ Records is a very simple, but powerful, library for making raw SQL queries to most relational databases. Just write SQL. No b

Kenneth Reitz 6.9k Jan 07, 2023
GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

GINO Community 2.5k Dec 29, 2022
GINO Is Not ORM - a Python asyncio ORM on SQLAlchemy core.

GINO - GINO Is Not ORM - is a lightweight asynchronous ORM built on top of SQLAlchemy core for Python asyncio. GINO 1.0 supports only PostgreSQL with

GINO Community 2.5k Dec 27, 2022
Python client for InfluxDB

InfluxDB-Python InfluxDB-Python is a client for interacting with InfluxDB. Development of this library is maintained by: Github ID URL @aviau (https:/

InfluxData 1.6k Dec 24, 2022
A Python wheel containing PostgreSQL

postgresql-wheel A Python wheel for Linux containing a complete, self-contained, locally installable PostgreSQL database server. All servers run as th

Michel Pelletier 71 Nov 09, 2022
Asynchronous, fast, pythonic DynamoDB Client

AsyncIO DynamoDB Asynchronous pythonic DynamoDB client; 2x faster than aiobotocore/boto3/botocore. Quick start With httpx Install this library pip ins

HENNGE 48 Dec 18, 2022
Script em python para carregar os arquivos de cnpj dos dados públicos da Receita Federal em MYSQL.

cnpj-mysql Script em python para carregar os arquivos de cnpj dos dados públicos da Receita Federal em MYSQL. Dados públicos de cnpj no site da Receit

17 Dec 25, 2022
Anomaly detection on SQL data warehouses and databases

With CueObserve, you can run anomaly detection on data in your SQL data warehouses and databases. Getting Started Install via Docker docker run -p 300

Cuebook 171 Dec 18, 2022
Makes it easier to write raw SQL in Python.

CoolSQL Makes it easier to write raw SQL in Python. Usage Quick Start from coolsql import Field name = Field("name") age = Field("age") condition =

Aber 7 Aug 21, 2022
SpyQL - SQL with Python in the middle

SpyQL SQL with Python in the middle Concept SpyQL is a query language that combines: the simplicity and structure of SQL with the power and readabilit

Daniel Moura 853 Dec 30, 2022
Some scripts for microsoft SQL server in old version.

MSSQL_Stuff Some scripts for microsoft SQL server which is in old version. Table of content Overview Usage References Overview These script works when

小离 5 Dec 29, 2022
aiosql - Simple SQL in Python

aiosql - Simple SQL in Python SQL is code. Write it, version control it, comment it, and run it using files. Writing your SQL code in Python programs

Will Vaughn 1.1k Jan 08, 2023
Pure-python PostgreSQL driver

pg-purepy pg-purepy is a pure-Python PostgreSQL wrapper based on the anyio library. A lot of this library was inspired by the pg8000 library. Credits

Lura Skye 11 May 23, 2022
This repository is for active development of the Azure SDK for Python.

Azure SDK for Python This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public

Microsoft Azure 3.4k Jan 02, 2023