Airflow Operator for running Soda SQL scans

Overview

Soda SQL Airflow Operator

Airflow Operator for running Soda SQL scans

Example Usage

# src/soda/scans/my_scan.yml
# Note that the Airflow rendered templates are accessible (e.g. {{ params.client_id }})
table_name: tmp_{{ params.client_id }}_{{ ds_nodash }}
sql_metrics:
  - sql: |
      SELECT
        SUM(value1) AS staged_value1,
        SUM(value2) AS staged_value2
      FROM tmp_{{ params.client_id }}_{{ ds_nodash }}
  - sql: |
      SELECT
        SUM(value1) AS final_value1,
        SUM(value2) AS final_value2
      FROM final_table
      WHERE
        date = '{{ ds }}'
        AND client_id = {{ params.client_id }}
tests:
  - staged_value1 == final_value1
  - staged_value2 > final_value2
# my_airflow_dag.py
from pathlib import Path
from soda_util import build_soda_warehouse, convert_templated_yml_to_dict

SODA_PATH = Path(os.getenv("PYTHON_PATH", "/code/src")) / "/soda/scans/"  # Matches where my_scan.yml is saved

validate_staged_data = SodaSqlOperator(
    task_id="validate_staged_data",
    warehouse=build_soda_warehouse("warehouse_name", "database_name"),  # Could also pass a file path to a yml file
    scan=convert_templated_yml_to_dict(SODA_PATH, "my_scan.yml"),
    params={"client_id": 12345},  # Params are rendered by Airflow and accessible in the yaml file
)

Notes

  • Unlike Soda itself, a builder pattern is not used to define the warehouse and scan argument. Rather, the warehouse and scan parameters are instance checked and the relevant Soda methods are set. This provides a much simpler API, where we can just pass in the args to the Operator
  • As we are passing over all rendering of Jinga templates to Airflow, the native Soda templates are not accessible. So always use Airflow templates
  • Soft failures (i.e. the Airflow task doesn't fail, it just alerts) have been implemented, but alerting of soft failures has not. So soft failures will essentially just mean the Airflow task passes. Alerting to be implemented
Owner
Todd de Quincey
Data Engineer, Chartered Accountant and all round nice guy (or so I like to think). I believe that quality, simplicity and focus are the keys to success
Todd de Quincey
A website to collect vintage 4 tracks cassette recorders.

Vintage 4tk cassette recorders A website to collect vintage 4 tracks cassette recorders. Local development setup Copy and customize Django settings (e

1 May 01, 2022
A inspector to be able to view and edit Qt style sheet while an application is running

Qt Style Sheet Inspector An inspector widget to view and modify the style sheet of a Qt app at runtime. Usage In order to use the inspector widget on

ESSS 46 Dec 10, 2022
A check numbers python module

Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https://github.com/FayasNoushad/Numbers/blob/main/LICENSE Deplo

Fayas Noushad 3 Nov 28, 2021
WriteAIr is a website which allows users to stream their writing.

WriteAIr is a website which allows users to stream their writing. It uses HSV masking to detect a pen which the user writes with. Plus, users can select a wide range of options through hand gestures!

Atharva Patil 1 Nov 01, 2021
An alternative site to emplea.do due to inconsistent service of the app.

feline a agile and fast alternative to emplea.do License: MIT Settings Moved to settings. Basic Commands Setting Up Your Users To create a normal user

Codetiger 8 Nov 10, 2021
A python trivium implemention

A python trivium implemention

tnt2402 1 Nov 12, 2021
The repository for AnyMacro: a Fusion360 Add-In

AnyMacro AnyMacro is an Autodesk® Fusion 360™ add-in for chaining multiple commands in a row to form Macros. Macros are created from a set of commands

1 Jan 07, 2022
Mixtaper - Web app to make mixtapes

Mixtaper A web app which allows you to input songs in the form of youtube links

suryansh 1 Feb 14, 2022
Python binding to rust zw-fast-quantile

zw_fast_quantile_py zw-fast-quantile python binding Installation pip install zw_fast_quantile_py Usage import zw_fast_quantile_py

Paul Meng 1 Dec 30, 2021
Open source book about making Python packages.

Python packages Tomas Beuzen & Tiffany Timbers Python packages are a core element of the Python programming language and are how you create organized,

Python Packages 169 Jan 06, 2023
Reload all Blender add-on modules

Reload-Addon This add-on creates a list of the modules that the add-on selected in the drop-down menu contains and reloads them with the keyboard shor

2 Dec 02, 2021
Packages of Example Data for The Effect

causaldata This repository will contain R, Stata, and Python packages, all called causaldata, which contain data sets that can be used to implement th

103 Dec 24, 2022
Certipy is a Python tool to enumerate and abuse misconfigurations in Active Directory Certificate Services (AD CS).

Certipy Certipy is a Python tool to enumerate and abuse misconfigurations in Active Directory Certificate Services (AD CS). Based on the C# variant Ce

ollypwn 1.3k Jan 01, 2023
VAST - Visualise Abstract Syntax Trees for Python

VAST VAST - Visualise Abstract Syntax Trees for Python. VAST generates ASTs for a given Python script and builds visualisations of them. Install Insta

Jesse Phillips 2 Feb 18, 2022
ToDoListAndroid - To-do list application created using Kivymd

ToDoListAndroid To-do list application created using Kivymd. Version 1.0.0 (1/Jan/2022). Planned to do next: -Add setting (theme selector, etc) -Add f

AghnatHs 1 Jan 01, 2022
Install packages with pip as if you were in the past!

A PyPI time machine Do you wish you could just install packages with pip as if you were at some fixed date in the past? If so, the PyPI time machine i

Thomas Robitaille 51 Jan 09, 2023
Python samples for Google Cloud Platform products.

Google Cloud Platform Python Samples Python samples for Google Cloud Platform products. Setup Install pip and virtualenv if you do not already have th

Google Cloud Platform 6k Jan 03, 2023
Solves Maths24 problems for you!

maths24-solver Solves Maths24 problems for you! Enjoy this open scource project! You can edit modify and share! My wishes is for you to use this proje

6 Nov 07, 2021
An implementation of an interpreter for the Brainfuck esoteric language in Python

Brainfuck Interpreter in Python An implementation of an interpreter for the Brainfuck esoteric language in Python. 🧠 The Brainfuck Language Created i

Carlos Santos 0 Feb 01, 2022
KeyLogger cliente-servidor em Python para estudos

KeyLogger Esse projeto é apenas para estudos, não nos responsabilisamos por qualquer uso indevido ou prejudiciais do mesmo. Sobre O objetivo do projet

1 Dec 17, 2021