Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Schema validation just got Pythonic

    Schema validation just got Pythonic schema is a library for validating Python data structures, such as those obtained from config-files, forms, extern

    Vladimir Keleshev 2.7k Jan 06, 2023
    Productivity Tools for Plotly + Pandas

    Cufflinks This library binds the power of plotly with the flexibility of pandas for easy plotting. This library is available on https://github.com/san

    Jorge Santos 2.7k Dec 30, 2022
    A concise grammar of interactive graphics, built on Vega.

    Vega-Lite Vega-Lite provides a higher-level grammar for visual analysis that generates complete Vega specifications. You can find more details, docume

    Vega 4k Jan 08, 2023
    Streamlit dashboard examples - Twitter cashtags, StockTwits, WSB, Charts, SQL Pattern Scanner

    streamlit-dashboards Streamlit dashboard examples - Twitter cashtags, StockTwits, WSB, Charts, SQL Pattern Scanner Tutorial Video https://ww

    122 Dec 21, 2022
    Tidy data structures, summaries, and visualisations for missing data

    naniar naniar provides principled, tidy ways to summarise, visualise, and manipulate missing data with minimal deviations from the workflows in ggplot

    Nicholas Tierney 611 Dec 22, 2022
    D-Analyst : High Performance Visualization Tool

    D-Analyst : High Performance Visualization Tool D-Analyst is a high performance data visualization built with python and based on OpenGL. It allows to

    4 Apr 14, 2022
    Small project demonstrating the use of Grafana and InfluxDB for monitoring the speed of an internet connection

    Speedtest monitor for Grafana A small project that allows internet speed monitoring using Grafana, InfluxDB 2 and Speedtest. Demo Requirements Docker

    Joshua Ghali 3 Aug 06, 2021
    :art: Diagram as Code for prototyping cloud system architectures

    Diagrams Diagram as Code. Diagrams lets you draw the cloud system architecture in Python code. It was born for prototyping a new system architecture d

    MinJae Kwon 27.5k Dec 30, 2022
    A small script written in Python3 that generates a visual representation of the Mandelbrot set.

    Mandelbrot Set Generator A small script written in Python3 that generates a visual representation of the Mandelbrot set. Abstract The colors in the ou

    1 Dec 28, 2021
    Interactive plotting for Pandas using Vega-Lite

    pdvega: Vega-Lite plotting for Pandas Dataframes pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas datafra

    Altair 342 Oct 26, 2022
    erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes

    erdantic is a simple tool for drawing entity relationship diagrams (ERDs) for Python data model classes. Diagrams are rendered using the venerable Graphviz library.

    DrivenData 129 Jan 04, 2023
    Open-questions - Open questions for Bellingcat technical contributors

    Open questions for Bellingcat technical contributors These are difficult, long-term projects that would contribute to open source investigations at Be

    Bellingcat 234 Dec 31, 2022
    This component provides a wrapper to display SHAP plots in Streamlit.

    streamlit-shap This component provides a wrapper to display SHAP plots in Streamlit.

    Snehan Kekre 30 Dec 10, 2022
    Simple, realtime visualization of neural network training performance.

    pastalog Simple, realtime visualization server for training neural networks. Use with Lasagne, Keras, Tensorflow, Torch, Theano, and basically everyth

    Rewon Child 416 Dec 29, 2022
    Bokeh Plotting Backend for Pandas and GeoPandas

    Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of

    Patrik Hlobil 822 Jan 07, 2023
    Extract data from ThousandEyes REST API and visualize it on your customized Grafana Dashboard.

    ThousandEyes Grafana Dashboard Extract data from the ThousandEyes REST API and visualize it on your customized Grafana Dashboard. Deploy Grafana, Infl

    Flo Pachinger 16 Nov 26, 2022
    A high performance implementation of HDBSCAN clustering. http://hdbscan.readthedocs.io/en/latest/

    HDBSCAN Now a part of scikit-learn-contrib HDBSCAN - Hierarchical Density-Based Spatial Clustering of Applications with Noise. Performs DBSCAN over va

    Leland McInnes 91 Dec 29, 2022
    A package for plotting maps in R with ggplot2

    Attention! Google has recently changed its API requirements, and ggmap users are now required to register with Google. From a user’s perspective, ther

    David Kahle 719 Jan 04, 2023
    nvitop, an interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management

    An interactive NVIDIA-GPU process viewer, the one-stop solution for GPU process management.

    Xuehai Pan 1.3k Jan 02, 2023
    在原神中使用围栏绘图

    yuanshen_draw 在原神中使用围栏绘图 文件说明 toLines.py 将一张图片转换为对应的线条集合,视频可以按帧转换。 draw.py 在原神家园里绘制一张线条图。 draw_video.py 在原神家园里绘制视频(自动按帧摆放,截图(win)并回收) cat_to_video.py

    14 Oct 08, 2022