Baseline is a cross-platform library and command-line utility that creates file-oriented baselines of your systems.

Overview

Baselining, on steroids!

Baseline is a cross-platform library and command-line utility that creates file-oriented baselines of your systems.

The project aims to offer an open-source alternative to the famous NSRL or HashSets and allows you to generate baselines from your own systems. Plus, it is cross-platform, so you can use it the same way whether on a Windows or a GNU/Linux system.

Currently available extractors:

  • fs: Extracts filesystem-related metadata.
  • hash: Computes several hashes from the entry's data (e.g. MD5, SHA-1, ssdeep).
  • pe: Extracts detailed information from Portable Executable (PE) files.

Table of contents

Installation

Installing from PyPI

Baseline is currently not available on PyPI. The main reason is that the name is currently taken by another project.

Installing from source

Since Baseline uses Poetry as its packaging toolkit of choice, so installing it from source is as simple as:

git clone httpe://github.com/sk4la/baseline.git
cd baseline

python3 -m pip install poetry
python3 -m poetry install

Precompiled binaries

Precompiled binaries are available in the Releases section.

Docker

Baseline is also available as a Docker image.

To pull the latest image from Docker Hub:

docker pull sk4la/baseline

See the official Docker documentation for details on how to install and use it.

Usage

The help menu for the baseline command-line utility:

Usage: baseline 
   
     COMMAND [ARGS]...

  Command-line utility that creates file-oriented baselines.

Options:
  --ensure-administrator          Ensure that the current user has administrative privileges (i.e.
                                  is `root` or equivalent on GNU/Linux systems).
  --log-file 
    
                    Set the log file path (e.g. 'baseline.log').  [default:
                                  /home/sk4la/baseline/20211031110323.25756fb1b706.log]
  --logging-configuration 
     
        Set the logging configuration using a custom file. The file must
                                  adhere to the official specification. See
                                  https://docs.python.org/3/library/logging.config.html#logging-
                                  config-dictschema for more details.
  --monochrome                    Disable console output coloring. This can be useful when piping
                                  the output to a log file.
  -v, --verbose                   Increase the logging verbosity. Supports up to 4 occurrences of
                                  the same option (e.g. -vvvv).  [0<=x<=4]
  --version                       Show the version and exit.
  --help                          Show this message and exit.

Commands:
  new     Creates a new filesystem-based baseline.
  schema  Show the JSON representation of the actual schema.

     
    
   

The help menu for the baseline new subcommand:

Usage: baseline new 
    
    
     ...

  Creates a new filesystem-based baseline.

Options:
  --comment 
     
                       Add an arbitrary comment to the generated output file.
  --exclude-directory 
      
       ...
                                  Exclude a specific directory from the baseline. Can be specified
                                  multiple times (e.g. `--exclude-directory /dev --exclude-
                                  directory /proc`).
  --exclude-extractor [hash|pe|fs]
                                  Exclude extractors. Can be specified multiple times (e.g.
                                  `--exclude-extractor hash --exclude-extractor pe`).
  --max-size 
       
         Set the maximum file size (in bytes) to inspect. [default: 5000000; x>=1] -o, --output-file 
        
          Set the output file path (e.g. 'baseline.ndjson'). [default: /workspaces/baseline/20211102200452.58bd60a3b16a.ndjson] --output-file-encoding [utf-8|utf-16le] Set the output file encoding. Only applies when writing to an actual file. [default: utf-8] -f, --output-format [ndjson] Set the output format. [default: ndjson] --partition-size 
         
           Set the partition size (i.e. number of entries per process). [default: 200; x>=1] --processes 
          
            Set the number of parallel processes. [default: 2; x>=1] --recursive / --non-recursive Whether to walk the filesystem recursively. When set, the program will only inspect the files and directories specifies on the first level of any included path. For example, if '/mnt/image' is specified as an included path, then only the directory '/mnt/image' itself and its direct children will be inspected. [default: recursive] --remap 
           
            ... Artificially remap included paths (e.g. '/mnt/image:/'). Can be specified multiple times (e.g. `--remap /mnt/image:/ --remap /dev/null:/dev/void`). --report / --no-report Whether to show a final report at the end. [default: report] --skip-compression Whether to skip on-the-fly compression of the resulting file. --skip-directories Whether to skip directories. --skip-empty Whether to skip empty entries. --help Show this message and exit. 
           
          
         
        
       
      
     
    
   

The help menu for the baseline schema subcommand:

Usage: baseline schema 
   
    

  Show the JSON representation of the actual schema.

Options:
  --compact                       Render compact JSON instead of the default idented version.
  --output-file 
    
                 Set the output file path (e.g. 'schema.json').
  --output-file-encoding [utf-8|utf-16le]
                                  Set the output file encoding. Only applies when writing to an
                                  actual file.  [default: utf-8]
  --help                          Show this message and exit.

    
   

Creating a baseline of a live system

Creating a baseline of a live system is as simple as:

baseline new

When using Baseline from a removable device, you may want to exclude its path (for example /mnt/usb) from the generated baseline:

baseline --ensure-administrator new --exclude-directory /mnt/usb

See the Usage section for a complete list of options and arguments.

Creating a baseline from a mounted image

When creating a baseline of a mounted image, you may want the baseline to represent the files as if they were read from the actual system, not the mounted image.

For example, if your image is currently mounted on /mnt/IMG-001, you can then execute the following command to remap all entries read from this path to /:

baseline new --remap /mnt/IMG-001:/ /mnt/IMG-001

You can think of this as a chroot jail.

Displaying the schema

Baseline uses a fixed schema for rendering the information. This schema is enforced using the Pydantic package and produces a heavily-typed output that can later be ingested as-is.

To print the standardized JSON schema:

baseline schema

To dump a compact version of the JSON schema to schema.min.json:

baseline schema --compact --output-file schema.min.json

The JSON schema produced by Pydantic is compatible with the specifications from JSON Schema Core, JSON Schema Validation and OpenAPI Data Types. See the official Pydantic documentation for more details.

Advanced usage

Building binaries

Baseline currently supports the following packaging systems:

Although precompiled binaries are available in the Releases section, you should always build your own binaries.

To produce a binary using PyInstaller:

make pyinstaller-linux

To produce a binary using Nuitka:

make nuitka-linux

As Nuitka is a Python compiler by itself and does not rely on the standard CPython interpreter, you should be aware that there may be bugs and/or issues unrelated to Baseline itself.

Building the Docker image

To build the official Docker image:

make docker

Additional instructions can be added to the Dockerfile in order to customize the image.

The official Docker image is available at https://hub.docker.com/r/sk4la/baseline. You can use the FROM docker.io/sk4la/baseline:latest instruction in your own Dockerfile to derive your own image.

API

Using Baseline from Python is possible using the Baseline class:

from baseline.core import Baseline

with Baseline() as baseline:
    for record in baseline.compute(*[
        "/mnt/IMG-001",
        "/mnt/IMG-002",
    ]):
        print(record.json(exclude_none=True))

See the actual code for a more thorough example.

The Baseline class emits logging messages to the baseline logger, to which you can subscribe to if you wish. The command-line utility displays these messages to the console by default.

Contribute

Baseline is a work in progress, everyone is welcome to contribute! ๐Ÿ‘

Writing a new extractor

In order for new extractors to be able to enrich the generated records, the global schema first needs to be updated. To do this, you must create a sublass of Pydantic's BaseModel in schema.py that references the fields that will eventually be filled by the extractor. This class will then be referenced in the schema's root Record class.

In this example, we want to extract the first 50 lines of any *.txt file. Here we arbitrarily decide that the extracted text will be stored in the content attribute and that the extractor's key will be text:

class Text(pydantic.BaseModel):
    content: str

class Record(pydantic.BaseModel):
    ...
    text: typing.Optional[Text]

We can then start to write the actual code. All extractors must inherit from the base Extractor class:

None: with self.entry.open() as stream: setattr( record, self.KEY, schema.Text( content=stream.read(50), ), ) ">
from baseline.models import Extractor
from baseline.schema import Text

class Text(Extractor):
    """Extracts the first line of any `.txt` file."""

    EXTENSION_FILTERS = (
      r"\.txt$",
    )
    KEY = "text"

    def run(self: object, record: schema.Record) -> None:
        with self.entry.open() as stream:
            setattr(
                record,
                self.KEY,
                schema.Text(
                    content=stream.read(50),
                ),
            )

The extractor's KEY class variable must correspond to the one that was specified in the schema's root Record class (text in this example).

Support

In case you encounter a problem or want to suggest a new feature, please submit a ticket.

License

Baseline is licensed under the GNU General Public License (GPL) version 3.

You might also like...
A Python module and command line utility for working with web archive data using the WACZ format specification

py-wacz The py-wacz repository contains a Python module and command line utility for working with web archive data using the WACZ format specification

A Python module and command-line utility for converting .ANS format ANSI art to HTML

ansipants A Python module and command-line utility for converting .ANS format ANSI art to HTML. Installation pip install ansipants Command-line usage

A command line utility to export Google Keep notes to markdown.

Keep-Exporter A command line utility to export Google Keep notes to markdown files with metadata stored as a frontmatter header. Supports exporting: S

A command line utility for tracking a stock market portfolio. Primarily featuring high resolution braille graphs.
A command line utility for tracking a stock market portfolio. Primarily featuring high resolution braille graphs.

A command line stock market / portfolio tracker originally insipred by Ericm's Stonks program, featuring unicode for incredibly high detailed graphs even in a terminal.

๐Ÿ“ฆ A command line utility to put text in a box.
๐Ÿ“ฆ A command line utility to put text in a box.

boxie A command line utility to put text in a box. Installation pip install boxie If you are on Linux you may need to use sudo to access this globally

Tiny command-line utility for mapping broken keys to other positions.

brokenkey Tiny command-line utility for mapping broken keys to other positions. Installation Clone this repository using git: git clone https://github

This is a CLI utility that allows you to view RedFlagDeals.com on the command line.
This is a CLI utility that allows you to view RedFlagDeals.com on the command line.

RFD Description Motivation Installation Usage View Hot Deals View and Sort Hot Deals Search Advanced View Posts Shell Completion bash zsh Description

img-proof (IPA) provides a command line utility to test images in the Public Cloud

overview img-proof (IPA) provides a command line utility to test images in the Public Cloud (AWS, Azure, GCE, etc.). With img-proof you can now test c

A Python command-line utility for validating that the outputs of a given Declarative Form Azure Portal UI JSON template map to the input parameters of a given ARM Deployment Template JSON template

A Python command-line utility for validating that the outputs of a given Declarative Form Azure Portal UI JSON template map to the input parameters of a given ARM Deployment Template JSON template

Comments
  • executable variable is not defined in extractors/executables

    executable variable is not defined in extractors/executables

    self.executable is not defined in baseline/extractors/executables.py, so in the __del__ method, it crashes.

    $ baseline new /etc      
    Exception ignored in: <function PortableExecutable.__del__ at 0x7fd1142c98b0>
    Traceback (most recent call last):
      File "/home/git/baseline/env/lib/python3.9/site-packages/baseline/extractors/executables.py", line 80, in __del__
        if self.executable:
    AttributeError: 'PortableExecutable' object has no attribute 'executable'
    All done! ๐Ÿ’ช
    
    Entries Processed : 1592
    Output Format     : ndjson
    Output File       : /home/git/baseline/20211106191044.heaven.ndjson.xz
    Size              : 1.2 MB
    Total Time        : 0:00:01.993753
    
    opened by jurelou 0
Releases(0.1.0)
Owner
Nelson
Random bits of code
Nelson
Custom 64 bit shellcode encoder that evades detection and removes some common badchars (\x00\x0a\x0d\x20)

x64-shellcode-encoder Custom 64 bit shellcode encoder that evades detection and removes some common badchars (\x00\x0a\x0d\x20) Usage Using a generato

Cole Houston 2 Jan 26, 2022
Stephen's Obsessive Note-Storage Engine.

Latest Release ยท PyPi Package ยท Issues ยท Changelog ยท License # Get Sonse and tell it where your notes are... $ pip install sonse $ export SONSE="$HOME

Stephen Malone 23 Jun 10, 2022
Task-manager-CLI with Priority Modification

Task-manager-CLI with Priority Modification The functions for the app have been written in task.py file. 1. Install Node.js This project requires Node

1 Jan 21, 2022
Standalone script written in Python 3 for generating Reverse Shell one liner snippets and handles the communication between target and client using custom Netcat binaries

Standalone script written in Python 3 for generating Reverse Shell one liner snippets and handles the communication between target and client using custom Netcat binaries. It automates the boring stu

Yash Bhardwaj 3 Sep 27, 2022
Set of scripts & tools for converting between numbers and major system encoded words.

major-system-converter Set of scripts & tools for converting between numbers and major system encoded words. Uses phonetics instead of letters to conv

4 Aug 09, 2022
Wordle-solver - A tool that helps people who struggle with vocabulary to enjoy the famous game of WORDLE

Wordle-Solver Wordle-Solver helps people who struggle with vocabulary to enjoy t

Jason Chao 104 Dec 31, 2022
An easy-to-bundle GTK terminal emulator.

EasyTerm An easy-to-bundle GTK terminal emulator. This project is meant to be used as a dependency for other

Bottles 4 May 15, 2022
Apple Silicon 'top' CLI

asitop pip install asitop What A nvtop/htop style/inspired command line tool for Apple Silicon (aka M1) Macs. Note that it requires sudo to run due to

Timothy Liu 1.2k Dec 31, 2022
lfb (light file browser) is a terminal file browser

lfb (light file browser) is a terminal file browser. The whole program is a mess as of now. In the feature I will remove the need for external dependencies, tidy up the code, make an actual readme, a

2 Apr 09, 2022
Color preview command-line tool written in python

Color preview command-line tool written in python

Arnau 1 Dec 27, 2021
Hack-All is a simple CLI tool that helps ethical-hackers to make a reverse connection without knowing the target device in use is it computer or phone

Hack-All is a simple CLI tool that helps ethical-hackers to make a reverse connection without knowing the target device in use is it computer

LightYagami17 5 Nov 22, 2022
Generate folder trees directly from the terminal.

Dir Tree Artist ๐ŸŽจ ๐ŸŒฒ Intro Easily view folder structure, with parameters to sieve out what you want. Choose to exclude files from being viewed (.git,

Glenda T 0 May 17, 2022
Easily turn single threaded command line applications into a fast, multi-threaded application with CIDR and glob support.

Easily turn single threaded command line applications into a fast, multi-threaded application with CIDR and glob support.

Michael Skelton 1k Jan 07, 2023
CLI/GUI Math commands based on python 3

PyMath Commands Syntax Installation Commands: pymath add: usage: pymath add 12.5 12.5 sub: usage: pymath sub 25 12.5 div: usage: pymath div 144 12 mul

eggsnham07 0 Nov 22, 2021
argofloats: Simple CLI for ArgoVis and Argofloats

argofloats: Simple CLI for ArgoVis and Argofloats Argo is an international program that collects information from inside the ocean using a fleet of ro

Samapriya Roy 2 Feb 13, 2022
Modern line-oriented terminal emulator without support for TUIs.

Modern line-oriented terminal emulator without support for TUIs.

10 Jun 12, 2022
GoogleFormSpammer - A simple CLI script to spam Google Forms used by Crypto Wallet scammers to collect stolen data

GoogleFormSpammer - A simple CLI script to spam Google Forms used by Crypto Wallet scammers to collect stolen data

14 Dec 17, 2022
A VIM-inspired filemanager for the console

ranger 1.9.3 ranger is a console file manager with VI key bindings. It provides a minimalistic and nice curses interface with a view on the directory

12.6k Dec 30, 2022
Openstack bucket retention cli

Openstack bucket retention cli

Fatih Sarhan 3 Apr 03, 2022
A command line tool to hide and reveal information inside images (works for both PNGs and JPGs)

ImgReRite A command line tool to hide and reveal information inside images (work

Jigyasu 10 Jul 27, 2022