Self Organising Map (SOM) for clustering of atomistic samples through unsupervised learning.

Overview

Self Organising Map for Clustering of Atomistic Samples - V2

Description

Self Organising Map (also known as Kohonen Network) implemented in Python for clustering of atomistic samples through unsupervised learning. The program allows the user to select wich per-atom quantities to use for training and application of the network, this quantities must be specified in the LAMMPS input file that is being analysed. The algorithm also requires the user to introduce some of the networks parameters:

  • f: Fraction of the input data to be used when training the network, must be between 0 and 1.
  • SIGMA: Maximum value of the sigma function, present in the neighbourhood function.
  • ETA: Maximum value of the eta funtion, which acts as the learning rate of the network.
  • N: Number of output neurons of the SOM, this is the number of groups the algorithm will use when classifying the atoms in the sample.
  • Whether to use batched or serial learning for the training process.
  • B: Batch size, in case the training is performed with batched learning.

The input file must be inside the same folder as the main.py file. Furthermore, the input file passed to the algorithm must have the LAMMPS dump format, or at least have a line with the following format:

ITEM: ATOMS id x y z feature_1 feature_2 ...

To run the software, simply execute the following command in a terminal (from the folder that contains the files and with a python environment activated):

python3 main.py

Check the software report in the general repository for more information: https://github.com/rambo1309/SOM_for_Atomistic_Samples_GeneralRepo

Dependencies:

This software is written in Python 3.8.8 and uses the following external libraries:

  • NumPy 1.20.1
  • Pandas 1.2.4

(Both packages come with the basic installation of Anaconda)

What's new in V2:

Its important to clarify that V2 of the software isn't designed to replace V1, but to be used when multiple files need to be analysed sequentially with a network that has been trained using a specific training file. It is recommended for the user to first use V1 to explore the results given by different parameters and features of the sample, and then to use V2 to get consistent results for a series of samples. Another reason why V1 will be continually updated is its command-line interactive interface, which allows the users to implement the algorithm without ever having to open and edit a python file.

The most fundamental change with respect to V.1 is the way of communicating with the program. While V.1 uses an interactive command-line interface, V.2 requests an input_params.py file that contains a dictionary specifying the parameters and sample files for the algorithm.

Check the report file in the repository for a complete description of the changes made in the software.

Updates:

Currently working on giving the user the option to change the learning rate funtion, eta, with a few alternatives such as a power-law and an exponential decrease. Another important issue still to be addressed is the training time of the SOM.

Owner
Franco Aquistapace
Undergraduate Physics student at FCEN, UNCuyo
Franco Aquistapace
icepickle is to allow a safe way to serialize and deserialize linear scikit-learn models

icepickle It's a cooler way to store simple linear models. The goal of icepickle is to allow a safe way to serialize and deserialize linear scikit-lea

vincent d warmerdam 24 Dec 09, 2022
Examples and code for the Practical Machine Learning workshop series

Practical Machine Learning Workshop Series Practical Machine Learning for Quantitative Finance Post conference workshop at the WBS Spring Conference D

CompatibL 21 Jun 25, 2022
ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T

ClearML 4k Jan 09, 2023
Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020)

Karate Club is an unsupervised machine learning extension library for NetworkX. Please look at the Documentation, relevant Paper, Promo Video, and Ext

Benedek Rozemberczki 1.8k Jan 03, 2023
Probabilistic programming framework that facilitates objective model selection for time-varying parameter models.

Time series analysis today is an important cornerstone of quantitative science in many disciplines, including natural and life sciences as well as eco

Christoph Mark 129 Dec 24, 2022
Pandas Machine Learning and Quant Finance Library Collection

Pandas Machine Learning and Quant Finance Library Collection

148 Dec 07, 2022
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Knock Knock A small library to get a notification when your training is complete or when it crashes during the process with two additional lines of co

Hugging Face 2.5k Jan 07, 2023
Sleep stages are classified with the help of ML. We have used 4 different ML algorithms (SVM, KNN, RF, NN) to demonstrate them

Sleep stages are classified with the help of ML. We have used 4 different ML algorithms (SVM, KNN, RF, NN) to demonstrate them.

Anirudh Edpuganti 3 Apr 03, 2022
Library of Stan Models for Survival Analysis

survivalstan: Survival Models in Stan author: Jacki Novik Overview Library of Stan Models for Survival Analysis Features: Variety of standard survival

Hammer Lab 122 Jan 06, 2023
MLOps pipeline project using Amazon SageMaker Pipelines

This project shows steps to build an end to end MLOps architecture that covers data prep, model training, realtime and batch inference, build model registry, track lineage of artifacts and model drif

AWS Samples 3 Sep 16, 2022
An AutoML survey focusing on practical systems.

This project is a community effort in constructing and maintaining an up-to-date beginner-friendly introduction to AutoML, focusing on practical systems. AutoML is a big field, and continues to grow

AutoGOAL 16 Aug 14, 2022
Iterative stochastic gradient descent (SGD) linear regressor with regularization

SGD-Linear-Regressor Iterative stochastic gradient descent (SGD) linear regressor with regularization Dataset: Kaggle “Graduate Admission 2” https://w

Zechen Ma 1 Oct 29, 2021
Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification

Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification Introduction. This package includes the pyth

5 Dec 06, 2022
Pyomo is an object-oriented algebraic modeling language in Python for structured optimization problems.

Pyomo is a Python-based open-source software package that supports a diverse set of optimization capabilities for formulating and analyzing optimization models. Pyomo can be used to define symbolic p

Pyomo 1.4k Dec 28, 2022
Distributed Computing for AI Made Simple

Project Home Blog Documents Paper Media Coverage Join Fiber users email list Uber Open Source 997 Dec 30, 2022

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets

Interactive Web App with Streamlit and Scikit-learn that applies different Classification algorithms to popular datasets Datasets Used: Iris dataset,

Samrat Mitra 2 Nov 18, 2021
Official code for HH-VAEM

HH-VAEM This repository contains the official Pytorch implementation of the Hierarchical Hamiltonian VAE for Mixed-type Data (HH-VAEM) model and the s

Ignacio Peis 8 Nov 30, 2022
The unified machine learning framework, enabling framework-agnostic functions, layers and libraries.

The unified machine learning framework, enabling framework-agnostic functions, layers and libraries. Contents Overview In a Nutshell Where Next? Overv

Ivy 8.2k Dec 31, 2022
MICOM is a Python package for metabolic modeling of microbial communities

Welcome MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems

57 Dec 21, 2022
A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

The Alan Turing Institute 6k Jan 06, 2023