iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms.

Overview


Ax

Description

iAWE is a wonderful dataset for those of us who work on Non-Intrusive Load Monitoring (NILM) algorithms. You can find its main page and description via this link. If you are familiar with NILM-TK API, you probably know that you can work with iAWE hdf5 data file in NILM-TK. However I faced some problems that convinced me to Not use NILM-TK and iAWE hdf5 datafile. Instead, I decided to use the iAWE appliance consumption CSV files and preprocess them myself. So if you have problems with NILM-TK API and iAWE hdf5 data file too, this piece of code may help you to prepare 11 appliance consumption data for your NILM algorithm.

Installation

  • First, download the iAWE dataset using this link (also available on iAWE page!).
  • Download the electricity.tar.gz file.


Ax

  • Download the repo and all its folders.
  • Unzip the electricity.tar.gz and copy all 12 CSV file (plus the labels file into the electricity folder of the downloaded repo.
  • Now everythng is ready for you to start the data preprocessing using the main.py file. But before running the code let me show you what kind of problems we had with the original iAWE hdf5 file.

What problems did we solve?

Well, to be honest NILM-TK documentation is not very clear! If you try to use the hdf5 datafile of the datasets that works with NILM-TK, soon you will admit it. Sometimes you find the the similiar questions on stack overflow but when you try them, they simply don't work due to some updates in NILM-TK (undocumented maybe!?). So, having full control on the data was my main incentive to redo the data preprocessing by my self. You see 12 CSV files in your downloaded files. They belong to:

  • main meter (1)
  • main meter (2)
  • fridge
  • air conditioner (1)
  • air conditioner (2)
  • washing machine
  • laptop
  • iron
  • kitchen outlets
  • television
  • water filter
  • water motor The publisher of iAWE dataset has recommended to ignore the water motor CSV file as it is not accurate (so did we!). Each CSV file consists of timestamp, W, VAR, VA, f, V, PF and A columns. timestamp can be read and converted to read time and date by Python libraries. The publisher of dataset have collected time stamps to reduce the size of final data files which means there is no sampling when the appliances are not consuming power. On the other hand the start time of different appliances measurement is not the same so the length, start and end of most csv files are different. When you plot it in NILM-TK it is fine becuase it reads the timestamps and ignores the NA time steps. However when you want to feed this data into your algorithm it will be a problem which needs data preprocessing. To better understand the problem when using the raw data in iAWE dataset, I've plotted W (active power) of the air conditioner which is CSV file number 4.


AC

As you see, when youplot it in Python the NA timestamp will be plotted as a direct line between last available data and the next available one. It is neither human readable (to some extents!) nor NILM algorithm readable. In fact what your NILM algorithm will be fed with is the series of these values because your algorithm has nothing to do with timestamps! See this is what NILM algorithm sees as the AC power consumption:


AC WO

Now to make it both human readable and NILM algorithm readable, I did as below: (I've commented the code so you can see what is happening in every part of the code)

  • Loaded all CSV files in a dictionary of Dataframes with CSV file orders
  • Measured the lowes and highest timestamp in order to know the length of the measurement period (they have different lengthes!)
  • Created a big dataframe of zeros with from lowest timestamp to the highest one as its index
  • Used the update method on dataframes to transfer the values of dataframes to the big dataframes of zeros (Now all of them have the same length)
  • Putting all dfs into a dictionary of dataframes
  • Casting all the dataframes into the efficient period of sampling (Because now we know which part of sampling is useless)
  • Removing NAN values
  • Dropping unwanted columns
  • Filling NA values with last available value in dataframes
  • Saving all the dataframes as CSV files in the prepared data folder
  • Done!


AC WO

Conclusion

Basically, what we have here after running this code is 11 CSV files of W, VAR, VA, f, V, PF and A for 11 different meters. Prepared CSV file are all of the same length without NAN or NA values which are ready to be fed to any NILM algorithm. Despite the fact that I've done these changes to iAWE dataset, I'm sure the publishers of this dataset have much better solution via NILM-TK to have such an output. However due to lack of documentation or changes in their code I prefered to do this data preprocessing myself. Hope you enjoy it!

Owner
Mozaffar Etezadifar
NILM and RL researcher @ Polytechnique Montreal
Mozaffar Etezadifar
A priority of preferences for teacher assignment problem

Genetic-Algorithm-for-Assignment-Problem A priority of preferences for teacher assignment problem Keywords k-partition; clustering; education 4.0 Abst

hades 2 Oct 31, 2022
FPE - Format Preserving Encryption with FF3 in Python

ff3 - Format Preserving Encryption in Python An implementation of the NIST approved FF3 and FF3-1 Format Preserving Encryption (FPE) algorithms in Pyt

Privacy Logistics 42 Dec 16, 2022
This repository is not maintained

This repository is no longer maintained, but is being kept around for educational purposes. If you want a more complete algorithms repo check out: htt

Nic Young 2.8k Dec 30, 2022
Sorting Algorithm Visualiser using pygame

SortingVisualiser Sorting Algorithm Visualiser using pygame Features Visualisation of some traditional sorting algorithms like quicksort and bubblesor

4 Sep 05, 2021
The test data, code and detailed description of the AW t-SNE algorithm

AW-t-SNE The test data, code and result of the AW t-SNE algorithm Structure of the folder Datasets: This folder contains two datasets, the MNIST datas

1 Mar 09, 2022
Implementation for Evolution of Strategies for Cooperation

Moraliser Implementation for Evolution of Strategies for Cooperation Dependencies You will need a python3 (= 3.8) environment to run the code. Before

1 Dec 21, 2021
Algorithm and Structured Programming course project for the first semester of the Internet Systems course at IFPB

Algorithm and Structured Programming course project for the first semester of the Internet Systems course at IFPB

Gabriel Macaúbas 3 May 21, 2022
A litle algorithm that i made for transform a picture in a spreadsheet.

PicsToSheets How it works? It is an algorithm designed to transform an image into a spreadsheet file. this converts image pixels to color cells of she

Guilherme de Oliveira 1 Nov 12, 2021
Implemented page rank program

Page Rank Implemented page rank program based on fact that a website is more important if it is linked to by other important websites using recursive

Vaibhaw 6 Aug 24, 2022
A minimal implementation of the IQRM interference flagging algorithm for radio pulsar and transient searches

A minimal implementation of the IQRM interference flagging algorithm for radio pulsar and transient searches. This module only provides the algorithm that infers a channel mask from some spectral sta

Vincent Morello 6 Nov 29, 2022
A GUI visualization of QuickSort algorithm

QQuickSort A simple GUI visualization of QuickSort algorithm. It only uses PySide6, it does not have any other external dependency. How to run Install

Jaime R. 2 Dec 24, 2021
This is the code repository for 40 Algorithms Every Programmer Should Know , published by Packt.

40 Algorithms Every Programmer Should Know, published by Packt

Packt 721 Jan 02, 2023
Algorithmic trading backtest and optimization examples using order book imbalances. (bitcoin, cryptocurrency, bitmex)

Algorithmic trading backtest and optimization examples using order book imbalances. (bitcoin, cryptocurrency, bitmex)

172 Dec 21, 2022
frePPLe - open source supply chain planning

frePPLe Open source supply chain planning FrePPLe is an easy-to-use and easy-to-implement open source advanced planning and scheduling tool for manufa

frePPLe 385 Jan 06, 2023
Algorithms and data structures for educational, demonstrational and experimental purposes.

Algorithms and Data Structures (ands) Introduction This project was created for personal use mostly while studying for an exam (starting in the month

50 Dec 06, 2022
Pathfinding algorithm based on A*

Pathfinding V1 What is pathfindingV1 ? This program is my very first path finding program, using python and turtle for graphic rendering. How is it wo

Yan'D 6 May 26, 2022
🧬 Performant Evolutionary Algorithms For Python with Ray support

🧬 Performant Evolutionary Algorithms For Python with Ray support

Nathan 49 Oct 20, 2022
Using A * search algorithm and GBFS search algorithm to solve the Romanian problem

Romanian-problem-using-Astar-and-GBFS Using A * search algorithm and GBFS search algorithm to solve the Romanian problem Romanian problem: The agent i

Mahdi Hassanzadeh 6 Nov 22, 2022
BCI datasets and algorithms

Brainda Welcome! First and foremost, Welcome! Thank you for visiting the Brainda repository which was initially released at this repo and reorganized

52 Jan 04, 2023
This repository provides some codes to demonstrate several variants of Markov-Chain-Monte-Carlo (MCMC) Algorithms.

Demo-of-MCMC These files are based on the class materials of AEROSP 567 taught by Prof. Alex Gorodetsky at University of Michigan. Author: Hung-Hsiang

Sean 1 Feb 05, 2022