ADIOS - I-NERGY Open Dataset
This asset is the open dataset used for the experiment and for develop the proof of concept. It is an open dataset of SCADA data for Energy Management Systems.
The data includes 30 days of events as logged by an Energy Management System (EMS) in the United States of America (USA). They come in the .csv format, and each event is uniquely represented by a row describing its characteristics (features). The features are:
- EventId: numerical identifier
- EventTimeStamp: data and time of the event
- SCADA_Category: type of event
- TOC: column to be ignored
- AOR: area of responsibility
- Priority_Code: numerical values from 1 to 8, where 1 is the highest priority and 8 the lowest
- Substation: name of the substation where the event occurred (anonymized)
- DeviceType: type of the device where the event occurred
- Device: name of the devices where the event occurred
- Event_message: raw text providing information about the event
There are about 5 million events in the provided .csv file
The asset contains two URLS - Open Dataset samples:
10K.csv - 10000 rows of EMS data - data sample for testing purposes (randomly extracted)
500k.csv - 500000 rows of EMS data - data sample for testing purposes (randomly extracted)
50k.csv - 50000 rows of EMS data - data sample for testing purposes (randomly extracted)
tagged_1K.csv - 1000 rows of EMS data tagged by a control system operators
tagged_3K.csv - 3000 rows of EMS data tagged by a control system operators
tagged_complete.csv - full dataset with tagged alarms by a control system operators
The tagging is structured in four main labels
INFORMATION - sensor reading or status
NORMAL - the system is working as it should
WARNING - the system is reaching the operational limits
CRITICAL - there is a fault in the system
and parser template Importing templates
tagged_template.json to import the tagged dataset for ML training
full_template.json to import the full dataset