Skip to main content

A4CZC model building - Session 1

To provide accurate measures of Cross Zonal Capacity, a key parameter is Net Position (the difference between generation and load of a zone). Here we describe our supervised model to predict net position for the next 48h that achieve an MAE of 75.8 and an R2 of 0.7309

Categories

Developed by
Business Category
Energy
Technical Category
Machine learning

Context

In high voltage electricity networks, the Cross Zonal Capacity (CZC) is the amount of transmission capacity, that is the maximum energy that is allowed to be transferred between two bidding zones. Its management is a critical aspect of the European TSOs' efforts to harmonize electricity balancing services, ensuring security of supply and fostering efficiency in the energy market. However, the lack of precise CZC estimations leads to suboptimal allocation and utilization of available cross-zonal capacity.

In this context, we have worked on the AI4CZC project in order to provide tools (a web platform, AI4CZC platform) to estimate CZC 48h ahead. To estimate CZC accurately, we predict the net position (which is the difference between generation and load on a given geographical area). 

The whole training was done using Inceptive MLE, a part of Inceptive Igloo private platform for building and deploying machine learning models. 

Dataset

For this we created a dataset retrieving data from ENTSOE Transparency platform using ENTSOEDataretrieval library. We retrieved data from January 2019 to June 2023.

The dataset included these features : 

  • Time stamp
  • Actual, day ahead and week ahead loads of Montenegro, Serbia, Italy CS, Bosnia and Kosovo
  • Outage of transmission units between Montenegro and Serbia, Albania, Bosnia, Italy and Kosovo 
  • For each outage, if it was scheduled or not 
  • Generation outages on Montenegro, Serbia, Kosovo, Bosnia and Italy CS.
  • Sum of nominal values of outage nominal generation power for Serbia, Montenegro, Italy, Bosnia and Kosovo. 
  • Generation of Serbia, Bosnia, Italy Center-South, Montenegro and Kosovo by production type
  • Physical flows between Montenegro and Albania, Serbia, Bosnia, Italy CS and Kosovo (both sides)
  • Week ahead transfer capacity estimation between Montenegro and Albania, Serbia, Kosovo, Bosnia, Italy CS 
  • Montenegro net position
Methodology

We split the dataset on a train set with 70% of the first data (in the temporal order) and the rest on the test set. The test set, unseen by the model, is used to estimate the performance of the model (see results section).

Many approaches for data engineering were use. We describe here the one with the best results.

For the temporal representation of the data, we select the past 12 entries (equivalent to 12 hours) of the data at time T, and we try to predict the next 48h (48 outputs). Using this method we also use the actual and past values of net  position to predict the future ones.
As the number of features is quite important, 2600 (~220 * 12),  we perform a feature selection before model training. We select the 100 features most correlated with the mean of the net position for the next 48h. These features where : 

  • Montenegro net position (All 12 entries)
  • Montenegro generation (All 12 entries)
  • Montenegro generation of Hydro reservoir (All 12 entries) 
  • Montenegro generation brown coal  (All 12 entries)
  • Italy center south generation of Hydro run river  (All 12 entries)
  • Bosnia Generation hydro reservoir (All 12 entries)
  • Serbia generation (All 12 entries)
  • Serbia load week ahead forecast (Max value) (All 12 entries)
  • Montenegro outage nominal capacity (All 12 entries)
Models

We tested some models like random forest, linear regressions, and many architectures of neural networks. Many parameters were tested for each models.

The best model was a neural network with this sequential architecture :

  1. Dense layer with 100 neurons, Gaussian noise on output and Identity activation function
  2. Dense layer with 4 neurons, sigmoid activation transfer
  3. Dense layer with 100 neurons, x*sigmoid(x) activation function
  4. Output with a loss based on Mean Squared error. 
Results and conclusion

Our best model results: 

  • MAE: 75.8
  • R2: 0.7309
  • sMAPE: 78.28%
  • RMSE: 100.49

 

We could not compute classical MAPE metrics, as the target had negative, positive, and zero values.

These result seem promising, as only electricity data was used. But they also show that future net position not only depends on actual electricity parameters but also on other factors. These are some improvements paths : 

  • Find the other factors that have an impact on net position and include them in the data. 
  • A better feature selection. There is a lot of features, especially when considering time horizon, but not so much historical data. We do not believe that adding more data may be helpful, as the infrastructure, and the load has changes. But we think that a more clever feature selection may improve the results.

In order to follow these paths, a second session including 3 years of meteorological data, and a handcrafted feature selection has been done.

This project is part of a project that has received funding from the European Union’s Horizon 2020 research and innovation program under  I-NERGY  grant agreement No 101016508.

I-NERGY and EU logo