s-X-AIPI self-X Data Exploration in R
self-X autonomic supervised and unsupervised feature selection
self-X autonomic supervised and unsupervised feature selection
"selfX" R package includes the implementation of several functions to perform data exploration activities, specifically feature selection, in a more autonomic way. This R package is implemented as open-source software and offered to the R community for use and improvement.
The s-X-AIPI project has received funding from the European Union's Horizon Europe research and innovation programme under grant agreement Nº 101058715 (s-X-AIPI)
Feature Selection in Data Mining is a complementary technique that also helps to reduce the dimensionality of the data on which to subsequently apply the modelling algorithms, i.e., to enhance the quality and efficiency of clustering and/or predictive models, enabling them to focus on the most relevant and significant features for the given problem. In this way, the quality of the results is improved, their visualization and interpretation is facilitated, and the problems associated with high-dimensional data sets are alleviated.
With the aim of making feature selection more autonomic, an ad-hoc procedure was defined as part of the Asphalt Use Case within the s-X-AIPI project and subsequently adapted for general use. The approach can be summarized as follows:
- Initially, well-known techniques (PCA/FA for unsupervised and RELIEF for supervised approaches) are used to select the most important features under different criteria.
- Subsequently, an OWA operator is applied to select the three most relevant ones.