Federated Learning
The application contained in this folder creates an AI model using Federeated Learning. Unlike other applications, which have a centralized training dataset, this version pursues training the model using multiple datasets hosted in different institutions without need of transferring them and, hence, faouring the privacy of the data collected by each institution. In this case, each institution trains a RandomForest model from the dataset available in it. The model trained by each institution is transferred to a central institution which merges all the models to create a single model.
The Federated Learning is based on the definition of a COMPSs application deployed on several agents on different hosts. Each agent has a part of the dataset used to train a local model. One of the agents acts as a server delegating the execution of the training based on the location of the data. The train task can be implemented using different models, as for example a Random Forest implementation of dislib. The server then computes a prediction for each model, averaging the parameters and updating the centralised model.