AI-SPRINT facilitates the joint training of machine learning (ML) models across various entities within the computing continuum, without necessitating the direct exchange of data. Within this framework, the Secure Generative Data Exchange (SGDE) has been developed to establish, amass, and disseminate generators of data that are cognizant of privacy concerns. SGDE tackles the challenge of data accumulation by enabling the training of data generators directly on edge devices, the primary sites of data collection. These generators are trained following stringent privacy-preserving protocols, ensuring that they are incapable of reconstructing any data that could compromise user privacy, even in the presence of a malicious agent. Moreover, SGDE introduces a protocol for sharing these data generators. This allows AI developers seeking data for specific tasks to access a repository of data generators, from which they can create an extensive synthetic dataset. Such datasets can be utilised to train machine learning models, circumventing direct access to sensitive real-world data.
The Secure Generative Data Exchange (SGDE) framework addresses the challenges of data acquisition by enabling the training of data generators directly on edge devices, the locus of data collection. These generators are designed to adhere to stringent privacy-preservation protocols during training. The primary objective is to create and utilise synthetic data, thereby precluding the possibility of malevolent entities accessing private information from users' authentic data. Furthermore, SGDE establishes a mechanism for sharing these generators. This facilitates AI developers in acquiring data for specific tasks by granting access to a collection of data generators. Utilising these, developers can compile an extensive synthetic dataset to train machine learning models. Under the auspices of SGDE, users can enhance the efficacy of their machine learning models beyond the limitations imposed by reliance solely on locally sourced data.
Within the AI-SPRINT ecosystem, SGDE augments the repertoire of methodologies available to AI developers for procuring appropriate data to train machine learning models. Distinguished as the inaugural tool capable of training, collecting, and redistributing data generators in a privacy-conscious manner, these generators originate from data harvested on users' edge devices. SGDE addresses the same fundamental issue as standard federated learning; however, its approach diverges significantly. Rather than devising a new method for privacy-preserving training of machine learning models, SGDE circumvents privacy concerns by shifting the training focus to synthetic data. This strategic pivot allows for the effective utilisation of data while safeguarding user privacy.