AI4Citizen

The School-Work Alternation (SWA) programme fostered by the European Commission calls for reinforcing the partnership between educational institutions and the job market (Figure 2.1.1). The AI4Citizen pilot, a novel software pilot developed in the realm of the H2020 AI4EU project, has been conceived to support a real-world SWA programme. AI4Citizen provides AI tools to help automate the current process implemented by the studied SWA programme; along with further AI tools to enable a novel process that supports the allocation of teams of students to internships, and therefore facilitate team-based learning and the acquisition of teamwork skills. Interestingly, from an applied AI perspective, the AI4Citizen pilots results from pipelining a variety of AI technologies. Thanks to our exhaustive empirical analysis, we observe that the AI4Citizen pilot is ready for deployment in a real-world setting.

Main Contributors
Athina Georgara (IIIA-CSIC), Raman Kazhamiakin(FBK), Ornella Mich (FBK) Jean-Christophe Pazzaglia (SAP), Juan Antonio Rodríguez-Aguilar (IIIA-CSIC)

Categories

Business Category

Public Services

Technical Category

AI ethics Machine learning Natural language processing Optimisation

General Presentation

The School-Work Alternation (SWA) programme^{^[1]} fostered by the European Commission calls for reinforcing the partnership between educational institutions and the job market (Figure 2.1.1). The AI4Citizen pilot, a novel software pilot developed in the realm of the H2020 AI4EU project, has been conceived to support a real-world SWA programme. AI4Citizen provides AI tools to help automate the current process implemented by the studied SWA programme; along with further AI tools to enable a novel process that supports the allocation of teams of students to internships, and therefore facilitate team-based learning and the acquisition of teamwork skills. Interestingly, from an applied AI perspective, the AI4Citizen pilots results from pipelining a variety of AI technologies. Thanks to our exhaustive empirical analysis, we observe that the AI4Citizen pilot is ready for deployment in a real-world setting.

Figure 2.1.1: The School Work Alternation Program

Innovation is not only about delivering the best algorithm but to use information technology to facilitate people life. During the project we followed a Design Thinking led development process according to SAP best practices to elicitate users requirements and to gather users feedbacks.

AI4Citizen provides AI tools to help automate the current process implemented by the SWA programme; along with further AI tools to enable a novel process that supports the allocation of teams of students to internships, and therefore facilitate team-based learning and the acquisition of teamwork skills. Following our DT work, the scenario to be implemented is:

A news published on the school billboard informed all students that it is time to think about their SWA, and that diﬀerent social media channels (e.g. Facebook, Twitter) can be used by students to search and enroll in an internship. These channels are connected with the internship oﬀers proposed by companies, universities, research institutes, but also - in a secure way - to the SWA-FBK system, which means that it can access all the known information about each student in the school (e.g. competences, previous internship experiences) and past year statistics and testimonials. During her quest, Ludovica connects to the chatbot, using the SWA-FBK credentials, and initiates a discussion session with the general chatbot. The general chatbot discerns where Ludovica’s interests lie, ﬁrst using the information already memorized in the SWA-FBK system, but also asking Ludovica some more information (e.g. when she would like to do the internship; where she would prefer to do it – close to the school, close to her home; if she would like to do it with some friends). The chatbot answers the most common questions related to the process and redirects the family’s friend question to Arnoldo, who only receives a fraction of the questions and thus can answer in due time. He also manages to dedicate one hour to all students to talk about their desiderata and expectations and the discussion with Ludovica was really fruitful, showing a real desire to accomplish her ﬁrst choice. The chatbot then proposes several internship programs that lie in its ﬁeld and gathers Ludovica’s interests and preferences over them. These preferences are passed to the team formation algorithm to be used in this process. After a couple of hours, Arnoldo received the assignment provided by the algorithm that shows the score of the overall assignment and the individual team ﬁt. Remembering the discussion with Ludovica, he suggests the algorithm to consider her ﬁrst choice, however the results show that this constraint decreased the global assignment score and the ﬁt of ﬁve teams including Ludovica’s one. Arnoldo decides therefore to follow the ﬁrst recommendation and eventually explain to Ludovica the rationales behind the assignment to her second choice.

Interestingly, from an applied AI perspective, the AI4Citizen pilots results from pipelining a variety of AI technologies: (1) NLP algorithms for extracting competences and skills oﬀered by students in their curricula and requested by companies in their internship oﬀers, and for, later on, matching them; (2) a chatbot to assist students in selecting internships; (3) a novel algorithm to group students into teams to allocate them to the internships on oﬀer.

More in detail, the development of the AI4Citizen pilot builds upon the following contributions:

An NLP-based tool to match students with internships.(FBK) - The essential problem here is to make the two parties speak the same language when it comes to the characterization of the skills and competences required by the companies and those owned by the candidates as the result from the activities and experiences that they have carried out during their career and/or studies. Indeed, frequently the descriptions of experiences are in free text, making it diﬃcult to match requests and oﬀers in an automated manner. To address this issue, we have deﬁned and developed a tool that aims at bridging the gap between the required and provided competences when expressed as a free text description leveraging ESCO^{^[2]} a multilingual classiﬁcation of European Skills, Competences, Qualiﬁcations and Occupations. It contains a taxonomy of 13,485 competences and 13,485 jobs, connected with relations.
A chatbot to assist students.(SAP) - In order to develop the AI4Citizen chatbot, we used SAP Conversational AI^{^[3]} a collaborative end-to-end platform for creating chatbots. Our chatbot understand 28 diﬀerent types of student intentions related to gather information about the process but also to express and review preferences. The NLP engine recognizes the intents and triggers suitable skill to interact with the end users. These skills interact with a dedicated components interacting with the database minimizing the need to exchange personal information outside the FBK perimeter.
A novel heuristic algorithm to match teams of students with internships (IIIA-CSIC) - Our proposed algorithm initially ﬁnds an eﬃcient and feasible allocation by exploiting the distribution of students’ capabilities over the internship requirements; and then iteratively improves the initial allocation by swapping members among working teams. As we show the initial allocation is computed in polynomial time in the number of internship and the number of students, hence improving the allocation becomes an anytime, heuristic algorithm.
An empirical evaluation of the chatbot usability (SAP, FBK). We performed the evaluation of the chatbot against 55 students from 3 diﬀerent classes. The main mission was to select three wishes and to further understand the internship context. The ﬁrst class was aﬀected by a default that prevented 10% of the student to achieve their task, this default was later ﬁxed. All students (e.g. 90%) that performed the mission did it in less than 30 minutes and were reasonably satisﬁed by the experience, satisfaction improved thanks to the ﬁxes but also the enrichment of vocabulary due to former interactions. Still, the main usability criticism that emerged was that the chatbot was providing too many guidelines while they should have preferred a free interaction. While this should be taken into consideration for the future, further experiment should understand the impact on the task completion and the time to complete the global mission.

Figure 2.1.2: Heuristic vs Expert vs Random. Percentage out of 29 tournaments.

An empirical evaluation of the team allocation algorithm (IIIA-CSIC). We compare our heuristic algorithm against CPLEX^{^[4]}, a state-of-the-art linear programming solver, using synthetic data, and show that it outperforms CPLEX in terms of solving time. Furthermore, we use real world data from students that must be allocated to internships. Our heuristic algorithm solves the problem while CPLEX cannot generate the constraints in a reasonable amount of time. This evaluation illustrates that our heuristic algorithm solves large problem instances that CPLEX cannot handle.

An expert-driven validation of the team allocation algorithm (IIIA-CSIC, FBK). A group of experts in education conﬁrm that the allocations produced by our heuristic algorithm are better than those manually produced by experienced teachers (Figure 2.1.2).

Cooperation between the partners

All partners were involved in the ideation and design of the pilot. The collaboration of the different partners leveraged their areas of expertise:

SAP oversaw the development of the pilot, they provided the chatbot component and business modeling capabilities;
FBK provided its SWA know how and its field experience as the operator for the Trento region SWA system. This included the access to real – albeit obfuscated – data, but also the development of new components to facilitate the dialog with the students using data clustering methods; and access to students and teachers for the evaluation of the pilot;
CSIC focused on the modeling of the team formation problem, which is NP-hard. A specific effort was also dedicated to designing an algorithm for computing the optimal distribution of teams based not only on companies’ requirements, but also on pedagogical aspects and students’ preferences. Furthermore, CSIC has conducted a systematic evaluation to show that the team formation algorithm outperforms state-of-the-art approaches^{^[5]}.

A weekly meeting was setup between partners to share knowledge, to plan the activies, to align development, etc.

Exploitation of the AI4Europe platform

The architecture of AI4Citizen aims at pipelining the diﬀerent AI software modules and algorithms identiﬁed as necessary to produce a new service (competence and skill extraction, chatbot, and team formation), as well as at integrating them with the existing IT systems at FBK that currently handle the information about students (their professional studies and activities) and job oﬀers. Such orchestration and integration results in a prototype implementation for our end-to-end scenario.

Figure 2.1.3: Information Flow and Partner Responsibilities

In a nutshell, Figure 2.1.3 describes the logical architecture of the AI4Citizen pilot together with the partner in charge, which comprises the following elements:

A module for interfacing with and integrating from the external — preexisting — IT system that contains and manages the information regarding students’ proﬁles, their activities and experiences, etc. Depending on the speciﬁc implementation of the external system, this module may require storing (a part of) the information internally, performing periodic synchronization, etc.
A module for interfacing with and integrating from the external IT system that contains and manages information about job/internship oﬀers. Also, in this case the integration depends on speciﬁc implementation.
Data management services to facilitate the ”normalization” of the data in terms of vocabularies, taxonomies, for the purpose of the oﬀer / candidate matching. More speciﬁcally, the two key services here are: (i) skill matching to associate an entity with the skills and competences in the ESCO ontology; and (ii) a multi-dimensional classiﬁcation of internships to guide the selection process. The latter service characterizes oﬀers in diﬀerent useful ways (e.g. activity domain, geographical distribution, context (e.g. private vs. public hosting entities), etc.).
A team formation service to match teams of students to internships taking into account competences, preferences, and availability in a holistic, cross-organizational, manner.
An internship browser that brings this information together, exposing diﬀerent APIs for searching, matching, and selecting oﬀers, as well as for storing preferences, and matching teams to available oﬀers. The component exposes the necessary APIs across for multi-channel information access and preference management.
A chatbot-based Natural Language Processing (NLP) and User Interaction (UI) service and its associated chatbot logic component that make the AI4Citizen pilot available to the involved actors, in particular to students and teachers. While the role of the UI is to make AI4Citizen directly available on the web or a cell phone, the chatbot logic drives the selection process and helps collect students’ preferences.

Figure 2.1.4: AI4Citizen assets

As reported in D6.1 and shown on figure 2.1.4, we published our resources in the AI4EU catatogs^{^[6]}. We recently integrated the two main reusable components: Competence & Skills extraction and Team allocation on the AI4EU experiments platform. As the components were developed in parallel with the experiment platform, we did not leverage the platform at development time. The existing contractual framework and the GDPR constraints between FBK and the autonomous province of Trento also forbid to use it for deployment time but does not preclude to perform experiments with other available components.

Our implementation follows the principles of the Cloud-Native Applications, and uses a container-based component model, where each separate micro-service is deployed in a Docker container on top of the Cloud infrastructure. Regarding our implementation, we deployed all components on the FBK cloud provided by MS Azure IaaS with the exception of the Chatbot NLP and UI component that run on the SAP cloud and interact with the SAP logic thanks to a REST API.

Ethical Assessment of the pilot

From the inception of the AI4EU project, we thought that the AI4Citizen pilot had a special duty to address privacy and responsible AI thoughtfully. These topics are indeed critical for the acceptance of AI based services run by governments and government’s agencies and to reduce the fear of dystopian scenarios. The SWA case study itself is particularly demanding since the system needs to interact with minors that are considered as vulnerable persons^{^[7]} by the General Data Protection Regulation (EU) 2016/679. Supporting the decision of the recommendation for internships is also critical and, albeit diﬀerent, closely related to the high risk scenarios identiﬁed in the recent proposed European Artiﬁcial Intelligence Act^{^[8]} with respect to vocational training and employment. During the design phase, in relation with the GDPR and to be compliant with the existing contract between FBK and the region of Trento, we designed our solution to minimize the personal information exchanged between the components of the system. For example, the match between students and internships is done by a component hosted within the FBK network, similarly the matching algorithm only manipulates a set of preferences without access to students nor PII. The information exchanged by the components is clearly described by a set of public APIs avoiding direct access to the underlying database. Finally, communication is done over secure communication channels with client authentication.

With respect to the usage of AI, we analyzed the intent of the automation and the impact on the two types of users - students and professors - of the solution. For students, our solution enables to gather the student’s preferences by recommending a set of proposals based on the tagging done by the school and automated mapping to the job ontology, the ﬁt between the individual skills and the internship skills is provided as information and nothing precludes a student to choose a internship that exhibit a bad ﬁt. Since we are lacking students’ feedback on former internships, we unfortunately cannot currently provide further recommendations based on them. The eﬀective assignment of students to internships is ultimately done by the professor, our system - the edu2com component - is only providing a recommendation that relies on the preferences expressed by the diﬀerent students and the skills acquired during the curriculum mimicking the professor’s behaviour but exploring in an exhaustive manner the diﬀerent combinations. Still to support the professor to take an informed decision, we provide a metric that provides the overall solution ﬁt, the group ﬁt, and the individual ﬁt. On the last version of the system, we also introduced the possibility to the professor to challenge certain choices by asking targeted questions (cf. next section).

To cross check the soundness of our global approach, we evaluated our solution - each individual component and the end-to-end solution - against the preliminary questionnaire provided by the AI4EU ethical experts that is largely inspired by the ALTAI guidelines^{^[9]}. We did not detect violation to these principles and all components were below the high-risk level threshold except for the question 10. With respect to question 10 “Open-source code: Is the development participatory and multidisciplinary?, What kind of access to the code and development is there?”, we contest the assumption that non open source code are ranked in the low category while no studies linked the quality of software with the licensing terms. We perceived that he can be damageable from from an industrial perspective.

This collaboration contributed to the evolution of the assessment questionnaire that was later published in ^{^[10]}.

Business perspective for SWA alike scenario

Figure 2.1.5: Data Driven Innovation analysis of AI4Citizen

While it was not within the initial scope of the pilot and in order to better understand the potential business value of the AI4Citizen alike scenario, we decided to use the Data Driven Innovation Framework^{^[11]} (DDI) to help us to develop a consistent strategy and to explore all the dimensions of our use case, in order to potentially implement it outside Trento. The DDI Canvas guides you in exploring all relevant dimensions on the supply and demand side of a data-driven innovation in systematic manner. Using such methodology in research projects is unusual, but was the key to better understanding 1) the data required to realize a full ﬂedge version of our proof of concept, 2) to identify the network of actors required to achieve this vision, and 3) some avenues to make such software sustainable. This DDI use case was presented during EBDVF 2020^{^[12]}.

With respect to data, we identiﬁed that currently the main gap was the lack of students’ feedbacks on former internships to enhance the advice provided by the system, similarly very few data are collected to improve the process itself and to capitalise on previous students’ interrogations. We strongly recommend to the institutes to collect such information in a systematic way and to share these data after anonymisation across institutes; this will be the unique way to propose better support to students. With respect to the ecosystem, while the institutes and the ministry of education are the principal actors, we believe that the students may beneﬁt from opening up to the business social network (e.g., LinkedIn, Twitter, Glassdoor) reﬂecting current practices in the business world in alignment with the rationales behind the SWA. While this can only be applied to 18 years old student, this will enable them to start to build a professional presence and business network, sharing insights on their experience and eventually get acknowledgements by their mentors or people from the company. With the proper connectors, this can also be a way to generate information in natural language that can also be retroﬁtted to the system. Finally, the cost to deploy, to maintain and to enrich such systems may beneﬁt from a modern approach that leverage not only the institutions but also other actors. While one can argue that it is against educative system ethic, a too conservative approach and the impact on resources to develop a smart software may pave the way to private companies that will propose orientation services that will be accessible by only privileged students.

Conclusion

The COVID-19 pandemia had a direct and strong impact on our original experimentation plan. Effective internships within the Italian Asl Scheme were basically stopped from January 2020 to the end of the project, the access to students for experimentation quite complex, and the professor had to handle emergency duties and to deal with the extra workload to setup distance learning and manage preventive measures. We had no choice but to use labs and small scale experimentations, it also forbids us to gather students feedback on their SWA experiment that should have pave the way for a recommendation system to better help students to choose their internships (cf. gap identified by DDI). Still the pilot managed to deliver an end to end process to support the process with components that will be used within the autonomous province of Trento. The Team formation components is also tested within other projects. The task activities have directly led to several publications and submissions^{^[13]}

[1] In Italy Aternanza Scuola Lavor[1] (ASL or SWA)was established by the Law n. 107 – 2015 – https://www.gazzettaufficiale.it/eli/id/2015/07/15/15G00122/sg,more details can be found in the Italian Education Ministry (MIUR) website http://www.alternanza.miur.gov.it/cos-e-alternanza.html

[2] ESCO, European skills, competences, qualiﬁcations and occupations, https://ec.europa.eu/esco/portal (2010)

[3] SAP Conversational AI https://cai.tools.sap/

[4] IBM, Ibm ilog cplex optimization studio 12.10, https://www.ibm.com/us-en/marketplace/ibm-ilog-cplex (2019)

[5] Athina Georgara, Carles Sierra, Juan A. Rodríguez-Aguilar. TAIP: an anytime algorithm for allocating student teams to internship programs. In The 11th International Workshop and Optimization and Learning in Multiagent Systems.

[6] https://www.ai4europe.eu/research/ai-catalog

[7] GDPR: who are considered as vulnerable persons? https://onderzoektips.ugent.be/en/tips/00001782/

[8] Proposal for a regulation laying down harmonised rules on artificial intelligence (artificial intelligence act) (May 2021). URL https://digital-strategy.ec.europa.eu/en/library/proposal-regulation-laying-down-harmonised-rules-artificial

[9] H.-L. E. G. on Artificial Intelligence", Assessment list for trustworthy artificial intelligence (July 2020). https://digital-strategy.ec.europa.eu/en/library/assessment-list-trustworthy-artificial-intelligence-altai-self-assessment

[10] V. Dignum, J. C. Nieves, A. Theodorou, A. A. Tubella, An abbreviated assessment list to support the responsible development and use of AI (April 2021).

https://webapps.cs.umu.se/uminf/index.cgi?year=2021&number=3

[11] https://ddi-canvas.com/ Prof. Sonja Zillner

[12] We are proud to present the DDI Framework at the EBDVF – the yearly flagship event of the European Data & AI community. Together with SAP and Atos we want through two AI4EU pilots and real life cases, where we were able to demonstrate the layout and application of the DDI Canvas. In our workshop session with over 100 attendees. https://ddi-canvas.com/introducing-the-ddi-framework-at-the-european-big-data-value-forum/

[13] Athina Georgara, Raman Kazhamiakin, Ornella Mich, Alessio Palmero Aprosio, Jean-Christophe Pazzaglia, Juan A. Rodríguez-Aguilar, Carles Sierra. An anytime heuristic algorithm for allocating many teams to many tasks. Submitted to the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022). Ranking CORE: A*.

Athina Georgara, Juan A. Rodríguez-Aguilar, Carles Sierra. Towards a competence-based approach to allocate teams to tasks. In Proceedings of the 20th international conference on autonomous agents and multiagent systems (AAMAS 2021), pp. 1504-1506, London, UK, May 3-7, 2021. Ranking CORE: A*.

Athina Georgara, Juan A. Rodríguez-Aguilar, Carles Sierra. Edu2Com: an anytime algorithm to form student teams in companies. In IJCAI 2020 Workshop on AI for Social Good. Published at: https://crcs.seas.harvard.edu/files/crcs/files/ai4sg_2020_paper_18.pdf

AI4EU in practice: the industrial pilots perspective – EBDVF 2021 session

Athina Georgara, Juan A. Rodríguez-Aguilar, Carles Sierra. TAIP: an anytime algorithm for allocating student teams to internship programs. In 11th International Workshop on Optimization and Learning in Multiagent Systems, Auckland, New Zealand, May 8th 2020. Published as arXiv:2005.09331.