OSCAR
Open Source Serverless Computing for Data-Processing Applications
OSCAR is an open-source platform that supports the serverless computing model for data-processing applications. It can be automatically deployed on multi-Clouds to create highly-parallel event-driven file-processing serverless applications that execute on customised runtime environments provided by Docker containers that run on an elastic Kubernetes cluster which can grow and shrink in terms of the number of nodes according to the workload. Users can deploy OSCAR clusters in minified Kubernetes distributions such as K3s, to support event-driven workflows along the computing continuum.
Users get an elastic cluster on which they upload files (inside the MinIO object storage) and this triggers the execution of containers out of a Docker image, which encapsulates the application and its dependencies to process that file. Elasticity is automatically provided by increasing the number of nodes of the underlying Kubernetes cluster. The output is provided to the user, so that job scheduling and workflow enactment does not need to be carried out by the user. Synchronous REST-based invocations are supported via KNative. OSCAR provides both a web-based Graphical User Interface (GUI) and a CLI (Command-Line Interface), as well as a fully-featured REST API for users with diverse sets of skills.
OSCAR, within AI-SPRINT, provides several benefits. First of all, it supports long-running computationally-intensive applications, even requiring GPU support, thus not being restricted to processing bursts of short-lived HTTP requests. Second, it provides a shell-script entry point to run applications within Docker containers, thus not requiring porting the user application to a certain programming language supported by the underlying FaaS framework. This facilitates the adoption by scientific users. Third, it can be deployed on a minified Kubernetes distribution to run on low-powered devices such as Raspberry Pis. Fourth, it supports a Functions Definition Language to define data-driven workflows along the computing continuum. Fifth, it integrates with production scientific services from the European Open Science Cloud (EOSC) both for infrastructure deployment, such as the Infrastructure Manager, for resource provisioning (EGI Federated Cloud) and for mid-term object storage systems such as EGI DataHub (Onedata) Sixth, it integrates with the AI-SPRINT monitoring system to provide usage metrics from the runtime environment. Finally, it integrates with SCAR in order to support the execution of applications along the computing continuum including public Faas services such as AWS Lambda.