Zero-Shot Visual Concept Recognition

Recognizing existing visual concepts from an image or a video

Docker container

Fraunhofer IAIS

Fraunhofer IAIS

Developed by

Fraunhofer-Gesellschaft

License

Other

Intellectual property of Fraunhofer IAIS (closed source)

Main Characteristic

The concept recognition analysis service is used to detect one or more concepts that can describe a visual scene or an image.
Input: Image file or video file. You can specify which frames are to be processed from the video file.
Output: A sorted list of detected concepts will be returned for the image or each processed video frame. The list is sorted by the confidence score that is associated with each detected concept.

Research areas

Physical AI

Technical Categories

Computer vision

Keywords

Last updated

27.02.2023 - 13:02

Detailed Description

Model:
The concept recognition analysis service is using the multi-modal CLIP model from OpenAI in a zero-shot classification manner. This allows the user to modify or create a new concept-bank to be detected from query images or videos, without the need to retrain the model. Each concept in the bank can be defined using natural language sentences or paragraphs with up to 1024 characters (e.g., “A visual showing prediction of rainy weather”), and an associated label to be returned in the result (e.g., “weather forecast – rainy”). A default concept-bank is provided to detect many useful visual concepts seen in commonly broadcast content (e.g., news, sports, weather).

Results details:
A ranked list of recognized concepts with their confidence scores will be returned by the concept recognition analysis service. The number of recognized concepts may vary for different image entries.

References:

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger and Ilya Sutskever. “Learning Transferable Visual Models from Natural Language Supervision.” International Conference on Machine Learning (2021).

Trustworthy AI

The mining service is (1) lawful, as it respects all applicable laws and regulations (e. g. software licenses of used open source components), especially it is GDPR-compliant, (2) ethical, as it pursues the ethical goal of making information from documents easily accessible in digital form to the documents' owner, (3) This service is a prototype and according to the models providers, any deployed use case of the model is out of scope.

GDPR Requirements

The mining service allows the user to extract textual context from images and video files. The software itself is GDPR compliant. Images and video files are processed within a Docker container and all data remains on the user's local computer. However, the user must ensure that he has the authority to store and process the file, for example if it contains personal data or other sensitive, GDPR-relevant information.

Related Projects

AI4Media AI4Copernicus