Entity Recognizer

Deep learning based extraction of named entities from text documents.

Docker container

AI4EU Experiments Platform

AI-Cafe: Using deep-learning approaches for extracting entities

Fraunhofer IAIS

Developed by

Fraunhofer-Gesellschaft

License

Other

Intellectual property of Fraunhofer IAIS (closed source)

Main Characteristic

The entity recognizer is a deep learning-based solution that takes a text document as input and returns a list of instances of pre-defined entities (Person, Location, Organization, Miscellaneous).

It uses bidirectional LSTM networks to generate informative word representations that capture the contextual dependencies between words in a sentence. Additionally, a CRF layer is added on top for a higher tagging accuracy. The models have been built using Flair, a PyTorch-based NLP framework.

This tool includes a multilingual NER model supporting English, German and Dutch.

Technical Categories

Natural language processing

Keywords

Last updated

30.05.2023 - 10:14

Detailed Description

The provided container is packaged in the following asset deployed in the AI4EU Experiments Platform: https://aiexp.ai4europe.eu/#/marketSolutions?solutionId=e3794e16-0225-4bf1-a99c-b99638a22232&revisionId=f7447500-0c8d-4ca7-be7e-24ce3fefd144

Additional information:

The multilingual entity recognizer has been trained over aggregate CoNLL-2002/2003 data for German, English and Dutch.

In addition, two entity recognizers for German only have been trained on the two biggest available datasets for German: CoNLL-2003 and GermEval2014.

References: Gugu, Andel, Evaluation and re-usable implementation of DL-based approaches for Entity Recognition, 2021, Master Thesis.

Documents

Thesis: Evaluation and re-usable implementation of DL-based approaches for Entity Recognition

Trustworthy AI

The NER software is (1) lawful, as it respects all applicable laws and regulations (e. g. software licenses of used open source components), especially it is GDPR-compliant, (2) ethical, as it pursues the ethical goal of making information from documents easily accessible in digital form to the documents' owner, (3) robust, from a technical perspective, especially as it is deployed in a "ready-to-use" Docker container, to make processing documents as simple as possible.

GDPR Requirements

The NER software allows the user to extract named entities from text documents. The software itself is GDPR compliant. Documents are processed within a Docker container and all data remains on the user's local computer. However, the user must ensure that he has the authority to store and process the document, for example if it contains personal data or other sensitive, GDPR-relevant information.

Related Projects

AI4Media