JedAI
The Force Behind Entity Resolution. Perform State Of The Art Entity Resolution With The Java Generic Data Integration Toolkit.
JedAI comprises a set of domain-independent, state-of-the-art techniques that apply to any domain. At their core lies an approximate, schema-agnostic functionality based on blocking for high scalability. JedAI constitutes an open source, high scalability toolkit that offers out-of-the-box solutions for any data integration task, e.g., Record Linkage, Entity Resolution and Link Discovery. At its core lies a set of domain-independent, state-of-the-art techniques that apply to both RDF and relational data. These techniques rely on an approximate, schema-agnostic functionality based on (meta-)blocking for high scalability.
JedAI constitutes an open source, high scalability toolkit that offers out-of-the-box solutions for any data integration task, e.g., Record Linkage, Entity Resolution and Link Discovery. At its core lies a set of domain-independent, state-of-the-art techniques that apply to both RDF and relational data. These techniques rely on an approximate, schema-agnostic functionality based on (meta-)blocking for high scalability.
JedAI can be used in three different ways:
- As an open source library that implements numerous state-of-the-art methods for all steps of the end-to-end ER work presented in the figure below.
- As a desktop application with an intuitive Graphical User Interface that can be used by both expert and lay users.
- As a workbench that compares the relative performance of different (configurations of) ER workflows.
This repository contains the code (in Java 8) of JedAI's open source library. The code of JedAI's desktop application and workbench is available in this repository.
Several datasets already converted into the serialized data type of JedAI can be found here.
You can find a short presentation of JedAI Toolkit here.
How to add JedAI as a dependency to your project
Visit https://search.maven.org/artifact/org.scify/jedai-core
How to run JedAI as a Docker image
After installing Docker on your machine, type the following commands:
docker pull gmandi/jedai-webapp docker run -p 8080:8080 gmandi/jedai-webapp
Then, open your browser and go to localhost:8080. JedAI should be running on your browser!
How to use JedAI with Python
You can combine JedAI with Python through PyJNIus (https://github.com/kivy/pyjnius).
Preparation Steps:
- Install python3 and PyJNIus (https://github.com/kivy/pyjnius).
- Install java 8 openjdk and openjfx for java 8 and configure it as the default java.
- Create a directory or a jar file with jedai-core and its dependencies. One approach is to use the maven-assembly-plugin (https://maven.apache.org/plugins/maven-assembly-plugin/usage.html), which will package everything to a single jar file: jedai-core-3.0-jar-with-dependencies.jar
Consortium
JEDAI is a collaboration project involving the following partners: