Document Images Classification

Developed by

License

GNU General Public License (GPL) v3

Main Characteristic

This is a computer vision tool specifically designed to automate the classification process of document images; currently the images can be classified in seven possible classes. The proposed solution exploits a convolutional neural network pre-trained on the ImageNet dataset.

Technical Categories

Computer vision Machine learning

Keywords

Deep Learning Classification

Last updated

25.05.2021 - 15:19

Detailed Description

Classification of documents has usually been performed following two main approaches: 1) analyse the textual content in order to extract text-based features; and/or 2) design a set of visual features to be extracted from the image of a document. These features were then used to train a machine learning model aiming to classify the documents under analysis. The recent advances in deep learning introduced the possibility to learn both the features and the classifier directly from the data improving the final performance with respect to classical handcrafted solutions.

This is a computer vision tool specifically designed to automate the classification process of document images; currently the images can be classified in seven possible classes. The proposed solution exploits a convolutional neural network pre-trained on the ImageNet dataset. In particular, the architecture used is ResNet50, a very deep convolutional neural network which exploits residual connections to improve the final training performance. Transfer learning is then applied in order to leverage knowledge previously learnt from the visual objects’ domain and “transfer it” to a new domain (e.g., document images). This is obtained by fine-tuning the pre-trained network on a usually smaller dataset from the new domain. The fine-tuning process requires to substitute the existing classification layer with a new one in order to adapt the network to the new task. Then a training is performed, using the images from the new domain, so as to slightly modify the existing weights and obtain a new set of features which is able to better discriminate the images of the new task.

Format: Protobuf (Tensorflow)

Trustworthy AI

No impact.

GDPR Requirements

No impact. Document Images Classification is not directly affected by GDPR. However, since the tool is expected to handle media resources, the data provider should take car that the provided dataset is compliant with GDPR regulations.