Number of elements: 97,000,000
Overall size of the datatset: 1,000,000,000,000
Install & Run: see the "dowload instructions" document.
Additional information: All the deep features were extracted using the Caffe framework. In particular we took the activation of the neurons in the fc6 layer of the Hybrid-CNN whose model and weights are public available in the Caffe Model Zoo. The Hybrid-CNN was trained on 1,183 categories (205 scene categories from Places Database, and 978 object categories from the train data of ILSVRC2012 ImageNet) with ~3.6 million images. The architecture is the same as Caffe reference network. More information can be found on the Places-CNN model webpage at MIT .
The features are reported in zipped text files. Each file, except the last one, contains the information about 1 million images. The images have been randomly ordered. Thus, any subset can be considered statistically representative of the whole dataset. Each line is related to one image and contains the following information separated by the tab character:
- 4096 float values
Please note that the features are the activations of the fc6 layer of the Convolutional Neural Network before the ReLu and without any processing (e.g. L2Normalization). You should avoid unzipping the archive. We suggest reading the text files unzipping on the fly.
Three different features can be obtained from the dataset:
- Raw features (this is the dataset that will be downloaded): 4096 floats corresponding to raw HybridNet fc6 activations
- ReLU-L2Norm features (just apply ReLU to the Raw dataset above): 4096 floats obtained after ReLU and L2 normalization; ReLU (Rectified Linear Unit), sets to zero negative components
- Binary features (Just set to 0 if raw value <=0, and 1 if raw value >0): 4096 bits, that is 64 floats; Bit set to zero if corresponding raw component smaller or equal to zero 1 elsewhere
Dataset was split into subsets of incremental size
- Steps of 1 million
- 97 total subsets
- Ordering of objects is random
Ground truths: Exact Similarity Search. Ground truths were built for the 3 types of features above:
- Raw features: Ground truth generated using Euclidian distance on L2 normalized raw vectors; Euclidian distance on L2 normalized raw vectors
- Relu features: Ground truth generated using Euclidian distance on the vectors
- Binary features: Ground truth generated using Hamming distance
First 1000 objects of the entire datasets are used as queries. 10001 results per query at intermediate steps from the first to the 97th subset, to allow scalability tests.
- Results include the query
- Obtained performing exhaustive sequential scan
More information at www.deepfeatures.org