This grocery food dataset has been collected within the project MANGO (Mobile Augmented Reality for Nutrition Guidance and Food Awareness). It contains 1719 videos comprising 23 classes, which are subdivided into 98 subclasses. Details about classes and used mobile phones can be found below in Tables 1-3.
Due to the nature of the dataset and the comprehensive annotation, it is well suited to train and test algorithms for food recognition with respect to raw grocery items (fruits, vegetables) and hierarchical classification.
|6||currants||7||chanterelles||8||champignons & mushrooms||9||apples||10||tomatoes|
The 23 classes have been further split into 35 visually distinct classes, e.g. sorts like "peppers" have been split into four classes (yellow, green, red and mixed). See Table 2 for a list of classes.
|5||blueberries||6||chanterelles||7||champignons||8||tomatoes on the wine|
|21||red currants||22||black currants||23||white currants||24||brown mushrooms|
|25||red apples||26||green apples||27||mixed tomatoes||28||beef tomatoes|
|29||Kumato tomatoes||30||iceberg salad||31||Lollo Rosso salad||32||yellow peppers|
|33||green peppers||34||red peppers||35||mixed peppers|
The videos were recorded in two SPAR grocery stores in HD (1920x1080 and 1280x720) using five mobile phones (Samsung Galaxy S2, Samsung Galaxy S3, Motorola Moto G, HTC One, LG Nexus 4). Table 3 lists the models with resolution and abbreviation used within the dataset.
|Samsung Galaxy S2||1920x1080||gxs2|
|Samsung Galaxy S3||1920x1080||gxs3|
|LG Nexus 4||1920x1080||nex4|
|Mototola Moto G||1280x720||motg|
The videos were named following a simple convention. When the name is split using "-" as delimiter, the first number is the main class of the above listed 23 food classes. The second number is the subcategory (one of total 98) and the third is a sequential number of recordings with the same mobile phone and of the same food item. These three numbers are followed by the abbreviation of the mobile phone as listed in Table 3. For some videos there is another tag "i1" or "i2" as abbreviation for an intuitive recording, where the subjects were not instructed beforehand.
The second number is intended to be used for hierarchical computer vision methods, where a class is further split into subclasses.
Example: 2-7-2-htco-i1.mp4 denotes the 7th subclass of apricots and the 2nd recording of this class with the mobile phone HTC One.
By downloading the database you agree to the following restrictions:
If you agree with the terms of the license agreement contact Dušan Malić (dusan.malicnoSpam@tugraz.at) to obtain download instructions.
Please send the email from your official account so we can verify your affiliation and include your
This work was supported by the Austrian Research Promotion Agency (FFG) under the project Mobile Augmented Reality for Nutrition Guidance and Food Awareness (836488).