The goal for prototype development at UiB has been to
explore methods and techniques for multi-modal information retrieval (image, text & audio),
using multi-modal search criteria (image and GPS) submitted via a mobile phone.
The application chosen was providing support for tourists visiting the city of Bergen.
Three main prototypes have been developed for this application:
- The BergenBy database contains nearly 600 images of
over 50 landmark objects (buildings, statues, monuments and ships) within the Bergen city area.
Each object has been annotated with keywords, a descriptive text and an audio version of the text.
The image collection includes single and multiple object photographs,
historic photographs and machine and human created drawings.
Search images are kept, manually annotated and added to the primary DB by the project's DB administrator.
The images have been taken or constructed by members of the caim/UiB team and are thus
copyrighted to this project.
The database structure and content are described in T.Møller's 2009 report
Bergen By - a Multi-Modal Image Database. and in E.Parmann's 2010 report
A Metadata Editor for BergenBy - a Multi-Modal Image Database
- The MMIR - Mobile Multimedia Image Retrieval - prototype
runs on a NOKIA N5800 mobile phone, which features common camera functionality, GPS and
a sensitive (sketch pad) screen that supports free-hand drawing.
MMIR provides support for:
The final MMIR prototype is documented in Parmann,E.
MMIR4 - Mobile Multimedia Image Retrieval (ver.4). 2010.
Implementation of image component selection and the drawing feature are documented in
MMIR3 - Mobile Multimedia Image Retrieval (ver.3). 2009.
- Information search using:
a new camera image with GPS coordinates;
an image drawn on the Phone's screen/sketch pad;
a selected area of the camera or drawn image; or
a (selection of a) search image from the phone's memory.
- Results are presented as sets of 6 captioned images, fitted to the N5800 phone display screen.
Text and/or audio information about the content of an image can be displayed by
Result sets can be stored on the phone for viewing and information retrieval
at a later time.
- The VISI prototype was initially developed in 2005 at UiB
as a content-based image retrieval (CBIR) system.
During the caim project, VISI has been extended in multiple iterations
to support the development of MMIR.
In it's final version VISI is a full multi-modal, context aware
image-based information retrieval system.
- The initial prototype,
VISI/Maritime, supports image retrieval using uploaded images
or user constructed drawings.
The Maritime database used contains 400+ images of predominantly, marine animals.
This version of VISI was used as a basis for in a PhD thesis project,
Drawing visual query images. (7Mb) by Lars-Jacob Hove (2010).
VISI3/BergenBy prototype extends basic CBIR search with
text (TBIR), GPS data and the combinations CBIR-TBIR and CBIR-GPS.
VISI3 and MMIR use the same BergenBy image collection.
This prototype is documented in C.Carlson.
VISI3 - Context Aware Image Retrieval. 2009.
- Information search using:
An existing DB or uploaded image;
an image drawn on a sketch pad;
an image + GPS location and/or keyword characteristic; or
an image from a preceding result set.
- The result presents each image with its caption.
Selection of an image displays text information about the object(s) it depicts,
as well as other images of this/these objects.
VISI4/BergenBy prototype extends VISI3 functionality with support for context aware feedback.
Results are now presented in pages of 8 captioned images on which the user can mark relevant images.
On submission of the relevant images, keywords are presented to the user for further refinement of the result set.
VISI4 also includes an improved result ranking function for CBIR+TBIR searches.
This prototype is documented in Døskeland,Ø.
VISI4 - Supporting Relevance Feedback in Context Aware Image Retrieval, 2010.
The Telenor prototypes are described in
Sigmund Akselsen, Bente Evjemo and Anders Schürmann. (2008)
CAIM project meeting, Tromsø 25.09.2008
Telenor R&I, Products and Markets
- M2S - Tourist information in multiple channels(images of info guide ads)
- TIFF - Tromsø International Film Festival event assistant
- Smart Binoculars - Points of Interest info in images
(camera position, direction and depth of field)
- Visual search client for iPhone
(general content based image recognition)
- TVG - Tromsø Visual Guide
(images of sculptures and buildings, position, ads)
Modified: 30.01.2012 © University of Bergen - by:
[an error occurred while processing this directive]