logo logo logo



UiB 2006-2010:
    The goal for prototype development at UiB has been to explore methods and techniques for multi-modal information retrieval (image, text & audio), using multi-modal search criteria (image and GPS) submitted via a mobile phone. The application chosen was providing support for tourists visiting the city of Bergen. Three main prototypes have been developed for this application:
    1. The BergenBy database contains nearly 600 images of over 50 landmark objects (buildings, statues, monuments and ships) within the Bergen city area. Each object has been annotated with keywords, a descriptive text and an audio version of the text. The image collection includes single and multiple object photographs, historic photographs and machine and human created drawings. Search images are kept, manually annotated and added to the primary DB by the project's DB administrator. The images have been taken or constructed by members of the caim/UiB team and are thus copyrighted to this project.
    2. The database structure and content are described in T.Møller's 2009 report Bergen By - a Multi-Modal Image Database. and in E.Parmann's 2010 report A Metadata Editor for BergenBy - a Multi-Modal Image Database

    3. The MMIR - Mobile Multimedia Image Retrieval - prototype runs on a NOKIA N5800 mobile phone, which features common camera functionality, GPS and a sensitive (sketch pad) screen that supports free-hand drawing. MMIR provides support for:
      • Information search using: a new camera image with GPS coordinates; an image drawn on the Phone's screen/sketch pad; a selected area of the camera or drawn image; or a (selection of a) search image from the phone's memory.
      • Results are presented as sets of 6 captioned images, fitted to the N5800 phone display screen. Text and/or audio information about the content of an image can be displayed by selecting it. Result sets can be stored on the phone for viewing and information retrieval at a later time.
      The final MMIR prototype is documented in Parmann,E. MMIR4 - Mobile Multimedia Image Retrieval (ver.4). 2010. Implementation of image component selection and the drawing feature are documented in Hellevang,M. MMIR3 - Mobile Multimedia Image Retrieval (ver.3). 2009.

    4. The VISI prototype was initially developed in 2005 at UiB as a content-based image retrieval (CBIR) system. During the caim project, VISI has been extended in multiple iterations to support the development of MMIR. In it's final version VISI is a full multi-modal, context aware image-based information retrieval system.
    • The initial prototype, VISI/Maritime, supports image retrieval using uploaded images or user constructed drawings. The Maritime database used contains 400+ images of predominantly, marine animals. This version of VISI was used as a basis for in a PhD thesis project, Drawing visual query images. (7Mb) by Lars-Jacob Hove (2010).
    • The VISI3/BergenBy prototype extends basic CBIR search with text (TBIR), GPS data and the combinations CBIR-TBIR and CBIR-GPS. It supports:
      • Information search using: An existing DB or uploaded image; an image drawn on a sketch pad; an image + GPS location and/or keyword characteristic; or an image from a preceding result set.
      • The result presents each image with its caption. Selection of an image displays text information about the object(s) it depicts, as well as other images of this/these objects.
      VISI3 and MMIR use the same BergenBy image collection. This prototype is documented in C.Carlson. VISI3 - Context Aware Image Retrieval. 2009.

    • The VISI4/BergenBy prototype extends VISI3 functionality with support for context aware feedback. Results are now presented in pages of 8 captioned images on which the user can mark relevant images. On submission of the relevant images, keywords are presented to the user for further refinement of the result set. VISI4 also includes an improved result ranking function for CBIR+TBIR searches.
      This prototype is documented in Døskeland,Ø. VISI4 - Supporting Relevance Feedback in Context Aware Image Retrieval, 2010.

    Telenor 2007-2010:
      The Telenor prototypes are described in Sigmund Akselsen, Bente Evjemo and Anders Schürmann. (2008) CAIM Prototypes. CAIM project meeting, Tromsø 25.09.2008 Telenor R&I, Products and Markets

    • M2S - Tourist information in multiple channels(images of info guide ads)

    • TIFF - Tromsø International Film Festival event assistant (barcodes).

    • Smart Binoculars - Points of Interest info in images (camera position, direction and depth of field)

    • Visual search client for iPhone (general content based image recognition)

    • TVG - Tromsø Visual Guide (images of sculptures and buildings, position, ads)


Modified: 30.01.2012 © University of Bergen - by: [an error occurred while processing this directive] Lars-Jacob Hove