SAMT 2006 tutorials provide both postgraduate students and general interested audience the latest research work and progress in the field of the Multimedia and the Semantic Web, so that the audience has the opportunity to gain deeper insight into the challenges related to multimedia semantics and the increasingly emerging applications relying on multimedia understanding.
- , CWI, The Netherlands
- , National Technical University of Athens, Greece
Tutorial attendees must pay the SAMT 2006 tutorial registration fee, as well as the conference registration fee.
The semantic gap is referred to frequently in papers on image retrieval or multimedia information handling. However, whilst many authors have been happy to make reference to it, few have attempted to characterize the gap in any detail. This tutorial attempts to rectify this situation by characterizing the semantic gap in image retrieval rather more specifically than hitherto. It summarises current attempts to begin to bridge the gap both through developments in content-based techniques, the application of semantic web and knowledge technologies and recent progress in auto image annotation. The tutorial consists of presentations/demonstrations partly based on research in recent European and UK projects, and particularly on a project to investigate the semantic gap funded by the Arts and Humanities Research Council in the UK involving the four presenters.
The tutorial aims to provide valuable insights for those involved in research and development on image or multimedia retrieval and who wish to understand and address the concerns of real end-users and exploit recent research results in the field. In particular, the tutorial provides practical insights in the problems associated with bridging the communication gap between the computer science/vision research community and the image management/practitioner community.
- , DFKI Saarbrucken, Germany
In the field of image/video processing, one speaks often of a "semantic gap" when it comes to the task of annotating/indexing image/video material with high-level semantics on the sole base of low-level features detected by automated image/video analysis. There is a need to add and merge semantics from the analysis of available associated modalities, like speech and text.
The European Network of Excellence K-Space, which started in 2006, is tackling the integration of semantics generated from the analysis of various modalities/media in a principled manner under the umbrella of semantic web technologies and resources. Along the lines of the agenda of this project, the tutorial presents to the (semantic) multimedia community to which extent Human Language Technology can contribute to this challenge.
- & Cees Snoek, University of Amsterdam, The Netherlands
Many solutions to image and video indexing can only be applied in narrow domains using specific concept detectors, e.g., "sunset" or "face". The use of multimodal indexing, advances in machine learning, and the availability of some large, annotated information sources, e.g., the TRECVID benchmark, has paved the way to increase lexicon size by orders of magnitude (now 100 concepts, in a few years 1,000). This brings it within reach of research in ontology engineering, creating and maintaining large, typically 10,000+ structured sets of shared concepts.
This tutorial lays the foundation for these exciting new horizons. It covers basic video analysis techniques, video indexing, connections to ontologies, and interactive access to the data.