Image and Video Analysis

Tools

ViRaL is a content-based image search engine. It does not only retrieve visually similar images, but also identifies where a photo is taken, suggests tags and recognizes landmarks and points of interest. Its dataset consisting of more than 2 Million Flickr images from 40 cities around the world. The query may be uploaded, fetched from a given a URL, or chosen from the dataset. Try it online directly: http://viral.image.ntua.gr/

Hough Pyramid Matching (HPM) is a flexible spatial matching model which allows non-rigid motion and multiple matching surfaces or objects. It is fast enough to be used for geometric re-ranking in large scale image retrieval. Binary code for experimental comparison with the proposed approach is provided for Linux, along with documentation.

The medial feature detector (MFD) is a generic detector of regions of arbitrary scale and shape in still images. The strongest regions are mostly blob-like and well enclosed by boundaries. It has been tested successfully in image matching and retrieval applications, with state of the art performance and savings in computational and space requirements. Binary code is provided for Linux and Windows, along with documentation and examples.

The goal of this tool is to demonstrate the integration of several low- to high-level analysis algorithms toward semantic indexing of images. Different modules created by several research groups have been included and results are presented graphically in a unified way.

Annotator is an image annotation tool that supports semi-automatic annotation. Semi-automatic annotation is based on the Viola and Jones object detection algorithm implemented in OpenCV. User can create, edit and delete annotation in any image with the least possible effort. The annotation is stored in text files in OpenCV format.

Visual Descriptor Applications are developed to facilitate the automated extraction (VDE) and matching (VDM) of MPEG-7 Visual Descriptors from images. All 8 descriptors supported by VDE can be extracted from whole or parts of images, which means that depending on the existence of a binary mask file, a segmentation mask or a set of bounding box coordinates the extraction mechanism is able to calculate the descriptors either for specific image regions or the entire image. The produced output can be either in xml format or plain text. VDM supports the matching of the same 8 descriptors by getting as input xml files generated by VDE.