4 KiB
Basi di Dati multimediali
Proporre tipi di gestione di dati come informazioni (immagini, testo, media...).
9 Crediti: attivita' di approfondimento (Teorico / Sperimentale).
- Teorico: approfondimento individuale di argomenti trattati
- Sperimentale: approfondimento sperimentale su un argomento del corso
Si puo' scegliere se Teorico o Sperimentale. Lucidi in inglese, appunti in inglese.
Programma
- Introduzione alla multimedialita' ed alle sfide tecnologiche che comporta.
- Modelli per la rappresentazioni di dati multimediali.
- Indicizzazione di dati multimediali (multidimensionali).
- Clustering come tecnica alternativa all'indicizzazione.
- Interrogazione di basi di dati multimediali e gestione dell'imprecisione.
- Relevance feedback (algoritmo che apprende dalla reazione dell'utente dei fattori correttivi utilizzati per migliorare una query, che includa match totale o parziale della richiesta).
- Web come bacino di dati eterogenei e multimediali.
1. Introduzione alla multimedialita'
La definizione di strategie / algoritmi / strutture dati che permetta una gestione ottimale dei dati multimediali e la definizione degli stessi.
What is media?
A mean to communicate information in a compact / immediate form (as compact as possible). Examples can be text (structured/unstructured), images, video, audio. Communication is obtained through cohexistance of multiple ways of representing media (video as image + audio).
Multimedia / Hypermedia
-
An hypermedia document is one in which multiple media are present, accessible through an interactive interface (e.g. online newspaper which employ hyperlinks).
-
A multimedia document is formed by multiple media as well BUT not necessarily accessible interactively (no structured navigation).
Heterogeneity (semantic/physical resource/interactive) is a keyword for multimedia / hypermedia management. (see slides for sample applications).
Semantic Heterogeneity
Types of information (media) different from each other: different dimensions, such as spatial, temporal, hyerarchical.
Space is conceptually different from time and hyerarchies can be defined upon them. Furthermore, hyerarchical representations which do not involve space/time might be employed:
- Modeling is a formal representation of an object which classifies its characteristic traits.
- Specifying is defining a request (query).
- Indexing is the process of assigning indexes to data for a fast / consistent retrieval.
- Retrieval is the result of a query.
- Visualization methods allow the result to be displayed efficiently.
Usually media is context and user dependent, subjectivity has to be kept in mind. In fact, not everyone reacts (and associates) the same data with the same traits / characteristics, given that media is subject to human interpretation. Context refers to the situation in which the user is involved in the information, which might change his perspective on it.These issues derive from the (missing) definition of some terms (beautiful, old, high, etc) which might change according to the person who is describing the information.
Avaiability at various quality levels is part of heterogeneity too.
Physical Heterogeneity
- Volume which defines storage, delivery and processing.
- Quality/Cost tradeoff has to be determined.
This is important since it impacts processing times, analisys and organization of data. Operations on different data must be optimized for the objective of the process. Typical conflicting objectives are robustness (of transfer) vs degradation (graceful degradation). Graceful D. aims to smoothen the transition between low quality, partially downloaded data and fully downloaded high quality data.
Interactivity
- 100ms interaction deadline (involves resource allocation, prefetching/caching to optimize download times)
- Subjectivity / personalization of content.
- Interaction structure along with spatial, hyerarchical, temporal structures (meaning: how do we represent data?)