UniTO/anno2/YearI/SecondSem/BDM/lesson1-26022018.md
Francesco Mecca 5e286062f8 MCAD 2019
2018-11-22 13:09:11 +01:00

69 lines
4 KiB
Markdown

# Basi di Dati multimediali
Proporre tipi di gestione di dati come informazioni (immagini, testo, media...).
9 Crediti: attivita' di approfondimento (Teorico / Sperimentale).
* Teorico: approfondimento individuale di argomenti trattati
* Sperimentale: approfondimento sperimentale su un argomento del corso
Si puo' scegliere se Teorico o Sperimentale. Lucidi in inglese, appunti in inglese.
## Programma
* Introduzione alla multimedialita' ed alle sfide tecnologiche che comporta.
* Modelli per la rappresentazioni di dati multimediali.
* Indicizzazione di dati multimediali (multidimensionali).
* Clustering come tecnica alternativa all'indicizzazione.
* Interrogazione di basi di dati multimediali e gestione dell'imprecisione.
* Relevance feedback (algoritmo che apprende dalla reazione dell'utente dei fattori correttivi utilizzati per migliorare una query, che includa match totale o **parziale** della richiesta).
* Web come bacino di dati eterogenei e multimediali.
## 1. Introduzione alla multimedialita'
La definizione di strategie / algoritmi / strutture dati che permetta una gestione ottimale dei dati multimediali e la definizione degli stessi.
### What is media?
A mean to communicate information in a **compact / immediate** form (as compact as possible). Examples can be *text (structured/unstructured), images, video, audio.*
Communication is obtained through cohexistance of multiple ways of representing media (video as image + audio).
### Multimedia / Hypermedia
* An hypermedia document is one in which multiple media are present, accessible through an interactive interface (e.g. online newspaper which employ hyperlinks).
* A multimedia document is formed by multiple media as well BUT not necessarily accessible interactively (**no structured navigation**).
**Heterogeneity** (semantic/physical resource/interactive) is a keyword for multimedia / hypermedia management. *(see slides for sample applications)*.
#### Semantic Heterogeneity
Types of information (media) different from each other: different **dimensions**, such as **spatial**, **temporal**, **hyerarchical**.
Space is conceptually different from time and hyerarchies can be **defined upon them**. Furthermore, hyerarchical representations which do not involve space/time might be employed:
* **Modeling** is a formal representation of an object which classifies its characteristic traits.
* **Specifying** is defining a request (query).
* **Indexing** is the process of assigning indexes to data for a fast / consistent retrieval.
* **Retrieval** is the result of a query.
* **Visualization methods** allow the result to be displayed efficiently.
Usually media is **context and user dependent**, **subjectivity** has to be kept in mind. In fact, not everyone reacts (and associates) the same data with the same traits / characteristics, given that media is subject to human interpretation. Context refers to the situation in which the user is involved in the information, which might change his perspective on it.These issues derive from the (missing) definition of some terms (beautiful, old, high, etc) which might change according to the person who is describing the information.
Avaiability at various **quality levels** is part of heterogeneity too.
#### Physical Heterogeneity
* **Volume** which defines **storage, delivery and processing**.
* **Quality/Cost tradeoff** has to be determined.
This is important since it impacts processing times, analisys and organization of data. Operations on different data must be optimized for the objective of the process. Typical conflicting objectives are **robustness (of transfer) vs degradation (graceful degradation)**. Graceful D. aims to smoothen the transition between low quality, *partially downloaded* data and fully downloaded high quality data.
#### Interactivity
* 100ms interaction deadline (involves resource allocation, prefetching/caching to optimize download times)
* Subjectivity / personalization of content.
* Interaction structure along with spatial, hyerarchical, temporal structures (meaning: how do we represent data?)