|Tipo di tesi||Tesi di dottorato di ricerca|
|Autore||DANNAOUI, ABDUL RAHMAN|
|Titolo||Integrazione dell'informazione per sorgenti dati biologici|
|Titolo in inglese||Information Integration for biological data sources|
|Settore scientifico disciplinare||ING-INF/06 - BIOINGEGNERIA ELETTRONICA E INFORMATICA|
|Corso di studi||Scuola di D.R. in INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT)|
|Data inizio appello||2013-03-11|
|Disponibilità||Accessibile via web (tutti i file della tesi sono accessibili)|
Data integration è il processo di combinare i dati che risiedono in fonti diverse al fine di offrire all'utente finale una visione unificata dell'intera informazione disponibile.
Data integration is the process of combining data residing in different sources in order to offer the end user a unified view over the entire available information. Data Provenance is the process of identifying where data came from, how it was derived, and how it was updated over time. This PhD research activity on Information Integration for biological data sources was granted by the SITEIA project and was focused on improvements and extensions of the CEREALAB database. The CEREALAB database is a web-based tool realized to help cereal breeders in choosing molecular markers associated to economically important phenotypic traits; it contains phenotypic and genotypic data obtained from the integration of available open source databases with the data obtained by the CEREALAB project. Information integration in the CEREALAB database is obtained by using the MOMIS (Mediator Environment for Multiple Information Sources) system, developed by the DBGroup of the University of Modena and Reggio Emilia. As a result of the wide use of the CEREALAB database, several extensions and improvements, that can be classified in two categories, were introduced. First, the CEREALAB database content was extended in order to offer to the breeders new significant data. To improve and simplify the access to the database, a new breeder friendly Graphic User Interface (GUI) was developed. To maximize and optimize the accessibility of the available information, new functionalities and additional tools were realized. Finally, a new data entry module was implemented. Moreover, in order to meet the end-user needs, data provenance was introduced and partially implemented in the context of the CEREALAB database. Data Provenance is an open research problem; it is particularly required in data integration systems, where information coming from different sources, potentially uncertain or even inconsistent with each other, is integrated. In this context, having the possibility to trace the lineage of specific data can help identify possible unexpected or questionable results.