Detta är ett uppsatsförslag hämtat från Nationella Exjobb-poolen. Klicka här för att komma tillbaka till samtliga exjobbsförslag.
Integration of data sources in bioinformatics: Use of knowledge bases for query rewriting
The thesis is a part of a larger project that builds a system enabling transparent access to multiple heterogeneous biological data sources . The user of the system does not need to know about the integrated data sources. She formulates a query in a uniform query language using terms of the mediated schema that uniformly describes content of the underlying data sources. The system performs a query processing, i.e. reformulates a user query expressed over the mediated schema into the query over the relevant multiple data sources, creates a query plan that specifies how the query should be executed, executes the query plan and returns the retrieved results to the user.
The focus of the thesis is to consider how domain knowledge (application area specific knowledge) can be used to expand user queries to find a larger set of relevant results and to rewrite the user queries into queries over the data sources. For example, it is known that enzymes are proteins. Based on this knowledge, the user queries searching for proteins will also be extended to search for enzymes. For this, the student will model the domain knowledge and describe content of the data sources in OWL , use a reasoning system to infer the relevant knowledge, analyze the user queries extending them when possible and rewrite the user queries into queries over multiple data sources. To select the reasoning system, a few systems will need to be studied and compared. Java will be used as an implementation language.
The student is expected to have background or interest in learning knowledge representation languages (Description Logics and OWL). Basic knowledge in biology would be helpful but is not necessary.
(*) Data sources refers to different types of sources of the data, e.g. databases, text files storing semistructured information and applications.
 BioTrifu. http://www.ida.liu.se/~patla/research/ceniit.html
 OWL. Ontology Web Language. http://www.w3.org/2004/OWL
Informationen om uppsatsförslag är hämtad från Nationella Exjobb-poolen.