Blott, Stephen, Camous, Fabrice, Gurrin, Cathal ORCID: 0000-0003-4395-7702 and Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 (2005) On the use of clustering and the MeSH controlled vocabulary to improve MEDLINE abstract search. In: the Second CORIA (Conference en Recherche d'Informations et Applications), March 2005, Grenoble, France.
Abstract
Databases of genomic documents contain substantial amounts of structured information in addition to the texts of titles and abstracts. Unstructured information retrieval techniques fail to take advantage of the structured information available. This paper describes a technique to
improve upon traditional retrieval methods by clustering the retrieval result set into two distinct clusters using additional structural information. Our hypothesis is that the relevant documents are to be found in the tightest cluster of the two, as suggested by van Rijsbergen's cluster
hypothesis. We present an experimental evaluation of these ideas based on the relevance judgments of the 2004 TREC workshop Genomics track, and the CLUTO software clustering
package.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Genomic information retrieval; clustering; ontology; tree similarity measure |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16205 |
Deposited On: | 09 Jun 2011 08:24 by Shane Harper . Last Modified 25 Oct 2018 13:15 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
233kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record