Ganguly, Debasis ORCID: 0000-0003-0050-7138, Leveling, Johannes ORCID: 0000-0003-0603-4191 and Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 (2010) Exploring sentence level query expansion in language modeling based information retrieval. In: the 8th International Conference on Natural Language Processing ICON 2010, 8-11 Dec. 2010, Kharagpur, India..
Abstract
We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from
pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major
findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher
mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for
English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can
improve performance even for topics with low initial retrieval precision where standard BRF fails.
Metadata
Item Type: | Conference or Workshop Item (Paper) |
---|---|
Event Type: | Conference |
Refereed: | Yes |
Uncontrolled Keywords: | Blind Relevance Feedback; BRF; query expansion |
Subjects: | Computer Science > Information retrieval |
DCU Faculties and Centres: | Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL) Research Initiatives and Centres > National Centre for Language Technology (NCLT) DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Use License: | This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License |
ID Code: | 16038 |
Deposited On: | 17 Jun 2011 13:20 by Shane Harper . Last Modified 25 Oct 2018 10:44 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
194kB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record