Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

United we fall, divided we stand: A study of query segmentation and PRF for patent prior art search

Ganguly, Debasis orcid logoORCID: 0000-0003-0050-7138, Leveling, Johannes orcid logoORCID: 0000-0003-0603-4191 and Jones, Gareth J.F. orcid logoORCID: 0000-0003-2923-8365 (2011) United we fall, divided we stand: A study of query segmentation and PRF for patent prior art search. In: 4th International Workshop on Patent Information Retrieval (PaIR'11) at CIKM 2011, 24 Oct 2011, Glasgow, Scotland.

Abstract
Previous research in patent search has shown that reducing queries by extracting a few key terms is ineffective primarily because of the vocabulary mismatch between patent applications used as queries and existing patent documents. This finding has led to the use of full patent applications as queries in patent prior art search. In addition, standard information retrieval (IR) techniques such as query expansion (QE) do not work effectively with patent queries, principally because of the presence of noise terms in the massive queries. In this study, we take a new approach to QE for patent search. Text segmentation is used to decompose a patent query into selfcoherent sub-topic blocks. Each of these much shorted sub-topic blocks which is representative of a specific aspect or facet of the invention, is then used as a query to retrieve documents. Documents retrieved using the different resulting sub-queries or query streams are interleaved to construct a final ranked list. This technique can exploit the potential benefit of QE since the segmented queries are generally more focused and less ambiguous than the full patent query. Experiments on the CLEF-2010 IP prior-art search task show that the proposed method outperforms the retrieval effectiveness achieved when using a single full patent application text as the query, and also demonstrates the potential benefits of QE to alleviate the vocabulary mismatch problem in patent search.
Metadata
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Refereed:Yes
Uncontrolled Keywords:Query segmentation; query expansion; pseudo relevance feedback; patent prior art search
Subjects:Computer Science > Information retrieval
DCU Faculties and Centres:Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16515
Deposited On:27 Oct 2011 09:58 by Shane Harper . Last Modified 25 Oct 2018 10:26
Documents

Full text available as:

[thumbnail of United_we_fall,_Divided_we_stand_A_Study_of_Query_Segmentation_and_PRF_for_Patent_Prior_Art_Search.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
226kB
Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record