Where academic tradition
meets the exciting future

Relevance Measures for XML Information Retrieval

Olli Luoma, Relevance Measures for XML Information Retrieval. International Journal of Web and Grid Services 3(2), 170–193, 2007.

Abstract:

In recent years, a lot of work has been carried out to develop efficient methods for storing and querying XML data. Most of the proposals have approached the subject from the database point of view, i.e., they have primarily aimed at providing exact matching capabilities. The problem can, however, also be addressed as an information-retrieval problem, which obviously introduces some challenges, such as the need for relevance ranking. The vast majority of the previous proposals have based the ranking primarily on content and, furthermore, if structural properties were taken into account, only containment relationships have been considered. In this paper, we focus on ranking the results based on their structural properties and aim at supporting a wide range of structural operations, such as operations based on preceding/following relationships. Our method is based on a fuzzy interpretation of the XPath query language which is also discussed in this paper. Finally, we discuss a relational implementation of our model and present the results of our experiments.

Files:

Full publication in PDF-format

BibTeX entry:

@ARTICLE{jLuoma07a,
  title = {Relevance Measures for XML Information Retrieval},
  author = {Luoma, Olli},
  journal = {International Journal of Web and Grid Services},
  volume = {3},
  number = {2},
  pages = {170–193},
  year = {2007},
  keywords = {information retrieval, semistructured documents, XML, relevance ranking},
}

Belongs to TUCS Research Unit(s): Algorithmics and Computational Intelligence Group (ACI)

Publication Forum rating of this publication: level 1

Edit publication