Journal article
A comparison of two-poisson, inverse document frequency and discrimination value models of document representation
Information processing & management, Vol.26(2), pp.269-278
1990
DOI: 10.1016/0306-4573(90)90030-6
Abstract
In this paper we present a comparison of the Two-Poisson (2P), Inverse Document Frequency (IDF) and Discrimination Value (DV) models of document representation. The first objective of the study was to understand the nature of the relationship between the term property underlying the 2P model and the other statistical properties of index terms that we known about: discrimination value and inverse document frequency. The second objective was to compare the properties with respect to the feature of ultimate interest, i.e., retrieval effectiveness. The study showed that 2P and IDF properties work in parallel, while 2P and DV have a negative relationship. An explanation for this negative correlation was given by viewing the distribution of inter document dissimilarities. In the retrieval experiment most of the 2P strategies tested acheived the same performance level as the DV and IDF strategies. The important conclusion made from this study is that despite the fact that the 2P model is extremely selective in its choice of indexing vocabulary for a database, it still performs with the same effectiveness as the more traditional models. The overall contribution of this work is in the area of understanding features that influence the indexing potential of terms.
Details
- Title: Subtitle
- A comparison of two-poisson, inverse document frequency and discrimination value models of document representation
- Creators
- Padmini Srinivasan - University of Iowa, Iowa City, IA 52242, U.S.A
- Resource Type
- Journal article
- Publication Details
- Information processing & management, Vol.26(2), pp.269-278
- Publisher
- Elsevier Ltd
- DOI
- 10.1016/0306-4573(90)90030-6
- ISSN
- 0306-4573
- eISSN
- 1873-5371
- Language
- English
- Date published
- 1990
- Academic Unit
- Nursing; Computer Science; Business Analytics
- Record Identifier
- 9984003189002771
Metrics
24 Record Views