| Does “authority” mean quality? predicting expert quality ratings of Web documents |
| Full text |
Pdf
(773 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Athens, Greece
Pages: 296 - 303
Year of Publication: 2000
ISBN:1-58113-226-3
|
|
Authors
|
|
Brian Amento
|
AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ and Department of Computer Science, Virginia Tech.
|
|
Loren Terveen
|
AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ
|
|
Will Hill
|
AT&T Shannon Laboratories, 180 Park Avenue, Florharn Park, NJ
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 20, Downloads (12 Months): 172, Citation Count: 46
|
|
|
ABSTRACT
For many topics, the World Wide Web contains hundreds or thousands of relevant documents of widely varying quality. Users face a daunting challenge in identifying a small subset of documents worthy of their attention.
Link analysis algorithms have received much interest recently, in large part for their potential to identify high quality items. We report here on an experimental evaluation of this potential.
We evaluated a number of link and content-based algorithms using a dataset of web documents rated for quality by human topic experts. Link-based metrics did a good job of picking out high-quality items. Precision at 5 is about 0.75, and precision at 10 is about 0.55; this is in a dataset where 0.32 of all documents were of high quality. Surprisingly, a simple content-based metric performed nearly as well; ranking documents by the total number of pages on their containing site.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
Brian Amento , Will Hill , Loren Terveen , Deborah Hix , Peter Ju, An empirical evaluation of user interfaces for topic management of Web sites, Proceedings of the SIGCHI conference on Human factors in computing systems: the CHI is the limit, p.552-559, May 15-20, 1999, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/302979.303156]
|
 |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
Stuart K. Card , George G. Robertson , William York, The WebBook and the Web Forager: an information workspace for the World-Wide Web, Proceedings of the SIGCHI conference on Human factors in computing systems: common ground, p.111-ff., April 13-18, 1996, Vancouver, British Columbia, Canada
[doi> 10.1145/238386.238446]
|
| |
7
|
Soumen Chakrabarti , Byron Dom , Prabhakar Raghavan , Sridhar Rajagopalan , David Gibson , Jon Kleinberg, Automatic resource compilation by analyzing hyperlink structure and associated text, Computer Networks and ISDN Systems, v.30 n.1-7, p.65-74, April 1, 1998
|
| |
8
|
|
| |
9
|
Page L., Brin S., Motwani R., and Winograd T. The PageRank Citation Ranking: Bringing Order to the Web. Stanford Digital Libraries Working Paper
|
 |
10
|
Peter Pirolli , James Pitkow , Ramana Rao, Silk from a sow's ear: extracting usable structures from the Web, Proceedings of the SIGCHI conference on Human factors in computing systems: common ground, p.118-125, April 13-18, 1996, Vancouver, British Columbia, Canada
[doi> 10.1145/238386.238450]
|
 |
11
|
Peter Pirolli , Patricia Schank , Marti Hearst , Christine Diehl, Scatter/gather browsing communicates the topic structure of a very large text collection, Proceedings of the SIGCHI conference on Human factors in computing systems: common ground, p.213-220, April 13-18, 1996, Vancouver, British Columbia, Canada
[doi> 10.1145/238386.238489]
|
 |
12
|
James Pitkow , Peter Pirolli, Life, death, and lawfulness on the electronic frontier, Proceedings of the SIGCHI conference on Human factors in computing systems, p.383-390, March 22-27, 1997, Atlanta, Georgia, United States
[doi> 10.1145/258549.258805]
|
 |
13
|
|
CITED BY 46
|
|
|
|
|
|
|
|
|
|
|
Tao Qin , Tie-Yan Liu , Xu-Dong Zhang , Zheng Chen , Wei-Ying Ma, A study of relevance propagation for web search, Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, August 15-19, 2005, Salvador, Brazil
|
|
|
|
|
|
Rong Tang , Kwong Bor Ng , Tomek Strzalkowski , Paul B. Kantor, Automatically predicting information quality in news documents, Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: companion volume of the Proceedings of HLT-NAACL 2003--short papers, p.97-99, May 27-June 01, 2003, Edmonton, Canada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Filippo Menczer , Gautam Pant , Padmini Srinivasan , Miguel E. Ruiz, Evaluating topic-driven web crawlers, Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, p.241-249, September 2001, New Orleans, Louisiana, United States
|
|
|
|
|
|
|
|
|
Hung-Yu Kao , Ming-Syan Chen , Shian-Hua Lin , Jan-Ming Ho, Entropy-based link analysis for mining web informative structures, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Diane Kelly , Xiao-jun Yuan , Nicholas J. Belkin , Vanessa Murdock , W. Bruce Croft, Features of documents relevant to task- and fact- oriented questions, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Meiqun Hu , Ee-Peng Lim , Aixin Sun , Hady Wirawan Lauw , Ba-Quy Vuong, On improving wikipedia search using article quality, Proceedings of the 9th annual ACM international workshop on Web information and data management, November 09-09, 2007, Lisbon, Portugal
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Brian Amento , Loren Terveen , Will Hill , Deborah Hix, TopicShop: enhanced support for evaluating and organizing collections of Web sites, Proceedings of the 13th annual ACM symposium on User interface software and technology, p.201-209, November 06-08, 2000, San Diego, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Peer to Peer - Readers of this Article have also read:
-
M4: a metamodel for data preprocessing
Proceedings of the 4th ACM international workshop on Data warehousing and OLAP
Anca Vaduva
, Jörg-Uwe Kietz
, Regina Zücker
-
Data structures for quadtree approximation and compression
Communications of the ACM
28, 9
Hanan Samet
-
A hierarchical single-key-lock access control using the Chinese remainder theorem
Proceedings of the 1992 ACM/SIGAPP Symposium on Applied computing
Kim S. Lee
, Huizhu Lu
, D. D. Fisher
-
The GemStone object database management system
Communications of the ACM
34, 10
Paul Butterworth
, Allen Otis
, Jacob Stein
-
Putting innovation to work: adoption strategies for multimedia communication systems
Communications of the ACM
34, 12
Ellen Francik
, Susan Ehrlich Rudman
, Donna Cooper
, Stephen Levine
|