Paper List for WebIR&IE 2005
-
Web data extraction
- Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Raghavan, Clement T. Yu:
Fully automatic wrapper generation for search engines. 66-75, WWW 2005
- Yanhong Zhai, Bing Liu: Web data extraction based on partial tree
alignment. 76-85, WWW 2005
- Andrew Hogue, David Karger: Thresher: automating the unwrapping of
semantic content from the World Wide Web. 86-95, WWW 2005
- Oren Etzioni, Michael J. Cafarella, Doug Downey, Stanley Kok, Ana-Maria
Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, Alexander Yates:
Web-scale information extraction in knowitall: (preliminary results).
100-110, WWW 2004
-
Search interface integration
- Bin He, Kevin Chen-Chuan Chang, Jiawei Han: Discovering complex
matchings across web query interfaces: a correlation mining approach.
KDD2004, 148-157
- Hai He, Weiyi Meng, Clement T. Yu, Zonghuan Wu: WISE-Integrator: An
Automatic Integrator of Web Search Interfaces for E-Commerce. VLDB2003,
357-368
-
Web service
- Vikas Agarwal, Koustuv Dasgupta, Neeran Karnik, Arun Kumar, Ashish
Kundu, Sumit Mittal, Biplav Srivastava: A service creation environment based
on end to end composition of Web services. 128-137, WWW 2005
- Sami Bhiri, Olivier Perrin, Claude Godart: Ensuring required failure
atomicity of composite Web services. 138-147, WWW 2005
- Dirk Beyer, Arindam Chakrabarti, Thomas A. Henzinger: Web service
interfaces. 148-159, WWW 2005
- Monika Solanki, Antonio Cau, Hussein Zedan: Augmenting semantic web
service descriptions with compositional specification. 544-552, WWW 2004
- Abhijit A. Patil, Swapna A. Oundhakar, Amit P. Sheth, Kunal Verma:
Meteor-s web service annotation framework. 553-562, WWW 2004
- Peter Mika, Daniel Oberle, Aldo Gangemi, Marta Sabou: Foundations for
service ontologies: aligning OWL-S to dolce. 563-572, WWW 2004
-
Web crawling
- Sandeep Pandey, Krithi Ramamritham, Soumen Chakrabarti: Monitoring the
dynamic web to respond to continuous queries. 659-668, WWW 2003
- Dennis Fetterly, Mark Manasse, Marc Najork, Janet L. Wiener: A
large-scale study of the evolution of web pages. 669-678, WWW 2003
- Andrei Z. Broder, Marc Najork, Janet L. Wiener: Efficient URL caching
for world wide web crawling. 679-68, WWW 2003
-
Indexing and Querying
- Edleno Silva de Moura, Celia F. dos Santos, Daniel R. Fernandes,
Altigran Soares da Silva, Pavel Calado, Mario A. Nascimento: Improving Web
search efficiency via a locality based static pruning method. 235-244, WWW
2005
- Aris Anagnostopoulos, Andrei Z. Broder, David Carmel: Sampling
search-engine results. 245-256, WWW 2005 Xiaohui Long, Torsten Suel:
Three-level caching for efficient query processing in large Web search
engines. 257-266, WWW 2005
-
Ranking
- Paolo Boldi, Massimo Santini, Sebastiano Vigna: PageRank as a function
of the damping factor. 557-566, WWW 2005 Zaiqing Nie, Yuanzhi Zhang, Ji-Rong
Wen, Wei-Ying Ma: Object-level ranking: bringing order to Web objects.
567-574, WWW 2005
- Frank McSherry: A uniform approach to accelerated PageRank computation.
575-582, WWW 2005
- Sepandar D. Kamvar, Taher H. Haveliwala, Christopher D. Manning, Gene H.
Golub: Extrapolation methods for accelerating PageRank computations.
261-270, WWW 2003
- Glen Jeh, Jennifer Widom: Scaling personalized web search. 271-279, WWW
2003
- Serge Abiteboul, Mihai Preda, Gregory Cobena: Adaptive on-line page
importance computation. 280-290, WWW 2003
- John A. Tomlin: A new paradigm for ranking pages on the world wide web.
350-355, WWW 2003
- Ah Chung Tsoi, Gianni Morini, Franco Scarselli, Markus Hagenbuchner,
Marco
- Maggini: Adaptive ranking of web pages. 356-365, WWW 2003
- Ronald Fagin, Ravi Kumar, Kevin S. McCurley, Jasmine Novak, D.
Sivakumar,
- John A. Tomlin, David P. Williamson: Searching the workplace web.
366-375, WWW 2003
-
Question Answering
- Dell Zhang, Wee Sun Lee: Question classification using support vector
machines. 26-32, SIGIR 2003
- Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh: Structured use of
external knowledge for event-based open domain question answering. 33-40,
SIGIR 2003
- Stefanie Tellex, Boris Katz, Jimmy J. Lin, Aaron Fernandes, Gregory
Marton: Quantitative evaluation of passage retrieval algorithms for question
answering. 41-47, SIGIR 2003
-
Semantic Search
- Michael J. Cafarella, Oren Etzioni: A search engine for natural language
applications. 442-452, WWW 2005
- Lei Zhang, Yong Yu, Jian Zhou, Chenxi Lin, Yin Yang: An enhanced model
for searching in semantic portals. 453-462,WWW 2005
- Ron Bekkerman, Andrew McCallum: Disambiguating Web appearances of people
in a social network. 463-470, WWW 2005
-
XML retrieval
- Makoto Onizuka, Fong Yee Chan, Ryusuke Michigami, Takashi Honishi:
Incremental maintenance for materialized XPath/XSLT views. 671-681, WWW 2005
- Achille Fokoue, Kristoffer Hogsbro Rose, Jerome Simeon, Lionel Villard:
Compiling XSLT 2.0 into XQuery 1.0. 682-691, WWW 2005
- Toshiro Takase, Hisashi Miyashita, Toyotaro Suzumura, Michiaki
Tatsubori: An adaptive, fast, and safe XML parser based on byte sequences
memorization. 692-701, WWW 2005
- Gabriella Kazai, Mounia Lalmas, Arjen P. de Vries: The overlap problem
in content-oriented XML retrieval evaluation. 72-79, SIGIR 2004
- Jaap Kamps, Maarten de Rijke, Borkur Sigurbjornsson: Length
normalization in XML retrieval. 80-87, SIGIR 2004
- Shaorong Liu, Qinghua Zou, Wesley W. Chu: Configurable indexing and
ranking for XML information retrieval. 88-95,SIGIR 2004
- David Carmel, Yoelle S. Maarek, Matan Mandelbrod, Yosi Mass, Aya Soffer:
Searching XML documents via XML fragments. 151-158, SIGIR 2003
-
Information Retrieval Model
- Hugo Zaragoza, Djoerd Hiemstra, Michael E. Tipping: Bayesian extension
to the language model for ad hoc information retrieval. 4-9, SIGIR 2003
- ChengXiang Zhai, William W. Cohen, John D. Lafferty: Beyond independent
relevance: methods and evaluation metrics for subtopic retrieval. 10-17,
SIGIR 2003
- Jaime Teevan, David R. Karger: Empirical development of an exponential
probabilistic model for text retrieval: using textual analysis to build a
better model. 18-25, SIGIR 2003
- Kareem Darwish, Douglas W. Oard: Probabilistic structured query methods.
338-344, SIGIR 2003
- Thomas Hofmann: Collaborative filtering via gaussian probabilistic
latent semantic analysis. 259-266, SIGIR 2003
-
Web usage analysis
- Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi: Duplicate detection in
click streams. 12-21, WWW 2005
- Cai-Nicolas Ziegler, Sean M. McNee, Joseph A. Konstan, Georg Lausen:
Improving recommendation lists through topic diversification. 22-32,WWW 2005
-
Focused crawling
- Jian-Tao Sun, Hua-Jun Zeng, Huan Liu, Yuchang Lu, Zheng Chen: CubeSVD: a
novel approach to personalized Web search. 382-390, WWW 2005
- Uichin Lee, Zhenyu Liu, Junghoo Cho: Automatic identification of user
goals in Web search. 391-400, WWW 2005
- Sandeep Pandey, Christopher Olston: User-centric Web crawling. 401-411,
WWW 2005
- Ismail Sengor Altingovde, Ozgur Ulusoy: Exploiting Interclass Rules for
Focused Crawling. 66-73, IEEE Intelligent Systems, 2004
-
Clustering web search result
- Krishna Kummamuru, Rohit Lotlikar, Shourya Roy, Karan Singal, Raghu
Krishnapuram: A hierarchical monothetic document clustering algorithm for
summarization and browsing search results. 658-665, WWW 2004
- Reiner Kraft, Jason Y. Zien: Mining anchor text for query refinement.
666-674, WWW 2004
- Kazunari Sugiyama, Kenji Hatano, Masatoshi Yoshikawa: Adaptive web
search based on user profile constructed without any effort from users.
675-684 WWW 2004
-
Text classification
- Dunja Mladenic, Janez Brank, Marko Grobelnik, Natasa Milic-Frayling:
Feature selection using linear classifier weights: interaction with
classification models. 234-241 SIGIR 2004
- Dmitry Davidov, Evgeniy Gabrilovich, Shaul Markovitch: Parameterized
generation of labeled datasets for text categorization based on a
hierarchical directory. 250-257, SIGIR 2004
- Yiming Yang, Jian Zhang, Bryan Kisiel: A scalability analysis of
classifiers in text categorization. 96-103, SIGIR 2003
- Dmitry V. Khmelev, William John Teahan: A repetition based measure for
verification of text collections and for text categorization. 104-110, SIGIR
2003
- Paul N. Bennett: Using asymmetric distributions to improve text
classifier probability estimates. 111-118 SIGIR 2003
- Sheng Gao, Wen Wu, Chin-Hui Lee, Tat-Seng Chua: A maximal
figure-of-merit learning approach to text categorization. 174-181, SIGIR
2003
- Lijuan Cai, Thomas Hofmann: Text categorization by boosting
automatically extracted concepts. 182-189, SIGIR 2003
- Jian Zhang, Yiming Yang: Robustness of regularized linear classification
methods in text categorization. 190-197 SIGIR 2003
- Dunja Mladenic, Janez Brank, Marko Grobelnik, Natasa Milic-Frayling:
Feature selection using linear classifier weights: interaction with
classification models. 234-241, SIGIR 2004
- Dou Shen, Zheng Chen, Qiang Yang, Hua-Jun Zeng, Benyu Zhang, Yuchang Lu,
Wei-Ying Ma: Web-page classification through summarization. 242-249, SIGIR
2004
- Dmitry Davidov, Evgeniy Gabrilovich, Shaul Markovitch: Parameterized
generation of labeled datasets for text categorization based on a
hierarchical directory. 250-257,SIGIR 2004