|EXAM & GRADING|
|Paper Present (Dec. 8 - Dec. 23)||30%|
|Term Project (Dec. 29 - Jan. 12)||30%|
|Final (Jan. 13, 1999)||40%|
- Introduction to Information Retrieval and Extraction -- [slide]
- Conventional Information Retrieval Systems (Salton, Chapter 8) -- [slide]
- Database Management and Information Retrieval
- Text Retrieval Using Inverted Indexing Methods
- Extensions of the Inverted Index Operations
Automatic Indexing (Salton, Chapter 9) -- [slide]
- Indexing Environment
- Indexing Aims
- Single-Term Indexing Theories
- Term Relationships in Indexing
- Term-Phrase Formulation
- Thesaurus-Group Generation
Information-Retrieval Models (Baeza-Yates, 1999, Chapter 2) -- [Slide]
- Boolean Retrieval Model
- The Vector Space Model
- Probabilistic Retrieval Model
Retrieval Performance Evaluation (Baeza-Yates, 1999, Chapter 3) -- [slide]
- Test Collection Comparison -- [slide]
- File Structures (Frakes & Baeza-Yates, Chapters 3-5) --
- Term Operations and Document Processing (Frakes & Baeza-Yates, 1992, Chapter 7-9) -- [slide]
- Lexical Analysis and Stoplists
- Stemming Algorithms
- Thesaurus Construction
- Clustering Algorithms (Frakes & Baeza-Yates, 1992, Chapter 16) -- [slides]
- Query Operation ( Baeza-Yates and Ribeiro-Neto, 1999, Chapter 4,5) and
Relevance Feedback (Frakes & Baeza-Yates, 1992, Chapter 11) -- [slide]
- Searching on the Web (Baeza-Yates and Ribeiro-Neto, 1999, Chapter 13) -- [slide]
- Information extraction* Acknowledgement: Special thanks to Prof. Chen, Hsin-Hsi for providing the course material.
The students are asked to read IR-related papers and prepare 20-30 minutes presentation on the class.
- The origin of the paper
- Where you find it
- Infomation about the authors or the research group
- Problem definition
- The proposed solution
- What do you think about the paper
Presentation Date: Dec. 8 - Dec. 23, 1999
Please mail me the paper you want to present as soon as you make the decision, including paper title, authors, year, origin, URL, etc.
Please also hand in one copy of the paper as well as the slides you make for the presentation on the day you present the paper.
In this project, you don't realy have to implement your Web search engine. Instead, you are asked to make a proposal of an IR system. You should describe as detail as possible in your proposal where your IR system is applied, how it is designed, the features of your design, etc. In particular, your proposal should include the following sections:
- Describe where your IR system is applied.
- Your IR system can be applied for example on the World Wide Web, a news center, or a bibliography collection, etc.
- Details how your Web search engine works.
- You can design your IR system from draft, using inverted file indexing your corpus and some model to rank documents, or you can design a multi-engine search engine assuming there's already some IR systems providing ranked documents for furthur processing. Be sure that at least one feature is described clearly such that it makes your IR system distinguished.
- Outline the features of your design.
- Depends on your design, describe how your text or query are processed, or what techniques are applied to your initial query result.
- What do you expect about your design?
- With your special design, how much improvement you expect to obtain, or what benefits the user can take from your design?
- What experiments do you need to validate your claims in section 4?
- Draw the figures or tables as well.
- What problems you might encounter?
- Challenge yourself about your design to adjust your system. Prepare as many problems as possible.
Note: The proposal should be printed using in 12pt font, single line spacing, and should not exceed 15 pages. Please also prepare 15-minute slides to present your work.
Proposal Due Date: Dec. 29, 1999
Presentation Date: Dec. 29, 1999 - Jan. 12, 2000