Information Retrieval
0%
Course Title: Information Retrieval
Course No: CSIT.425.1
Nature of the Course: Theory + Lab
Semester: 8
Full Marks: 60 + 20 + 20
Pass Marks: 24 + 10 + 10
Credit Hours: 3
Course Description
Course Objectives
Course Contents
1. Introduction
6 hrs
1.1. IR Fundamentals
- Introduction
- History of IR
- Components of IR
- Issues
- Open source Search engine Frameworks
1.2. IR and the Web
- The impact of the web on IR
- The role of artificial intelligence (AI) in IR
- IR Versus Web Search
- Components of a Search engine
- Characterizing the web
2. Information Retrieval
12 hrs
2.1. Retrieval Models and Indexing
- Boolean and vector-space retrieval models
- Term weighting - TF-IDF weighting
- Cosine similarity
- Preprocessing
- Inverted indices
- Efficient processing with sparse vectors
2.2. Advanced Retrieval Models
- Language Model based IR
- Probabilistic IR
- Latent Semantic Indexing
- Relevance feedback
- Pseudo-relevance feedback and query expansion
3.1. Web Search Overview
- Web search overview
- Web structure
- The user
- Paid placement
- Search engine optimization/spam
- Web size measurement
3.2. Crawling and Indexing
- Web Search Architectures
- Crawling
- Meta-crawlers
- Focused Crawling
- Web indexes
- Near-duplicate detection
- Index Compression
- XML retrieval
4. Web Search
10 hrs
4.1. Link Analysis and Ranking
- Link Analysis
- Hubs and authorities
- Page Rank and HITS algorithms
- Searching and Ranking
- Relevance Scoring and ranking for Web
- Similarity
- Hadoop & Map Reduce
- Evaluation
4.2. Personalization and Advanced Features
- Personalized search
- Collaborative filtering and content-based recommendation of documents and products
- Handling invisible Web
- Snippet generation
- Summarization
- Question Answering
- Cross-Lingual Retrieval
5.1. Text Classification
- Information filtering; organization and relevance feedback
- Text Mining
- Text classification and clustering
- Naive Bayes categorization algorithm
- Decision trees categorization algorithm
- Nearest neighbor categorization algorithm
5.2. Clustering Algorithms
- Agglomerative clustering
- K-means
- Expectation maximization (EM)
Laboratory Works
- 1.IR Algorithms
- 2.Web Search and Mining
Text Books
- 1.C. Manning, P. Raghavan, and H. Schutze, Introduction to Information Retrieval, Cambridge University Press, 2008.
Reference Books
- 1.Ricardo Baeza, Yates and Berthier Ribeiro, Neto, Modern Information Retrieval: The Concepts and Technology behind Search 2nd Edition, ACM Press Books 2011.
- 2.Bruce Croft, Donald Metzler and Trevor Strohman, Search Engines: Information Retrieval in Practice, 1st Edition Addison Wesley, 2009.
- 3.Mark Levene, An Introduction to Search Engines and Web Navigation, 2nd Edition Wiley, 2010.