2025

Information Retrieval for Text Bases

Name: Information Retrieval for Text Bases
Code: INF13259M
6 ECTS
Duration: 15 weeks/156 hours
Scientific Area: Informatics

Teaching languages: Portuguese
Languages of tutoring support: Portuguese

Sustainable Development Goals

Learning Goals

To be able to identify the main problems in text information retrieval systems and the possible solutions.
To be able to analyse information retrieval systems regarding:
- knowledge representation, search and indexing algorithms, information extraction, document clustering, document classification, cooperativity.
To be able to evaluate the existent information retrieval systems.
To have the theoretical and practical knowledge about: text indexing, boolean, vector and probabilistic models, ordering the results, evaluation.

Contents

1. Introduction; main concepts and problems
2. Boolean, vectorial, and probabilistic models
3. Indexing, lemmatization, stop-words
4. Ontologies
5. Query Languages
6. Evaluation
7. Searching the web
8. Semantic web
9. Text classification
10. Text clustering
11. Information extraction
12. Question-Answering systems

Teaching Methods

Theoretical classes with introduction of concepts, accompanied resolution of exercises and clarification of doubts.
Practical laboratory classes with proposal of problems that accompany the theoretical material and clarification of doubts during their resolution. Exercises, of gradual difficulty, covering the topics taught, for students to practice the subjects.

Assessment

Continuous assessment - consisting of 2 components:
* individual work on a specific topic in article format (30%)
* practical group work (70%)

Final assessment - consisting of 2 components:
* article (30%)
* report on group practice (70%)

The final grade is obtained through the weighted average of the 2 components. The student is approved if the final grade is equal to or greater than 10.