2025

Information Extraction and Ontologies

Name: Information Extraction and Ontologies
Code: INF13258M
6 ECTS
Duration: 15 weeks/156 hours
Scientific Area: Informatics

Teaching languages: Portuguese
Languages of tutoring support: Portuguese

Sustainable Development Goals

Learning Goals

The main objective is to provide the necessary skills to analyse, compare and build computer systems with the ability to process large collections of documents, extracting relevant information, populate ontologies (knowledge bases) and answer questions about the extracted information.

As additional objectives, students should apply advanced skills in the areas of Natural Language Processing - lexical analysis, syntactic, semantic and pragmatic - and machine learning.

Contents

1. Basic concepts: document collections, information extraction, text mining, ontologies, question-answer systems.
2. Evaluation measures. Standard measures - precision, recall, f-measure - and conferences: QA @ CLEF, TREC QA.
3. NLP symbolic approaches: lexicon, syntax, semantics, pragmatics, ontologies.
4. Non symbolic approaches: extraction of information through automatic learning techniques – SVMs, neural networks/deep learning.
5. Hybrid approaches.
6. Case Studies: automatic ontology population, semantic tagging - "semantic role labeling", automatic summarization, question-answer systems.

Teaching Methods

As teaching methodology a mixture of various techniques was adopted:
1. Oral presentation of the basic concepts and methodologies
2. Selection of scientific papers on recent and/or ongoing work
3. Presentation and discussion of selected papers
4. Elaboration of practical work
5. Use of the e-learning platform moodle 

The evaluation is based in the following components:
1. Implementation of a software project
2. Writing a monograph / research paper
3. Oral presentation of the work