2025

Categorical Data Analysis

Name: Categorical Data Analysis
Code: MAT13608M
6 ECTS
Duration: 15 weeks/156 hours
Scientific Area: Mathematics

Teaching languages: Portuguese
Languages of tutoring support: Portuguese
Regime de Frequência: Presencial

Sustainable Development Goals

Learning Goals

The learning outcomes are:
• To know how to analyze the association and correlation involving categorical variables;
• To know the principles of a generalized linear model in order to identify, adjust and interpret a model of this type;
• To know and apply the basic principles of modeling with this type of models.;
• To know how to critically interpret the results obtained from the statistical software.

The skills developed are:
• Ability to critically and autonomously know how to construct and analyze different generalized linear models and to apply these methodologies in their professional activity;
• To acquire the basic principles of statistical modeling and to know the main modeling phases of a generalized linear model;
• Ability to interpret problems for longitudinal data;
• Ability to research and understand related literature in order to apply to other models for categorical data;
• Ability to use R for categorical data analysis.

Contents

• Contingency Tables and association and correlation measures with categorical variables.
• Generalized linear models: characterization, link functions, statistical modelling, assumptions, residual analysis, validation and inference.
• Discrete models: logit, probit, log-log, ordinal, Multinomial, Poisson, Negative Binomial, Inverse-Gaussian and Gama.
• Generalized additive models (GAM).
• Generalized Estimating Equations (GEE) and Generalized Linear Mixed Models (GLMM).
• Introduction to zero inflated models (ZIF).

Teaching Methods

Theoretical-practical lessons combining the concepts with their application to real data from different areas, making students aware of the importance of the exposed subject. The sessions include modelling and data analysis with the help of statistical software. Students actively participating in their resolution and / or discussion. In addition students are encouraged to solve practical exercises on their own in order to develop autonomy.
Focus on modeling, critical interpretation and data analysis using outputs from the software used.

Assessment

In continuous assessment, students carry out two midterm tests (each 25% of the final grade) and two works (each 25% of the final grade). A minimum grade of 8.0 out of 20 is required for each assessment component. The final grade is the result of the arithmetic average between the midterm tests and the two works.
The final assessment regime consists of a written exam in the regular period and a written exam in the appeal period.
The student is Approved when the final classification equals or exceeds a grade of 10 out of 20.

The use of AI tools is permitted in this course unit only for the completion of assignments, serving as technical, analytical, and learning support, provided that students understand, validate, and assume full responsibility for the produced results. The fabrication of sources, data, or results constitutes a serious breach of academic integrity.

The use of AI tools is permitted in this course unit only for the completion of assignments, as technical, analytical, and learning support, provided that students understand, validate, and assume full responsibility for the results produced. The fabrication of sources, data, or results constitutes a serious breach of academic integrity.
The instructor reserves the right to summon the student for an oral defense of the completed assessments, where the student must be able to explain specific parts of the work or test, write different code that performs the same task or a variation of it, analyze another dataset, or apply another technique studied in the discipline. The final grade is determined by the performance in this oral defense.

Teaching Staff