| Título: | PREDICTION OF RESULTS OF FOOTBALL MATCHES IN THE BRAZILIAN CHAMPIONSHIP SÉRIE A USING MACHINE LEARNING: A COMPARATIVE ANALYSIS OF MODELS | ||||||||||||
| Autor(es): |
RODRIGO LORENTE KAUER |
||||||||||||
| Colaborador(es): |
ALBERTO BARBOSA RAPOSO - Orientador CESAR AUGUSTO SIERRA FRANCO - Coorientador |
||||||||||||
| Catalogação: | 27/MAR/2026 | Língua(s): | PORTUGUESE - BRAZIL |
||||||||||
| Tipo: | TEXT | Subtipo: | SENIOR PROJECT | ||||||||||
| Notas: |
[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio. [en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio. |
||||||||||||
| Referência(s): |
[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=75870@1 [en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=75870@2 |
||||||||||||
| DOI: | https://doi.org/10.17771/PUCRio.acad.75870 | ||||||||||||
| Resumo: | |||||||||||||
|
Predicting the outcomes of football matches is a challenging problem due
to the sport s dynamic and multifactorial nature. In the Brazilian context, the
Campeonato Brasileiro Série A exhibits specific characteristics, such as a high
level of competitive balance, regional influences, and performance variability
throughout the season that make predictive modeling particularly difficult.
This work proposes the development and evaluation of a machine learning
pipeline aimed at predicting match outcomes in Série A. To conduct the study,
a comprehensive dataset was built from historical sources, including match
statistics, recent team performance indicators, head-to-head records, regional
factors, and metrics related to the sports betting market. A systematic feature
engineering process was applied to capture temporal and contextual patterns
relevant to Brazilian football. Several supervised classification models were
evaluated, including Logistic Regression, Naive Bayes, K-Nearest Neighbors,
Support Vector Machines, Random Forest, Gradient Boosting, AdaBoost,
Multilayer Perceptron, and XGBoost. The models were compared using metrics
suitable for potentially imbalanced problems, such as accuracy, precision,
recall, and F1-score, with emphasis on weighted and macro averages.
|
|||||||||||||
|
|||||||||||||