Logo PUC-Rio Logo Maxwell
TRABALHOS DE FIM DE CURSO @PUC-Rio
Consulta aos Conteúdos
Estatística
Título: IDENTIFICATION OF RELATED DATASETS IN THE CONTEXT OF MISSING OR PARTIAL METADATA
Autor(es): SERGIO BERNARDELLI NETTO
Colaborador(es): MARCOS VIANNA VILLAS - Orientador
Catalogação: 25/MAR/2026 Língua(s): PORTUGUESE - BRAZIL
Tipo: TEXT Subtipo: SENIOR PROJECT
Notas: [pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.
Referência(s): [pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=75814@1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/TFCs/consultas/conteudo.php?strSecao=resultado&nrSeq=75814@2
DOI: https://doi.org/10.17771/PUCRio.acad.75814
Resumo:
This project proposes a tool for analyzing, identifying, and determining potential combinations of datasets, focusing on the discovery of primary and foreign keys between two datasets (enabling relational joins), or similarity between datasets (enabling union, intersection, and difference operations). The approach leverages data mining and machine learning techniques to automate the correlation process between tables. By doing so, the project aims to enable new forms of data analysis that were previously unattainable due to the lack of explicit relationships between datasets. The results are expected to enhance data integration and uncover insights in contexts where datasets appeared unrelated.
Descrição: Arquivo:   
COMPLETE PDF