Logo PUC-Rio Logo Maxwell
ETDs @PUC-Rio
Estatística
Título: STRATEGIES TO UNDERSTAND THE CONNECTIVITY OF ENTITY PAIRS IN KNOWLEDGE BASES
Autor: JAVIER GUILLOT JIMENEZ
Colaborador(es): MARCO ANTONIO CASANOVA - Orientador
Catalogação: 04/NOV/2021 Língua(s): ENGLISH - UNITED STATES
Tipo: TEXT Subtipo: THESIS
Notas: [pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.
Referência(s): [pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=55649&idi=1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=55649&idi=2
DOI: https://doi.org/10.17771/PUCRio.acad.55649
Resumo:
The entity relatedness problem refers to the question of exploring a knowledge base, represented as an RDF graph, to discover and understand how two entities are connected. This question can be addressed by implementing a path search strategy that combines an entity similarity measure with an entity degree limit and an expansion limit to reduce the path search space and a path ranking measure to order the relevant paths between a given pair of entities in the RDF graph. This thesis first introduces a framework, called CoEPinKB, together with an implementation, to experiment with path search strategies. The framework features as hot spots the entity similarity measure, the entity degree limit, the expansion limit, the path ranking measure, and the knowledge base. The thesis moves on to present a performance evaluation of nine path search strategies using a benchmark from two entertainment domains over the OpenLink Virtuoso SPARQL protocol endpoint of the DBpedia. The thesis then introduces DCoEPinKB, a distributed version of the framework based on Apache Spark, that supports the empirical evaluation of path search strategies, and presents an evaluation of six path search strategies over two entertainment domains over real-data collected from DBpedia. The results provide insights about the performance of the path search strategies and suggest that the framework implementation, instantiated with the best performing pair of measures, can be used, for example, to expand the results of search engines over knowledge bases to include related entities.
Descrição: Arquivo:   
COMPLETE PDF