Logo PUC-Rio Logo Maxwell
ETDs @PUC-Rio
Estatística
Título: EXTRACTING RELIABLE INFORMATION FROM LARGE COLLECTIONS OF LEGAL DECISIONS
Autor: FERNANDO ALBERTO CORREIA DOS SANTOS JUNIOR
Colaborador(es): HELIO CORTES VIEIRA LOPES - Orientador
Catalogação: 09/JUN/2022 Língua(s): ENGLISH - UNITED STATES
Tipo: TEXT Subtipo: THESIS
Notas: [pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.
Referência(s): [pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=59463&idi=1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=59463&idi=2
DOI: https://doi.org/10.17771/PUCRio.acad.59463
Resumo:
As a natural consequence of the Brazilian Judicial System’s digitization, a large and increasing number of legal documents have become available on the Internet, especially judicial decisions. As an illustration, in 2020, 25 million decisions were produced by the Brazilian Judiciary. Meanwhile, the Brazilian Supreme Court (STF), the highest judicial body in Brazil, alone has produced 99.5 thousand decisions. In line with those numbers, we face a growing demand for studies focused on extracting and exploring the legal knowledge hidden in those large collections of legal documents. However, unlike typical textual content (e.g., book, news, and blog post), the legal text constitutes a particular case of highly conventionalized language. Little attention is paid to information extraction in specialized domains such as legal texts. From a temporal perspective, the Judiciary itself is a constantly evolving institution, which molds itself to cope with the demands of society. Therefore, our goal is to propose a reliable process for legal information extraction from large collections of legal documents, based on the STF scenario and the monocratic decisions published by it between 2000 and 2018. To do so, we intend to explore the combination of different Natural Language Processing (NLP) and Information Extraction (IE) techniques on legal domain. From NLP, we explore automated named entity recognition strategies in the legal domain. From IE, we explore dynamic topic modeling with tensor decomposition as a tool to investigate the legal reasoning changes embedded in those decisions over time through textual evolution and the presence of the legal named entities. For reliability, we explore the interpretability of the methods employed. Also, we add visual resources to facilitate interpretation by a domain specialist. As a final result, we expect to propose a reliable and cost-effective process to support further studies in the legal domain and, also, to propose new strategies for information extraction on a large collection of documents.
Descrição: Arquivo:   
COMPLETE PDF