Título: | EXTRACTING RELIABLE INFORMATION FROM LARGE COLLECTIONS OF LEGAL DECISIONS | ||||||||||||
Autor: |
FERNANDO ALBERTO CORREIA DOS SANTOS JUNIOR |
||||||||||||
Colaborador(es): |
HELIO CORTES VIEIRA LOPES - Orientador |
||||||||||||
Catalogação: | 09/JUN/2022 | Língua(s): | ENGLISH - UNITED STATES |
||||||||||
Tipo: | TEXT | Subtipo: | THESIS | ||||||||||
Notas: |
[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio. [en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio. |
||||||||||||
Referência(s): |
[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=59463&idi=1 [en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=59463&idi=2 |
||||||||||||
DOI: | https://doi.org/10.17771/PUCRio.acad.59463 | ||||||||||||
Resumo: | |||||||||||||
As a natural consequence of the Brazilian Judicial System’s digitization, a large and increasing number of legal documents have become available on the Internet, especially judicial decisions. As an illustration, in 2020,
25 million decisions were produced by the Brazilian Judiciary. Meanwhile,
the Brazilian Supreme Court (STF), the highest judicial body in Brazil,
alone has produced 99.5 thousand decisions. In line with those numbers, we
face a growing demand for studies focused on extracting and exploring the
legal knowledge hidden in those large collections of legal documents. However, unlike typical textual content (e.g., book, news, and blog post), the
legal text constitutes a particular case of highly conventionalized language.
Little attention is paid to information extraction in specialized domains such
as legal texts. From a temporal perspective, the Judiciary itself is a constantly evolving institution, which molds itself to cope with the demands of
society. Therefore, our goal is to propose a reliable process for legal information extraction from large collections of legal documents, based on the STF
scenario and the monocratic decisions published by it between 2000 and
2018. To do so, we intend to explore the combination of different Natural
Language Processing (NLP) and Information Extraction (IE) techniques on
legal domain. From NLP, we explore automated named entity recognition
strategies in the legal domain. From IE, we explore dynamic topic modeling with tensor decomposition as a tool to investigate the legal reasoning
changes embedded in those decisions over time through textual evolution
and the presence of the legal named entities. For reliability, we explore the
interpretability of the methods employed. Also, we add visual resources to
facilitate interpretation by a domain specialist. As a final result, we expect
to propose a reliable and cost-effective process to support further studies
in the legal domain and, also, to propose new strategies for information
extraction on a large collection of documents.
|
|||||||||||||
|