Logo PUC-Rio Logo Maxwell
ETDs @PUC-Rio
Estatística
Título: EVALUATING LLM IN-CONTEXT FEW-SHOT LEARNING ON LEGAL ENTITY ANNOTATION TASK
Autor: VENICIUS GARCIA REGO
Colaborador(es): HELIO CORTES VIEIRA LOPES - Orientador
FERNANDO ALBERTO CORREIA DOS SANTOS JUNIOR - Coorientador
Catalogação: 24/MAR/2025 Língua(s): ENGLISH - UNITED STATES
Tipo: TEXT Subtipo: THESIS
Notas: [pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.
Referência(s): [pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=69716&idi=1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=69716&idi=2
DOI: https://doi.org/10.17771/PUCRio.acad.69716
Resumo:
A considerable amount of legal documents is available on the Internet nowadays. Even so, knowledge extraction activities, such as Named Entity Recognition (NER), in the legal domain are still challenging, even more so when are not in English. One of the reasons is the low amount of annotated corpora available, combined with the burden and cost of developing a new one. The legal annotation task is itself challenging due to limitations on both time and human resources. The emergence of Large Language Models (LLMs) has attracted attention due to their capability of reasoning using only in context information about the tasks. Recent studies present significant results regarding its usage in document annotation tasks; in some cases, the model is comparable to human annotators. Thus, in this work, we evaluate LLM s in-context few-shot learning capability on a legal NER, assessing its usage in an annotation task process with humans. To do so, our study is based on the data gathered along an annotation task previously conducted to produce a corpus of legal decisions written in Portuguese, published by Brazilian Supreme Federal Court (STF), dedicated to the NER, and annotated by law students. Our experiments showed that the LLM can produce highly accurate annotations, without any gradient update. Thus, may can assist annotators in the annotation process, reducing the amount of time and effort and making the annotation task more efficient.
Descrição: Arquivo:   
COMPLETE PDF