Título: | EVALUATING LLM IN-CONTEXT FEW-SHOT LEARNING ON LEGAL ENTITY ANNOTATION TASK | ||||||||||||
Autor: |
VENICIUS GARCIA REGO |
||||||||||||
Colaborador(es): |
HELIO CORTES VIEIRA LOPES - Orientador FERNANDO ALBERTO CORREIA DOS SANTOS JUNIOR - Coorientador |
||||||||||||
Catalogação: | 24/MAR/2025 | Língua(s): | ENGLISH - UNITED STATES |
||||||||||
Tipo: | TEXT | Subtipo: | THESIS | ||||||||||
Notas: |
[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio. [en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio. |
||||||||||||
Referência(s): |
[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=69716&idi=1 [en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=69716&idi=2 |
||||||||||||
DOI: | https://doi.org/10.17771/PUCRio.acad.69716 | ||||||||||||
Resumo: | |||||||||||||
A considerable amount of legal documents is available on the Internet
nowadays. Even so, knowledge extraction activities, such as Named Entity
Recognition (NER), in the legal domain are still challenging, even more so
when are not in English. One of the reasons is the low amount of annotated
corpora available, combined with the burden and cost of developing a new
one. The legal annotation task is itself challenging due to limitations on both
time and human resources. The emergence of Large Language Models (LLMs)
has attracted attention due to their capability of reasoning using only in
context information about the tasks. Recent studies present significant results
regarding its usage in document annotation tasks; in some cases, the model
is comparable to human annotators. Thus, in this work, we evaluate LLM s
in-context few-shot learning capability on a legal NER, assessing its usage in
an annotation task process with humans. To do so, our study is based on
the data gathered along an annotation task previously conducted to produce
a corpus of legal decisions written in Portuguese, published by Brazilian
Supreme Federal Court (STF), dedicated to the NER, and annotated by law
students. Our experiments showed that the LLM can produce highly accurate
annotations, without any gradient update. Thus, may can assist annotators in
the annotation process, reducing the amount of time and effort and making
the annotation task more efficient.
|
|||||||||||||
|