ETDs

Estatística

Título:

A DATA ANNOTATION APPROACH USING LARGE LANGUAGE MODELS

Autor:

CARLOS VINICIOS MARTINS ROCHA

Colaborador(es):

HELIO CORTES VIEIRA LOPES - Orientador
JONATAS DOS SANTOS GROSMAN - Coorientador

Catalogação:

17/OUT/2024

Língua(s):

ENGLISH - UNITED STATES

Tipo:

TEXT

Subtipo:

THESIS

Notas:

[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.

Referência(s):

[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=68379&idi=1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=68379&idi=2

DOI:

https://doi.org/10.17771/PUCRio.acad.68379

Resumo:

Documents are essential for the economic and academic system; however, exploring them can be complex and time-consuming. An approach to surpass this problem is the use of Visual Question and Answering (VQA) models to extract information from documents through natural language prompts. In VQA, as well as for the development of various models, it is necessary to have annotated data for training and validation. However, creating these datasets is challenging due to the high cost involved in the process. To face this challenge, we propose a four-step process that combines Computer Vision Models and Large Language Models (LLMs) for VQA data annotation in financial reports. The proposed method starts with recognizing the textual structure of documents through Document Layout Analysis and Table Structure Extraction models. Then, it uses two distinct LLMs for the generation and evaluation of question and answer pairs, automating the construction and selection of the best pairs to compose the final dataset. To evaluate the proposed method, we generate a dataset for train and evaluate VQA specialized models.

Descrição:			Arquivo:
COMPLETE			PDF