Título: | A CLUSTER-BASED METHOD FOR ACTION SEGMENTATION USING SPATIO-TEMPORAL AND POSITIONAL ENCODED EMBEDDINGS | ||||||||||||
Autor: |
GUILHERME DE AZEVEDO P MARQUES |
||||||||||||
Colaborador(es): |
SERGIO COLCHER - Orientador |
||||||||||||
Catalogação: | 20/ABR/2023 | Língua(s): | ENGLISH - UNITED STATES |
||||||||||
Tipo: | TEXT | Subtipo: | THESIS | ||||||||||
Notas: |
[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio. [en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio. |
||||||||||||
Referência(s): |
[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=62315&idi=1 [en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=62315&idi=2 |
||||||||||||
DOI: | https://doi.org/10.17771/PUCRio.acad.62315 | ||||||||||||
Resumo: | |||||||||||||
The rise of video content as the main media for communication has
been creating massive volumes of video data every second. The ability
of understanding this huge quantities of data automatically has become
increasingly important, therefore better video understanding methods are
needed. A crucial task to overall video understanding is the recognition
and localisation in time of dierent actions. To address this problem,
action segmentation must be achieved. Action segmentation consists of
temporally segmenting a video by labeling each frame with a specific
action. In this work, we propose a novel action segmentation method that
requires no prior video analysis and no annotated data. Our method involves
extracting spatio-temporal features from videos using a pre-trained deep
network. Data is then transformed using a positional encoder, and finally a
clustering algorithm is applied where each cluster presumably corresponds
to a dierent single and distinguishable action. In experiments, we show
that our method produces competitive results on the Breakfast and Inria
Instructional Videos dataset benchmarks.
|
|||||||||||||
|