ETDs

Estatística

Título:

DEEP REINFORCEMENT LEARNING FOR QUADROTOR TRAJECTORY CONTROL IN VIRTUAL ENVIRONMENTS

Autor:

GUILHERME SIQUEIRA EDUARDO

Colaborador(es):

WOUTER CAARLS - Orientador

Catalogação:

12/AGO/2021

Língua(s):

ENGLISH - UNITED STATES

Tipo:

TEXT

Subtipo:

THESIS

Notas:

[pt] Todos os dados constantes dos documentos são de inteira responsabilidade de seus autores. Os dados utilizados nas descrições dos documentos estão em conformidade com os sistemas da administração da PUC-Rio.
[en] All data contained in the documents are the sole responsibility of the authors. The data used in the descriptions of the documents are in conformity with the systems of the administration of PUC-Rio.

Referência(s):

[pt] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=54178&idi=1
[en] https://www.maxwell.vrac.puc-rio.br/projetosEspeciais/ETDs/consultas/conteudo.php?strSecao=resultado&nrSeq=54178&idi=2

DOI:

https://doi.org/10.17771/PUCRio.acad.54178

Resumo:

With recent advances in computational power, the use of novel, complex control models has become viable for controlling quadrotors. One such method is Deep Reinforcement Learning (DRL), which can devise a control policy that better addresses non-linearities in the quadrotor model than traditional control methods. An important non-linearity present in payload carrying air vehicles are the inherent time-varying properties, such as size and mass, caused by the addition and removal of cargo. The general, domain-agnostic approach of the DRL controller also allows it to handle visual navigation, in which position estimation data is unreliable. In this work, we employ a Soft Actor-Critic algorithm to design controllers for a quadrotor to carry out tasks reproducing the mentioned challenges in a virtual environment. First, we develop two waypoint guidance controllers: a low-level controller that acts directly on motor commands and a high-level controller that interacts in cascade with a velocity PID controller. The controllers are then evaluated on the proposed payload pickup and drop task, thereby introducing a timevarying variable. The controllers conceived are able to outperform a traditional positional PID controller with optimized gains in the proposed course, while remaining agnostic to a set of simulation parameters. Finally, we employ the same DRL algorithm to develop a controller that can leverage visual data to complete a racing course in simulation. With this controller, the quadrotor is able to localize gates using an RGB-D camera and devise a trajectory that drives it to traverse as many gates in the racing course as possible.

Descrição:			Arquivo:
COMPLETE			PDF