Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Alves, Adson Nogueira [UNESP]
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Estadual Paulista (Unesp)
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/11449/214035
Resumo: Unmanned Aerial Vehicles (UAV) have received increasing attention in recent years mainly due to their breadth of application in complex and costly activities, such as surveillance, agriculture, and entertainment. All of this market and academic interest has highlighted new challenges that the platform will confront. Among these challenges is the complexity of navigation in unknown environments due to the randomness of agents’ position and movement dynamics in the environment. Thus, new learning techniques have been proposed for these and other tasks in recent years. Particularly, model-free algorithms based on the process of exploration and autonomous learning have been highlighted in this domain. This is the case of Reinforcement Learning (RL). RL seeks appropriate behavior for the robot through a trial and error approach and mapping input states to commands in actuators directly. Thus, any pre-defined control structure becomes unnecessary. The present work aims to investigate the navigation of UAVs using a state-of-the-art method and off-policy method, the Soft Actor-Critic (SAC) of Deep Learning (DL). Our proposed approach employs visual information from the environment and multiple embedded sensors and the Autoencoder (AE) method to reduce the dimensionality of the visual data collected in the environment. We developed our work using the CoppeliaSim simulator, which has a high degree of fidelity concerning the real world. In this scenario, we investigated the aircraft state representation and the resulting navigation in environments with or without obstacles, fixed and mobile. The results showed that the learned policy was able to perform the low-level control of the UAV in all analyzed scenarios. The learned policies have good generalization capabilities. However, as the complexity of the environment increased, we re-used the learned policies from less complex environments with further training needed.
id UNSP_3158ef5e734a295251abd19883c306aa
oai_identifier_str oai:repositorio.unesp.br:11449/214035
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str
spelling Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approachControle de um veiculo aéreo não tripulado (VANT) usando abordagem de aprendizado por reforço profundoInteligência artificialRedes neurais (Computação)Sistemas embarcados (Computadores)Drone aircraftRobot visionUnmanned Aerial Vehicles (UAV) have received increasing attention in recent years mainly due to their breadth of application in complex and costly activities, such as surveillance, agriculture, and entertainment. All of this market and academic interest has highlighted new challenges that the platform will confront. Among these challenges is the complexity of navigation in unknown environments due to the randomness of agents’ position and movement dynamics in the environment. Thus, new learning techniques have been proposed for these and other tasks in recent years. Particularly, model-free algorithms based on the process of exploration and autonomous learning have been highlighted in this domain. This is the case of Reinforcement Learning (RL). RL seeks appropriate behavior for the robot through a trial and error approach and mapping input states to commands in actuators directly. Thus, any pre-defined control structure becomes unnecessary. The present work aims to investigate the navigation of UAVs using a state-of-the-art method and off-policy method, the Soft Actor-Critic (SAC) of Deep Learning (DL). Our proposed approach employs visual information from the environment and multiple embedded sensors and the Autoencoder (AE) method to reduce the dimensionality of the visual data collected in the environment. We developed our work using the CoppeliaSim simulator, which has a high degree of fidelity concerning the real world. In this scenario, we investigated the aircraft state representation and the resulting navigation in environments with or without obstacles, fixed and mobile. The results showed that the learned policy was able to perform the low-level control of the UAV in all analyzed scenarios. The learned policies have good generalization capabilities. However, as the complexity of the environment increased, we re-used the learned policies from less complex environments with further training needed.Veículos aéreos não tripulados (VANT) têm sido alvo de crescente atenção nos últimos anos principalmente devido a sua amplitude de aplicação em atividades complexas e onerosas, como no setor de vigilância, agricultura, entretenimento, entre outros. Todo esse interesse do mercado e acadêmico colocou em evidência novos desafios que a plataforma enfrentará. Entre esses está a complexidade de navegação em ambientes desconhecidos devido a aleatoriedade da posição e dinâmica de movimento dos agentes no ambiente. Com isso, novas técnicas de aprendizado têm sido propostas para essas e outras tarefas nos últimos anos. Particularmente, algoritmos livres de modelo baseados no processo de exploração e aprendizado autônomo, têm obtido destaque nesse domínio, como é o caso do Aprendizado por Reforço (RL) que através de uma abordagem de tentativa e erro busca atingir um comportamento adequado ao robô, utilizando uma Rede Neural para mapear diretamente os estados de entrada para comandos nos atuadores. Com isso qualquer estrutura de controle pré definida se torna desnecessária. O presente trabalho tem como objetivo investigar a navegação de VANTs utilizando um método de ponta e fora da politica, o Soft Actor- Critic (SAC) de Aprendizado Profundo (DL) fazendo uso simultâneo de informações visuais do ambiente e também de multiplos sensores embarcados, e o método de Autoencoder (AE) para redução de dimensionalidade das informações visuais coletadas no ambiente. O trabalho foi desenvolvido no ambiente de simulação CoppeliaSim, que tem alto grau de fidelidade em relação ao mundo real, utilizando o Pyrep, que é uma estrutura usada para pesquisa de aprendizado de robô. Nesse cenário, foi realizado uma investigação sobre a representação dos estados da aeronave e sua navegação em ambientes com ou sem obstáculos, fixos e móveis. Os resultados mostraram que a politica aprendida foi capaz de realizar o controle de baixo nível do VANT em todos os cenários analisados, a generalização da dinâmica aprendida teve boa evolução conforme aumentava a complexidade do ambiente, porém para cada novo cenário novos treinamentos eram necessários, devido ao período de adaptação, evoluindo conforme previsto.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)CAPES: 001Universidade Estadual Paulista (Unesp)Simões, Alexandre da Silva [UNESP]Universidade Estadual Paulista (Unesp)Alves, Adson Nogueira [UNESP]2021-08-17T18:23:54Z2021-08-17T18:23:54Z2021-07-16info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/11449/21403533004170002P2enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESP2026-01-17T05:02:45Zoai:repositorio.unesp.br:11449/214035Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462026-01-17T05:02:45Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
Controle de um veiculo aéreo não tripulado (VANT) usando abordagem de aprendizado por reforço profundo
title Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
spellingShingle Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
Alves, Adson Nogueira [UNESP]
Inteligência artificial
Redes neurais (Computação)
Sistemas embarcados (Computadores)
Drone aircraft
Robot vision
title_short Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
title_full Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
title_fullStr Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
title_full_unstemmed Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
title_sort Control of an unmanned aerial vehicle (UAV) using deep reinforcement learning (DRL) approach
author Alves, Adson Nogueira [UNESP]
author_facet Alves, Adson Nogueira [UNESP]
author_role author
dc.contributor.none.fl_str_mv Simões, Alexandre da Silva [UNESP]
Universidade Estadual Paulista (Unesp)
dc.contributor.author.fl_str_mv Alves, Adson Nogueira [UNESP]
dc.subject.por.fl_str_mv Inteligência artificial
Redes neurais (Computação)
Sistemas embarcados (Computadores)
Drone aircraft
Robot vision
topic Inteligência artificial
Redes neurais (Computação)
Sistemas embarcados (Computadores)
Drone aircraft
Robot vision
description Unmanned Aerial Vehicles (UAV) have received increasing attention in recent years mainly due to their breadth of application in complex and costly activities, such as surveillance, agriculture, and entertainment. All of this market and academic interest has highlighted new challenges that the platform will confront. Among these challenges is the complexity of navigation in unknown environments due to the randomness of agents’ position and movement dynamics in the environment. Thus, new learning techniques have been proposed for these and other tasks in recent years. Particularly, model-free algorithms based on the process of exploration and autonomous learning have been highlighted in this domain. This is the case of Reinforcement Learning (RL). RL seeks appropriate behavior for the robot through a trial and error approach and mapping input states to commands in actuators directly. Thus, any pre-defined control structure becomes unnecessary. The present work aims to investigate the navigation of UAVs using a state-of-the-art method and off-policy method, the Soft Actor-Critic (SAC) of Deep Learning (DL). Our proposed approach employs visual information from the environment and multiple embedded sensors and the Autoencoder (AE) method to reduce the dimensionality of the visual data collected in the environment. We developed our work using the CoppeliaSim simulator, which has a high degree of fidelity concerning the real world. In this scenario, we investigated the aircraft state representation and the resulting navigation in environments with or without obstacles, fixed and mobile. The results showed that the learned policy was able to perform the low-level control of the UAV in all analyzed scenarios. The learned policies have good generalization capabilities. However, as the complexity of the environment increased, we re-used the learned policies from less complex environments with further training needed.
publishDate 2021
dc.date.none.fl_str_mv 2021-08-17T18:23:54Z
2021-08-17T18:23:54Z
2021-07-16
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/11449/214035
33004170002P2
url http://hdl.handle.net/11449/214035
identifier_str_mv 33004170002P2
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv repositoriounesp@unesp.br
_version_ 1854954508190220288