Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular
| Ano de defesa: | 2024 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Serra
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://repositorio.ifes.edu.br/handle/123456789/5718 |
Resumo: | Monocular depth estimation is a fundamental task in computer vision, with applications in areas such as augmented reality, autonomous navigation, and medical procedures. This work investigates the reuse of architectures originally developed for semantic segmentation, such as U-Net and its variants, in the task of depth estimation. Combinations of U-Net and UNet++ architectures with different encoders were implemented and evaluated, covering convolutional networks, Transformer networks, and hybrid architectures, including VGG-19 (BN), Xception, Inception-ResNet-v2, Mixed Transformer (B2), CoaT, CoAtNet, and TransUnet. The experiments were conducted on the NYU Depth V2 dataset, using multiple input sizes to investigate the impact of resolution on the results. The evaluated metrics include RMSE, rel, log10, and accuracy thresholds δ_1, δ_2 e δ_3.. The results reveal that the combination of U-Net with the CoaT-Lite (M) encoder outperforms all other evaluated approaches and network combinations. The implementation is available at: https://github.com/duraes-antonio/seg_depth. |
| id |
IFES-2_42db7de1aeee2c24741469f9aac9fedd |
|---|---|
| oai_identifier_str |
oai:repositorio.ifes.edu.br:123456789/5718 |
| network_acronym_str |
IFES-2 |
| network_name_str |
Repositório Institucional do IFES |
| repository_id_str |
|
| spelling |
Silva, Antônio Carlos Durães daInstituto Federal do Espírito SantoCampana, Vitor FaiçalKomati, Karin SatieGazolli, Kelly Assis de Souza2025-02-14T22:08:45Z2025-02-14T22:08:45Z2024-11-19Silva, Antonio Carlos Duraes da. Uma Comparação De Arquiteturas Baseadas Em U-Net Na Estimativa De Profundidade Monocular. 2024. 69 f. Dissertação (Mestrado em Computação Aplicada) - Instituto Federal do Espírito Santo, Campus Serra, Serra, 2024.https://repositorio.ifes.edu.br/handle/123456789/571830004012075P4Monocular depth estimation is a fundamental task in computer vision, with applications in areas such as augmented reality, autonomous navigation, and medical procedures. This work investigates the reuse of architectures originally developed for semantic segmentation, such as U-Net and its variants, in the task of depth estimation. Combinations of U-Net and UNet++ architectures with different encoders were implemented and evaluated, covering convolutional networks, Transformer networks, and hybrid architectures, including VGG-19 (BN), Xception, Inception-ResNet-v2, Mixed Transformer (B2), CoaT, CoAtNet, and TransUnet. The experiments were conducted on the NYU Depth V2 dataset, using multiple input sizes to investigate the impact of resolution on the results. The evaluated metrics include RMSE, rel, log10, and accuracy thresholds δ_1, δ_2 e δ_3.. The results reveal that the combination of U-Net with the CoaT-Lite (M) encoder outperforms all other evaluated approaches and network combinations. The implementation is available at: https://github.com/duraes-antonio/seg_depth.ABSTRACT Monocular depth estimation is a fundamental task in computer vision, with applications in areas such as augmented reality, autonomous navigation, and medical procedures. This work investigates the reuse of architectures originally developed for semantic segmentation, such as U-Net and its variants, in the task of depth estimation. Combinations of U-Net and UNet++ architectures with different encoders were implemented and evaluated, covering convolutional networks, Transformer networks, and hybrid architectures, including VGG-19 (BN), Xception, Inception-ResNet-v2, Mixed Transformer (B2), CoaT, CoAtNet, and TransUnet. The experiments were conducted on the NYU Depth V2 dataset, using multiple input sizes to investigate the impact of resolution on the results. The evaluated metrics include RMSE, rel, log10, and accuracy thresholds δ1, δ2, and δ3. The results reveal that the combination of U-Net with the CoaT-Lite (M) encoder outperforms all other evaluated approaches and network combinations. The implementation is available at: https://github.com/duraes-antonio/seg_depth. Keywords: Monocular depth estimation. U-Net. UNet++. Transformers.65 f.U-NetUNet++Profundidade monocularVisão computacionalRedes transformersRede neural convolucionaUma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocularinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisSerrainfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional do IFESinstname:Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES)instacron:IFESCampus SerraComputação AplicadaInteligência Artificialhttp://lattes.cnpq.br/0343732414150447https://orcid.org/0000-0001-5551-3258Computação Aplicadahttp://lattes.cnpq.br/9860697624155451https://orcid.org/0000-0001-5677-4724TEXTdissertação.pdf.txtdissertação.pdf.txtExtracted texttext/plain123374https://repositorio.ifes.edu.br/bitstreams/2330426b-0027-4971-84fb-7d0f4ab23dba/download9b20ddc8a5cee2b76c2c7507270fd2c5MD53falseAnonymousREADTHUMBNAILdissertação.pdf.jpgdissertação.pdf.jpgGenerated Thumbnailimage/jpeg2167https://repositorio.ifes.edu.br/bitstreams/ec7a26bf-0b84-47cd-bd1c-77eef8afcc90/download93f2c6720a5b32d73850a875f8088184MD54falseAnonymousREADORIGINALdissertação.pdfdissertação.pdfDissertação - Mestrado em Computação Aplicada - IFES Serraapplication/pdf23277576https://repositorio.ifes.edu.br/bitstreams/22a6b194-a340-4206-b517-3b78679b2f90/download5f08d2bbf9bc5cb6b3ad6258bd3e4a8fMD51trueAnonymousREADLICENSElicense.txtlicense.txttext/plain; charset=utf-8934https://repositorio.ifes.edu.br/bitstreams/4c225af8-8c3d-461e-9855-e03fe6b8cb1c/downloadac7cb971050ed632be934da23d966924MD52falseAnonymousREAD123456789/57182025-08-27T18:26:44.221Zopen.accessoai:repositorio.ifes.edu.br:123456789/5718https://repositorio.ifes.edu.brRepositório InstitucionalPUBhttps://repositorio.ifes.edu.br/server/oai/requestrepositorio@ifes.edu.bropendoar:2025-08-27T18:26:44Repositório Institucional do IFES - Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES)falseQXV0b3JlcyBxdWUgc3VibWV0ZW0gYSBlc3RhIGNvbmZlcsOqbmNpYSBjb25jb3JkYW0gY29tIG9zIHNlZ3VpbnRlcyB0ZXJtb3M6CmEpIEF1dG9yZXMgbWFudMOpbSBvcyBkaXJlaXRvcyBhdXRvcmFpcyBzb2JyZSBvIHRyYWJhbGhvLCBwZXJtaXRpbmRvIMOgIGNvbmZlcsOqbmNpYSBjb2xvY8OhLWxvIHNvYiB1bWEgbGljZW7Dp2EgTGljZW7Dp2EgQ3JlYXRpdmUgQ29tbW9ucyBBdHRyaWJ1dGlvbiwgcXVlIHBlcm1pdGUgbGl2cmVtZW50ZSBhIG91dHJvcyBhY2Vzc2FyLCB1c2FyIGUgY29tcGFydGlsaGFyIG8gdHJhYmFsaG8gY29tIG8gY3LDqWRpdG8gZGUgYXV0b3JpYSBlIGFwcmVzZW50YcOnw6NvIGluaWNpYWwgbmVzdGEgY29uZmVyw6puY2lhLgpiKSBBdXRvcmVzIHBvZGVtIGFicmlyIG3Do28gZG9zIHRlcm1vcyBkYSBsaWNlbsOnYSBDQyBlIGRlZmluaXIgY29udHJhdG9zIGFkaWNpb25haXMgcGFyYSBhIGRpc3RyaWJ1acOnw6NvIG7Do28tZXhjbHVzaXZhIGUgc3Vic2Vxw7xlbnRlIHB1YmxpY2HDp8OjbyBkZXN0ZSB0cmFiYWxobyAoZXguOiBwdWJsaWNhciB1bWEgdmVyc8OjbyBhdHVhbGl6YWRhIGVtIHVtIHBlcmnDs2RpY28sIGRpc3BvbmliaWxpemFyIGVtIHJlcG9zaXTDs3JpbyBpbnN0aXR1Y2lvbmFsLCBvdSBwdWJsaWPDoS1sbyBlbSBsaXZybyksIGNvbSBvIGNyw6lkaXRvIGRlIGF1dG9yaWEgZSBhcHJlc2VudGHDp8OjbyBpbmljaWFsIG5lc3RhIGNvbmZlcsOqbmNpYS4KYykgQWzDqW0gZGlzc28sIGF1dG9yZXMgc8OjbyBpbmNlbnRpdmFkb3MgYSBwdWJsaWNhciBlIGNvbXBhcnRpbGhhciBzZXVzIHRyYWJhbGhvcyBvbmxpbmUgKGV4LjogZW0gcmVwb3NpdMOzcmlvIGluc3RpdHVjaW9uYWwgb3UgZW0gc3VhIHDDoWdpbmEgcGVzc29hbCkgYSBxdWFscXVlciBtb21lbnRvIGFudGVzIGUgZGVwb2lzIGRhIGNvbmZlcsOqCg== |
| dc.title.pt_BR.fl_str_mv |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| title |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| spellingShingle |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular Silva, Antônio Carlos Durães da U-Net UNet++ Profundidade monocular Visão computacional Redes transformers Rede neural convoluciona |
| title_short |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| title_full |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| title_fullStr |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| title_full_unstemmed |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| title_sort |
Uma Comparação De Arquiteturas Baseadas Em U-net Na Estimativa De Profundidade Monocular |
| author |
Silva, Antônio Carlos Durães da |
| author_facet |
Silva, Antônio Carlos Durães da |
| author_role |
author |
| dc.contributor.institution.pt_BR.fl_str_mv |
Instituto Federal do Espírito Santo |
| dc.contributor.member.none.fl_str_mv |
Campana, Vitor Faiçal Komati, Karin Satie |
| dc.contributor.author.fl_str_mv |
Silva, Antônio Carlos Durães da |
| dc.contributor.advisor1.fl_str_mv |
Gazolli, Kelly Assis de Souza |
| contributor_str_mv |
Gazolli, Kelly Assis de Souza |
| dc.subject.por.fl_str_mv |
U-Net UNet++ Profundidade monocular Visão computacional Redes transformers Rede neural convoluciona |
| topic |
U-Net UNet++ Profundidade monocular Visão computacional Redes transformers Rede neural convoluciona |
| description |
Monocular depth estimation is a fundamental task in computer vision, with applications in areas such as augmented reality, autonomous navigation, and medical procedures. This work investigates the reuse of architectures originally developed for semantic segmentation, such as U-Net and its variants, in the task of depth estimation. Combinations of U-Net and UNet++ architectures with different encoders were implemented and evaluated, covering convolutional networks, Transformer networks, and hybrid architectures, including VGG-19 (BN), Xception, Inception-ResNet-v2, Mixed Transformer (B2), CoaT, CoAtNet, and TransUnet. The experiments were conducted on the NYU Depth V2 dataset, using multiple input sizes to investigate the impact of resolution on the results. The evaluated metrics include RMSE, rel, log10, and accuracy thresholds δ_1, δ_2 e δ_3.. The results reveal that the combination of U-Net with the CoaT-Lite (M) encoder outperforms all other evaluated approaches and network combinations. The implementation is available at: https://github.com/duraes-antonio/seg_depth. |
| publishDate |
2024 |
| dc.date.issued.fl_str_mv |
2024-11-19 |
| dc.date.accessioned.fl_str_mv |
2025-02-14T22:08:45Z |
| dc.date.available.fl_str_mv |
2025-02-14T22:08:45Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.citation.fl_str_mv |
Silva, Antonio Carlos Duraes da. Uma Comparação De Arquiteturas Baseadas Em U-Net Na Estimativa De Profundidade Monocular. 2024. 69 f. Dissertação (Mestrado em Computação Aplicada) - Instituto Federal do Espírito Santo, Campus Serra, Serra, 2024. |
| dc.identifier.uri.fl_str_mv |
https://repositorio.ifes.edu.br/handle/123456789/5718 |
| dc.identifier.capes.pt_BR.fl_str_mv |
30004012075P4 |
| identifier_str_mv |
Silva, Antonio Carlos Duraes da. Uma Comparação De Arquiteturas Baseadas Em U-Net Na Estimativa De Profundidade Monocular. 2024. 69 f. Dissertação (Mestrado em Computação Aplicada) - Instituto Federal do Espírito Santo, Campus Serra, Serra, 2024. 30004012075P4 |
| url |
https://repositorio.ifes.edu.br/handle/123456789/5718 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
65 f. |
| dc.publisher.none.fl_str_mv |
Serra |
| publisher.none.fl_str_mv |
Serra |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional do IFES instname:Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES) instacron:IFES |
| instname_str |
Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES) |
| instacron_str |
IFES |
| institution |
IFES |
| reponame_str |
Repositório Institucional do IFES |
| collection |
Repositório Institucional do IFES |
| bitstream.url.fl_str_mv |
https://repositorio.ifes.edu.br/bitstreams/2330426b-0027-4971-84fb-7d0f4ab23dba/download https://repositorio.ifes.edu.br/bitstreams/ec7a26bf-0b84-47cd-bd1c-77eef8afcc90/download https://repositorio.ifes.edu.br/bitstreams/22a6b194-a340-4206-b517-3b78679b2f90/download https://repositorio.ifes.edu.br/bitstreams/4c225af8-8c3d-461e-9855-e03fe6b8cb1c/download |
| bitstream.checksum.fl_str_mv |
9b20ddc8a5cee2b76c2c7507270fd2c5 93f2c6720a5b32d73850a875f8088184 5f08d2bbf9bc5cb6b3ad6258bd3e4a8f ac7cb971050ed632be934da23d966924 |
| bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
| repository.name.fl_str_mv |
Repositório Institucional do IFES - Instituto Federal de Educação, Ciência e Tecnologia do Espírito Santo (IFES) |
| repository.mail.fl_str_mv |
repositorio@ifes.edu.br |
| _version_ |
1864451011147464704 |