Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects
| Ano de defesa: | 2026 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/18/18162/tde-09022026-143242/ |
Resumo: | Precise taxonomic identification constitutes a prerequisite for effective biodiversity monitoring, ecological research, and biological control strategies. The order Hymenoptera, encompassing over 150,000 described species, includes the hyper-diverse superfamily Ichneumonoidea, which comprises the families Ichneumonidae and Braconidae. The taxonomic impediment, characterized by the vast number of species within these groups that remain undescribed, has direct impacts on ecology, biodiversity conservation, and practical applications such as biological control. To address this challenge, we introduce a curated, high-resolution dataset designed to advance automated taxonomic identification. The dataset comprises 1,739 specimen images focusing primarily on Ichneumonidae and Braconidae, supplemented by representative samples from nine additional hymenopteran families (Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae, and Vespidae) to ensure robust out-group differentiation. Annotation precision was maximized through a semi-automated workflow utilizing the Computer Vision Annotation Tool (CVAT) integrated with the Segment Anything Model (SAM). This methodology enabled the generation of high-fidelity segmentation masks and bounding boxes for diagnostic morphological structures, specifically facilitating family-level identification. To establish a performance benchmark, we evaluated the efficacy of multiple deep learning architectures. For image-level classification, the YOLOv12 model proved optimal, achieving an accuracy exceeding 93%. Subsequently, an object detection pipeline was implemented for the automated detection of the insect body and wings, attaining a mean Average Precision (mAP) of over 90%. Furthermore, Explainable AI (XAI) techniques were employed to interpret the models inference mechanism. These visual analyses confirmed that the network attends to biologically significant featuressuch as specific wing venation patternsrather than confounding background artifacts, thereby validating the taxonomic reliability of the automated predictions. This dataset and its associated benchmarks represent a critical resource for the advancement of computational entomology. |
| id |
USP_da9c70fc49e59eae3eb119d35ef79de7 |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-09022026-143242 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insectsTécnicas de visão computacional baseadas em aprendizado profundo para identificação automatizada de Ichneumonoidea e outros insetos da ordem Hymenopteraaprendizado de máquinaarthropodartrópodebiodiversidadebiodiversityconvolutional neural networkmachine learningrede neural convolucionaltaxonomiataxonomyPrecise taxonomic identification constitutes a prerequisite for effective biodiversity monitoring, ecological research, and biological control strategies. The order Hymenoptera, encompassing over 150,000 described species, includes the hyper-diverse superfamily Ichneumonoidea, which comprises the families Ichneumonidae and Braconidae. The taxonomic impediment, characterized by the vast number of species within these groups that remain undescribed, has direct impacts on ecology, biodiversity conservation, and practical applications such as biological control. To address this challenge, we introduce a curated, high-resolution dataset designed to advance automated taxonomic identification. The dataset comprises 1,739 specimen images focusing primarily on Ichneumonidae and Braconidae, supplemented by representative samples from nine additional hymenopteran families (Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae, and Vespidae) to ensure robust out-group differentiation. Annotation precision was maximized through a semi-automated workflow utilizing the Computer Vision Annotation Tool (CVAT) integrated with the Segment Anything Model (SAM). This methodology enabled the generation of high-fidelity segmentation masks and bounding boxes for diagnostic morphological structures, specifically facilitating family-level identification. To establish a performance benchmark, we evaluated the efficacy of multiple deep learning architectures. For image-level classification, the YOLOv12 model proved optimal, achieving an accuracy exceeding 93%. Subsequently, an object detection pipeline was implemented for the automated detection of the insect body and wings, attaining a mean Average Precision (mAP) of over 90%. Furthermore, Explainable AI (XAI) techniques were employed to interpret the models inference mechanism. These visual analyses confirmed that the network attends to biologically significant featuressuch as specific wing venation patternsrather than confounding background artifacts, thereby validating the taxonomic reliability of the automated predictions. This dataset and its associated benchmarks represent a critical resource for the advancement of computational entomology.A identificação taxonômica precisa constitui um pré-requisito para o monitoramento eficaz da biodiversidade, as pesquisas ecológicas e as estratégias de controle biológico. A ordem Hymenoptera, abrangendo mais de 150.000 espécies descritas, inclui a superfamília hiperdiversa Ichneumonoidea, que compreende as famílias Ichneumonidae e Braconidae. O impedimento taxonômico, caracterizado pelo vasto número de espécies ainda não descritas nesses grupos, gera impactos diretos na ecologia, na conservação da biodiversidade e em aplicações práticas como o controle biológico. Para enfrentar este desafio, apresentamos um conjunto de dados curado e de alta resolução, projetado para avançar a identificação taxonômica automatizada. O conjunto de dados compreende 1.739 imagens de espécimes, focando principalmente em Ichneumonidae e Braconidae, suplementado por amostras representativas de nove famílias adicionais de himenópteros (Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae e Vespidae) para garantir uma diferenciação robusta de grupos externos. A precisão da anotação foi maximizada por meio de um fluxo de trabalho semiautomatizado utilizando a ferramenta Computer Vision Annotation Tool (CVAT) integrada ao Segment Anything Model (SAM). Esta metodologia permitiu a geração de máscaras de segmentação de alta fidelidade e caixas delimitadoras (bounding boxes) para estruturas morfológicas diagnósticas, facilitando especificamente a identificação em nível de família. Para estabelecer um benchmark de desempenho, avaliamos a eficácia de múltiplas arquiteturas de aprendizado profundo. Para a classificação em nível de imagem, o modelo YOLOv12 mostrou-se o mais eficaz, alcançando uma acurácia superior a 93%. Subsequentemente, um pipeline de detecção de objetos foi implementado para a detecção automatizada do corpo do inseto e das asas, atingindo uma Mean Average Precision (mAP) superior a 90%. Além disso, técnicas de Inteligência Artificial Explicável (XAI) foram empregadas para interpretar o mecanismo de inferência do modelo. Essas análises visuais confirmaram que a rede se concentra em características biologicamente significativas como padrões específicos de venação das asas em vez de artefatos de fundo, validando assim a confiabilidade taxonômica das previsões automatizadas. Este conjunto de dados e seus benchmarks associados representam um recurso crítico para o avanço da entomologia computacional.Biblioteca Digitais de Teses e Dissertações da USPBecker, MarceloPinheiro, João Manoel Herrera2026-01-15info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/18/18162/tde-09022026-143242/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2026-02-13T20:14:02Zoai:teses.usp.br:tde-09022026-143242Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212026-02-13T20:14:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects Técnicas de visão computacional baseadas em aprendizado profundo para identificação automatizada de Ichneumonoidea e outros insetos da ordem Hymenoptera |
| title |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| spellingShingle |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects Pinheiro, João Manoel Herrera aprendizado de máquina arthropod artrópode biodiversidade biodiversity convolutional neural network machine learning rede neural convolucional taxonomia taxonomy |
| title_short |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| title_full |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| title_fullStr |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| title_full_unstemmed |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| title_sort |
Deep learning-based computer vision techniques for automated identification of Ichneumonoidea and other Hymenoptera insects |
| author |
Pinheiro, João Manoel Herrera |
| author_facet |
Pinheiro, João Manoel Herrera |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Becker, Marcelo |
| dc.contributor.author.fl_str_mv |
Pinheiro, João Manoel Herrera |
| dc.subject.por.fl_str_mv |
aprendizado de máquina arthropod artrópode biodiversidade biodiversity convolutional neural network machine learning rede neural convolucional taxonomia taxonomy |
| topic |
aprendizado de máquina arthropod artrópode biodiversidade biodiversity convolutional neural network machine learning rede neural convolucional taxonomia taxonomy |
| description |
Precise taxonomic identification constitutes a prerequisite for effective biodiversity monitoring, ecological research, and biological control strategies. The order Hymenoptera, encompassing over 150,000 described species, includes the hyper-diverse superfamily Ichneumonoidea, which comprises the families Ichneumonidae and Braconidae. The taxonomic impediment, characterized by the vast number of species within these groups that remain undescribed, has direct impacts on ecology, biodiversity conservation, and practical applications such as biological control. To address this challenge, we introduce a curated, high-resolution dataset designed to advance automated taxonomic identification. The dataset comprises 1,739 specimen images focusing primarily on Ichneumonidae and Braconidae, supplemented by representative samples from nine additional hymenopteran families (Andrenidae, Apidae, Bethylidae, Chrysididae, Colletidae, Halictidae, Megachilidae, Pompilidae, and Vespidae) to ensure robust out-group differentiation. Annotation precision was maximized through a semi-automated workflow utilizing the Computer Vision Annotation Tool (CVAT) integrated with the Segment Anything Model (SAM). This methodology enabled the generation of high-fidelity segmentation masks and bounding boxes for diagnostic morphological structures, specifically facilitating family-level identification. To establish a performance benchmark, we evaluated the efficacy of multiple deep learning architectures. For image-level classification, the YOLOv12 model proved optimal, achieving an accuracy exceeding 93%. Subsequently, an object detection pipeline was implemented for the automated detection of the insect body and wings, attaining a mean Average Precision (mAP) of over 90%. Furthermore, Explainable AI (XAI) techniques were employed to interpret the models inference mechanism. These visual analyses confirmed that the network attends to biologically significant featuressuch as specific wing venation patternsrather than confounding background artifacts, thereby validating the taxonomic reliability of the automated predictions. This dataset and its associated benchmarks represent a critical resource for the advancement of computational entomology. |
| publishDate |
2026 |
| dc.date.none.fl_str_mv |
2026-01-15 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/18/18162/tde-09022026-143242/ |
| url |
https://www.teses.usp.br/teses/disponiveis/18/18162/tde-09022026-143242/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1857669976814518272 |