Segmentation of oral lesions through convolutional neural networks

Detalhes bibliográficos
Ano de defesa: 2025
Autor(a) principal: Souza, Eduardo Santos Carlos de
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
Resumo: Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.
id USP_4bc01e6ebf5ffa5a68547e0c80e6c8c4
oai_identifier_str oai:teses.usp.br:tde-29072025-102150
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling Segmentation of oral lesions through convolutional neural networksSegmentação de lesões orais através de redes neurais convolucionaisArtificial intelligenceCâncer de bocaConvolutional neural networksInteligência artificialLesões orais potencialmente malignasOral cancerOral potentially malignant disordersRedes neurais convolucionaisSegmentação semânticaSemantic segmentationArtificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.A inteligência artificial tem sido amplamente utilizada na área da saúde, em especial no contexto de imagens médicas, para produzir modelos capazes de rápida e precisamente identificar doenças ou características relevantes presentes nas imagens, visando auxiliar os médicos em sua prática. Câncer de boca e lesões orais potencialmente malignas são um grupo de doenças que recebeu relativamente pouca atenção da comunidade científica, o que é especialmente preocupante dado que o câncer de boca está entre as formas de câncer mais comuns e letais. Dos trabalhos presentes na literatura para tais condições, a maioria se foca na classificação das lesões existentes nas imagens, o que não é suficiente dado que os clínicos também necessitam diferenciar as áreas afetadas das saudáveis para adequadamente diagnosticar e tratar a doença. Esse trabalho tenta remediar essa brecha realizando um estudo focado no uso de modelos de inteligência artificial para a segmentação semântica de imagens contendo tais doenças. Além da construção de tais modelos, esse estudo também explora a utilização de transferência de aprendizado para amenizar a falta de dados anotados para a produção dos mesmos. Em especial, foi realizada a comparação do volumoso conjunto de dados para classificação ImageNet com conjuntos menores voltados a segmentação semântica, especificamente os conjuntos COCO e ISIC 2018, em relação a suas performances quando utilizados para transferência de aprendizado. Por fim, para fornecer uma comparação valiosa, foi calculada a diferença pareada em performance entre os modelos produzidos e a performance humana. Através deste estudo, foram produzidos dois modelos baseados na Attention U-Net a serem mencionados nesse resumo: um modelo com a ConvNeXt como backbone, de alto custo computacional, que obteve um Dice Score de 0.715; e um modelo com a MobileNet como backbone, de baixo custo computacional, que obteve um Dice Score de 0.692. Quanto a transferência de aprendizado, foi concluído que o conjunto ISIC 2018 fornece performance inferior à falta de transferência, enquanto os conjuntos ImageNet e COCO fornecem um ganho significativo em performance. Esses dois últimos conjuntos geraram resultados muito similares, e os testes de hipótese não conseguiram determinar superioridade entre eles. Isso cria a possibilidade de se utilizar o conjunto COCO para uma transferência de aprendizado mais rápida e menos intensiva em recursos. Por último, a comparação pareada dos modelos com a performance humana determinou estatisticamente que a performance humana é superior, indicando a necessidade de mais pesquisa e desenvolvimento para essa tarefa.Biblioteca Digitais de Teses e Dissertações da USPCarvalho, André Carlos Ponce de Leon Ferreira deSouza, Eduardo Santos Carlos de2025-04-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-07-29T13:30:02Zoai:teses.usp.br:tde-29072025-102150Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212025-07-29T13:30:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Segmentation of oral lesions through convolutional neural networks
Segmentação de lesões orais através de redes neurais convolucionais
title Segmentation of oral lesions through convolutional neural networks
spellingShingle Segmentation of oral lesions through convolutional neural networks
Souza, Eduardo Santos Carlos de
Artificial intelligence
Câncer de boca
Convolutional neural networks
Inteligência artificial
Lesões orais potencialmente malignas
Oral cancer
Oral potentially malignant disorders
Redes neurais convolucionais
Segmentação semântica
Semantic segmentation
title_short Segmentation of oral lesions through convolutional neural networks
title_full Segmentation of oral lesions through convolutional neural networks
title_fullStr Segmentation of oral lesions through convolutional neural networks
title_full_unstemmed Segmentation of oral lesions through convolutional neural networks
title_sort Segmentation of oral lesions through convolutional neural networks
author Souza, Eduardo Santos Carlos de
author_facet Souza, Eduardo Santos Carlos de
author_role author
dc.contributor.none.fl_str_mv Carvalho, André Carlos Ponce de Leon Ferreira de
dc.contributor.author.fl_str_mv Souza, Eduardo Santos Carlos de
dc.subject.por.fl_str_mv Artificial intelligence
Câncer de boca
Convolutional neural networks
Inteligência artificial
Lesões orais potencialmente malignas
Oral cancer
Oral potentially malignant disorders
Redes neurais convolucionais
Segmentação semântica
Semantic segmentation
topic Artificial intelligence
Câncer de boca
Convolutional neural networks
Inteligência artificial
Lesões orais potencialmente malignas
Oral cancer
Oral potentially malignant disorders
Redes neurais convolucionais
Segmentação semântica
Semantic segmentation
description Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.
publishDate 2025
dc.date.none.fl_str_mv 2025-04-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
url https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1844786351064481792