Segmentation of oral lesions through convolutional neural networks

Souza, Eduardo Santos Carlos de

Segmentation of oral lesions through convolutional neural networks

Detalhes bibliográficos
Ano de defesa:	2025
Autor(a) principal:	Souza, Eduardo Santos Carlos de
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation
Link de acesso:	https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
Resumo:	Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.

Metadados do item

id	USP_4bc01e6ebf5ffa5a68547e0c80e6c8c4
oai_identifier_str	oai:teses.usp.br:tde-29072025-102150
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling	Segmentation of oral lesions through convolutional neural networksSegmentação de lesões orais através de redes neurais convolucionaisArtificial intelligenceCâncer de bocaConvolutional neural networksInteligência artificialLesões orais potencialmente malignasOral cancerOral potentially malignant disordersRedes neurais convolucionaisSegmentação semânticaSemantic segmentationArtificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.A inteligência artificial tem sido amplamente utilizada na área da saúde, em especial no contexto de imagens médicas, para produzir modelos capazes de rápida e precisamente identificar doenças ou características relevantes presentes nas imagens, visando auxiliar os médicos em sua prática. Câncer de boca e lesões orais potencialmente malignas são um grupo de doenças que recebeu relativamente pouca atenção da comunidade científica, o que é especialmente preocupante dado que o câncer de boca está entre as formas de câncer mais comuns e letais. Dos trabalhos presentes na literatura para tais condições, a maioria se foca na classificação das lesões existentes nas imagens, o que não é suficiente dado que os clínicos também necessitam diferenciar as áreas afetadas das saudáveis para adequadamente diagnosticar e tratar a doença. Esse trabalho tenta remediar essa brecha realizando um estudo focado no uso de modelos de inteligência artificial para a segmentação semântica de imagens contendo tais doenças. Além da construção de tais modelos, esse estudo também explora a utilização de transferência de aprendizado para amenizar a falta de dados anotados para a produção dos mesmos. Em especial, foi realizada a comparação do volumoso conjunto de dados para classificação ImageNet com conjuntos menores voltados a segmentação semântica, especificamente os conjuntos COCO e ISIC 2018, em relação a suas performances quando utilizados para transferência de aprendizado. Por fim, para fornecer uma comparação valiosa, foi calculada a diferença pareada em performance entre os modelos produzidos e a performance humana. Através deste estudo, foram produzidos dois modelos baseados na Attention U-Net a serem mencionados nesse resumo: um modelo com a ConvNeXt como backbone, de alto custo computacional, que obteve um Dice Score de 0.715; e um modelo com a MobileNet como backbone, de baixo custo computacional, que obteve um Dice Score de 0.692. Quanto a transferência de aprendizado, foi concluído que o conjunto ISIC 2018 fornece performance inferior à falta de transferência, enquanto os conjuntos ImageNet e COCO fornecem um ganho significativo em performance. Esses dois últimos conjuntos geraram resultados muito similares, e os testes de hipótese não conseguiram determinar superioridade entre eles. Isso cria a possibilidade de se utilizar o conjunto COCO para uma transferência de aprendizado mais rápida e menos intensiva em recursos. Por último, a comparação pareada dos modelos com a performance humana determinou estatisticamente que a performance humana é superior, indicando a necessidade de mais pesquisa e desenvolvimento para essa tarefa.Biblioteca Digitais de Teses e Dissertações da USPCarvalho, André Carlos Ponce de Leon Ferreira deSouza, Eduardo Santos Carlos de2025-04-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-07-29T13:30:02Zoai:teses.usp.br:tde-29072025-102150Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212025-07-29T13:30:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv	Segmentation of oral lesions through convolutional neural networks Segmentação de lesões orais através de redes neurais convolucionais
title	Segmentation of oral lesions through convolutional neural networks
spellingShingle	Segmentation of oral lesions through convolutional neural networks Souza, Eduardo Santos Carlos de Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation
title_short	Segmentation of oral lesions through convolutional neural networks
title_full	Segmentation of oral lesions through convolutional neural networks
title_fullStr	Segmentation of oral lesions through convolutional neural networks
title_full_unstemmed	Segmentation of oral lesions through convolutional neural networks
title_sort	Segmentation of oral lesions through convolutional neural networks
author	Souza, Eduardo Santos Carlos de
author_facet	Souza, Eduardo Santos Carlos de
author_role	author
dc.contributor.none.fl_str_mv	Carvalho, André Carlos Ponce de Leon Ferreira de
dc.contributor.author.fl_str_mv	Souza, Eduardo Santos Carlos de
dc.subject.por.fl_str_mv	Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation
topic	Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation
description	Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.
publishDate	2025
dc.date.none.fl_str_mv	2025-04-29
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
url	https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv	Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Liberar o conteúdo para acesso público.
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1844786351064481792

Segmentation of oral lesions through convolutional neural networks

Registros relacionados