Segmentation of oral lesions through convolutional neural networks
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/ |
Resumo: | Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task. |
| id |
USP_4bc01e6ebf5ffa5a68547e0c80e6c8c4 |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-29072025-102150 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Segmentation of oral lesions through convolutional neural networksSegmentação de lesões orais através de redes neurais convolucionaisArtificial intelligenceCâncer de bocaConvolutional neural networksInteligência artificialLesões orais potencialmente malignasOral cancerOral potentially malignant disordersRedes neurais convolucionaisSegmentação semânticaSemantic segmentationArtificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task.A inteligência artificial tem sido amplamente utilizada na área da saúde, em especial no contexto de imagens médicas, para produzir modelos capazes de rápida e precisamente identificar doenças ou características relevantes presentes nas imagens, visando auxiliar os médicos em sua prática. Câncer de boca e lesões orais potencialmente malignas são um grupo de doenças que recebeu relativamente pouca atenção da comunidade científica, o que é especialmente preocupante dado que o câncer de boca está entre as formas de câncer mais comuns e letais. Dos trabalhos presentes na literatura para tais condições, a maioria se foca na classificação das lesões existentes nas imagens, o que não é suficiente dado que os clínicos também necessitam diferenciar as áreas afetadas das saudáveis para adequadamente diagnosticar e tratar a doença. Esse trabalho tenta remediar essa brecha realizando um estudo focado no uso de modelos de inteligência artificial para a segmentação semântica de imagens contendo tais doenças. Além da construção de tais modelos, esse estudo também explora a utilização de transferência de aprendizado para amenizar a falta de dados anotados para a produção dos mesmos. Em especial, foi realizada a comparação do volumoso conjunto de dados para classificação ImageNet com conjuntos menores voltados a segmentação semântica, especificamente os conjuntos COCO e ISIC 2018, em relação a suas performances quando utilizados para transferência de aprendizado. Por fim, para fornecer uma comparação valiosa, foi calculada a diferença pareada em performance entre os modelos produzidos e a performance humana. Através deste estudo, foram produzidos dois modelos baseados na Attention U-Net a serem mencionados nesse resumo: um modelo com a ConvNeXt como backbone, de alto custo computacional, que obteve um Dice Score de 0.715; e um modelo com a MobileNet como backbone, de baixo custo computacional, que obteve um Dice Score de 0.692. Quanto a transferência de aprendizado, foi concluído que o conjunto ISIC 2018 fornece performance inferior à falta de transferência, enquanto os conjuntos ImageNet e COCO fornecem um ganho significativo em performance. Esses dois últimos conjuntos geraram resultados muito similares, e os testes de hipótese não conseguiram determinar superioridade entre eles. Isso cria a possibilidade de se utilizar o conjunto COCO para uma transferência de aprendizado mais rápida e menos intensiva em recursos. Por último, a comparação pareada dos modelos com a performance humana determinou estatisticamente que a performance humana é superior, indicando a necessidade de mais pesquisa e desenvolvimento para essa tarefa.Biblioteca Digitais de Teses e Dissertações da USPCarvalho, André Carlos Ponce de Leon Ferreira deSouza, Eduardo Santos Carlos de2025-04-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-07-29T13:30:02Zoai:teses.usp.br:tde-29072025-102150Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212025-07-29T13:30:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Segmentation of oral lesions through convolutional neural networks Segmentação de lesões orais através de redes neurais convolucionais |
| title |
Segmentation of oral lesions through convolutional neural networks |
| spellingShingle |
Segmentation of oral lesions through convolutional neural networks Souza, Eduardo Santos Carlos de Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation |
| title_short |
Segmentation of oral lesions through convolutional neural networks |
| title_full |
Segmentation of oral lesions through convolutional neural networks |
| title_fullStr |
Segmentation of oral lesions through convolutional neural networks |
| title_full_unstemmed |
Segmentation of oral lesions through convolutional neural networks |
| title_sort |
Segmentation of oral lesions through convolutional neural networks |
| author |
Souza, Eduardo Santos Carlos de |
| author_facet |
Souza, Eduardo Santos Carlos de |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Carvalho, André Carlos Ponce de Leon Ferreira de |
| dc.contributor.author.fl_str_mv |
Souza, Eduardo Santos Carlos de |
| dc.subject.por.fl_str_mv |
Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation |
| topic |
Artificial intelligence Câncer de boca Convolutional neural networks Inteligência artificial Lesões orais potencialmente malignas Oral cancer Oral potentially malignant disorders Redes neurais convolucionais Segmentação semântica Semantic segmentation |
| description |
Artificial intelligence has been widely used in the medical field in recent years, especially for medical imaging, with the goal of creating models capable of quickly and precisely identifying conditions or relevant characteristics present in the images, in order to aid medical professionals in their practice. Oral cancer and oral potentially malignant disorders are a group of conditions that has received relatively little attention from the scientific community, which is especially concerning since oral cancer is among the most common and deadly forms of cancer. Of the works present in the literature on these afflictions, most focus on classifying the lesions in the images, which is not sufficient for medical practice as practitioners also need to differentiate healthy tissue from the afflicted areas in order to properly diagnose and subsequently treat the disease. This work bridges this gap by conducting research focused on the usage of artificial intelligence models for the semantic segmentation of such images. Aside from building the models themselves, this work also explores the utilization of transfer learning practices to remediate the lack of annotated data necessary to build such models. In particular, the large ImageNet classification dataset was compared to smaller semantic segmentation ones, namely the COCO and ISIC 2018 datasets, regarding the transfer learning performance provided by these datasets. Finally, to produce a valuable reference for comparison, the pairwise difference in performance between the developed models and human performance was calculated. Through this work, two leading models based on the Attention U-Net were produced worthy of mention in this abstract: a model using the ConvNeXt for its backbone, of high computational cost, which obtained a Dice Score of 0.715; and a model using the MobileNet for its backbone, of low computational cost, which obtained a Dice Score of 0.692. Regarding transfer learning, it was concluded that the ISIC 2018 dataset provides worse performance than no transfer learning, whereas the ImageNet and COCO datasets provide significant gains in performance. These latter two datasets produced highly similar results, and the hypothesis testing conducted was unable to determine that either was superior to the other. This creates the possibility of using the COCO dataset for quicker and less resource-intensive transfer learning. Finally, the paired comparison between the models and the human performance statistically demonstrated that humans outperform the models, indicating the need for further research and development for this task. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-04-29 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/ |
| url |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-29072025-102150/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1844786351064481792 |