Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques

Peres, Daniella Lúmara Pereira Mendes de Oliveira

Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques

Detalhes bibliográficos
Ano de defesa:	2025
Autor(a) principal:	Peres, Daniella Lúmara Pereira Mendes de Oliveira
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy
Link de acesso:	https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/
Resumo:	Oral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability.

Metadados do item

id	USP_1c8119c0bc7370b854cfa4668804fc39
oai_identifier_str	oai:teses.usp.br:tde-11122025-123821
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniquesAnálise de imagens hiperespectrais do carcinoma espinocelular oral utilizando técnicas de aprendizado de máquinaaprendizado de máquinacâncer oralespectroscopia FTIRespectroscopia vibracionalFTIR spectroscopyhyperspectral imagingimagem hiperespectralmachine learningoral cancervibrational spectroscopyOral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability.O carcinoma espinocelular oral (OSCC) continua sendo uma das neoplasias mais agressivas da região de cabeça e pescoço, com um prognóstico fortemente dependente do diagnóstico precoce. A imagem hiperespectral (HSI), combinada com a espectroscopia no infravermelho por Transformada de Fourier (FTIR), é capaz de captar informações bioquímicas detalhadas a partir de amostras de tecido. Neste estudo, foi investigado o desempenho de quatro modelos de aprendizagem de máquina (ML) - Análise Discriminante Linear (LDA), Análise Discriminante por Mínimos Quadrados Parciais (PLS-DA), Floresta Aleatória (RF) e Redes Neurais Feedforward (FNNs) - na classificação de imagens hiperespectrais FTIR de OSCC e tecido oral saudável. Amostras humanas de microarranjos teciduais (TMA), compostas por 48 espécimes de OSCC e 48 controles, foram pré-processadas por meio de corte espectral, suavização, Correção Estendida do Sinal Multiplicativo (EMSC) e normalização por Variável Normal Padrão (SNV). Os espectros foram reorganizados para análise em nível de pixel, e o desempenho dos modelos foi avaliado por validação cruzada de 10 vezes (10-fold CV), utilizando métricas como acurácia, F1-score e a área sob a curva ROC (AUC). O modelo LDA apresentou resultados robustos tanto em nível de pixel quanto de imagem, com AUC de 0,9465 e acurácia de 91,7% em nível de imagem. O PLS-DA demonstrou bom desempenho na classificação por pixel (AUC = 0,8686), mas teve desempenho reduzido na avaliação por imagem. O modelo de Floresta Aleatória superou os demais na análise por pixel (AUC = 0,9864) e manteve desempenho satisfatório em nível de imagem. As FNNs alcançaram acurácia equilibrada (80%) e destacaram regiões espectrais relacionadas à estrutura secundária de proteínas como discriminadores importantes. Esses resultados confirmam o potencial da combinação entre FTIR-HSI e modelos de aprendizagem de máquina como uma ferramenta poderosa para o diagnóstico precoce de OSCC, sendo os modelos LDA e RF particularmente eficazes em termos de interpretabilidade e capacidade preditiva.Biblioteca Digitais de Teses e Dissertações da USPZezell, Denise MariaPeres, Daniella Lúmara Pereira Mendes de Oliveira2025-05-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-12-19T12:29:02Zoai:teses.usp.br:tde-11122025-123821Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212025-12-19T12:29:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques Análise de imagens hiperespectrais do carcinoma espinocelular oral utilizando técnicas de aprendizado de máquina
title	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
spellingShingle	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques Peres, Daniella Lúmara Pereira Mendes de Oliveira aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy
title_short	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
title_full	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
title_fullStr	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
title_full_unstemmed	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
title_sort	Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
author	Peres, Daniella Lúmara Pereira Mendes de Oliveira
author_facet	Peres, Daniella Lúmara Pereira Mendes de Oliveira
author_role	author
dc.contributor.none.fl_str_mv	Zezell, Denise Maria
dc.contributor.author.fl_str_mv	Peres, Daniella Lúmara Pereira Mendes de Oliveira
dc.subject.por.fl_str_mv	aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy
topic	aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy
description	Oral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability.
publishDate	2025
dc.date.none.fl_str_mv	2025-05-08
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/
url	https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv	Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Liberar o conteúdo para acesso público.
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1857669982947639296

Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques

Registros relacionados