Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/ |
Resumo: | Oral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability. |
| id |
USP_1c8119c0bc7370b854cfa4668804fc39 |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-11122025-123821 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniquesAnálise de imagens hiperespectrais do carcinoma espinocelular oral utilizando técnicas de aprendizado de máquinaaprendizado de máquinacâncer oralespectroscopia FTIRespectroscopia vibracionalFTIR spectroscopyhyperspectral imagingimagem hiperespectralmachine learningoral cancervibrational spectroscopyOral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability.O carcinoma espinocelular oral (OSCC) continua sendo uma das neoplasias mais agressivas da região de cabeça e pescoço, com um prognóstico fortemente dependente do diagnóstico precoce. A imagem hiperespectral (HSI), combinada com a espectroscopia no infravermelho por Transformada de Fourier (FTIR), é capaz de captar informações bioquímicas detalhadas a partir de amostras de tecido. Neste estudo, foi investigado o desempenho de quatro modelos de aprendizagem de máquina (ML) - Análise Discriminante Linear (LDA), Análise Discriminante por Mínimos Quadrados Parciais (PLS-DA), Floresta Aleatória (RF) e Redes Neurais Feedforward (FNNs) - na classificação de imagens hiperespectrais FTIR de OSCC e tecido oral saudável. Amostras humanas de microarranjos teciduais (TMA), compostas por 48 espécimes de OSCC e 48 controles, foram pré-processadas por meio de corte espectral, suavização, Correção Estendida do Sinal Multiplicativo (EMSC) e normalização por Variável Normal Padrão (SNV). Os espectros foram reorganizados para análise em nível de pixel, e o desempenho dos modelos foi avaliado por validação cruzada de 10 vezes (10-fold CV), utilizando métricas como acurácia, F1-score e a área sob a curva ROC (AUC). O modelo LDA apresentou resultados robustos tanto em nível de pixel quanto de imagem, com AUC de 0,9465 e acurácia de 91,7% em nível de imagem. O PLS-DA demonstrou bom desempenho na classificação por pixel (AUC = 0,8686), mas teve desempenho reduzido na avaliação por imagem. O modelo de Floresta Aleatória superou os demais na análise por pixel (AUC = 0,9864) e manteve desempenho satisfatório em nível de imagem. As FNNs alcançaram acurácia equilibrada (80%) e destacaram regiões espectrais relacionadas à estrutura secundária de proteínas como discriminadores importantes. Esses resultados confirmam o potencial da combinação entre FTIR-HSI e modelos de aprendizagem de máquina como uma ferramenta poderosa para o diagnóstico precoce de OSCC, sendo os modelos LDA e RF particularmente eficazes em termos de interpretabilidade e capacidade preditiva.Biblioteca Digitais de Teses e Dissertações da USPZezell, Denise MariaPeres, Daniella Lúmara Pereira Mendes de Oliveira2025-05-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-12-19T12:29:02Zoai:teses.usp.br:tde-11122025-123821Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212025-12-19T12:29:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques Análise de imagens hiperespectrais do carcinoma espinocelular oral utilizando técnicas de aprendizado de máquina |
| title |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| spellingShingle |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques Peres, Daniella Lúmara Pereira Mendes de Oliveira aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy |
| title_short |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| title_full |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| title_fullStr |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| title_full_unstemmed |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| title_sort |
Hyperspectral image analysis of oral squamous cell carcinoma using machine learning techniques |
| author |
Peres, Daniella Lúmara Pereira Mendes de Oliveira |
| author_facet |
Peres, Daniella Lúmara Pereira Mendes de Oliveira |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Zezell, Denise Maria |
| dc.contributor.author.fl_str_mv |
Peres, Daniella Lúmara Pereira Mendes de Oliveira |
| dc.subject.por.fl_str_mv |
aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy |
| topic |
aprendizado de máquina câncer oral espectroscopia FTIR espectroscopia vibracional FTIR spectroscopy hyperspectral imaging imagem hiperespectral machine learning oral cancer vibrational spectroscopy |
| description |
Oral squamous cell carcinoma (OSCC) remains one of the most aggressive malignancies of the head and neck region, with prognosis heavily dependent on early detection. Hyperspectral imaging (HSI) combined with Fourier Transform Infrared (FTIR) spectroscopy is capable of capturing detailed biochemical information from tissue samples. In this study, we investigated the performance of four machine learning (ML) models - Linear Discriminant Analysis (LDA), Partial Least Squares Discriminant Analysis (PLS-DA), Random Forest (RF), and Feed Forward Neural Networks (FNNs) - for the classification of FTIR hyperspectral images of OSCC and healthy oral tissue. Human tissue microarray samples, comprising 48 OSCC and 48 control specimens, were preprocessed using spectral trimming, smoothing, Extended Multiplicative Signal Correction (EMSC), and Standard Normal Variate (SNV) normalization. Spectra were unfolded for pixel-level analysis, and classification performance was evaluated through 10-fold cross-validation (CV) using metrics such as accuracy, F1-score, and the area under the ROC curve (AUC). LDA achieved robust results at both pixel and image levels, with an AUC of 0.9465 and 91.7% image-level accuracy. PLS-DA demonstrated strong pixel-level classification (AUC = 0.8686) but showed decreased performance at the image level. Random Forest outperformed the other models in pixel-level analysis (AUC = 0.9864) and maintained satisfactory image-level performance. FNNs achieved balanced accuracy (80%) and high-lighted spectral regions related to protein secondary structures as key discriminators. These findings confirm the potential of FTIR-HSI coupled with ML as a powerful tool for the early diagnosis of OSCC, with LDA and RF models offering particularly favorable performance in both interpretability and predictive capability. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-05-08 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/ |
| url |
https://www.teses.usp.br/teses/disponiveis/85/85134/tde-11122025-123821/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1857669982947639296 |