Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring

Detalhes bibliográficos
Ano de defesa: 2025
Autor(a) principal: Watanabe, Luciana Erika Yaginuma
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/
Resumo: In the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems.
id USP_e6f4b9199b7589ef095ac585f1e26cb1
oai_identifier_str oai:teses.usp.br:tde-11062025-162009
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling Machine-Learning Modeling of Benthic Systems: Baseline Assessment and MonitoringModelagem por Aprendizagem de Máquina de Sistemas Bentônicos: Avaliação de Base e MonitoramentoBacia de SantosEcossistema marinhoMacrofaunaMacrofaunaMarine ecosystemMeiofaunaMeiofaunaMicrobiotaMicrobiotaNematodaNematodaSantos BasinIn the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems.Diante do colapso global dos ecossistemas e da necessidade urgente de responder de forma eficaz através de ações de conservação, a modelagem estatística pode ser uma ferramenta poderosa, pois fornece informações valiosas de referência e predição. Este estudo visa melhorar avaliações de referência e programas de monitoramento, propondo esquemas analíticos adequados para diferentes comunidades-alvo e abordagens de modelagem via aprendizado de máquina. O banco de dados do sistema bentônico da Bacia de Santos foi utilizado como estudo de caso. Os dados foram obtidos de 100 estações, distribuídas em 11 transectos com oito pontos cada, e 12 estações adicionais sobre o Platô de São Paulo, durante duas campanhas entre junhonovembro de 2019 e fevereiro-junho de 2021. Dados de composição taxonômica e indicadores ecológicos da microbiota (bactérias e arqueias), meiofauna e macrofauna foram obtidos em cada estação, com 132 variáveis ambientais da água de fundo e do sedimento. Em quatro capítulos, diferentes técnicas de aprendizado de máquina foram aplicadas para modelar diferentes alvos bentônicos. No Capítulo 2, descritores univariados da meiofauna foram modelados usando variáveis ambientais como preditores e algoritmos de regressão de floresta aleatória. Baseado na distribuição espacial das predições, foram delineadas seis zonas bentônicas na bacia. No Capítulo 3, dados multivariados de gêneros de nematódeos foram agrupados em associações usando mapa auto-organizado (SOM) e análise de agrupamento hierárquico (HC). Essas associações foram modeladas usando preditores ambientais observados e simulados. Foram obtidas seis associações de nematódeos, cuja distribuição espacial confirmou a organização em um mosaico de seis zonas bentônicas. Outrossim, o modelo de associação obteve alta precisão baseado nos dados simulados de oito variáveis ambientais significativas, sugerindo que sua distribuição espacial pode ser monitorada usando apenas essas variáveis. No Capítulo 4, a variabilidade temporal foi adicionada aos dados univariados de meiofauna e macrofauna e os resultados do modelo destacaram influência da variação temporal dos processos oceanográficos nessas comunidades bentônicas, principalmente na plataforma continental. Ademais, usar um esquema de modelo estruturado resultou na perda de 3% de precisão, comparando com um modelo não estruturado, mas com vantagem de fazer generalizações. Finalmente, no Capítulo 5, mais complexidade espacial foi adicionada, integrando dados de composição multivariada da microbiota, nematódeos e macrofauna através de SOM multicamadas. Oito comunidades bentônicas principais foram identificadas e foram altamente correlacionadas com cada grupo bentônico separadamente. A precisão do modelo foi perfeita (100%) e nove variáveis ambientais foram significativas. As abordagens de modelagem apresentadas nos capítulos fornecem informações valiosas de referência sobre o sistema bentônico, com referências de composição e densidades relativas, e listas das variáveis ambientais mais importantes para a previsão. Os esquemas permitiram a avaliação da precisão das diferentes estruturas de modelo, subsidiando decisões de monitoramento. Baseado no modelo estruturado proposto, é possível fazer previsões para cenários futuros. Esta é uma grande vantagem das abordagens de aprendizado de máquina para programas de monitoramento adaptativo, pois se pode melhorar a modelagem constantemente, conforme mais dados são adquiridos. Portanto, o estudo tem alto potencial para apoiar programas de gestão e monitoramento, e, em última instância, conservação dos sistemas marinhos.Biblioteca Digitais de Teses e Dissertações da USPCorbisier, Thais NavajasWatanabe, Luciana Erika Yaginuma2025-01-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-08-12T13:01:02Zoai:teses.usp.br:tde-11062025-162009Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212025-08-12T13:01:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
Modelagem por Aprendizagem de Máquina de Sistemas Bentônicos: Avaliação de Base e Monitoramento
title Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
spellingShingle Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
Watanabe, Luciana Erika Yaginuma
Bacia de Santos
Ecossistema marinho
Macrofauna
Macrofauna
Marine ecosystem
Meiofauna
Meiofauna
Microbiota
Microbiota
Nematoda
Nematoda
Santos Basin
title_short Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
title_full Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
title_fullStr Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
title_full_unstemmed Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
title_sort Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
author Watanabe, Luciana Erika Yaginuma
author_facet Watanabe, Luciana Erika Yaginuma
author_role author
dc.contributor.none.fl_str_mv Corbisier, Thais Navajas
dc.contributor.author.fl_str_mv Watanabe, Luciana Erika Yaginuma
dc.subject.por.fl_str_mv Bacia de Santos
Ecossistema marinho
Macrofauna
Macrofauna
Marine ecosystem
Meiofauna
Meiofauna
Microbiota
Microbiota
Nematoda
Nematoda
Santos Basin
topic Bacia de Santos
Ecossistema marinho
Macrofauna
Macrofauna
Marine ecosystem
Meiofauna
Meiofauna
Microbiota
Microbiota
Nematoda
Nematoda
Santos Basin
description In the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems.
publishDate 2025
dc.date.none.fl_str_mv 2025-01-28
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/
url https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1848370482507677696