Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring
| Ano de defesa: | 2025 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Tese |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/ |
Resumo: | In the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems. |
| id |
USP_e6f4b9199b7589ef095ac585f1e26cb1 |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-11062025-162009 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and MonitoringModelagem por Aprendizagem de Máquina de Sistemas Bentônicos: Avaliação de Base e MonitoramentoBacia de SantosEcossistema marinhoMacrofaunaMacrofaunaMarine ecosystemMeiofaunaMeiofaunaMicrobiotaMicrobiotaNematodaNematodaSantos BasinIn the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems.Diante do colapso global dos ecossistemas e da necessidade urgente de responder de forma eficaz através de ações de conservação, a modelagem estatística pode ser uma ferramenta poderosa, pois fornece informações valiosas de referência e predição. Este estudo visa melhorar avaliações de referência e programas de monitoramento, propondo esquemas analíticos adequados para diferentes comunidades-alvo e abordagens de modelagem via aprendizado de máquina. O banco de dados do sistema bentônico da Bacia de Santos foi utilizado como estudo de caso. Os dados foram obtidos de 100 estações, distribuídas em 11 transectos com oito pontos cada, e 12 estações adicionais sobre o Platô de São Paulo, durante duas campanhas entre junhonovembro de 2019 e fevereiro-junho de 2021. Dados de composição taxonômica e indicadores ecológicos da microbiota (bactérias e arqueias), meiofauna e macrofauna foram obtidos em cada estação, com 132 variáveis ambientais da água de fundo e do sedimento. Em quatro capítulos, diferentes técnicas de aprendizado de máquina foram aplicadas para modelar diferentes alvos bentônicos. No Capítulo 2, descritores univariados da meiofauna foram modelados usando variáveis ambientais como preditores e algoritmos de regressão de floresta aleatória. Baseado na distribuição espacial das predições, foram delineadas seis zonas bentônicas na bacia. No Capítulo 3, dados multivariados de gêneros de nematódeos foram agrupados em associações usando mapa auto-organizado (SOM) e análise de agrupamento hierárquico (HC). Essas associações foram modeladas usando preditores ambientais observados e simulados. Foram obtidas seis associações de nematódeos, cuja distribuição espacial confirmou a organização em um mosaico de seis zonas bentônicas. Outrossim, o modelo de associação obteve alta precisão baseado nos dados simulados de oito variáveis ambientais significativas, sugerindo que sua distribuição espacial pode ser monitorada usando apenas essas variáveis. No Capítulo 4, a variabilidade temporal foi adicionada aos dados univariados de meiofauna e macrofauna e os resultados do modelo destacaram influência da variação temporal dos processos oceanográficos nessas comunidades bentônicas, principalmente na plataforma continental. Ademais, usar um esquema de modelo estruturado resultou na perda de 3% de precisão, comparando com um modelo não estruturado, mas com vantagem de fazer generalizações. Finalmente, no Capítulo 5, mais complexidade espacial foi adicionada, integrando dados de composição multivariada da microbiota, nematódeos e macrofauna através de SOM multicamadas. Oito comunidades bentônicas principais foram identificadas e foram altamente correlacionadas com cada grupo bentônico separadamente. A precisão do modelo foi perfeita (100%) e nove variáveis ambientais foram significativas. As abordagens de modelagem apresentadas nos capítulos fornecem informações valiosas de referência sobre o sistema bentônico, com referências de composição e densidades relativas, e listas das variáveis ambientais mais importantes para a previsão. Os esquemas permitiram a avaliação da precisão das diferentes estruturas de modelo, subsidiando decisões de monitoramento. Baseado no modelo estruturado proposto, é possível fazer previsões para cenários futuros. Esta é uma grande vantagem das abordagens de aprendizado de máquina para programas de monitoramento adaptativo, pois se pode melhorar a modelagem constantemente, conforme mais dados são adquiridos. Portanto, o estudo tem alto potencial para apoiar programas de gestão e monitoramento, e, em última instância, conservação dos sistemas marinhos.Biblioteca Digitais de Teses e Dissertações da USPCorbisier, Thais NavajasWatanabe, Luciana Erika Yaginuma2025-01-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2025-08-12T13:01:02Zoai:teses.usp.br:tde-11062025-162009Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212025-08-12T13:01:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring Modelagem por Aprendizagem de Máquina de Sistemas Bentônicos: Avaliação de Base e Monitoramento |
| title |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| spellingShingle |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring Watanabe, Luciana Erika Yaginuma Bacia de Santos Ecossistema marinho Macrofauna Macrofauna Marine ecosystem Meiofauna Meiofauna Microbiota Microbiota Nematoda Nematoda Santos Basin |
| title_short |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| title_full |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| title_fullStr |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| title_full_unstemmed |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| title_sort |
Machine-Learning Modeling of Benthic Systems: Baseline Assessment and Monitoring |
| author |
Watanabe, Luciana Erika Yaginuma |
| author_facet |
Watanabe, Luciana Erika Yaginuma |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Corbisier, Thais Navajas |
| dc.contributor.author.fl_str_mv |
Watanabe, Luciana Erika Yaginuma |
| dc.subject.por.fl_str_mv |
Bacia de Santos Ecossistema marinho Macrofauna Macrofauna Marine ecosystem Meiofauna Meiofauna Microbiota Microbiota Nematoda Nematoda Santos Basin |
| topic |
Bacia de Santos Ecossistema marinho Macrofauna Macrofauna Marine ecosystem Meiofauna Meiofauna Microbiota Microbiota Nematoda Nematoda Santos Basin |
| description |
In the face of global ecosystem collapse and our urgent need to respond effectively through conservation actions, statistical modeling can be a powerful tool as it provides valuable baseline and predictive information. This study aims to enhance baseline assessments and monitoring programs by proposing analytical frameworks suited to different community targets and machine-learning modeling approaches. The database from the Santos Basin benthic system was used as a case study. All data were obtained from 100 sampling stations distributed along 11 transects with eight points each, and an additional 12 stations over the São Paulo Plateau, during two campaigns between June-November 2019 and February-June 2021. Taxonomic composition data and ecological indicators of the microbiota (bacteria and archaea), meiofauna, and macrofauna were obtained in each station, along with 132 environmental variables from the bottom water and sediment. Over four chapters, different machine-learning techniques were applied to model different benthic targets. In Chapter 2, meiofauna univariate descriptors were modeled using environmental variables as predictors and random forest regression algorithms. Based on the spatial distribution of the predictions, six benthic zones were delineated in the Basin. In Chapter 3, multivariate nematode genera data were aggregated into associations by using a self-organizing map (SOM) and hierarchical clustering (HC) analysis. Those associations were then modeled using observed and simulated environmental predictors. Six nematode associations were obtained and their spatial distribution analytically confirmed the Basin organization in a mosaic of six benthic zones. Also, high accuracy was retrieved by the association model based on the simulated data of eight significant environmental variables suggesting that its spatial distribution can be monitored using only those variables. In Chapter 4, temporal variability was added to the meiofauna and macrofauna univariate data and the model\'s results highlighted the influence of temporal variation of oceanographic processes on those benthic communities mainly in the continental shelf. Also, the use of a structure modeling framework resulted in a loss of 3% accuracy compared to a non-structure model, with the advantage of making generalizations of the first one. Finally, in Chapter 5, more spatial complexity was added by integrating multivariate composition data from the microbiota, nematode, and macrofauna through multilayer SOM. Eight major benthic communities were identified and were highly correlated with each benthic group separately. The accuracy of the model was perfect (100%) and nine environmental variables were indicated as significant ones. All modeling approaches presented in the chapters provide valuable baseline information on the benthic system, with composition and relative densities references, and lists of the most important environmental variables for prediction. The frameworks allowed the accuracy evaluation of the different model structures, subsidizing monitoring decisions. Based on the proposed structured model it is also possible to make predictions for future scenarios. This is a major advantage of machine-learning approaches for adaptive monitoring programs, as modeling can be constantly improved and simulated as more data is acquired. In conclusion, the study has high potential to support management and monitoring programs, and ultimately the conservation of marine systems. |
| publishDate |
2025 |
| dc.date.none.fl_str_mv |
2025-01-28 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
| format |
doctoralThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/ |
| url |
https://www.teses.usp.br/teses/disponiveis/21/21134/tde-11062025-162009/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1865492324530782208 |