A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling

Grubert, Daniel Alves da Veiga

A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling

Detalhes bibliográficos
Ano de defesa:	2023
Autor(a) principal:	Grubert, Daniel Alves da Veiga
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Agricultural yield prediction Algoritmos de aprendizado de máquina APSIM-Sugar Hybrid modeling Machine learning algorithms Modelagem híbrida Modelagem preditiva Predição de produtividade agrícola Predictive modeling Simulação de cana-de-açúcar Sugarcane simulation
Link de acesso:	https://www.teses.usp.br/teses/disponiveis/11/11152/tde-03112023-103445/
Resumo:	Accurate and precise crop yield forecasts are essential for farmers and decision-makers. This study aims to assess a hybrid approach involving remote sensing data, crop modeling with process-based models, and machine learning algorithms to improve sugarcane yield predictions. To achieve this, a hybrid yield forecasting approach was developed, combining various data sources, including simulated soil and plant variables from the APSIM model (a process-based crop model), meteorological data, and vegetation indices. These data were used as inputs in machine learning models to forecast end-season sugarcane yield. In this study, 16 regression models were evaluated to forecast sugarcane yield at the municipal level in the state of São Paulo, Brazil, during the period 2010-2020. The results indicated that the hybrid approach developed using the K-Neighbors Regressor algorithm showed the best statistical performance, resulting in the lowest Mean Absolute Error (MAE) of 3.26 t ha-1, with a Mean Absolute Percentage Error (MAPE) of 4.54%. Sugarcane yield predictions were most accurate 1-2 months before harvesting. Furthermore, the study determined which variables had the greatest influence on sugarcane productivity prediction by partially excluding some variables from the prediction model. The results showed that adding variables simulated by the process-based model (APSIM) as input variables for machine learning models could reduce the Root Mean Square Error (RMSE) of yield prediction, ranging from 7.7% to 26.9%, while vegetation indices had the least impact on predictions. The analysis revealed that meteorological data had a greater impact on yield prediction when provided to the process-based model than when directly used in machine learning algorithms. This result suggests that the simulated variables provided by APSIM offer a more comprehensive biophysical description of the interaction between soil, plant, and atmosphere.

Metadados do item

id	USP_1234ddc6f73cc8ad7c7a9eaacf63858d
oai_identifier_str	oai:teses.usp.br:tde-03112023-103445
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modelingUma abordagem sinérgica para a previsão de produtividade da cana-de-açúcar usando aprendizado de máquina, sensoriamento remoto e modelagem baseada em processosAgricultural yield predictionAlgoritmos de aprendizado de máquinaAPSIM-SugarAPSIM-SugarHybrid modelingMachine learning algorithmsModelagem híbridaModelagem preditivaPredição de produtividade agrícolaPredictive modelingSimulação de cana-de-açúcarSugarcane simulationAccurate and precise crop yield forecasts are essential for farmers and decision-makers. This study aims to assess a hybrid approach involving remote sensing data, crop modeling with process-based models, and machine learning algorithms to improve sugarcane yield predictions. To achieve this, a hybrid yield forecasting approach was developed, combining various data sources, including simulated soil and plant variables from the APSIM model (a process-based crop model), meteorological data, and vegetation indices. These data were used as inputs in machine learning models to forecast end-season sugarcane yield. In this study, 16 regression models were evaluated to forecast sugarcane yield at the municipal level in the state of São Paulo, Brazil, during the period 2010-2020. The results indicated that the hybrid approach developed using the K-Neighbors Regressor algorithm showed the best statistical performance, resulting in the lowest Mean Absolute Error (MAE) of 3.26 t ha-1, with a Mean Absolute Percentage Error (MAPE) of 4.54%. Sugarcane yield predictions were most accurate 1-2 months before harvesting. Furthermore, the study determined which variables had the greatest influence on sugarcane productivity prediction by partially excluding some variables from the prediction model. The results showed that adding variables simulated by the process-based model (APSIM) as input variables for machine learning models could reduce the Root Mean Square Error (RMSE) of yield prediction, ranging from 7.7% to 26.9%, while vegetation indices had the least impact on predictions. The analysis revealed that meteorological data had a greater impact on yield prediction when provided to the process-based model than when directly used in machine learning algorithms. This result suggests that the simulated variables provided by APSIM offer a more comprehensive biophysical description of the interaction between soil, plant, and atmosphere.Previsões precisas e acuradas da produtividade de culturas agrícolas são fundamentais para agricultores e tomadores de decisão. Este estudo tem como objetivo avaliar uma abordagem híbrida que envolve dados de sensoriamento remoto, modelagem de culturas com modelos baseados em processos e algoritmos de aprendizado de máquina para melhorar as previsões de produtividade da cana-de-açúcar. Para isso, foi desenvolvida uma abordagem híbrida de previsão de produtividade que combina várias fontes de dados, incluindo variáveis simuladas de solo e da planta do modelo APSIM (um modelo de cultura baseado em processos), dados meteorológicos e índices de vegetação. Esses dados foram utilizados como entrada em modelos de aprendizado de máquina para prever a produtividade final da cana-de-açúcar. Neste estudo, foram avaliados 16 modelos de regressão para prever a produtividade da cana-de-açúcar no final da safra ao nível municipal, no estado de São Paulo, Brasil, durante o período 2010-2020. Os resultados indicaram que a abordagem híbrida desenvolvida utilizando o algoritmo K-Neighbors Regressor apresentou a melhor performance estatística, resultando no menor erro absoluto médio (MAE) de 3.26 t ha-1, com um erro percentual absoluto médio (MAPE) de 4.54%. As previsões de produtividade da cana-de-açúcar proporcionaram maior grau de precisão entre 1-2 meses antes da colheita. Além disso, determinou-se quais variáveis exerceram maior influência para a previsão da produtividade da cana-de-açúcar, excluindo parcialmente algumas variáveis do modelo de previsão. Os resultados mostraram que a adição de variáveis simuladas pelo modelo baseado em processos (APSIM) como variáveis de entrada para modelos de aprendizado de máquina, pode reduzir o erro quadrático médio (RMSE) da previsão de produtividade, variando entre 7,7 e 26,9%, enquanto que, os índices de vegetação tiveram o menor impacto nas previsões. A análise mostrou que os dados meteorológicos têm um impacto maior na previsão da produtividade quando fornecidos ao modelo baseado em processos do que quando usados diretamente em algoritmos de aprendizado de máquina. Esse resultado indica que as variáveis simuladas fornecidas pelo APSIM oferecem uma descrição biofísica mais completa da interação entre solo, planta e atmosfera.Biblioteca Digitais de Teses e Dissertações da USPPilau, Felipe GustavoGrubert, Daniel Alves da Veiga2023-08-28info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/11/11152/tde-03112023-103445/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2023-11-06T18:26:02Zoai:teses.usp.br:tde-03112023-103445Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212023-11-06T18:26:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling Uma abordagem sinérgica para a previsão de produtividade da cana-de-açúcar usando aprendizado de máquina, sensoriamento remoto e modelagem baseada em processos
title	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
spellingShingle	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling Grubert, Daniel Alves da Veiga Agricultural yield prediction Algoritmos de aprendizado de máquina APSIM-Sugar APSIM-Sugar Hybrid modeling Machine learning algorithms Modelagem híbrida Modelagem preditiva Predição de produtividade agrícola Predictive modeling Simulação de cana-de-açúcar Sugarcane simulation
title_short	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
title_full	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
title_fullStr	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
title_full_unstemmed	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
title_sort	A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling
author	Grubert, Daniel Alves da Veiga
author_facet	Grubert, Daniel Alves da Veiga
author_role	author
dc.contributor.none.fl_str_mv	Pilau, Felipe Gustavo
dc.contributor.author.fl_str_mv	Grubert, Daniel Alves da Veiga
dc.subject.por.fl_str_mv	Agricultural yield prediction Algoritmos de aprendizado de máquina APSIM-Sugar APSIM-Sugar Hybrid modeling Machine learning algorithms Modelagem híbrida Modelagem preditiva Predição de produtividade agrícola Predictive modeling Simulação de cana-de-açúcar Sugarcane simulation
topic	Agricultural yield prediction Algoritmos de aprendizado de máquina APSIM-Sugar APSIM-Sugar Hybrid modeling Machine learning algorithms Modelagem híbrida Modelagem preditiva Predição de produtividade agrícola Predictive modeling Simulação de cana-de-açúcar Sugarcane simulation
description	Accurate and precise crop yield forecasts are essential for farmers and decision-makers. This study aims to assess a hybrid approach involving remote sensing data, crop modeling with process-based models, and machine learning algorithms to improve sugarcane yield predictions. To achieve this, a hybrid yield forecasting approach was developed, combining various data sources, including simulated soil and plant variables from the APSIM model (a process-based crop model), meteorological data, and vegetation indices. These data were used as inputs in machine learning models to forecast end-season sugarcane yield. In this study, 16 regression models were evaluated to forecast sugarcane yield at the municipal level in the state of São Paulo, Brazil, during the period 2010-2020. The results indicated that the hybrid approach developed using the K-Neighbors Regressor algorithm showed the best statistical performance, resulting in the lowest Mean Absolute Error (MAE) of 3.26 t ha-1, with a Mean Absolute Percentage Error (MAPE) of 4.54%. Sugarcane yield predictions were most accurate 1-2 months before harvesting. Furthermore, the study determined which variables had the greatest influence on sugarcane productivity prediction by partially excluding some variables from the prediction model. The results showed that adding variables simulated by the process-based model (APSIM) as input variables for machine learning models could reduce the Root Mean Square Error (RMSE) of yield prediction, ranging from 7.7% to 26.9%, while vegetation indices had the least impact on predictions. The analysis revealed that meteorological data had a greater impact on yield prediction when provided to the process-based model than when directly used in machine learning algorithms. This result suggests that the simulated variables provided by APSIM offer a more comprehensive biophysical description of the interaction between soil, plant, and atmosphere.
publishDate	2023
dc.date.none.fl_str_mv	2023-08-28
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://www.teses.usp.br/teses/disponiveis/11/11152/tde-03112023-103445/
url	https://www.teses.usp.br/teses/disponiveis/11/11152/tde-03112023-103445/
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv	Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Liberar o conteúdo para acesso público.
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv	Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1865490936765612032

A synergistic approach to sugarcane yield forecasting using machine learning, remote sensing, and process-based modeling

Registros relacionados