Imbalanced Regression Pipeline Recommendation

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: AVELINO, Juscimara Gomes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso embargado
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/58487
Resumo: Imbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research.
id UFPE_18ab553dd5c16850bec68607a362d724
oai_identifier_str oai:repositorio.ufpe.br:123456789/58487
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str
spelling Imbalanced Regression Pipeline RecommendationRegressão desbalanceadaEstratégias de reamostragemMeta-aprendizadoImbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research.Problemas de desbalanceamento são comuns em diversos cenários do mundo real e apre- sentam desafios significativos, especialmente para tarefas de regressão, devido à raridade de certos valores-alvo contínuos. Embora essas questões tenham sido amplamente exploradas em tarefas de classificação, elas também afetam a regressão, complicando o desempenho dos mod- elos. Este trabalho apresenta um estudo experimental extenso envolvendo várias estratégias de balanceamento e modelos de aprendizado, introduzimos uma taxonomia para abordagens de regressão desbalanceada baseada em modelos de regressão, modificação no processo de aprendizado e métricas de avaliação, e destaca novos insights sobre as vantagens de diferentes estratégias. A partir deste estudo, ficou evidente que a escolha do método de reamostragem depende do problema, dos modelos de aprendizado e das métricas, tornando difícil selecionar uma estratégia de reamostragem e um modelo de aprendizado apropriados. Como resultado, é necessário testar a maioria das combinações existentes. Com base nessas descobertas, este tra- balho propõe o modelo Meta-learning for Imbalanced Regression (Meta-IR) para enfrentar esses desafios. O Meta-IR recomenda pipelines ideais que consistem em estratégias de reamostragem e modelos de aprendizado para tarefas de regressão desbalanceada. Duas formulações são pro- postas: Independente, que recomenda separadamente algoritmos de aprendizado e estratégias de reamostragem, e Encadeada, que modela suas interdependências sequencialmente. A abor- dagem Encadeada demonstrou desempenho superior, sugerindo uma relação significativa entre algoritmos de aprendizado e estratégias de reamostragem. Em comparação com modelos de AutoML e configurações de linha de base, o Meta-IR superou todos, oferecendo uma solução mais eficaz para a regressão desbalanceada e indicando direções para futuras pesquisas.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoCAVALCANTI, George Darmiton da CunhaCRUZ, Rafael Menelau Oliveira ehttp://lattes.cnpq.br/5854014635627691http://lattes.cnpq.br/8577312109146354AVELINO, Juscimara Gomes2024-11-05T15:27:15Z2024-11-05T15:27:15Z2024-08-20info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfAVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.https://repositorio.ufpe.br/handle/123456789/58487engAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/embargoedAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2024-11-07T05:39:01Zoai:repositorio.ufpe.br:123456789/58487Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212024-11-07T05:39:01Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv Imbalanced Regression Pipeline Recommendation
title Imbalanced Regression Pipeline Recommendation
spellingShingle Imbalanced Regression Pipeline Recommendation
AVELINO, Juscimara Gomes
Regressão desbalanceada
Estratégias de reamostragem
Meta-aprendizado
title_short Imbalanced Regression Pipeline Recommendation
title_full Imbalanced Regression Pipeline Recommendation
title_fullStr Imbalanced Regression Pipeline Recommendation
title_full_unstemmed Imbalanced Regression Pipeline Recommendation
title_sort Imbalanced Regression Pipeline Recommendation
author AVELINO, Juscimara Gomes
author_facet AVELINO, Juscimara Gomes
author_role author
dc.contributor.none.fl_str_mv CAVALCANTI, George Darmiton da Cunha
CRUZ, Rafael Menelau Oliveira e
http://lattes.cnpq.br/5854014635627691
http://lattes.cnpq.br/8577312109146354
dc.contributor.author.fl_str_mv AVELINO, Juscimara Gomes
dc.subject.por.fl_str_mv Regressão desbalanceada
Estratégias de reamostragem
Meta-aprendizado
topic Regressão desbalanceada
Estratégias de reamostragem
Meta-aprendizado
description Imbalanced problems are common in various real-world scenarios and present significant chal- lenges, especially for regression tasks due to the rarity of certain continuous target values. While these issues have been extensively explored in classification tasks, they also affect re- gression, complicating model performance. This work presents an extensive experimental study involving various balancing strategies and learning models, introduces a taxonomy for imbal- anced regression approaches based on regression models, learning process modification, and evaluation metrics, and highlights new insights into the advantages of different strategies. From this study, it became evident that the choice of resampling method depends on the problem, learning models, and metrics, making it difficult to select an appropriate resam- pling strategy and learning model. As a result, it is necessary to test the majority of existing combinations. Based on these findings, this work proposes the Meta-learning for Imbalanced Regression (Meta-IR) framework to address these challenges. Meta-IR recommends optimal pipelines consisting of resampling strategies and learning models for imbalanced regression tasks. Two formulations are proposed: Independent, which separately recommends learning algorithms and resampling strategies, and Chained, which models their interdependencies se- quentially. The Chained approach demonstrated superior performance, suggesting a significant relationship between learning algorithms and resampling strategies. Compared with AutoML models and baseline configurations, Meta-IR outperformed all, offering a more effective solu- tion for imbalanced regression and indicating directions for future research.
publishDate 2024
dc.date.none.fl_str_mv 2024-11-05T15:27:15Z
2024-11-05T15:27:15Z
2024-08-20
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv AVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
https://repositorio.ufpe.br/handle/123456789/58487
identifier_str_mv AVELINO, Juscimara Gomes. Imbalanced Regression Pipeline Recommendation. 2024. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2024.
url https://repositorio.ufpe.br/handle/123456789/58487
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/embargoedAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv embargoedAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1856041941765980160