Optimum-path forest in support of collaborative filtering

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Martins, Guilherme Brandão
Orientador(a): Papa, João Paulo lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/19885
Resumo: Machine learning algorithms are being applied in various computational challenges, among which Recommender Systems (RS) present a range of techniques and approaches to effectively manage large volumes of data and provide personalized and relevant content to users. Such systems must be able to handle data-related issues such as sparsity, scalability, and the cold start problem and Collaborative Filtering (CF) has traditionally been the primary strategy for addressing those challenges. One way to tackled those problems and improve recommendation results is by leveraging auxiliary information sources to compensate the lack of CF data, such as user-item interactions. However, different interpretations of the mentioned problems should be explored. The current work contributes in the field of machine learning by proposing approaches to address the mentioned challenges. This thesis presents a collection of works developed by the author throughout the research period, which have been published or submitted up to the present, encompassing: (i) a systematic literature review which analyzes and discuss recent deep learning approaches employed for CF under sparse-related conditions, while also identifying the challenges and limitations within the field; (ii) a Matrix Factorization (MF)-based ap- proach that leverages CF-related sparsity for the purpose of classifiers fusion; (iii) an alternative unsupervised Optimum-Path Forest (OPF) designed to perform efficiently in large-scale datasets by employing k-approximate-nearest-neighbors graph as its adjacency relation; and (iv) an OPF clustering model built upon the shared-neighborhood concept to alleviate sparsity and high dimensionality issues during CF-based recommendation. The experimental results achieved through such works corroborate the hypotheses of the present thesis.
id SCAR_835b262485d8fde8e602a7b47d1f75f7
oai_identifier_str oai:repositorio.ufscar.br:20.500.14289/19885
network_acronym_str SCAR
network_name_str Repositório Institucional da UFSCAR
repository_id_str
spelling Martins, Guilherme BrandãoPapa, João Paulohttp://lattes.cnpq.br/9039182932747194http://lattes.cnpq.br/8300636274454060https://orcid.org/0000-0003-2842-78502024-07-11T12:05:09Z2024-07-11T12:05:09Z2023-12-07MARTINS, Guilherme Brandão. Optimum-path forest in support of collaborative filtering. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2023. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/19885.https://repositorio.ufscar.br/handle/20.500.14289/19885Machine learning algorithms are being applied in various computational challenges, among which Recommender Systems (RS) present a range of techniques and approaches to effectively manage large volumes of data and provide personalized and relevant content to users. Such systems must be able to handle data-related issues such as sparsity, scalability, and the cold start problem and Collaborative Filtering (CF) has traditionally been the primary strategy for addressing those challenges. One way to tackled those problems and improve recommendation results is by leveraging auxiliary information sources to compensate the lack of CF data, such as user-item interactions. However, different interpretations of the mentioned problems should be explored. The current work contributes in the field of machine learning by proposing approaches to address the mentioned challenges. This thesis presents a collection of works developed by the author throughout the research period, which have been published or submitted up to the present, encompassing: (i) a systematic literature review which analyzes and discuss recent deep learning approaches employed for CF under sparse-related conditions, while also identifying the challenges and limitations within the field; (ii) a Matrix Factorization (MF)-based ap- proach that leverages CF-related sparsity for the purpose of classifiers fusion; (iii) an alternative unsupervised Optimum-Path Forest (OPF) designed to perform efficiently in large-scale datasets by employing k-approximate-nearest-neighbors graph as its adjacency relation; and (iv) an OPF clustering model built upon the shared-neighborhood concept to alleviate sparsity and high dimensionality issues during CF-based recommendation. The experimental results achieved through such works corroborate the hypotheses of the present thesis.Algoritmos de aprendizado de máquina têm sido aplicados em diversos desafios computacionais, dentre os quais Sistemas Recomendadores (do inglês, Recommender Systems, RS) contém um conjunto de técnicas e abordagens para lidar efetivamente com extensos volumes de dados e oferecer conteúdos personalizados e relevantes aos usuários. Tais sistemas devem ser capazes de lidar com problemas relativos aos dados, como esparsidade, escalabilidade e cold start, e a Filtragem Colaborativa (do inglês, Collaborative Filtering, CF) tradicionalmente tem sido a principal estratégia para lidar com esses desafios. Uma das maneiras de aprimorar os resultados de recomendação é utilizar fontes auxiliares de informação para compensar a falta de dados de CF, como interações usuário-item. Todavia, diferentes interpretações acerca dos problemas mencionados poderiam ser exploradas. O presente trabalho contribui na área de aprendizado de máquina propondo abordagens para lidar com os desafios supracitados. Esta tese é constituída por uma coletânea de trabalhos desenvolvidos pelo autor durante o período de pesquisa, que foram publicados ou submetidos até a atualidade, apresentando: (i) uma revisão sistemática da literatura que analisa e discute abordagens recentes baseadas em aprendizagem profunda para recomendação sob condições de esparsidade, além de identificar desafios e limitações na área de CF; (ii) uma abordagem baseada em Fatoração de Matriz (do inglês, Matrix Factorization, MF ) que explora esparsidade relativa a CF para fusão de classificadores; (iii) um modelo alternativo do classificador não-supervisionado Floresta de Caminhos Ótimos (do inglês, Optimum-Path Forest, OPF ) projetado para operar eficientemente em conjuntos de dados de grande escala, utilizando relação de adjacência baseada em grafo de k-vizinhos-aproximados; e (iv) um modelo OPF para agrupamento de dados baseado no conceito de vizinhança compartilhada para aliviar esparsidade e alta dimensionalidade durante a recomendação baseada em CF. Os resultados experimentais alcançados por meio de tais trabalhos corroboram as hipóteses da presente tese.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)CAPES: Código de financiamento 001engUniversidade Federal de São CarlosCâmpus São CarlosPrograma de Pós-Graduação em Ciência da Computação - PPGCCUFSCarAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessFloresta de caminhos ótimosFiltragem colaborativaEsparsidadeOptimum-path forestCollaborative filteringSparsityCIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAOOptimum-path forest in support of collaborative filteringFloresta de caminhos ótimos no auxílio a filtragem colaborativainfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFSCARinstname:Universidade Federal de São Carlos (UFSCAR)instacron:UFSCARTEXTtese-guilhermebrandaomartins.pdf.txttese-guilhermebrandaomartins.pdf.txtExtracted texttext/plain100478https://repositorio.ufscar.br/bitstreams/57887fc8-eaa2-480a-be2a-cb4f937c7942/download7ea4dc418c570971d46db6f629036a01MD53falseAnonymousREADTHUMBNAILtese-guilhermebrandaomartins.pdf.jpgtese-guilhermebrandaomartins.pdf.jpgGenerated Thumbnailimage/jpeg3769https://repositorio.ufscar.br/bitstreams/44c2949a-81aa-4b61-93c7-4e6b063af5fc/downloadf7de6ce3a71f24bbe8ea5785010603fbMD54falseAnonymousREADORIGINALtese-guilhermebrandaomartins.pdftese-guilhermebrandaomartins.pdfTese de doutorado - Guilherme Brandão Martinsapplication/pdf6590869https://repositorio.ufscar.br/bitstreams/006e34e9-54cf-4441-a831-87e522d20a73/download0fa0dca93e25181cf32a0db2a3721193MD51trueAnonymousREADCC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8810https://repositorio.ufscar.br/bitstreams/6fd26744-3485-4f47-a653-9f1d973a2e0c/downloadf337d95da1fce0a22c77480e5e9a7aecMD52falseAnonymousREAD20.500.14289/198852025-02-06 02:18:08.514http://creativecommons.org/licenses/by-nc-nd/3.0/br/Attribution-NonCommercial-NoDerivs 3.0 Brazilopen.accessoai:repositorio.ufscar.br:20.500.14289/19885https://repositorio.ufscar.brRepositório InstitucionalPUBhttps://repositorio.ufscar.br/oai/requestrepositorio.sibi@ufscar.bropendoar:43222025-02-06T05:18:08Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)false
dc.title.eng.fl_str_mv Optimum-path forest in support of collaborative filtering
dc.title.alternative.por.fl_str_mv Floresta de caminhos ótimos no auxílio a filtragem colaborativa
title Optimum-path forest in support of collaborative filtering
spellingShingle Optimum-path forest in support of collaborative filtering
Martins, Guilherme Brandão
Floresta de caminhos ótimos
Filtragem colaborativa
Esparsidade
Optimum-path forest
Collaborative filtering
Sparsity
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
title_short Optimum-path forest in support of collaborative filtering
title_full Optimum-path forest in support of collaborative filtering
title_fullStr Optimum-path forest in support of collaborative filtering
title_full_unstemmed Optimum-path forest in support of collaborative filtering
title_sort Optimum-path forest in support of collaborative filtering
author Martins, Guilherme Brandão
author_facet Martins, Guilherme Brandão
author_role author
dc.contributor.authorlattes.por.fl_str_mv http://lattes.cnpq.br/8300636274454060
dc.contributor.authororcid.por.fl_str_mv https://orcid.org/0000-0003-2842-7850
dc.contributor.author.fl_str_mv Martins, Guilherme Brandão
dc.contributor.advisor1.fl_str_mv Papa, João Paulo
dc.contributor.advisor1Lattes.fl_str_mv http://lattes.cnpq.br/9039182932747194
contributor_str_mv Papa, João Paulo
dc.subject.por.fl_str_mv Floresta de caminhos ótimos
Filtragem colaborativa
Esparsidade
topic Floresta de caminhos ótimos
Filtragem colaborativa
Esparsidade
Optimum-path forest
Collaborative filtering
Sparsity
CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
dc.subject.eng.fl_str_mv Optimum-path forest
Collaborative filtering
Sparsity
dc.subject.cnpq.fl_str_mv CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO
description Machine learning algorithms are being applied in various computational challenges, among which Recommender Systems (RS) present a range of techniques and approaches to effectively manage large volumes of data and provide personalized and relevant content to users. Such systems must be able to handle data-related issues such as sparsity, scalability, and the cold start problem and Collaborative Filtering (CF) has traditionally been the primary strategy for addressing those challenges. One way to tackled those problems and improve recommendation results is by leveraging auxiliary information sources to compensate the lack of CF data, such as user-item interactions. However, different interpretations of the mentioned problems should be explored. The current work contributes in the field of machine learning by proposing approaches to address the mentioned challenges. This thesis presents a collection of works developed by the author throughout the research period, which have been published or submitted up to the present, encompassing: (i) a systematic literature review which analyzes and discuss recent deep learning approaches employed for CF under sparse-related conditions, while also identifying the challenges and limitations within the field; (ii) a Matrix Factorization (MF)-based ap- proach that leverages CF-related sparsity for the purpose of classifiers fusion; (iii) an alternative unsupervised Optimum-Path Forest (OPF) designed to perform efficiently in large-scale datasets by employing k-approximate-nearest-neighbors graph as its adjacency relation; and (iv) an OPF clustering model built upon the shared-neighborhood concept to alleviate sparsity and high dimensionality issues during CF-based recommendation. The experimental results achieved through such works corroborate the hypotheses of the present thesis.
publishDate 2023
dc.date.issued.fl_str_mv 2023-12-07
dc.date.accessioned.fl_str_mv 2024-07-11T12:05:09Z
dc.date.available.fl_str_mv 2024-07-11T12:05:09Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv MARTINS, Guilherme Brandão. Optimum-path forest in support of collaborative filtering. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2023. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/19885.
dc.identifier.uri.fl_str_mv https://repositorio.ufscar.br/handle/20.500.14289/19885
identifier_str_mv MARTINS, Guilherme Brandão. Optimum-path forest in support of collaborative filtering. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2023. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/19885.
url https://repositorio.ufscar.br/handle/20.500.14289/19885
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.publisher.program.fl_str_mv Programa de Pós-Graduação em Ciência da Computação - PPGCC
dc.publisher.initials.fl_str_mv UFSCar
publisher.none.fl_str_mv Universidade Federal de São Carlos
Câmpus São Carlos
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFSCAR
instname:Universidade Federal de São Carlos (UFSCAR)
instacron:UFSCAR
instname_str Universidade Federal de São Carlos (UFSCAR)
instacron_str UFSCAR
institution UFSCAR
reponame_str Repositório Institucional da UFSCAR
collection Repositório Institucional da UFSCAR
bitstream.url.fl_str_mv https://repositorio.ufscar.br/bitstreams/57887fc8-eaa2-480a-be2a-cb4f937c7942/download
https://repositorio.ufscar.br/bitstreams/44c2949a-81aa-4b61-93c7-4e6b063af5fc/download
https://repositorio.ufscar.br/bitstreams/006e34e9-54cf-4441-a831-87e522d20a73/download
https://repositorio.ufscar.br/bitstreams/6fd26744-3485-4f47-a653-9f1d973a2e0c/download
bitstream.checksum.fl_str_mv 7ea4dc418c570971d46db6f629036a01
f7de6ce3a71f24bbe8ea5785010603fb
0fa0dca93e25181cf32a0db2a3721193
f337d95da1fce0a22c77480e5e9a7aec
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional da UFSCAR - Universidade Federal de São Carlos (UFSCAR)
repository.mail.fl_str_mv repositorio.sibi@ufscar.br
_version_ 1851688811488083968