Exportação concluída — 

Graph pattern mining: consolidating models, systems, and abstractions

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Vinícius Vitor dos Santos Dias
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Minas Gerais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://hdl.handle.net/1843/51806
Resumo: Graph Pattern Mining (GPM) refers to a class of problems involving the processing of subgraphs extracted from larger graphs. Applications to GPM algorithms include querying subgraphs with given properties of interest, identifying motif structures in biological networks, characterizing social media, among others. GPM algorithms are challenging to develop due to inherently subroutines that include non-trivial graph theory concepts and methods such as isomorphism. General-purpose GPM systems emerge as a solution to improve the user experience with such algorithms. However, general-purpose GPM systems fail in providing a consistent model that is simple to understand and qualified to express alternative algorithms for the same problem via different paradigms for subgraph enumeration, limiting the integration with modern data analytics pipelines. Furthermore, because GPM systems are so heterogeneous in terms of supported paradigms and computing architecture, existing experimental evaluations are unable to distinguish whether performance differences are best explained by algorithmic strategies or implementation details. In this work we propose a primitive-based model for GPM, a proof of concept distributed implementation of that model, and an extensive experimentation analysis of popular algorithmic paradigms used in GPM systems. We demonstrate empirically the effectiveness of our model by showing competitive performance against state-of-the-art systems without sacrificing the expressiveness of algorithms or the composability of operators. Our experimental results also show that no single paradigm is best for every application scenario, and we believe that our findings may guide practitioner towards more optimized GPM systems in the future.
id UFMG_149cea87d3ef258ba5ffba7192a96eaf
oai_identifier_str oai:repositorio.ufmg.br:1843/51806
network_acronym_str UFMG
network_name_str Repositório Institucional da UFMG
repository_id_str
spelling Graph pattern mining: consolidating models, systems, and abstractionsComputação – TesesMineração de padrões em grafos – TesesSistemas distribuídos – TesesMineração de padrões em grafosSistemas distribuídosGraph Pattern Mining (GPM) refers to a class of problems involving the processing of subgraphs extracted from larger graphs. Applications to GPM algorithms include querying subgraphs with given properties of interest, identifying motif structures in biological networks, characterizing social media, among others. GPM algorithms are challenging to develop due to inherently subroutines that include non-trivial graph theory concepts and methods such as isomorphism. General-purpose GPM systems emerge as a solution to improve the user experience with such algorithms. However, general-purpose GPM systems fail in providing a consistent model that is simple to understand and qualified to express alternative algorithms for the same problem via different paradigms for subgraph enumeration, limiting the integration with modern data analytics pipelines. Furthermore, because GPM systems are so heterogeneous in terms of supported paradigms and computing architecture, existing experimental evaluations are unable to distinguish whether performance differences are best explained by algorithmic strategies or implementation details. In this work we propose a primitive-based model for GPM, a proof of concept distributed implementation of that model, and an extensive experimentation analysis of popular algorithmic paradigms used in GPM systems. We demonstrate empirically the effectiveness of our model by showing competitive performance against state-of-the-art systems without sacrificing the expressiveness of algorithms or the composability of operators. Our experimental results also show that no single paradigm is best for every application scenario, and we believe that our findings may guide practitioner towards more optimized GPM systems in the future.CNPq - Conselho Nacional de Desenvolvimento Científico e TecnológicoCAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorUniversidade Federal de Minas Gerais2023-04-11T17:20:09Z2025-09-08T22:53:55Z2023-04-11T17:20:09Z2023-03-24info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://hdl.handle.net/1843/51806engVinícius Vitor dos Santos Diasinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-08T22:53:55Zoai:repositorio.ufmg.br:1843/51806Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-08T22:53:55Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false
dc.title.none.fl_str_mv Graph pattern mining: consolidating models, systems, and abstractions
title Graph pattern mining: consolidating models, systems, and abstractions
spellingShingle Graph pattern mining: consolidating models, systems, and abstractions
Vinícius Vitor dos Santos Dias
Computação – Teses
Mineração de padrões em grafos – Teses
Sistemas distribuídos – Teses
Mineração de padrões em grafos
Sistemas distribuídos
title_short Graph pattern mining: consolidating models, systems, and abstractions
title_full Graph pattern mining: consolidating models, systems, and abstractions
title_fullStr Graph pattern mining: consolidating models, systems, and abstractions
title_full_unstemmed Graph pattern mining: consolidating models, systems, and abstractions
title_sort Graph pattern mining: consolidating models, systems, and abstractions
author Vinícius Vitor dos Santos Dias
author_facet Vinícius Vitor dos Santos Dias
author_role author
dc.contributor.author.fl_str_mv Vinícius Vitor dos Santos Dias
dc.subject.por.fl_str_mv Computação – Teses
Mineração de padrões em grafos – Teses
Sistemas distribuídos – Teses
Mineração de padrões em grafos
Sistemas distribuídos
topic Computação – Teses
Mineração de padrões em grafos – Teses
Sistemas distribuídos – Teses
Mineração de padrões em grafos
Sistemas distribuídos
description Graph Pattern Mining (GPM) refers to a class of problems involving the processing of subgraphs extracted from larger graphs. Applications to GPM algorithms include querying subgraphs with given properties of interest, identifying motif structures in biological networks, characterizing social media, among others. GPM algorithms are challenging to develop due to inherently subroutines that include non-trivial graph theory concepts and methods such as isomorphism. General-purpose GPM systems emerge as a solution to improve the user experience with such algorithms. However, general-purpose GPM systems fail in providing a consistent model that is simple to understand and qualified to express alternative algorithms for the same problem via different paradigms for subgraph enumeration, limiting the integration with modern data analytics pipelines. Furthermore, because GPM systems are so heterogeneous in terms of supported paradigms and computing architecture, existing experimental evaluations are unable to distinguish whether performance differences are best explained by algorithmic strategies or implementation details. In this work we propose a primitive-based model for GPM, a proof of concept distributed implementation of that model, and an extensive experimentation analysis of popular algorithmic paradigms used in GPM systems. We demonstrate empirically the effectiveness of our model by showing competitive performance against state-of-the-art systems without sacrificing the expressiveness of algorithms or the composability of operators. Our experimental results also show that no single paradigm is best for every application scenario, and we believe that our findings may guide practitioner towards more optimized GPM systems in the future.
publishDate 2023
dc.date.none.fl_str_mv 2023-04-11T17:20:09Z
2023-04-11T17:20:09Z
2023-03-24
2025-09-08T22:53:55Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1843/51806
url https://hdl.handle.net/1843/51806
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Minas Gerais
publisher.none.fl_str_mv Universidade Federal de Minas Gerais
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFMG
instname:Universidade Federal de Minas Gerais (UFMG)
instacron:UFMG
instname_str Universidade Federal de Minas Gerais (UFMG)
instacron_str UFMG
institution UFMG
reponame_str Repositório Institucional da UFMG
collection Repositório Institucional da UFMG
repository.name.fl_str_mv Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)
repository.mail.fl_str_mv repositorio@ufmg.br
_version_ 1856414114296889344