New perspectives on analyzing data from biological collections based on social network analytics

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Siracusa, Pedro Correia de
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Laboratório Nacional de Computação Científica
Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
Brasil
LNCC
Programa de Pós-Graduação em Modelagem Computacional
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://tede.lncc.br/handle/tede/279
Resumo: Biological collections have been historically regarded as fundamental sources of scientific information on biodiversity, supporting a wide range of scientific and management initiatives in the scope of natural resources conservation. As they are typically composed of discrete records of specimens (most of which derived from non-random and opportunistic sampling), biological collection datasets are commonly associated with a variety of biases, which must be characterized and mitigated before data can be consumed. In this dissertation, we are particularly motivated by taxonomic and collector biases, which can be understood as the effect of particular recording preferences of key collectors on shaping the overall taxonomic composition of biological collections they contribute to. In this context, we propose two network models as the first steps towards a network-based conceptual framework for understanding the formation of biological collections as a result of the composition of collectors’ interests and activities. Both models extend the well-established framework of social network analytics, benefiting from a whole set of metrics and algorithms for characterizing network topological features. Species-Collector Networks (SCNs) model the interests of collectors towards particular species, and are structured by linking collectors to each species they have recorded in biological collection datasets. From complementary perspectives, SCNs allow one to investigate which collectors share common interest for sets of species; and conversely, which species are usually recorded by similar sets of collectors. Collector CoWorking Networks (CWNs) are a special type of collaboration networks, structured from collaboration ties that are formed between collectors who record specimens together in field. Such collaborative ties are created between pairs of collectors whenever they are both included as collectors in the same record. Building upon the defined network models, we also present a case study in which we use our models to explore the community of collectors and the taxonomic composition of the University of Brasília herbarium. We describe general topological features of the networks and point out some of the most relevant collectors in the biological collection as well as their taxonomic groups of interest. We also investigate the collaborative behavior of collectors while recording specimens. Finally, we discuss future perspectives for incorporating temporal and geographical dimensions to the models. Moreover, we indicate some possible investigation directions that could possibly benefit from our approach based on social network analytics to model and analyze biological collections.
id LNCC_442d2cb46edeab32780fab9c8516ea60
oai_identifier_str oai:tede-server.lncc.br:tede/279
network_acronym_str LNCC
network_name_str Biblioteca Digital de Teses e Dissertações do LNCC
repository_id_str
spelling New perspectives on analyzing data from biological collections based on social network analyticsBiodiversidadeProcessamento de dadosbiodiversityData processingCNPQ::CIENCIAS BIOLOGICAS::ECOLOGIACNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::SISTEMAS DE INFORMACAOBiological collections have been historically regarded as fundamental sources of scientific information on biodiversity, supporting a wide range of scientific and management initiatives in the scope of natural resources conservation. As they are typically composed of discrete records of specimens (most of which derived from non-random and opportunistic sampling), biological collection datasets are commonly associated with a variety of biases, which must be characterized and mitigated before data can be consumed. In this dissertation, we are particularly motivated by taxonomic and collector biases, which can be understood as the effect of particular recording preferences of key collectors on shaping the overall taxonomic composition of biological collections they contribute to. In this context, we propose two network models as the first steps towards a network-based conceptual framework for understanding the formation of biological collections as a result of the composition of collectors’ interests and activities. Both models extend the well-established framework of social network analytics, benefiting from a whole set of metrics and algorithms for characterizing network topological features. Species-Collector Networks (SCNs) model the interests of collectors towards particular species, and are structured by linking collectors to each species they have recorded in biological collection datasets. From complementary perspectives, SCNs allow one to investigate which collectors share common interest for sets of species; and conversely, which species are usually recorded by similar sets of collectors. Collector CoWorking Networks (CWNs) are a special type of collaboration networks, structured from collaboration ties that are formed between collectors who record specimens together in field. Such collaborative ties are created between pairs of collectors whenever they are both included as collectors in the same record. Building upon the defined network models, we also present a case study in which we use our models to explore the community of collectors and the taxonomic composition of the University of Brasília herbarium. We describe general topological features of the networks and point out some of the most relevant collectors in the biological collection as well as their taxonomic groups of interest. We also investigate the collaborative behavior of collectors while recording specimens. Finally, we discuss future perspectives for incorporating temporal and geographical dimensions to the models. Moreover, we indicate some possible investigation directions that could possibly benefit from our approach based on social network analytics to model and analyze biological collections.Coleções biológicas são consideradas fundamentais fontes de informação científica sobre biodiversidade, tendo historicamente suportado uma ampla gama de iniciativas para conservação de recursos naturais. Por serem tipicamente compostas de registros pontuais de espécies (muitos dos quais derivam de amostragem não aleatória e oportunística), dados de coleções biológicas são comumente associados a uma variedade de vieses, que precisam ser caracterizados e mitigados antes que dados possam ser consumidos. Nesta dissertação temos como principal motivação os vieses taxonômico e de coletor, que podem ser compreendidos como o efeito de preferências pessoais de coletores-chave na composição taxonômica das coleções com as quais eles contribuem. Neste contexto, propomos dois modelos de redes como um primeiro passo para um modelo conceitual, com o objetivo de compreender a formação de coleções biológicas como resultado da composição dos interesses e atividades de seus coletores. Os modelos estendem o campo bem estabelecido da análise de redes sociais, beneficiando-se de uma variedade de métricas e algoritmos para a caracterização de aspectos topológicos. Redes Espécie-Coletor (SCNs) modelam os interesses dos coletores em espécies, e se estruturam por meio de enlaces entre coletores e espécies que eles registram. De forma complementar, SCNs permitem tanto a investigação de coletores compartilhando interesses comuns em conjuntos de espécies; quanto de espécies normalmente coletadas por conjuntos similares de coletores. Redes Colaborativas de Coletores (CWNs) são um tipo especial de redes de colaboração, estruturadas a partir de enlaces colaborativos que se formam entre coletores que registram espécies em conjunto em campo. Tais relações de colaboração são criadas entre pares de coletores caso ambos tenham sido incluídos como coletores responsáveis pelo mesmo registro. Com base nos modelos definidos, nós também apresentamos um estudo de caso em que exploramos a comunidade de coletores e a composição taxonômica dos herbário da Universidade de Brasília. Descrevemos aspectos topológicos gerais das redes e indicamos alguns dos coletores mais relevantes na coleção, bem como grupos taxonômicos de seus respectivos interesses. Nós também investigamos o comportamento colaborativo de coletores durante a coleta de espécimes. Ao final, discutimos perspectivas futuras para a incorporação das dimensões temporal e geográfica nos modelos. Também indicamos algumas possíveis direções de investigação que poderiam se beneficiar de nossa abordagem para a modelagem e análise de coleções biológicas.Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)Laboratório Nacional de Computação CientíficaCoordenação de Pós-Graduação e Aperfeiçoamento (COPGA)BrasilLNCCPrograma de Pós-Graduação em Modelagem ComputacionalZiviani, ArturZiviani, ArturGadelha Junior, Luiz Manoel Rochahttp://lattes.cnpq.br/9851093795076823Ziviani, ArturPorto, Fábio André Machadohttp://lattes.cnpq.br/6418711808050575Saraiva, Antonio MauroDalcin, Eduardo CoutoSiracusa, Pedro Correia de2018-08-07T14:56:56Z2018-06-26info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfSIRACUSA, P. C. New perspectives on analyzing data from biological collections based on social network analytics, 2018, 112 f., Dissertação (Mestrado), Programa de Pós-Graduação em Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis.https://tede.lncc.br/handle/tede/279porinfo:eu-repo/semantics/openAccessreponame:Biblioteca Digital de Teses e Dissertações do LNCCinstname:Laboratório Nacional de Computação Científica (LNCC)instacron:LNCC2023-06-02T12:30:50Zoai:tede-server.lncc.br:tede/279Biblioteca Digital de Teses e Dissertaçõeshttps://tede.lncc.br/PUBhttps://tede.lncc.br/oai/requestlibrary@lncc.br||library@lncc.bropendoar:2023-06-02T12:30:50Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)false
dc.title.none.fl_str_mv New perspectives on analyzing data from biological collections based on social network analytics
title New perspectives on analyzing data from biological collections based on social network analytics
spellingShingle New perspectives on analyzing data from biological collections based on social network analytics
Siracusa, Pedro Correia de
Biodiversidade
Processamento de dados
biodiversity
Data processing
CNPQ::CIENCIAS BIOLOGICAS::ECOLOGIA
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::SISTEMAS DE INFORMACAO
title_short New perspectives on analyzing data from biological collections based on social network analytics
title_full New perspectives on analyzing data from biological collections based on social network analytics
title_fullStr New perspectives on analyzing data from biological collections based on social network analytics
title_full_unstemmed New perspectives on analyzing data from biological collections based on social network analytics
title_sort New perspectives on analyzing data from biological collections based on social network analytics
author Siracusa, Pedro Correia de
author_facet Siracusa, Pedro Correia de
author_role author
dc.contributor.none.fl_str_mv Ziviani, Artur
Ziviani, Artur
Gadelha Junior, Luiz Manoel Rocha
http://lattes.cnpq.br/9851093795076823
Ziviani, Artur
Porto, Fábio André Machado
http://lattes.cnpq.br/6418711808050575
Saraiva, Antonio Mauro
Dalcin, Eduardo Couto
dc.contributor.author.fl_str_mv Siracusa, Pedro Correia de
dc.subject.por.fl_str_mv Biodiversidade
Processamento de dados
biodiversity
Data processing
CNPQ::CIENCIAS BIOLOGICAS::ECOLOGIA
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::SISTEMAS DE INFORMACAO
topic Biodiversidade
Processamento de dados
biodiversity
Data processing
CNPQ::CIENCIAS BIOLOGICAS::ECOLOGIA
CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::SISTEMAS DE INFORMACAO
description Biological collections have been historically regarded as fundamental sources of scientific information on biodiversity, supporting a wide range of scientific and management initiatives in the scope of natural resources conservation. As they are typically composed of discrete records of specimens (most of which derived from non-random and opportunistic sampling), biological collection datasets are commonly associated with a variety of biases, which must be characterized and mitigated before data can be consumed. In this dissertation, we are particularly motivated by taxonomic and collector biases, which can be understood as the effect of particular recording preferences of key collectors on shaping the overall taxonomic composition of biological collections they contribute to. In this context, we propose two network models as the first steps towards a network-based conceptual framework for understanding the formation of biological collections as a result of the composition of collectors’ interests and activities. Both models extend the well-established framework of social network analytics, benefiting from a whole set of metrics and algorithms for characterizing network topological features. Species-Collector Networks (SCNs) model the interests of collectors towards particular species, and are structured by linking collectors to each species they have recorded in biological collection datasets. From complementary perspectives, SCNs allow one to investigate which collectors share common interest for sets of species; and conversely, which species are usually recorded by similar sets of collectors. Collector CoWorking Networks (CWNs) are a special type of collaboration networks, structured from collaboration ties that are formed between collectors who record specimens together in field. Such collaborative ties are created between pairs of collectors whenever they are both included as collectors in the same record. Building upon the defined network models, we also present a case study in which we use our models to explore the community of collectors and the taxonomic composition of the University of Brasília herbarium. We describe general topological features of the networks and point out some of the most relevant collectors in the biological collection as well as their taxonomic groups of interest. We also investigate the collaborative behavior of collectors while recording specimens. Finally, we discuss future perspectives for incorporating temporal and geographical dimensions to the models. Moreover, we indicate some possible investigation directions that could possibly benefit from our approach based on social network analytics to model and analyze biological collections.
publishDate 2018
dc.date.none.fl_str_mv 2018-08-07T14:56:56Z
2018-06-26
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv SIRACUSA, P. C. New perspectives on analyzing data from biological collections based on social network analytics, 2018, 112 f., Dissertação (Mestrado), Programa de Pós-Graduação em Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis.
https://tede.lncc.br/handle/tede/279
identifier_str_mv SIRACUSA, P. C. New perspectives on analyzing data from biological collections based on social network analytics, 2018, 112 f., Dissertação (Mestrado), Programa de Pós-Graduação em Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis.
url https://tede.lncc.br/handle/tede/279
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Laboratório Nacional de Computação Científica
Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
Brasil
LNCC
Programa de Pós-Graduação em Modelagem Computacional
publisher.none.fl_str_mv Laboratório Nacional de Computação Científica
Coordenação de Pós-Graduação e Aperfeiçoamento (COPGA)
Brasil
LNCC
Programa de Pós-Graduação em Modelagem Computacional
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações do LNCC
instname:Laboratório Nacional de Computação Científica (LNCC)
instacron:LNCC
instname_str Laboratório Nacional de Computação Científica (LNCC)
instacron_str LNCC
institution LNCC
reponame_str Biblioteca Digital de Teses e Dissertações do LNCC
collection Biblioteca Digital de Teses e Dissertações do LNCC
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações do LNCC - Laboratório Nacional de Computação Científica (LNCC)
repository.mail.fl_str_mv library@lncc.br||library@lncc.br
_version_ 1832738027710971904