Population genomics of viruses and its analytical tools

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Santos, Matheus de Morais
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Agronomia
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufu.br/handle/123456789/30358
http://doi.org/10.14393/ufu.di.2020.228
Resumo: Comparative genomics makes possible to study virus populations with high degree of accuracy, by using the largest possible number of molecular markers. The knowledge of evolutionary processes that affect these populations is of ecological and epidemiological importance, since they show high degrees of genetic diversity. Begomoviruses (genus Begomovirus, Family Geminiviridae) have genomes composed of one single-stranded DNA molecule (monopartite begomoviruses) or two (bipartite begomoviruses, DNA-A and DNA-B components), infect dicotyledonous plants and evolve as quickly as viruses with genomes composed of RNA. The genetic structure of the global metapopulation of begomoviruses was determined in a previous study based on DNA-A sequences and revealed to consist of large, genetically different and cohesive subpopulations due to the existence of geographical barriers, host range and genetic barriers to recombination. However, the structure of the global metapopulation based on DNA-B sequences has not been determined. There is a enough sequences in public databases to carry out such a study. Nevertheless, comparing three or more biological sequences requires the construction of multiple sequence alignments (MSAs). All computer programs used in MSAs construction use heuristic strategies and, therefore, show differences in their degrees of accuracy. Accuracy is a critical parameter for choosing a MSA program. Also, notably, there is a scarcity of studies that assess the level of accuracy of these programs when data sets are composed of viral genomes. Thus, this work (i) determined the genetic structure of the global metapopulation of begomoviruses based on DNA-B sequences, comparing the recombination patterns from their subpopulations; and (ii) evaluated the degree of accuracy of the main MSA programs on viral genomes and their practical implications in genomics studies. To achieve the first objective, full-length DNA-B sequences were obtained from GenBank, subdivided into eight subpopulations by discriminant analysis of principal components and analyzed by seven methods of recombination detection. The inferred subpopulations were genetically different and presented two distinct recombination patterns. The second objective was achieved by estimating the degree of accuracy of 13 MSA programs/settings on six data sets composed of the full-length genomes of species belonging to five genera of viruses and one genus of a subviral agent. The programs showed distinct degrees of accuracy depending on the data set. Notably, MSAs generated by the most accurate programs (determined in this study) and those generated by programs widely used in genomic studies in the field of Virology yielded incongruent phylogenetic trees, suggesting that the evolutionary histories frequently presented in the literature may not be the most likely ones.
id UFU_84a7e17940b0ce69d0c8311e4b2d5fdf
oai_identifier_str oai:repositorio.ufu.br:123456789/30358
network_acronym_str UFU
network_name_str Repositório Institucional da UFU
repository_id_str
spelling Population genomics of viruses and its analytical toolsGenômica de populações de vírus e suas ferramentas analíticasMolecular evolutionBegomovirusesBioinformaticsEvolução molecularBegomovírusBioinformáticaCNPQ::CIENCIAS AGRARIAS::AGRONOMIAEvolução molecularBioinformáticaGenômicaVírusComparative genomics makes possible to study virus populations with high degree of accuracy, by using the largest possible number of molecular markers. The knowledge of evolutionary processes that affect these populations is of ecological and epidemiological importance, since they show high degrees of genetic diversity. Begomoviruses (genus Begomovirus, Family Geminiviridae) have genomes composed of one single-stranded DNA molecule (monopartite begomoviruses) or two (bipartite begomoviruses, DNA-A and DNA-B components), infect dicotyledonous plants and evolve as quickly as viruses with genomes composed of RNA. The genetic structure of the global metapopulation of begomoviruses was determined in a previous study based on DNA-A sequences and revealed to consist of large, genetically different and cohesive subpopulations due to the existence of geographical barriers, host range and genetic barriers to recombination. However, the structure of the global metapopulation based on DNA-B sequences has not been determined. There is a enough sequences in public databases to carry out such a study. Nevertheless, comparing three or more biological sequences requires the construction of multiple sequence alignments (MSAs). All computer programs used in MSAs construction use heuristic strategies and, therefore, show differences in their degrees of accuracy. Accuracy is a critical parameter for choosing a MSA program. Also, notably, there is a scarcity of studies that assess the level of accuracy of these programs when data sets are composed of viral genomes. Thus, this work (i) determined the genetic structure of the global metapopulation of begomoviruses based on DNA-B sequences, comparing the recombination patterns from their subpopulations; and (ii) evaluated the degree of accuracy of the main MSA programs on viral genomes and their practical implications in genomics studies. To achieve the first objective, full-length DNA-B sequences were obtained from GenBank, subdivided into eight subpopulations by discriminant analysis of principal components and analyzed by seven methods of recombination detection. The inferred subpopulations were genetically different and presented two distinct recombination patterns. The second objective was achieved by estimating the degree of accuracy of 13 MSA programs/settings on six data sets composed of the full-length genomes of species belonging to five genera of viruses and one genus of a subviral agent. The programs showed distinct degrees of accuracy depending on the data set. Notably, MSAs generated by the most accurate programs (determined in this study) and those generated by programs widely used in genomic studies in the field of Virology yielded incongruent phylogenetic trees, suggesting that the evolutionary histories frequently presented in the literature may not be the most likely ones.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorDissertação (Mestrado)A genômica comparativa permite estudar populações virais com alta acurácia, por utilizar o maior número possível de marcadores moleculares. O entendimento acerca dos processos evolutivos que atuam sobre estas populações tem importância ecológica e epidemiológica, visto que apresentam altos níveis de diversidade genética. Os begomovírus (gênero Begomovirus, Família Geminiviridae) possuem genomas compostos por uma única molécula de DNA de fita simples (begomovírus monopartidos) ou duas (begomovírus bipartidos, componentes DNA-A e DNA-B), infectam plantas dicotiledôneas e evoluem tão rapidamente quanto os vírus com genomas compostos por RNA. A estrutura genética da metapopulação global dos begomovírus foi determinada em um estudo prévio baseado em sequências de DNA-A e revelou ser composta por grandes subpopulações geneticamente diferentes e coesas em decorrência da existência de barreiras geográficas, pela gama de hospedeiros e por barreiras genéticas à recombinação. Contudo, a estrutura da metapopulação global dos begomovírus baseada no DNA-B ainda não foi determinada. Nos bancos de dados públicos há um número de sequências suficiente para realização de tal estudo. Mas a comparação de três ou mais sequências biológicas requer a construção de alinhamentos múltiplos de sequências (AMSs). Todos os programas computacionais utilizados na construção de AMSs utilizam estratégias heurísticas e, por isso, apresentam diferenças em seus níveis de acurácia. A acurácia é um parâmetro crítico para a escolha de um programa de AMSs. Também, notavelmente, há uma certa escassez de estudos que avaliem o nível de acurácia destes programas quando os conjuntos de dados são compostos por genomas virais. Assim, este trabalho teve como objetivos: (i) determinar a estrutura genética da metapopulação global dos begomovírus baseada em sequências do DNA-B, comparando os padrões de recombinação entre suas subpopulações; e (ii) avaliar o nível de acurácia dos principais programas para construção de AMSs sobre genomas virais e suas implicações práticas em estudos de genômica. Para cumprir o primeiro objetivo, sequências completas de DNA-B foram obtidas do GenBank, subdivididas em oito subpopulações por meio da análise discriminante de componentes principais e analisadas por sete métodos de detecção de recombinação. As subpopulações inferidas foram geneticamente diferentes e apresentaram dois padrões de recombinação distintos. O segundo objetivo foi cumprido pela avaliação do nível de acurácia de 13 programas/ajustes para construção de AMSs sobre seis conjuntos de dados compostos pelos genomas completos de espécies pertencentes a cinco gêneros de vírus e um gênero de um agente subviral. Os programas apresentaram níveis de acurácia diferentes dependendo do conjunto de dados. Notavelmente, AMSs gerados pelos programas mais acurados (determinados neste estudo) e aqueles gerados por programas amplamente utilizados em estudos genômicos na área da Virologia renderam árvores filogenéticas incongruentes, sugerindo que as histórias evolutivas frequentemente apresentadas na literatura podem não ser as mais verossímeis.2022-02-21Universidade Federal de UberlândiaBrasilPrograma de Pós-graduação em AgronomiaLima, Alison Talis Martinshttp://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4732506T1Coelho, LísiasSassaki, Flávio TetsuoNagata, Alice Kazuko InoueSantos, Matheus de Morais2020-11-12T17:18:05Z2020-11-12T17:18:05Z2020-02-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfSANTOS, Matheus de Morais. Population genomics of viruses and its analytical tools. 2020. 217 f. Dissertação (Mestrado em Agronomia) – Universidade Federal de Uberlândia, Uberlândia. DOI http://doi.org/10.14393/ufu.di.2020.228.https://repositorio.ufu.br/handle/123456789/30358http://doi.org/10.14393/ufu.di.2020.228enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFUinstname:Universidade Federal de Uberlândia (UFU)instacron:UFU2022-10-17T19:08:34Zoai:repositorio.ufu.br:123456789/30358Repositório InstitucionalONGhttp://repositorio.ufu.br/oai/requestdiinf@dirbi.ufu.bropendoar:2022-10-17T19:08:34Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)false
dc.title.none.fl_str_mv Population genomics of viruses and its analytical tools
Genômica de populações de vírus e suas ferramentas analíticas
title Population genomics of viruses and its analytical tools
spellingShingle Population genomics of viruses and its analytical tools
Santos, Matheus de Morais
Molecular evolution
Begomoviruses
Bioinformatics
Evolução molecular
Begomovírus
Bioinformática
CNPQ::CIENCIAS AGRARIAS::AGRONOMIA
Evolução molecular
Bioinformática
Genômica
Vírus
title_short Population genomics of viruses and its analytical tools
title_full Population genomics of viruses and its analytical tools
title_fullStr Population genomics of viruses and its analytical tools
title_full_unstemmed Population genomics of viruses and its analytical tools
title_sort Population genomics of viruses and its analytical tools
author Santos, Matheus de Morais
author_facet Santos, Matheus de Morais
author_role author
dc.contributor.none.fl_str_mv Lima, Alison Talis Martins
http://buscatextual.cnpq.br/buscatextual/visualizacv.do?id=K4732506T1
Coelho, Lísias
Sassaki, Flávio Tetsuo
Nagata, Alice Kazuko Inoue
dc.contributor.author.fl_str_mv Santos, Matheus de Morais
dc.subject.por.fl_str_mv Molecular evolution
Begomoviruses
Bioinformatics
Evolução molecular
Begomovírus
Bioinformática
CNPQ::CIENCIAS AGRARIAS::AGRONOMIA
Evolução molecular
Bioinformática
Genômica
Vírus
topic Molecular evolution
Begomoviruses
Bioinformatics
Evolução molecular
Begomovírus
Bioinformática
CNPQ::CIENCIAS AGRARIAS::AGRONOMIA
Evolução molecular
Bioinformática
Genômica
Vírus
description Comparative genomics makes possible to study virus populations with high degree of accuracy, by using the largest possible number of molecular markers. The knowledge of evolutionary processes that affect these populations is of ecological and epidemiological importance, since they show high degrees of genetic diversity. Begomoviruses (genus Begomovirus, Family Geminiviridae) have genomes composed of one single-stranded DNA molecule (monopartite begomoviruses) or two (bipartite begomoviruses, DNA-A and DNA-B components), infect dicotyledonous plants and evolve as quickly as viruses with genomes composed of RNA. The genetic structure of the global metapopulation of begomoviruses was determined in a previous study based on DNA-A sequences and revealed to consist of large, genetically different and cohesive subpopulations due to the existence of geographical barriers, host range and genetic barriers to recombination. However, the structure of the global metapopulation based on DNA-B sequences has not been determined. There is a enough sequences in public databases to carry out such a study. Nevertheless, comparing three or more biological sequences requires the construction of multiple sequence alignments (MSAs). All computer programs used in MSAs construction use heuristic strategies and, therefore, show differences in their degrees of accuracy. Accuracy is a critical parameter for choosing a MSA program. Also, notably, there is a scarcity of studies that assess the level of accuracy of these programs when data sets are composed of viral genomes. Thus, this work (i) determined the genetic structure of the global metapopulation of begomoviruses based on DNA-B sequences, comparing the recombination patterns from their subpopulations; and (ii) evaluated the degree of accuracy of the main MSA programs on viral genomes and their practical implications in genomics studies. To achieve the first objective, full-length DNA-B sequences were obtained from GenBank, subdivided into eight subpopulations by discriminant analysis of principal components and analyzed by seven methods of recombination detection. The inferred subpopulations were genetically different and presented two distinct recombination patterns. The second objective was achieved by estimating the degree of accuracy of 13 MSA programs/settings on six data sets composed of the full-length genomes of species belonging to five genera of viruses and one genus of a subviral agent. The programs showed distinct degrees of accuracy depending on the data set. Notably, MSAs generated by the most accurate programs (determined in this study) and those generated by programs widely used in genomic studies in the field of Virology yielded incongruent phylogenetic trees, suggesting that the evolutionary histories frequently presented in the literature may not be the most likely ones.
publishDate 2020
dc.date.none.fl_str_mv 2020-11-12T17:18:05Z
2020-11-12T17:18:05Z
2020-02-21
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv SANTOS, Matheus de Morais. Population genomics of viruses and its analytical tools. 2020. 217 f. Dissertação (Mestrado em Agronomia) – Universidade Federal de Uberlândia, Uberlândia. DOI http://doi.org/10.14393/ufu.di.2020.228.
https://repositorio.ufu.br/handle/123456789/30358
http://doi.org/10.14393/ufu.di.2020.228
identifier_str_mv SANTOS, Matheus de Morais. Population genomics of viruses and its analytical tools. 2020. 217 f. Dissertação (Mestrado em Agronomia) – Universidade Federal de Uberlândia, Uberlândia. DOI http://doi.org/10.14393/ufu.di.2020.228.
url https://repositorio.ufu.br/handle/123456789/30358
http://doi.org/10.14393/ufu.di.2020.228
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Agronomia
publisher.none.fl_str_mv Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Agronomia
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFU
instname:Universidade Federal de Uberlândia (UFU)
instacron:UFU
instname_str Universidade Federal de Uberlândia (UFU)
instacron_str UFU
institution UFU
reponame_str Repositório Institucional da UFU
collection Repositório Institucional da UFU
repository.name.fl_str_mv Repositório Institucional da UFU - Universidade Federal de Uberlândia (UFU)
repository.mail.fl_str_mv diinf@dirbi.ufu.br
_version_ 1827843511727161344