Development and application of statistical genetic methods to genomic prediction in Coffea canephora

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: Luís Felipe Ventorim Ferrão
Orientador(a): Antonio Augusto Franco Garcia
Banca de defesa: Luciana Lasry Benchimol, Carlos Augusto Colombo, Gabriel Rodrigues Alves Margarido, Luiz Filipe Protasio Pereira
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade de São Paulo
Programa de Pós-Graduação: Agronomia (Genética e Melhoramento de Plantas)
Departamento: Não Informado pela instituição
País: BR
Link de acesso: https://doi.org/10.11606/T.11.2017.tde-17082017-143756
Resumo: Genomic selection (GS) works by simultaneously selecting hundreds or thousands of markers covering the genome so that the majority of quantitative trait loci are in linkage disequilibrium (LD) with such markers. Thus, markers associated with QTLs, regardless of the significance of their effects, are used to explain the genetic variation of a trait. Simulation and empirical results have shown that genomic prediction presents sufficient accuracy to help success in breeding programs, in contrast to traditional phenotypic analysis. For this end, an important step addresses the use of statistical genetic models able to predict the phenotypic performance for important traits. Although some crops have benefited from this approach, studies in the genus Coffea are still in their infancy. Until now, there have been no studies of how predictive models work across populations and environments or, even, their performance for different complex traits. Therefore, the main objective of this research is investigating important aspects related to statistical modeling in order to enable a more comprehensive understanding of what makes a robust prediction model and, as consequence, apply it in practical breeding programs. Real data from two experimental populations of Coffea canephora, evaluated in two brazilian locations and SNPs identified by Genotyping-by-Sequencing (GBS) were considered to investigate the genotype-phenotype relationship. In terms of statistical modelling, two classes of models were considered: i) Mixed models, based on genomic relationship matrix to define the (co)variance between relatives (called GBLUP model); and ii) Multilocus association models, which thousands of markers are modeled simultaneously and the marker effects are summed, in order to compute the genetic merit of individuals. Both approaches were considered in separated chapters. Chapter entitled \"A mixed model to multiplicative harvest-location trial applied to genomic prediction in Coffea canephora\" addressed an expansion of the traditional GBLUP to accommodate interaction effects (Genotype × Local and Genotype × Harvest). For this end, we have tested appropriate (co)variance structures for modeling heterogeneity and correlation of genetic effects and residual effects. The proposed model, called MET.GBLUP, showed the best goodness of fit and higher predictive ability, when compared to other methods. Chapter in the sequence was entitled \"Comparison of statistical methods and reliability of genomic prediction in Coffea canephora population\" and addressed the use of different modelling assumptions considering multilocos association models. The usual assumption of marker effects drawn from a normal distribution was relaxed, in order to seek for a possible dependency between predictive performance and trait, conditional on the genetic architecture. Although the competitor models are conceptually different, a minimal difference in predictive accuracy was observed in the comparative analysis. In terms of computational demand, Bayesian models showed higher time of analysis. Results discussed in both chapters have supported the potential of genomic selection to reshape traditional breeding programs. In practice, compared to traditional phenotypic evaluation, it is expected to accelerate the breeding cycle in recurrent selection programs, maintain genetic diversity and increase the genetic gain per unit of time.
id USP_5a0949e02479a6e77d7b2e790418197f
oai_identifier_str oai:teses.usp.br:tde-17082017-143756
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis Development and application of statistical genetic methods to genomic prediction in Coffea canephora Desenvolvimento e aplicação de métodos genético-estatíticos para predição genômica em Coffea canephora 2017-04-07Antonio Augusto Franco GarciaLuciana Lasry BenchimolCarlos Augusto ColomboGabriel Rodrigues Alves MargaridoLuiz Filipe Protasio PereiraLuís Felipe Ventorim FerrãoUniversidade de São PauloAgronomia (Genética e Melhoramento de Plantas)USPBR Café Coffee Genomic selection Linear models Marcadores moleculares Modelos lineares Molecular markers Seleção genômica Genomic selection (GS) works by simultaneously selecting hundreds or thousands of markers covering the genome so that the majority of quantitative trait loci are in linkage disequilibrium (LD) with such markers. Thus, markers associated with QTLs, regardless of the significance of their effects, are used to explain the genetic variation of a trait. Simulation and empirical results have shown that genomic prediction presents sufficient accuracy to help success in breeding programs, in contrast to traditional phenotypic analysis. For this end, an important step addresses the use of statistical genetic models able to predict the phenotypic performance for important traits. Although some crops have benefited from this approach, studies in the genus Coffea are still in their infancy. Until now, there have been no studies of how predictive models work across populations and environments or, even, their performance for different complex traits. Therefore, the main objective of this research is investigating important aspects related to statistical modeling in order to enable a more comprehensive understanding of what makes a robust prediction model and, as consequence, apply it in practical breeding programs. Real data from two experimental populations of Coffea canephora, evaluated in two brazilian locations and SNPs identified by Genotyping-by-Sequencing (GBS) were considered to investigate the genotype-phenotype relationship. In terms of statistical modelling, two classes of models were considered: i) Mixed models, based on genomic relationship matrix to define the (co)variance between relatives (called GBLUP model); and ii) Multilocus association models, which thousands of markers are modeled simultaneously and the marker effects are summed, in order to compute the genetic merit of individuals. Both approaches were considered in separated chapters. Chapter entitled \"A mixed model to multiplicative harvest-location trial applied to genomic prediction in Coffea canephora\" addressed an expansion of the traditional GBLUP to accommodate interaction effects (Genotype × Local and Genotype × Harvest). For this end, we have tested appropriate (co)variance structures for modeling heterogeneity and correlation of genetic effects and residual effects. The proposed model, called MET.GBLUP, showed the best goodness of fit and higher predictive ability, when compared to other methods. Chapter in the sequence was entitled \"Comparison of statistical methods and reliability of genomic prediction in Coffea canephora population\" and addressed the use of different modelling assumptions considering multilocos association models. The usual assumption of marker effects drawn from a normal distribution was relaxed, in order to seek for a possible dependency between predictive performance and trait, conditional on the genetic architecture. Although the competitor models are conceptually different, a minimal difference in predictive accuracy was observed in the comparative analysis. In terms of computational demand, Bayesian models showed higher time of analysis. Results discussed in both chapters have supported the potential of genomic selection to reshape traditional breeding programs. In practice, compared to traditional phenotypic evaluation, it is expected to accelerate the breeding cycle in recurrent selection programs, maintain genetic diversity and increase the genetic gain per unit of time. Seleção Genômica pode ser definida como a seleção simultânea de centenas ou milhares de marcadores moleculares, os quais cobrem o genoma de forma densa, de modo que locos de caracteres quantitativos (QTL) estejam em desequilíbrio de ligação com uma parte desses marcadores. Assim, marcadores associados a QTLs, independentemente da significância dos seus efeitos, são utilizados na predição do mérito genético de um indivíduo para um determinado caráter. Simulações e estudos empíricos mostram que essa abordagem apresenta acurácia suficiente para garantir o sucesso em programas de melhoramento genético, quando comparado com os métodos tradicionais de seleção fenotípica. Para tanto, uma das etapas requeridas é o uso de modelos genético-estatísticos que contemplem a predição fidedigna da performance fenotípica da população sob estudo. Apesar da relevância, o número de estudos no gênero Coffea ainda são reduzidos, não havendo relatos sobre o desempenho desses modelos em diferentes populações e ambientes, ou mesmo, a sua performance para diferentes caracteres agronômicos do cafeeiro. Dessa forma, este estudo tem como finalidade investigar aspectos relacionados a modelagem estatística, a fim de compreender quais são os fatores que tornam os modelos preditivos mais acurados e utiliza-los em programas aplicados de melhoramento genético. Dados reais de duas populações de seleção recorrente de Coffea canephora, avaliados em dois ambientes e genotipados pela tecnologia de genotipagem por sequenciamento (GBS, do inglês Genotyping-by-Sequencing) foram considerados para o estudo da relação entre genótipo-fenótipo. Em termos de modelagem estatística, duas classes de modelos foram considerados: i) Modelos mistos, baseados no cálculo da matriz de parentesco realizado como medida de (co)variância genética entre indivíduos (modelo GBLUP); e ii) Modelos de associação multilocos, no qual milhares de marcadores moleculares são modelados simultaneamente e os efeitos estimados dos marcadores são somados, a fim de computar o mérito genético dos indivíduos. Ambas estratégias foram descritas em capítulos separados no formato de artigo científico. O capítulo intitulado \"A mixed model to multiplicative harvest-location trial applied to genomic prediction in Coffea canephora\" abordou uma expansão do modelo GBLUP de modo a contemplar efeitos de interações entre Genótipo × Colheita e Genótipo × Local. Para tanto, apropriadas estruturas de variância e covariância para modelagem da heterogeneidade e correlação dos efeitos genéticos e residuais foram testadas. O modelo proposto, denominado de MET.GBLUP, apresentou melhor qualidade de ajuste e capacidade preditiva, quando comparado com outros métodos. O capítulo em sequência, intitulado de \"Comparison of statistical methods and reliability of genomic prediction in Coffea canephora population\" investigou a capacidade preditiva de diferentes modelos de associação multilocos. A suposição usual de efeitos dos marcadores amostrados de uma distribuição normal foi relaxada, a fim de testar métodos alternativos que pudessem melhor descrever o fenômeno biológico e, consequentemente, resultar em maior capacidade preditiva. Embora os modelos testados sejam conceitualmente distintos, diferenças mínimas nos valores de acurácia de predição foram observadas nos cenários testados. Em termos de demanda computacional, modelos Bayesianos apresentaram maior tempo de análise. Os resultados descritos em ambos os capítulos apoiam o potencial do uso da seleção genômica em programas de melhoramento assistido de café. Em termos práticos, comparado com métodos tradicionais de avaliação fenotípica, é esperado que a implementação desses conceitos em programas de seleção recorrente possam acelerar o ciclo de melhoramento, manter a diversidade genética e, sobretudo, aumentar o ganho genético por unidade de tempo. https://doi.org/10.11606/T.11.2017.tde-17082017-143756info:eu-repo/semantics/openAccessengreponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USP2023-12-21T19:13:43Zoai:teses.usp.br:tde-17082017-143756Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212019-08-18T06:00:17Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.en.fl_str_mv Development and application of statistical genetic methods to genomic prediction in Coffea canephora
dc.title.alternative.pt.fl_str_mv Desenvolvimento e aplicação de métodos genético-estatíticos para predição genômica em Coffea canephora
title Development and application of statistical genetic methods to genomic prediction in Coffea canephora
spellingShingle Development and application of statistical genetic methods to genomic prediction in Coffea canephora
Luís Felipe Ventorim Ferrão
title_short Development and application of statistical genetic methods to genomic prediction in Coffea canephora
title_full Development and application of statistical genetic methods to genomic prediction in Coffea canephora
title_fullStr Development and application of statistical genetic methods to genomic prediction in Coffea canephora
title_full_unstemmed Development and application of statistical genetic methods to genomic prediction in Coffea canephora
title_sort Development and application of statistical genetic methods to genomic prediction in Coffea canephora
author Luís Felipe Ventorim Ferrão
author_facet Luís Felipe Ventorim Ferrão
author_role author
dc.contributor.advisor1.fl_str_mv Antonio Augusto Franco Garcia
dc.contributor.referee1.fl_str_mv Luciana Lasry Benchimol
dc.contributor.referee2.fl_str_mv Carlos Augusto Colombo
dc.contributor.referee3.fl_str_mv Gabriel Rodrigues Alves Margarido
dc.contributor.referee4.fl_str_mv Luiz Filipe Protasio Pereira
dc.contributor.author.fl_str_mv Luís Felipe Ventorim Ferrão
contributor_str_mv Antonio Augusto Franco Garcia
Luciana Lasry Benchimol
Carlos Augusto Colombo
Gabriel Rodrigues Alves Margarido
Luiz Filipe Protasio Pereira
description Genomic selection (GS) works by simultaneously selecting hundreds or thousands of markers covering the genome so that the majority of quantitative trait loci are in linkage disequilibrium (LD) with such markers. Thus, markers associated with QTLs, regardless of the significance of their effects, are used to explain the genetic variation of a trait. Simulation and empirical results have shown that genomic prediction presents sufficient accuracy to help success in breeding programs, in contrast to traditional phenotypic analysis. For this end, an important step addresses the use of statistical genetic models able to predict the phenotypic performance for important traits. Although some crops have benefited from this approach, studies in the genus Coffea are still in their infancy. Until now, there have been no studies of how predictive models work across populations and environments or, even, their performance for different complex traits. Therefore, the main objective of this research is investigating important aspects related to statistical modeling in order to enable a more comprehensive understanding of what makes a robust prediction model and, as consequence, apply it in practical breeding programs. Real data from two experimental populations of Coffea canephora, evaluated in two brazilian locations and SNPs identified by Genotyping-by-Sequencing (GBS) were considered to investigate the genotype-phenotype relationship. In terms of statistical modelling, two classes of models were considered: i) Mixed models, based on genomic relationship matrix to define the (co)variance between relatives (called GBLUP model); and ii) Multilocus association models, which thousands of markers are modeled simultaneously and the marker effects are summed, in order to compute the genetic merit of individuals. Both approaches were considered in separated chapters. Chapter entitled \"A mixed model to multiplicative harvest-location trial applied to genomic prediction in Coffea canephora\" addressed an expansion of the traditional GBLUP to accommodate interaction effects (Genotype × Local and Genotype × Harvest). For this end, we have tested appropriate (co)variance structures for modeling heterogeneity and correlation of genetic effects and residual effects. The proposed model, called MET.GBLUP, showed the best goodness of fit and higher predictive ability, when compared to other methods. Chapter in the sequence was entitled \"Comparison of statistical methods and reliability of genomic prediction in Coffea canephora population\" and addressed the use of different modelling assumptions considering multilocos association models. The usual assumption of marker effects drawn from a normal distribution was relaxed, in order to seek for a possible dependency between predictive performance and trait, conditional on the genetic architecture. Although the competitor models are conceptually different, a minimal difference in predictive accuracy was observed in the comparative analysis. In terms of computational demand, Bayesian models showed higher time of analysis. Results discussed in both chapters have supported the potential of genomic selection to reshape traditional breeding programs. In practice, compared to traditional phenotypic evaluation, it is expected to accelerate the breeding cycle in recurrent selection programs, maintain genetic diversity and increase the genetic gain per unit of time.
publishDate 2017
dc.date.issued.fl_str_mv 2017-04-07
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.11606/T.11.2017.tde-17082017-143756
url https://doi.org/10.11606/T.11.2017.tde-17082017-143756
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade de São Paulo
dc.publisher.program.fl_str_mv Agronomia (Genética e Melhoramento de Plantas)
dc.publisher.initials.fl_str_mv USP
dc.publisher.country.fl_str_mv BR
publisher.none.fl_str_mv Universidade de São Paulo
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1786376902790873088