Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Marin-Gazon, Natália Andrea [UNESP]
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Estadual Paulista (Unesp)
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
DNA
Link de acesso: https://hdl.handle.net/11449/252970
Resumo: Ongoing advances in genome sequencing technologies have enabled the unravelling of many structural variants (SVs) in livestock genomes. Association of SVs with complex traits are promising targets for animal breeding because of their effects on gene expression. The aims of this study were: i). to detect structural variants using whole genome resequencing data of 151 representative Nellore bulls by combining calling algorithms ii) to discover non redundant and highly frequent regions of structural variants (SVR) in the analyzed bulls. iii). to search for positional candidate genes, and quantitative trait loci (QTL) overlapping the most frequent SVR in the population. iv) to assess the functional impact of positional candidate genes overlapping SVR trough enriched gene ontology terms (GO terms) related to biological process (BP), molecular function (MF), cellular component (CC), and biochemical pathways. The whole genome re-sequencing (WGS) data from 151 representative Nellore bulls was used to conduct genome-wide structural variation calling, as well as detection of common SVRs for the analyzed bulls. The gene content and nearby QTLs of SVRs were retrieved from publicly available genomic databases, and functional enrichment analysis of positional candidate genes overlapping the most frequent SVRs were conducted using PANTHER. A total of 215,031 high-confidence SV was obtained, most of them corresponding to copy number variants (CNV) (183,032 deletions-DEL, and 14,013 duplications-DUP), and 17,986 inversions (INV). Total structural variation encompasses, on average, 4.81% of the individual autosomal genome extension. A total of 3,752 non-redundant SVR frequent in more than 5% of bulls were obtained, corresponding in more than 97% to regions of copy number variants (CNVR), and 3% to regions of inversions (INVR). All SVR comprises 13.13% of total autosomal genome extension, which were attributed in 11.4% to CNVR and 1.7% to INVR. Among all SVRs, 532 were shared by more than 50% of bulls and overlapped a total of 130 QTL distributed into 6 QTL types: exterior, health, meat and carcass, milk, production, and reproduction, which are related to a total of 50 economically important traits. Most SVR overlapped QTLs related to residual feed intake, structural soundness, multiple birth, clinical mastitis, and milk energy yield. Regarding gene content, 204 SVRs, overlapped a total of 1,164 positional candidate genes, which were significatively overrepresented into GO terms related to BP, MF, CC and one biochemical pathway. Among the significantly enriched genes, we highlight members of the olfactory receptor (OR) gene family, which play essential roles in mechanisms for adaptation to the environment. These genes were mainly found into regions of inversion and mixed events. Similarly, genes from the defensin family (DEFB), that play important roles in the innate immune system of multicellular organisms, and which are known to be caused by duplication events that mammalian genomes have undergone. Other important genes were found in this study, such as the members of the secretory phospholipase A2 family, adhesion G protein-coupled receptors, and zinc finger binding proteins. Most of the genes found in this study have been described as potential candidates for feed efficiency indicator traits, which reflects the biochemical mechanisms in which they are involved that have led to the improved fitness. The results of this study provide important knowledge about the mechanisms driving changes in the genome in Nellore cattle, the contributed to adaptation to environment.
id UNSP_c67db909d6a1b89322abead795ecb95e
oai_identifier_str oai:repositorio.unesp.br:11449/252970
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str
spelling Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattleDetecção e avaliação funcional de variantes estruturais usando dados de re-sequenciamento de genoma completo em gado Nelore.DNACopy number variationmobile elementsnext generation sequencingsequence gains and lossesZebusOngoing advances in genome sequencing technologies have enabled the unravelling of many structural variants (SVs) in livestock genomes. Association of SVs with complex traits are promising targets for animal breeding because of their effects on gene expression. The aims of this study were: i). to detect structural variants using whole genome resequencing data of 151 representative Nellore bulls by combining calling algorithms ii) to discover non redundant and highly frequent regions of structural variants (SVR) in the analyzed bulls. iii). to search for positional candidate genes, and quantitative trait loci (QTL) overlapping the most frequent SVR in the population. iv) to assess the functional impact of positional candidate genes overlapping SVR trough enriched gene ontology terms (GO terms) related to biological process (BP), molecular function (MF), cellular component (CC), and biochemical pathways. The whole genome re-sequencing (WGS) data from 151 representative Nellore bulls was used to conduct genome-wide structural variation calling, as well as detection of common SVRs for the analyzed bulls. The gene content and nearby QTLs of SVRs were retrieved from publicly available genomic databases, and functional enrichment analysis of positional candidate genes overlapping the most frequent SVRs were conducted using PANTHER. A total of 215,031 high-confidence SV was obtained, most of them corresponding to copy number variants (CNV) (183,032 deletions-DEL, and 14,013 duplications-DUP), and 17,986 inversions (INV). Total structural variation encompasses, on average, 4.81% of the individual autosomal genome extension. A total of 3,752 non-redundant SVR frequent in more than 5% of bulls were obtained, corresponding in more than 97% to regions of copy number variants (CNVR), and 3% to regions of inversions (INVR). All SVR comprises 13.13% of total autosomal genome extension, which were attributed in 11.4% to CNVR and 1.7% to INVR. Among all SVRs, 532 were shared by more than 50% of bulls and overlapped a total of 130 QTL distributed into 6 QTL types: exterior, health, meat and carcass, milk, production, and reproduction, which are related to a total of 50 economically important traits. Most SVR overlapped QTLs related to residual feed intake, structural soundness, multiple birth, clinical mastitis, and milk energy yield. Regarding gene content, 204 SVRs, overlapped a total of 1,164 positional candidate genes, which were significatively overrepresented into GO terms related to BP, MF, CC and one biochemical pathway. Among the significantly enriched genes, we highlight members of the olfactory receptor (OR) gene family, which play essential roles in mechanisms for adaptation to the environment. These genes were mainly found into regions of inversion and mixed events. Similarly, genes from the defensin family (DEFB), that play important roles in the innate immune system of multicellular organisms, and which are known to be caused by duplication events that mammalian genomes have undergone. Other important genes were found in this study, such as the members of the secretory phospholipase A2 family, adhesion G protein-coupled receptors, and zinc finger binding proteins. Most of the genes found in this study have been described as potential candidates for feed efficiency indicator traits, which reflects the biochemical mechanisms in which they are involved that have led to the improved fitness. The results of this study provide important knowledge about the mechanisms driving changes in the genome in Nellore cattle, the contributed to adaptation to environment.Os avanços contínuos nas tecnologias de sequenciamento do genoma permitiram o desvendamento de diferentes variantes estruturais (SVs) nos genomas dos animais de interesse zootécnico. A associação de SVs com características complexas são alvos promissores para o melhoramento animal devido aos seus efeitos na expressão gênica. Os objetivos deste estudo foram: i). detectar variantes estruturais usando dados de res-sequenciamento do genoma completo de 151 touros Nelore representativos combinando diferentes algoritmos; ii) descobrir regiões não redundantes e altamente frequentes de variantes estruturais (SVR) nos touros analisados. procurar genes candidatos posicionais e locos de características quantitativas (“QTL”) em sobreposição às SVRs mais frequentes na população. iv) avaliar o impacto funcional de genes candidatos posicionais abrigados pelas SVRs através de enriquecimento funcional de termos de ontologia genética (GO terms) relacionados a processo biológico (BP), função molecular (MF), componente celular (CC) e vias bioquímicas. Dados de re-sequenciamento do genoma completo de 151 touros Nelore representativos foram usados para conduzir a detecção de variantes estruturais, bem como para identificar SVRs comuns aos touros analisados. O conteúdo genético e os QTLs próximos de SVRs foram recuperados de bancos de dados genômicos disponíveis publicamente, e a análise de enriquecimento funcional de genes candidatos posicionais que se sobrepõem aos SVRs mais frequentes foi conduzida usando PANTHER. Foram obtidas 215.031 SVs de alta confiança, a maioria correspondendo a variantes de número de cópias (CNV) (183.032 deleções-DEL, e 14.013 duplicações-DUP, e 17.986 inversões (INV). A variação estrutural total abrange, em média, 4,81% da extensão do genoma autossômico individual. Foram obtidos um total de 3.752 SVRs frequentes em mais de 5% dos touros, correspondendo em mais de 97% a regiões de variantes de número de cópias (CNVR) e 3% a regiões de inversões (INVR). Todos os SVRs compreendem 13,13% da extensão total do genoma autossômico, que foram atribuídos em 11,4% a CNVRs e 1,7% a INVRs. Entre todos as SVRs, 532 foram compartilhados por mais de 50% dos touros e sobrepuseram um total de 130 QTL distribuídos em 6 tipos: exterior, saúde, carne e carcaça, leite, produção e reprodução, que estão relacionados a um total de 50 características de importância económica. A maioria dos QTLs sobrepostos às SVRs estão relacionados a consumo residual, solidez estrutural, nascimentos múltiplos, mastite clínica, e produção de energia do leite. Em relação ao conteúdo gênico, 204 SVRs das mais frequentes se sobrepuseram a um total de 1.164 genes candidatos posicionais, que foram significativamente super-representados em termos GO relacionados a BP, MF, CC e uma via bioquímica. Dentre os genes significativamente enriquecidos, destacamos os membros da família de genes dos receptores olfativos (OR), com funções importantes nos mecanismos de adaptação ao ambiente. Esses genes foram encontrados principalmente dentre de regiões com ocorrência de eventos de inversões e eventos mistos, que podem explicar a prevalência nos genomas ao longo da evolução dos mamíferos. De forma similar, os genes da família das defensinas (DEFB) que desempenham papéis importantes no sistema imunológico inato de organismos multicelulares, e que são conhecidos por serem causados por eventos de duplicação que sofreram os genomas dos mamíferos. Destacam-se também membros da família da fosfolipase A2 secretora, os receptores acoplados à proteína G de adesão e as proteínas de ligação aos dedos de zinco. A maioria dos genes encontrados neste estudo são candidatos potenciais para características indicadoras de eficiência alimentar, o que reflete os mecanismos bioquímicos nos quais estão envolvidos e que levaram à melhoria do valor adaptativo. Os resultados deste estudo fornecem conhecimento importante sobre os mecanismos que impulsionam as mudanças no genoma de bovinos Nelore, as causas dos processos de adaptação e a caracterização das consequências da variação estrutural em relação à diversidade genética.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)FAPESP# 2009/16118-5FAPESP #2017/10630-2FAPESP #2018/20026-8Universidade Estadual Paulista (Unesp)Albuquerque, Lucia Galvão deVargas, GiovanaMarin-Gazon, Natália Andrea [UNESP]2024-01-23T16:35:40Z2024-01-23T16:35:40Z2023-10-31info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfapplication/pdfMARIN-GAZON, N. A. Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle. 2023 - 110f - Tese (Doutorado em Genética e Melhoramento Animal). Universidade Estadual Paulista, Jaboticabal. 2024.https://hdl.handle.net/11449/25297033004102002P0enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESP2025-10-22T09:14:49Zoai:repositorio.unesp.br:11449/252970Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462025-10-22T09:14:49Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
Detecção e avaliação funcional de variantes estruturais usando dados de re-sequenciamento de genoma completo em gado Nelore.
title Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
spellingShingle Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
Marin-Gazon, Natália Andrea [UNESP]
DNA
Copy number variation
mobile elements
next generation sequencing
sequence gains and losses
Zebus
title_short Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
title_full Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
title_fullStr Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
title_full_unstemmed Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
title_sort Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle
author Marin-Gazon, Natália Andrea [UNESP]
author_facet Marin-Gazon, Natália Andrea [UNESP]
author_role author
dc.contributor.none.fl_str_mv Albuquerque, Lucia Galvão de
Vargas, Giovana
dc.contributor.author.fl_str_mv Marin-Gazon, Natália Andrea [UNESP]
dc.subject.por.fl_str_mv DNA
Copy number variation
mobile elements
next generation sequencing
sequence gains and losses
Zebus
topic DNA
Copy number variation
mobile elements
next generation sequencing
sequence gains and losses
Zebus
description Ongoing advances in genome sequencing technologies have enabled the unravelling of many structural variants (SVs) in livestock genomes. Association of SVs with complex traits are promising targets for animal breeding because of their effects on gene expression. The aims of this study were: i). to detect structural variants using whole genome resequencing data of 151 representative Nellore bulls by combining calling algorithms ii) to discover non redundant and highly frequent regions of structural variants (SVR) in the analyzed bulls. iii). to search for positional candidate genes, and quantitative trait loci (QTL) overlapping the most frequent SVR in the population. iv) to assess the functional impact of positional candidate genes overlapping SVR trough enriched gene ontology terms (GO terms) related to biological process (BP), molecular function (MF), cellular component (CC), and biochemical pathways. The whole genome re-sequencing (WGS) data from 151 representative Nellore bulls was used to conduct genome-wide structural variation calling, as well as detection of common SVRs for the analyzed bulls. The gene content and nearby QTLs of SVRs were retrieved from publicly available genomic databases, and functional enrichment analysis of positional candidate genes overlapping the most frequent SVRs were conducted using PANTHER. A total of 215,031 high-confidence SV was obtained, most of them corresponding to copy number variants (CNV) (183,032 deletions-DEL, and 14,013 duplications-DUP), and 17,986 inversions (INV). Total structural variation encompasses, on average, 4.81% of the individual autosomal genome extension. A total of 3,752 non-redundant SVR frequent in more than 5% of bulls were obtained, corresponding in more than 97% to regions of copy number variants (CNVR), and 3% to regions of inversions (INVR). All SVR comprises 13.13% of total autosomal genome extension, which were attributed in 11.4% to CNVR and 1.7% to INVR. Among all SVRs, 532 were shared by more than 50% of bulls and overlapped a total of 130 QTL distributed into 6 QTL types: exterior, health, meat and carcass, milk, production, and reproduction, which are related to a total of 50 economically important traits. Most SVR overlapped QTLs related to residual feed intake, structural soundness, multiple birth, clinical mastitis, and milk energy yield. Regarding gene content, 204 SVRs, overlapped a total of 1,164 positional candidate genes, which were significatively overrepresented into GO terms related to BP, MF, CC and one biochemical pathway. Among the significantly enriched genes, we highlight members of the olfactory receptor (OR) gene family, which play essential roles in mechanisms for adaptation to the environment. These genes were mainly found into regions of inversion and mixed events. Similarly, genes from the defensin family (DEFB), that play important roles in the innate immune system of multicellular organisms, and which are known to be caused by duplication events that mammalian genomes have undergone. Other important genes were found in this study, such as the members of the secretory phospholipase A2 family, adhesion G protein-coupled receptors, and zinc finger binding proteins. Most of the genes found in this study have been described as potential candidates for feed efficiency indicator traits, which reflects the biochemical mechanisms in which they are involved that have led to the improved fitness. The results of this study provide important knowledge about the mechanisms driving changes in the genome in Nellore cattle, the contributed to adaptation to environment.
publishDate 2023
dc.date.none.fl_str_mv 2023-10-31
2024-01-23T16:35:40Z
2024-01-23T16:35:40Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv MARIN-GAZON, N. A. Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle. 2023 - 110f - Tese (Doutorado em Genética e Melhoramento Animal). Universidade Estadual Paulista, Jaboticabal. 2024.
https://hdl.handle.net/11449/252970
33004102002P0
identifier_str_mv MARIN-GAZON, N. A. Detection and functional assessment of structural variants using whole genome re-sequencing data in Nellore cattle. 2023 - 110f - Tese (Doutorado em Genética e Melhoramento Animal). Universidade Estadual Paulista, Jaboticabal. 2024.
33004102002P0
url https://hdl.handle.net/11449/252970
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
application/pdf
dc.publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
publisher.none.fl_str_mv Universidade Estadual Paulista (Unesp)
dc.source.none.fl_str_mv reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv repositoriounesp@unesp.br
_version_ 1854954827933548544