Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Yagoub Ali Ibrahim Adam
Orientador(a): Ricardo Zorzetto Nicoliello Vencio
Banca de defesa: Silvana Giuliatti, Helder Takashi Imoto Nakaya, Alexandre Rossi Paschoal, Wilson Araújo da Silva Junior
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade de São Paulo
Programa de Pós-Graduação: Bioinformática
Departamento: Não Informado pela instituição
País: BR
Link de acesso: https://doi.org/10.11606/T.95.2019.tde-02042019-201857
Resumo: The Transcription Start Site Associated non-coding RNAs (TSSaRNAs) have been predicted across the three domain of life. However, still, there are no reliable annotation efforts to identify their biological functions and their underline molecular machinery. Therefore, this project addresses the question of what are the potential functions of TSSaRNAs regarding their roles in addressing the cellular functions. To answer this question, we aimed to accurately identify TSSaRNAs in the model organism Halobacterium salinarum NRC-1 (an Archean microorganism) that incubated at the standard growth condition. Consequently, we aimed to investigate TSSaRNAs structural stability in the term of the thermodynamic energies. Moreover, we attempted to functionally annotate TSSaRNAs based on Rfam functional classification of non-coding RNAs. Based on the statistical approach we developed an algorithm to predict TSSaRNA using next-generation RNA sequencing data (RNA-Seq). To perform structural annotation of TSSaRNAs, we investigated the structural stability of TSSaRNAs by modeling the secondary structures by minimizing the thermodynamic free energy. We simulated TSSaRNAs tertiary structures based on the secondary structures constrain using the Rosetta-Common RNA tool. The structures of the minimum free energy supposed to be biophysically stable structures. To investigate the higher order structures of TSSaRNAs, we studied the hybridization between TSSaRNAs and their cognate genes as part of RNA based regulation system. Also, based on our hypothesis that TSSaRNAs may bind to protein to trigger their function, we have investigated the interaction between TSSaRNAs and Lsm protein which known as a chaperone protein that mediates RNA function and involved in RNA processing. Our pipeline to perform the functional annotation of TSSaRNAs aimed to classify TSSaRNAs into their corresponding Rfam families based on two steps: either through querying TSSaRNAs sequences against the co-variance models of Rfam families or by querying the Rfam sequences against the co-variance models of the consensus secondary structures in TSSaRNAs. The results showed that the prediction algorithm has succeeded to identify a total of 224 TSSaRNAs that expressed in the same strand of the mRNAs and 58 TSSaRNAs that expressed as antisense of the mRNAs. The identified TSSaRNAs molecules showed a median length of 25 nucleotides. Regarding the structural annotation of TSSaRNAs, the results showed that most of TSSaRNAs possessed thermodynamically stable secondary structures and their tertiary structures were capable of forming more complex structures through binding with other biomolecules. About the formation of higher-order structures, we have observed that most of TSSaRNAs (92.2%) were capable of hybridizing into their cognate genes also 55 TSSaRNAs indicated putative interactions with Lsm protein. Furthermore, the computation docking experiments demonstrated the TSSaRNAs-Lsm complexes associated with favorable binding energy of a median of -542900 kcal mole -¹. Regarding the functional annotation of TSSaRNAs, the results showed that the majority of TSSaRNAs (42.05%) considered as potential cis-acting regulators such as cis-regulatory element and sRNAs, but still, there are potential trans-acting regulators to regulate distant molecules such as CRISPR and antisense RNA. Moreover, the results indicated that TSSaRNAs could trigger more complex function as a catalytic function such as Riboswitch or to play a role in the defense against a virus such as CRISPR. As a conclusion; based on the results of this study we could state that TSSaRNAs have several potential functions opening the experimental validation perspective.
id USP_1024abc24687935a2aa7790b0ba7f4d8
oai_identifier_str oai:teses.usp.br:tde-02042019-201857
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesis Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1 Predição de função para TSSaRNAs (transcritos associados a sitios de início de transcrição) em Halobacterium salinarum NRC-1 2019-02-07Ricardo Zorzetto Nicoliello VencioSilvana GiuliattiHelder Takashi Imoto NakayaAlexandre Rossi PaschoalWilson Araújo da Silva JuniorYagoub Ali Ibrahim AdamUniversidade de São PauloBioinformáticaUSPBR Anotação estrutural Anotação funcional Docagem de RNA Estruturas de ncRNAs de ordem superior Estruturas RNA Funções de ncRNAs Functional annotation Halobacterium salinarum NRC-1 Halobacterium salinarum NRC-1 Higher-order ncRNAs structures Interações de RNA Interações de TSSaRNAs-LSm ncRNAs functions ncRNAs prediction Non-coding RNA Predição de ncRNAs Regulação baseada em RNA Rfam Rfam RNA based regulation RNA docking RNA interactions RNA não codificante RNA structures Structural annotation TSSaRNAs TSSaRNAs TSSaRNAs-LSm interactions The Transcription Start Site Associated non-coding RNAs (TSSaRNAs) have been predicted across the three domain of life. However, still, there are no reliable annotation efforts to identify their biological functions and their underline molecular machinery. Therefore, this project addresses the question of what are the potential functions of TSSaRNAs regarding their roles in addressing the cellular functions. To answer this question, we aimed to accurately identify TSSaRNAs in the model organism Halobacterium salinarum NRC-1 (an Archean microorganism) that incubated at the standard growth condition. Consequently, we aimed to investigate TSSaRNAs structural stability in the term of the thermodynamic energies. Moreover, we attempted to functionally annotate TSSaRNAs based on Rfam functional classification of non-coding RNAs. Based on the statistical approach we developed an algorithm to predict TSSaRNA using next-generation RNA sequencing data (RNA-Seq). To perform structural annotation of TSSaRNAs, we investigated the structural stability of TSSaRNAs by modeling the secondary structures by minimizing the thermodynamic free energy. We simulated TSSaRNAs tertiary structures based on the secondary structures constrain using the Rosetta-Common RNA tool. The structures of the minimum free energy supposed to be biophysically stable structures. To investigate the higher order structures of TSSaRNAs, we studied the hybridization between TSSaRNAs and their cognate genes as part of RNA based regulation system. Also, based on our hypothesis that TSSaRNAs may bind to protein to trigger their function, we have investigated the interaction between TSSaRNAs and Lsm protein which known as a chaperone protein that mediates RNA function and involved in RNA processing. Our pipeline to perform the functional annotation of TSSaRNAs aimed to classify TSSaRNAs into their corresponding Rfam families based on two steps: either through querying TSSaRNAs sequences against the co-variance models of Rfam families or by querying the Rfam sequences against the co-variance models of the consensus secondary structures in TSSaRNAs. The results showed that the prediction algorithm has succeeded to identify a total of 224 TSSaRNAs that expressed in the same strand of the mRNAs and 58 TSSaRNAs that expressed as antisense of the mRNAs. The identified TSSaRNAs molecules showed a median length of 25 nucleotides. Regarding the structural annotation of TSSaRNAs, the results showed that most of TSSaRNAs possessed thermodynamically stable secondary structures and their tertiary structures were capable of forming more complex structures through binding with other biomolecules. About the formation of higher-order structures, we have observed that most of TSSaRNAs (92.2%) were capable of hybridizing into their cognate genes also 55 TSSaRNAs indicated putative interactions with Lsm protein. Furthermore, the computation docking experiments demonstrated the TSSaRNAs-Lsm complexes associated with favorable binding energy of a median of -542900 kcal mole -¹. Regarding the functional annotation of TSSaRNAs, the results showed that the majority of TSSaRNAs (42.05%) considered as potential cis-acting regulators such as cis-regulatory element and sRNAs, but still, there are potential trans-acting regulators to regulate distant molecules such as CRISPR and antisense RNA. Moreover, the results indicated that TSSaRNAs could trigger more complex function as a catalytic function such as Riboswitch or to play a role in the defense against a virus such as CRISPR. As a conclusion; based on the results of this study we could state that TSSaRNAs have several potential functions opening the experimental validation perspective. Os RNA não codificantes associados ao sítio de início da transcrição - em inglês, transcription start site associated non-coding RNAs (TSSaRNA) - foram observados nos três domínios da vida. No entanto, sem esforço confiável de anotação para identificar suas funções biológicas e seus mecanismos moleculares. Portanto, esse projeto levanta a questão de quais são as funções em potencial dos TSSaRNAs a respeito de seus papeis nas funções celulares. Para responder esta questão, nós objetivamos em identificar de forma eficaz os TSSaRNAs no organismo modelo Halobacterium salinarum NRC-1 (um microrganismo do domínio Arqueia) encubado em uma condição de crescimento padrão. Consequentemente, nós investigamos a estabilidade estrutural dos TSSaRNAs em relação a energias termodinâmicas. Ainda, fizemos a anotação funcional dos TSSaRNAs baseado na classificação funcional Rfam dos RNAs não-codificantes. Baseada em uma abordagem estatística nós desenvolvemos um algoritmo para predizer TSSaRNA usando dados de sequenciamento de RNA de nova geração (RNA-Seq). Para investigar a estabilidade estrutural dos TSSaRNAs nós modelamos as estruturas secundárias minimizando a energia livre termodinâmica para alcançar a estrutura mais estável biofisicamente. Nós simulamos estruturas terciárias de TSSaRNAs baseado nas restrições das estruturas secundárias usando a ferramenta Rosetta-Common RNA. As estruturas de energia livre mínima seriam supostamente estruturas estáveis biofisicamente. Para investigar as estruturas de ordem superior (quaternária) dos TSSaRNAs, nós estudamos a hibridização entre os TSSaRNAs e seus genes cognatos como parte de um possível sistema de regulação baseado em RNA. Ainda, baseada na hipótese que os TSSaRNAs podem ligar à proteína para habilitar sua função, nós investigamos a interação entre TSSaRNAs e proteína Lsm que é conhecida por ser uma proteína chaperone que media função do RNA e está envolvida no processamento do RNA. Nosso pipeline para executar a anotação funcional dos TSSaRNAs objetivou classificar as TSSaRNAs em suas correspondentes classes Rfam baseado em dois passos: por meio de consulta das sequências TSSaRNA em relação a modelos de covariância de famílias Rfam ou por consulta de sequências Rfam em relação a modelos de covariância das estruturas de secundárias de consenso das estruturas secundárias nos TSSaRNAs. Os resultados mostraram que o algoritmo de detecção teve sucesso em identificar um total de 224 TSSaRNAs que expressaram na mesma direção dos mRNAs e 58 TSSaRNAs que expressaram no sentido oposto (antisenso) dos mRNAs. As moléculas TSSaRNAs identificadas mostraram um comprimento mediano de 25 nucleotídeos. A respeito da anotação estrutural dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs possuíam estruturas secundárias estáveis termodinamicamente e suas estruturas terciárias foram capazes de formar estruturas mais complexas por meio de vínculos com outras biomoléculas. Quanto à formação de estruturas de maior de estruturas de alta ordem nos observamos que a maioria dos TSSaRNAs (92.2%) são capazes, pelo menos em princípio, de hibridizar em seus genes cognatos e, também, 55 TSSaRNAs evidenciaram interagir com a proteína Lsm. Além disso, os experimentos computacionais de docking demonstratam os complexos TSSaRNAs-Lsm associados com energia de ligação favorável com uma média de - 542900 kcal mole -¹. Quanto à anotação funcional dos TSSaRNAs, os resultados mostraram que a maioria dos TSSaRNAs (42.05%) podem ser consideradas potenciais reguladores atuando em cis tais como elemento cis-regulamentar e sRNAs, mas ainda há pontenciais reguladores atuando em trans para regular moléculas em loci distantes, tais como CRISPR e RNA antisense. Além disso, os resultados mostraram que TSSaRNAs podem potencialmente ativar funções mais complexas como uma função catalítica, tal como Riboswitch ou executar um papel de defesa contra vírus, tal como CRISPR. Como conclusão; baseado nos resultados desse estudo, nós podemos afirmar que TSSaRNAs possuem várias funções em potencial abrindo a perspecitiva de validação experimental. https://doi.org/10.11606/T.95.2019.tde-02042019-201857info:eu-repo/semantics/openAccessengreponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USP2023-12-21T18:04:33Zoai:teses.usp.br:tde-02042019-201857Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212019-04-10T00:06:19Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.en.fl_str_mv Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
dc.title.alternative.pt.fl_str_mv Predição de função para TSSaRNAs (transcritos associados a sitios de início de transcrição) em Halobacterium salinarum NRC-1
title Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
spellingShingle Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
Yagoub Ali Ibrahim Adam
title_short Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
title_full Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
title_fullStr Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
title_full_unstemmed Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
title_sort Function prediction of transcription start site associated RNAs (TSSaRNAs) in Halobacterium salinarum NRC-1
author Yagoub Ali Ibrahim Adam
author_facet Yagoub Ali Ibrahim Adam
author_role author
dc.contributor.advisor1.fl_str_mv Ricardo Zorzetto Nicoliello Vencio
dc.contributor.referee1.fl_str_mv Silvana Giuliatti
dc.contributor.referee2.fl_str_mv Helder Takashi Imoto Nakaya
dc.contributor.referee3.fl_str_mv Alexandre Rossi Paschoal
dc.contributor.referee4.fl_str_mv Wilson Araújo da Silva Junior
dc.contributor.author.fl_str_mv Yagoub Ali Ibrahim Adam
contributor_str_mv Ricardo Zorzetto Nicoliello Vencio
Silvana Giuliatti
Helder Takashi Imoto Nakaya
Alexandre Rossi Paschoal
Wilson Araújo da Silva Junior
description The Transcription Start Site Associated non-coding RNAs (TSSaRNAs) have been predicted across the three domain of life. However, still, there are no reliable annotation efforts to identify their biological functions and their underline molecular machinery. Therefore, this project addresses the question of what are the potential functions of TSSaRNAs regarding their roles in addressing the cellular functions. To answer this question, we aimed to accurately identify TSSaRNAs in the model organism Halobacterium salinarum NRC-1 (an Archean microorganism) that incubated at the standard growth condition. Consequently, we aimed to investigate TSSaRNAs structural stability in the term of the thermodynamic energies. Moreover, we attempted to functionally annotate TSSaRNAs based on Rfam functional classification of non-coding RNAs. Based on the statistical approach we developed an algorithm to predict TSSaRNA using next-generation RNA sequencing data (RNA-Seq). To perform structural annotation of TSSaRNAs, we investigated the structural stability of TSSaRNAs by modeling the secondary structures by minimizing the thermodynamic free energy. We simulated TSSaRNAs tertiary structures based on the secondary structures constrain using the Rosetta-Common RNA tool. The structures of the minimum free energy supposed to be biophysically stable structures. To investigate the higher order structures of TSSaRNAs, we studied the hybridization between TSSaRNAs and their cognate genes as part of RNA based regulation system. Also, based on our hypothesis that TSSaRNAs may bind to protein to trigger their function, we have investigated the interaction between TSSaRNAs and Lsm protein which known as a chaperone protein that mediates RNA function and involved in RNA processing. Our pipeline to perform the functional annotation of TSSaRNAs aimed to classify TSSaRNAs into their corresponding Rfam families based on two steps: either through querying TSSaRNAs sequences against the co-variance models of Rfam families or by querying the Rfam sequences against the co-variance models of the consensus secondary structures in TSSaRNAs. The results showed that the prediction algorithm has succeeded to identify a total of 224 TSSaRNAs that expressed in the same strand of the mRNAs and 58 TSSaRNAs that expressed as antisense of the mRNAs. The identified TSSaRNAs molecules showed a median length of 25 nucleotides. Regarding the structural annotation of TSSaRNAs, the results showed that most of TSSaRNAs possessed thermodynamically stable secondary structures and their tertiary structures were capable of forming more complex structures through binding with other biomolecules. About the formation of higher-order structures, we have observed that most of TSSaRNAs (92.2%) were capable of hybridizing into their cognate genes also 55 TSSaRNAs indicated putative interactions with Lsm protein. Furthermore, the computation docking experiments demonstrated the TSSaRNAs-Lsm complexes associated with favorable binding energy of a median of -542900 kcal mole -¹. Regarding the functional annotation of TSSaRNAs, the results showed that the majority of TSSaRNAs (42.05%) considered as potential cis-acting regulators such as cis-regulatory element and sRNAs, but still, there are potential trans-acting regulators to regulate distant molecules such as CRISPR and antisense RNA. Moreover, the results indicated that TSSaRNAs could trigger more complex function as a catalytic function such as Riboswitch or to play a role in the defense against a virus such as CRISPR. As a conclusion; based on the results of this study we could state that TSSaRNAs have several potential functions opening the experimental validation perspective.
publishDate 2019
dc.date.issued.fl_str_mv 2019-02-07
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://doi.org/10.11606/T.95.2019.tde-02042019-201857
url https://doi.org/10.11606/T.95.2019.tde-02042019-201857
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.publisher.none.fl_str_mv Universidade de São Paulo
dc.publisher.program.fl_str_mv Bioinformática
dc.publisher.initials.fl_str_mv USP
dc.publisher.country.fl_str_mv BR
publisher.none.fl_str_mv Universidade de São Paulo
dc.source.none.fl_str_mv reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1786376493105938432