Large language models as tools for Bayesian causal discovery
| Ano de defesa: | 2026 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/ |
Resumo: | Causal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality. |
| id |
USP_4a88e0863134e592a19fb89f53e3aa74 |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-18032026-165543 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Large language models as tools for Bayesian causal discoveryGrandes modelos de linguagem como ferramentas para descoberta causalBayesian causal discoveryCausal discoveryCausalidadeCausalityDescoberta causalDescoberta causal BayesianaGrandes modelos de linguagemLarge language modelsCausal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.Descoberta Causal (CD) é a tarefa de inferir estruturas causais automaticamente, usualmente a partir de dados observacionais. Recentemente, muitos trabalhos têm demonstrado interesse em utilizar o conhe- cimento contido nos Grandes Modelos de Linguagem (LLMs) para realizar tarefas de CD. Porém, os algorit- mos de CD existentes atualmente que utilizam LLMs, geram apenas um único Grafo Direcionado Acíclico (DAG) sem nenhuma forma de incerteza, o que diminui a confiança no resultado. Logo, esta dissertação explora métodos de Aprendizado Estrutural Bayesiano (BSL) os quais produzem distribuições de grafos, representações naturais de incerteza e utiliza LLMs para aprimorar sua capacidade de inferência. Este trabalho tem três objetivos principais. Primeiramente, discutimos e revisamos algoritmos básicos de CD e abordagens baseadas em LLMs. Há também uma curta seção que explora diferentes modos de calcular a incerteza da saída de um LLM. Em segundo lugar, investigamos o uso de LLMs em conjunto com métodos de BSL para CD. Especificamente, propomos utilizar o conhecimento do LLM como uma distribuição a priori do modelo BSL sobre possíveis grafos. Nossos experimentos mostram que, tais distribuições a priori retiradas da LLM podem aperfeiçoar o desempenho dos métodos Bayesianos. Finalmente, o texto também traz um trabalho onde focamos em Grafos Ancestrais (AGs), o que nos permite lidar com modelos que contenham confundidores latentes. Além disso, nossa abordagem utiliza uma LLM como um expert-in-the- loop, misturando a distribuição aprendida pelos dados com a informação recebida do especialista, no caso a LLM, durante o processamento do modelo. Nossos experimentos mostram que nosso método é competitivo em relação a outros que abordam confundidores latentes, tanto em conjuntos de dados sintéticos quanto reais; além disso, nosso design para incorporar o feedback de um especialista humano (simulado) ou de um LLM melhora a qualidade da inferência.Biblioteca Digitais de Teses e Dissertações da USPMauá, Denis DerataniSilva, Flavio Soares Correa daWang, BenjieVideira, Bruna Bazaluk Machado2026-02-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2026-03-20T09:01:02Zoai:teses.usp.br:tde-18032026-165543Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212026-03-20T09:01:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Large language models as tools for Bayesian causal discovery Grandes modelos de linguagem como ferramentas para descoberta causal |
| title |
Large language models as tools for Bayesian causal discovery |
| spellingShingle |
Large language models as tools for Bayesian causal discovery Videira, Bruna Bazaluk Machado Bayesian causal discovery Causal discovery Causalidade Causality Descoberta causal Descoberta causal Bayesiana Grandes modelos de linguagem Large language models |
| title_short |
Large language models as tools for Bayesian causal discovery |
| title_full |
Large language models as tools for Bayesian causal discovery |
| title_fullStr |
Large language models as tools for Bayesian causal discovery |
| title_full_unstemmed |
Large language models as tools for Bayesian causal discovery |
| title_sort |
Large language models as tools for Bayesian causal discovery |
| author |
Videira, Bruna Bazaluk Machado |
| author_facet |
Videira, Bruna Bazaluk Machado |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Mauá, Denis Deratani Silva, Flavio Soares Correa da Wang, Benjie |
| dc.contributor.author.fl_str_mv |
Videira, Bruna Bazaluk Machado |
| dc.subject.por.fl_str_mv |
Bayesian causal discovery Causal discovery Causalidade Causality Descoberta causal Descoberta causal Bayesiana Grandes modelos de linguagem Large language models |
| topic |
Bayesian causal discovery Causal discovery Causalidade Causality Descoberta causal Descoberta causal Bayesiana Grandes modelos de linguagem Large language models |
| description |
Causal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality. |
| publishDate |
2026 |
| dc.date.none.fl_str_mv |
2026-02-23 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/ |
| url |
https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1865492444051668992 |