Large language models as tools for Bayesian causal discovery

Detalhes bibliográficos
Ano de defesa: 2026
Autor(a) principal: Videira, Bruna Bazaluk Machado
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/
Resumo: Causal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.
id USP_4a88e0863134e592a19fb89f53e3aa74
oai_identifier_str oai:teses.usp.br:tde-18032026-165543
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling Large language models as tools for Bayesian causal discoveryGrandes modelos de linguagem como ferramentas para descoberta causalBayesian causal discoveryCausal discoveryCausalidadeCausalityDescoberta causalDescoberta causal BayesianaGrandes modelos de linguagemLarge language modelsCausal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.Descoberta Causal (CD) é a tarefa de inferir estruturas causais automaticamente, usualmente a partir de dados observacionais. Recentemente, muitos trabalhos têm demonstrado interesse em utilizar o conhe- cimento contido nos Grandes Modelos de Linguagem (LLMs) para realizar tarefas de CD. Porém, os algorit- mos de CD existentes atualmente que utilizam LLMs, geram apenas um único Grafo Direcionado Acíclico (DAG) sem nenhuma forma de incerteza, o que diminui a confiança no resultado. Logo, esta dissertação explora métodos de Aprendizado Estrutural Bayesiano (BSL) os quais produzem distribuições de grafos, representações naturais de incerteza e utiliza LLMs para aprimorar sua capacidade de inferência. Este trabalho tem três objetivos principais. Primeiramente, discutimos e revisamos algoritmos básicos de CD e abordagens baseadas em LLMs. Há também uma curta seção que explora diferentes modos de calcular a incerteza da saída de um LLM. Em segundo lugar, investigamos o uso de LLMs em conjunto com métodos de BSL para CD. Especificamente, propomos utilizar o conhecimento do LLM como uma distribuição a priori do modelo BSL sobre possíveis grafos. Nossos experimentos mostram que, tais distribuições a priori retiradas da LLM podem aperfeiçoar o desempenho dos métodos Bayesianos. Finalmente, o texto também traz um trabalho onde focamos em Grafos Ancestrais (AGs), o que nos permite lidar com modelos que contenham confundidores latentes. Além disso, nossa abordagem utiliza uma LLM como um expert-in-the- loop, misturando a distribuição aprendida pelos dados com a informação recebida do especialista, no caso a LLM, durante o processamento do modelo. Nossos experimentos mostram que nosso método é competitivo em relação a outros que abordam confundidores latentes, tanto em conjuntos de dados sintéticos quanto reais; além disso, nosso design para incorporar o feedback de um especialista humano (simulado) ou de um LLM melhora a qualidade da inferência.Biblioteca Digitais de Teses e Dissertações da USPMauá, Denis DerataniSilva, Flavio Soares Correa daWang, BenjieVideira, Bruna Bazaluk Machado2026-02-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2026-03-20T09:01:02Zoai:teses.usp.br:tde-18032026-165543Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212026-03-20T09:01:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Large language models as tools for Bayesian causal discovery
Grandes modelos de linguagem como ferramentas para descoberta causal
title Large language models as tools for Bayesian causal discovery
spellingShingle Large language models as tools for Bayesian causal discovery
Videira, Bruna Bazaluk Machado
Bayesian causal discovery
Causal discovery
Causalidade
Causality
Descoberta causal
Descoberta causal Bayesiana
Grandes modelos de linguagem
Large language models
title_short Large language models as tools for Bayesian causal discovery
title_full Large language models as tools for Bayesian causal discovery
title_fullStr Large language models as tools for Bayesian causal discovery
title_full_unstemmed Large language models as tools for Bayesian causal discovery
title_sort Large language models as tools for Bayesian causal discovery
author Videira, Bruna Bazaluk Machado
author_facet Videira, Bruna Bazaluk Machado
author_role author
dc.contributor.none.fl_str_mv Mauá, Denis Deratani
Silva, Flavio Soares Correa da
Wang, Benjie
dc.contributor.author.fl_str_mv Videira, Bruna Bazaluk Machado
dc.subject.por.fl_str_mv Bayesian causal discovery
Causal discovery
Causalidade
Causality
Descoberta causal
Descoberta causal Bayesiana
Grandes modelos de linguagem
Large language models
topic Bayesian causal discovery
Causal discovery
Causalidade
Causality
Descoberta causal
Descoberta causal Bayesiana
Grandes modelos de linguagem
Large language models
description Causal Discovery (CD) is the task of automatically inferring causal structures, typically from obser- vational data. Recently, there has been much interest in utilizing domain knowledge from large language models (LLM) in causal discovery. However, existing LLM-based approaches only output a single directed acyclic graph (DAG) without uncertainty, which can be unreliable. Thus, the focus of the present disserta- tion is exploring Bayesian Structure Learning algorithms which produce distributions of graphs, natural representations of uncertainty and using LLMs to enhance their inference power. The present disser- tation has three main goals. The first is discussing and reviewing the basic algorithms of CD, and also focusing on LLM-based approaches. There is also a short section that explores different methods of defin- ing the LLMs output reliability. Secondly, we intend to investigate the usage of LLMs alongside Bayesian structure learning (BSL) methods for causal discovery. In particular, we propose to harness the domain knowledge from the LLMs as the prior distribution over graphs. Our experiments show that LLM-informed priors can improve the performance of BSL methods. Lastly, this text describes another approach, which focuses on Ancestral Graphs (AGs), where we do not need the sufficiency assumption. Furthermore, our model describes a method that uses the LLM as an expert-in-the-loop, mixing the data information with the outside knowledge during the model processing. Our experiments show that our method is competitive against others that address latent confounding on both synthetic and real-world datasets; and our design for incorporating feedback from a (simulated) human expert or a Large Language Model (LLM) improves inference quality.
publishDate 2026
dc.date.none.fl_str_mv 2026-02-23
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/
url https://www.teses.usp.br/teses/disponiveis/45/45134/tde-18032026-165543/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1865492444051668992