Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech

Turi, Leandro Furlam

Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech

Detalhes bibliográficos
Ano de defesa:	2024
Autor(a) principal:	Turi, Leandro Furlam
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Espírito Santo BR Mestrado em Informática Centro Tecnológico UFES Programa de Pós-Graduação em Informática
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	GPT-2 Ajuste fino Telegram Ciência da Computação
Link de acesso:	http://repositorio.ufes.br/handle/10/18304
Resumo:	We examined the effects of integrating data containing divergent information, particularly concerning anti-vaccination narratives, in training a GPT-2 language model by fine-tuning it using content from anti-vaccination groups and channels on Telegram. Our objective was to analyze the model’s ability to generate coherent and rationalized texts compared to a model pre-trained on OpenAI’s WebText dataset. The results demonstrate that fine-tuning a GPT-2 model with biased data leads the model to perpetuate these biases in its responses, albeit with a certain degree of rationalization, highlighting the importance of using reliable and high-quality data in the training of natural language processing models and underscoring the implications for information dissemination through these models. We also explored the impact of data poisoning by incorporating anti-vaccination messages combined with general group messages in different proportions, aiming to understand how exposure to biased data can influence text generation and the introduction of harmful biases. The experiments highlight the change in frequency and intensity of anti-vaccination content generated by the model and elucidate the broader implications for reliability and ethics in using language models in sensitive applications. This study provides social scientists with a tool to explore and understand the complexities and challenges associated with misinformation in public health through the use of language models, particularly in the context of vaccine misinformation.

Metadados do item

id	UFES_fb83fc3436bab86effca560b0f61c77d
oai_identifier_str	oai:repositorio.ufes.br:10/18304
network_acronym_str	UFES
network_name_str	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
repository_id_str
spelling	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speechGPT-2Ajuste finoTelegramCiência da ComputaçãoWe examined the effects of integrating data containing divergent information, particularly concerning anti-vaccination narratives, in training a GPT-2 language model by fine-tuning it using content from anti-vaccination groups and channels on Telegram. Our objective was to analyze the model’s ability to generate coherent and rationalized texts compared to a model pre-trained on OpenAI’s WebText dataset. The results demonstrate that fine-tuning a GPT-2 model with biased data leads the model to perpetuate these biases in its responses, albeit with a certain degree of rationalization, highlighting the importance of using reliable and high-quality data in the training of natural language processing models and underscoring the implications for information dissemination through these models. We also explored the impact of data poisoning by incorporating anti-vaccination messages combined with general group messages in different proportions, aiming to understand how exposure to biased data can influence text generation and the introduction of harmful biases. The experiments highlight the change in frequency and intensity of anti-vaccination content generated by the model and elucidate the broader implications for reliability and ethics in using language models in sensitive applications. This study provides social scientists with a tool to explore and understand the complexities and challenges associated with misinformation in public health through the use of language models, particularly in the context of vaccine misinformation.Investigamos os efeitos da integração de dados contendo informações divergentes, especialmente no que diz respeito às narrativas antivacinação, no treinamento de um modelo de linguagem GPT-2, realizando o ajuste fino utilizando conteúdo proveniente de grupos e canais antivacinação no Telegram, com o objetivo de analisar sua capacidade de gerar textos coerentes e racionalizados em comparação com um modelo pré-treinado no conjunto de dados WebText da OpenAI. Os resultados demonstram que o ajuste fino de um modelo GPT-2 com dados tendenciosos leva o modelo a perpetuar esses vieses em suas respostas, embora com um certo grau de racionalização, sublinhando a importância de utilizar dados confiáveis e de alta qualidade no treinamento de modelos de processamento de linguagem natural e ressaltando as implicações para a disseminação de informações através desses modelos. Exploramos também o impacto do envenenamento de dados mediante a incorporação de mensagens antivacinação combinadas com mensagens gerais de grupo em diferentes proporções, com o objetivo de compreender como a exposição a dados tendenciosos pode influenciar a geração de textos e a introdução de preconceitos prejudiciais. Os experimentos destacam a mudança na frequência e intensidade do conteúdo antivacinação gerado pelo modelo e elucidam as implicações mais amplas para a confiabilidade e a ética no uso de modelos de linguagem em aplicações divergentes, oferecendo aos cientistas sociais uma ferramenta para explorar e compreender as complexidades e desafios associados à desinformação em saúde pública através do uso de modelos de linguagem, especialmente no contexto de desinformação sobre vacinas.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)Universidade Federal do Espírito SantoBRMestrado em InformáticaCentro TecnológicoUFESPrograma de Pós-Graduação em InformáticaBadue, ClaudineSouza, Alberto Ferreira dehttps://orcid.org/0000-0003-1561-8447Pacheco, Andre Georghton CardosoAlmeida Junior, Jurandy Gomes deTuri, Leandro Furlam2025-01-31T22:35:33Z2025-01-31T22:35:33Z2024-12-02info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisTextapplication/pdfhttp://repositorio.ufes.br/handle/10/18304porinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)instname:Universidade Federal do Espírito Santo (UFES)instacron:UFES2025-01-31T19:48:09Zoai:repositorio.ufes.br:10/18304Repositório InstitucionalPUBhttp://repositorio.ufes.br/oai/requestriufes@ufes.bropendoar:21082025-01-31T19:48:09Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)false
dc.title.none.fl_str_mv	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
title	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
spellingShingle	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech Turi, Leandro Furlam GPT-2 Ajuste fino Telegram Ciência da Computação
title_short	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
title_full	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
title_fullStr	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
title_full_unstemmed	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
title_sort	Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech
author	Turi, Leandro Furlam
author_facet	Turi, Leandro Furlam
author_role	author
dc.contributor.none.fl_str_mv	Badue, Claudine Souza, Alberto Ferreira de https://orcid.org/0000-0003-1561-8447 Pacheco, Andre Georghton Cardoso Almeida Junior, Jurandy Gomes de
dc.contributor.author.fl_str_mv	Turi, Leandro Furlam
dc.subject.por.fl_str_mv	GPT-2 Ajuste fino Telegram Ciência da Computação
topic	GPT-2 Ajuste fino Telegram Ciência da Computação
description	We examined the effects of integrating data containing divergent information, particularly concerning anti-vaccination narratives, in training a GPT-2 language model by fine-tuning it using content from anti-vaccination groups and channels on Telegram. Our objective was to analyze the model’s ability to generate coherent and rationalized texts compared to a model pre-trained on OpenAI’s WebText dataset. The results demonstrate that fine-tuning a GPT-2 model with biased data leads the model to perpetuate these biases in its responses, albeit with a certain degree of rationalization, highlighting the importance of using reliable and high-quality data in the training of natural language processing models and underscoring the implications for information dissemination through these models. We also explored the impact of data poisoning by incorporating anti-vaccination messages combined with general group messages in different proportions, aiming to understand how exposure to biased data can influence text generation and the introduction of harmful biases. The experiments highlight the change in frequency and intensity of anti-vaccination content generated by the model and elucidate the broader implications for reliability and ethics in using language models in sensitive applications. This study provides social scientists with a tool to explore and understand the complexities and challenges associated with misinformation in public health through the use of language models, particularly in the context of vaccine misinformation.
publishDate	2024
dc.date.none.fl_str_mv	2024-12-02 2025-01-31T22:35:33Z 2025-01-31T22:35:33Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://repositorio.ufes.br/handle/10/18304
url	http://repositorio.ufes.br/handle/10/18304
dc.language.iso.fl_str_mv	por
language	por
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	Text application/pdf
dc.publisher.none.fl_str_mv	Universidade Federal do Espírito Santo BR Mestrado em Informática Centro Tecnológico UFES Programa de Pós-Graduação em Informática
publisher.none.fl_str_mv	Universidade Federal do Espírito Santo BR Mestrado em Informática Centro Tecnológico UFES Programa de Pós-Graduação em Informática
dc.source.none.fl_str_mv	reponame:Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) instname:Universidade Federal do Espírito Santo (UFES) instacron:UFES
instname_str	Universidade Federal do Espírito Santo (UFES)
instacron_str	UFES
institution	UFES
reponame_str	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
collection	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes)
repository.name.fl_str_mv	Repositório Institucional da Universidade Federal do Espírito Santo (riUfes) - Universidade Federal do Espírito Santo (UFES)
repository.mail.fl_str_mv	riufes@ufes.br
_version_	1834479099219804160

Analysis of bias in GPT language models through fine-tuning with anti-vaccination speech

Registros relacionados