Advancing deep learning models for robustness and interpretability in image recognition

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: SANTOS, Flávio Arthur Oliveira
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/57293
Resumo: Deep Learning architectures are among the most promising machine learning models today. They are used in various domains, including drug discovery, speech recognition, ob- ject recognition, question and answer, machine translation, and image description. Surpris- ingly, some studies even report superhuman performance, that is, a level of performance superior to human experts in certain tasks. Although these models exhibit high precision and coverage, the literature shows that they also have several limitations: (1) they are vulnerable to adversarial attacks, (2) they have difficulty inferring data outside the train- ing distribution, (3) they provide correct inferences based on spurious information, and (4) their inferences are difficult for a domain expert to interpret. These limitations make it challenging to adopt these models in high-risk applications, such as autonomous cars or medical diagnostics. Overcoming these limitations requires robustness, reliability, and interpretability. This thesis conducts a comprehensive exploration of techniques and tools to improve the robustness and interpretability of Deep Learning models in the domain of image processing. These contributions cover four key areas: (1) the development of the Active Image Data Augmentation (ADA) method to improve model robustness, (2) the proposition of the Adversarial Right for Right Reasons (ARRR) loss function to ensure that models are "right for the right reasons" and adversarially robust, (3) the introduction of the Right for Right Reasons Data Augmentation (RRDA) method, which improves the context of the information to be represented among the training data to stimulate the model’s focus on signal characteristics, and (4) the presentation of a new method for interpreting the behavior of models during the inference process. We also present a tool for manipulating visual features and assessing the robustness of models trained under different usage situations. The analyses demonstrate that the ADA method improves the robustness of models without compromising traditional performance metrics. The ARRR method demonstrates robustness against the color bias of images in problems based on the structural information of the images. In addition, the RRDA method significantly im- proves the model’s robustness in relation to background shifts in the image, outperforming the performance of other traditional RRR methods. Finally, the proposed model analy- sis tool reveals the counterintuitive interdependence of features and assesses weaknesses in the models’ inference decisions. These contributions represent significant advances in Deep Learning applied to image processing, providing valuable insights and innovative solutions to challenges associated with the reliability and interpretation of these complex models.
id UFPE_1e6e618c31422b64baade162bbc9f33d
oai_identifier_str oai:repositorio.ufpe.br:123456789/57293
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str
spelling Advancing deep learning models for robustness and interpretability in image recognitionDeep learningRobustezAtaques adversáriosInterpretabilidadeDeep Learning architectures are among the most promising machine learning models today. They are used in various domains, including drug discovery, speech recognition, ob- ject recognition, question and answer, machine translation, and image description. Surpris- ingly, some studies even report superhuman performance, that is, a level of performance superior to human experts in certain tasks. Although these models exhibit high precision and coverage, the literature shows that they also have several limitations: (1) they are vulnerable to adversarial attacks, (2) they have difficulty inferring data outside the train- ing distribution, (3) they provide correct inferences based on spurious information, and (4) their inferences are difficult for a domain expert to interpret. These limitations make it challenging to adopt these models in high-risk applications, such as autonomous cars or medical diagnostics. Overcoming these limitations requires robustness, reliability, and interpretability. This thesis conducts a comprehensive exploration of techniques and tools to improve the robustness and interpretability of Deep Learning models in the domain of image processing. These contributions cover four key areas: (1) the development of the Active Image Data Augmentation (ADA) method to improve model robustness, (2) the proposition of the Adversarial Right for Right Reasons (ARRR) loss function to ensure that models are "right for the right reasons" and adversarially robust, (3) the introduction of the Right for Right Reasons Data Augmentation (RRDA) method, which improves the context of the information to be represented among the training data to stimulate the model’s focus on signal characteristics, and (4) the presentation of a new method for interpreting the behavior of models during the inference process. We also present a tool for manipulating visual features and assessing the robustness of models trained under different usage situations. The analyses demonstrate that the ADA method improves the robustness of models without compromising traditional performance metrics. The ARRR method demonstrates robustness against the color bias of images in problems based on the structural information of the images. In addition, the RRDA method significantly im- proves the model’s robustness in relation to background shifts in the image, outperforming the performance of other traditional RRR methods. Finally, the proposed model analy- sis tool reveals the counterintuitive interdependence of features and assesses weaknesses in the models’ inference decisions. These contributions represent significant advances in Deep Learning applied to image processing, providing valuable insights and innovative solutions to challenges associated with the reliability and interpretation of these complex models.CAPESAs arquiteturas de Deep Learning estão entre os modelos de aprendizado de máquina mais promissores na atualidade. Elas são utilizadas em diversos domínios, incluindo de- scoberta de medicamentos, reconhecimento de fala, reconhecimento de objetos, perguntas e respostas, tradução de automática e descrição de imagens. Surpreendentemente, alguns estudos relatam até mesmo desempenho super-humano, ou seja, um nível de desempenho superior ao de especialistas humanos em determinadas tarefas. Embora esses modelos exibam alta precisão e cobertura, a literatura mostra que também possuem várias limi- tações: (1) são vulneráveis a ataques adversários, (2) possuem dificuldade em inferir dados fora da distribuição de treinamento, (3) fornecem inferências corretas com base em in- formações espúrias e, além disso, (4) suas inferências são de difícil interpretação por um especialista do domínio. Essas limitações tornam desafiador adotar esses modelos em apli- cações de alto risco, como carros autônomos ou diagnósticos médicos. A superação destas limitações demanda robustez, confiabilidade e interpretabilidade. Nesta tese, é realizada uma exploração abrangente de técnicas e ferramentas, voltadas para aprimorar a robustez e interpretabilidade de modelos de Deep Learning no domínio de processamento de ima- gens. Essas contribuições abrangem quatro áreas-chave: (1) o desenvolvimento do método de aumento de dados de imagem ativo (ADA) para melhorar a robustez do modelo, (2) a proposição da função de perda adversarial right for right reasons (ARRR) para garantir que os modelos estejam "certos pelos motivos certos" e adversarialmente robustos, (3) a introdução do método de aumento de dados right for right reasons (RRDA) que mel- hora dentre os dados de treinamento o contexto das informações a serem representadas para estimular o foco do modelo em características de sinal, e (4) a apresentação de um novo método para interpretar o comportamento dos modelos durante o processo de in- ferência. Apresentamos ainda uma ferramenta para manipular características visuais e avaliar a robustez dos modelos treinados sob diferentes situações de uso. As análises real- izadas demonstram que o método ADA melhora a robustez dos modelos sem comprometer métricas tradicionais de desempenho. O método ARRR demonstra robustez ao viés de cor das imagens em problemas baseados em informações estruturais das imagens. Além disso, o método RRDA melhora significativamente a robustez do modelo em relação a deslocamentos de fundo da imagem, superando o desempenho de outros métodos RRR tradicionais. Finalmente, a ferramenta de análise de modelos proposta permite revelar a interdependência contraintuitiva de características e avaliar fraquezas nas decisões de inferência dos modelos. Estas contribuições representam avanços significativos no campo do Deep Learning aplicado ao processamento de imagens, fornecendo insights valiosos e soluções inovadoras para desafios associados à confiabilidade e interpretação desses mod- elos complexos.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoZANCHETTIN, CleberNOVAIS, Paulo Jorge Freitas de Oliveirahttp://lattes.cnpq.br/4086648712225670http://lattes.cnpq.br/1244195230407619SANTOS, Flávio Arthur Oliveira2024-08-12T12:43:45Z2024-08-12T12:43:45Z2023-12-06info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfSANTOS, Flávio Arthur Oliveira. Advancing deep learning models for robustness and interpretability in image recognition. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.https://repositorio.ufpe.br/handle/123456789/57293engAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2024-08-13T05:23:20Zoai:repositorio.ufpe.br:123456789/57293Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212024-08-13T05:23:20Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv Advancing deep learning models for robustness and interpretability in image recognition
title Advancing deep learning models for robustness and interpretability in image recognition
spellingShingle Advancing deep learning models for robustness and interpretability in image recognition
SANTOS, Flávio Arthur Oliveira
Deep learning
Robustez
Ataques adversários
Interpretabilidade
title_short Advancing deep learning models for robustness and interpretability in image recognition
title_full Advancing deep learning models for robustness and interpretability in image recognition
title_fullStr Advancing deep learning models for robustness and interpretability in image recognition
title_full_unstemmed Advancing deep learning models for robustness and interpretability in image recognition
title_sort Advancing deep learning models for robustness and interpretability in image recognition
author SANTOS, Flávio Arthur Oliveira
author_facet SANTOS, Flávio Arthur Oliveira
author_role author
dc.contributor.none.fl_str_mv ZANCHETTIN, Cleber
NOVAIS, Paulo Jorge Freitas de Oliveira
http://lattes.cnpq.br/4086648712225670
http://lattes.cnpq.br/1244195230407619
dc.contributor.author.fl_str_mv SANTOS, Flávio Arthur Oliveira
dc.subject.por.fl_str_mv Deep learning
Robustez
Ataques adversários
Interpretabilidade
topic Deep learning
Robustez
Ataques adversários
Interpretabilidade
description Deep Learning architectures are among the most promising machine learning models today. They are used in various domains, including drug discovery, speech recognition, ob- ject recognition, question and answer, machine translation, and image description. Surpris- ingly, some studies even report superhuman performance, that is, a level of performance superior to human experts in certain tasks. Although these models exhibit high precision and coverage, the literature shows that they also have several limitations: (1) they are vulnerable to adversarial attacks, (2) they have difficulty inferring data outside the train- ing distribution, (3) they provide correct inferences based on spurious information, and (4) their inferences are difficult for a domain expert to interpret. These limitations make it challenging to adopt these models in high-risk applications, such as autonomous cars or medical diagnostics. Overcoming these limitations requires robustness, reliability, and interpretability. This thesis conducts a comprehensive exploration of techniques and tools to improve the robustness and interpretability of Deep Learning models in the domain of image processing. These contributions cover four key areas: (1) the development of the Active Image Data Augmentation (ADA) method to improve model robustness, (2) the proposition of the Adversarial Right for Right Reasons (ARRR) loss function to ensure that models are "right for the right reasons" and adversarially robust, (3) the introduction of the Right for Right Reasons Data Augmentation (RRDA) method, which improves the context of the information to be represented among the training data to stimulate the model’s focus on signal characteristics, and (4) the presentation of a new method for interpreting the behavior of models during the inference process. We also present a tool for manipulating visual features and assessing the robustness of models trained under different usage situations. The analyses demonstrate that the ADA method improves the robustness of models without compromising traditional performance metrics. The ARRR method demonstrates robustness against the color bias of images in problems based on the structural information of the images. In addition, the RRDA method significantly im- proves the model’s robustness in relation to background shifts in the image, outperforming the performance of other traditional RRR methods. Finally, the proposed model analy- sis tool reveals the counterintuitive interdependence of features and assesses weaknesses in the models’ inference decisions. These contributions represent significant advances in Deep Learning applied to image processing, providing valuable insights and innovative solutions to challenges associated with the reliability and interpretation of these complex models.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-06
2024-08-12T12:43:45Z
2024-08-12T12:43:45Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv SANTOS, Flávio Arthur Oliveira. Advancing deep learning models for robustness and interpretability in image recognition. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.
https://repositorio.ufpe.br/handle/123456789/57293
identifier_str_mv SANTOS, Flávio Arthur Oliveira. Advancing deep learning models for robustness and interpretability in image recognition. 2023. Tese (Doutorado em Ciência da Computação) – Universidade Federal de Pernambuco, Recife, 2023.
url https://repositorio.ufpe.br/handle/123456789/57293
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1856041929984180224