Face image inpainting based on Generative Adversarial Networks
| Ano de defesa: | 2023 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | eng |
| Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://www.teses.usp.br/teses/disponiveis/100/100131/tde-06022024-184049/ |
Resumo: | Face image inpainting is a challenging problem in computer vision with several practical uses, and is employed in many image preprocessing applications. In the past few years, deep learning has made great breakthroughs in the field of image inpainting. The impressive results achieved by Generative Adversarial Networks in image processing increased the attention of the scientific community in recent years around facial inpainting. Recent network architecture developments are the two-stage networks using coarse to fine approach, landmarks, semantic segmentation map, and edge maps that guide the inpainting process. Moreover, improved convolutions enlarge the receptive field and filter the values passed to the next layer, and attention layers create relationships between local and distant information. The objective of this project is to evaluate and compare the efficacy of the baseline models identified in the literature on face datasets with three occlusion types. To this end, a comparative study was performed among the baseline models to identify the advantages and disadvantages of each model on three types of facial occlusions. A literature review gathered the baseline models, face datasets, occlusions, and evaluation metrics. The baseline models were DF1, DF2, EdgeConnect, GLCI, GMCNN, PConv, and SIIGM. The datasets were CelebA and CelebA-HQ. The occlusions were a square mask centered in the image, an irregular shape mask and a facial mask (MTF). The evaluation metrics were PSNR, SSIM and LPIPS. The comparative study consisted in two experiments, one training the baseline models from scratch, and the other using pretrained models. Both experiments followed the same testing procedure. GMCNN achieved the best overall results with the square mask, DF2 was the best model with the irregular mask, and both models were the best with the MTF mask. PConv results were very bad in all experiments, except for irregular mask with the pretrained model. |
| id |
USP_6f2f4676af9f728d26c4b1f0e66e623b |
|---|---|
| oai_identifier_str |
oai:teses.usp.br:tde-06022024-184049 |
| network_acronym_str |
USP |
| network_name_str |
Biblioteca Digital de Teses e Dissertações da USP |
| repository_id_str |
|
| spelling |
Face image inpainting based on Generative Adversarial NetworksRestauração de imagem facial baseada em Redes Adversárias GenerativasBiometriaBiometryComputer visionFacial inpaintingFacial occlusionGenerative Adversarial NetworkImage processingOclusão facialProcessamento de imagemRede Adversária GenerativaRestauração facialVisão computacionalFace image inpainting is a challenging problem in computer vision with several practical uses, and is employed in many image preprocessing applications. In the past few years, deep learning has made great breakthroughs in the field of image inpainting. The impressive results achieved by Generative Adversarial Networks in image processing increased the attention of the scientific community in recent years around facial inpainting. Recent network architecture developments are the two-stage networks using coarse to fine approach, landmarks, semantic segmentation map, and edge maps that guide the inpainting process. Moreover, improved convolutions enlarge the receptive field and filter the values passed to the next layer, and attention layers create relationships between local and distant information. The objective of this project is to evaluate and compare the efficacy of the baseline models identified in the literature on face datasets with three occlusion types. To this end, a comparative study was performed among the baseline models to identify the advantages and disadvantages of each model on three types of facial occlusions. A literature review gathered the baseline models, face datasets, occlusions, and evaluation metrics. The baseline models were DF1, DF2, EdgeConnect, GLCI, GMCNN, PConv, and SIIGM. The datasets were CelebA and CelebA-HQ. The occlusions were a square mask centered in the image, an irregular shape mask and a facial mask (MTF). The evaluation metrics were PSNR, SSIM and LPIPS. The comparative study consisted in two experiments, one training the baseline models from scratch, and the other using pretrained models. Both experiments followed the same testing procedure. GMCNN achieved the best overall results with the square mask, DF2 was the best model with the irregular mask, and both models were the best with the MTF mask. PConv results were very bad in all experiments, except for irregular mask with the pretrained model.A restauração de imagens faciais é um problema desafiador em visão computacional com vários usos práticos, e é empregada em muitas aplicações de pré-processamento de imagens. Nos últimos anos, o aprendizado profundo fez grandes avanços no campo de restauração de imagens. Os resultados impressionantes alcançados pelas Redes Adversariais Generativas (GANs) no processamento de imagens aumentaram a atenção da comunidade científica nos últimos anos em torno da pintura facial. Os desenvolvimentos recentes da arquitetura de rede são as redes de dois estágios que usam abordagem grosseira a fina, pontos de referência, mapa de segmentação semântica e mapas de borda que orientam o processo de pintura interna. Além disso, convoluções aprimoradas ampliam o campo receptivo e filtram os valores passados para a próxima camada, e as camadas de atenção criam relacionamentos entre informações locais e distantes. O objetivo deste projeto é avaliar e comparar a eficácia dos modelos de referência identificados na literatura em conjuntos de dados faciais com três tipos de oclusão. Para tanto, foi realizado um estudo comparativo entre os modelos basais para identificar as vantagens e desvantagens de cada modelo nos três tipos de oclusões faciais. Uma revisão da literatura reuniu os modelos de referência, conjuntos de dados faciais, oclusões e métricas de avaliação. Os modelos de referência foram DF1, DF2, EdgeConnect, GLCI, GMCNN, PConv e SIIGM. Os conjuntos de dados foram CelebA e CelebA-HQ. As oclusões foram uma máscara quadrada centrada na imagem, uma máscara de formato irregular e uma máscara facial (MTF). As métricas de avaliação foram PSNR, SSIM e LPIPS. O estudo comparativo consistiu em dois experimentos, um treinando do zero os modelos de referência e outro utilizando modelos pré-treinados. Ambos os experimentos seguiram o mesmo procedimento de teste. GMCNN obteve os melhores resultados gerais com a máscara quadrada, DF2 foi o melhor modelo com máscara irregular, e ambos os modelos foram os melhores com máscara MTF. Os resultados de PConv foram muito ruins em todos os experimentos, exceto para máscara irregular com o modelo pré-treinado.Biblioteca Digitais de Teses e Dissertações da USPLima, Clodoaldo Aparecido de MoraesIvamoto, Victor Soares2023-11-23info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/100/100131/tde-06022024-184049/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2024-12-09T16:36:02Zoai:teses.usp.br:tde-06022024-184049Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212024-12-09T16:36:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false |
| dc.title.none.fl_str_mv |
Face image inpainting based on Generative Adversarial Networks Restauração de imagem facial baseada em Redes Adversárias Generativas |
| title |
Face image inpainting based on Generative Adversarial Networks |
| spellingShingle |
Face image inpainting based on Generative Adversarial Networks Ivamoto, Victor Soares Biometria Biometry Computer vision Facial inpainting Facial occlusion Generative Adversarial Network Image processing Oclusão facial Processamento de imagem Rede Adversária Generativa Restauração facial Visão computacional |
| title_short |
Face image inpainting based on Generative Adversarial Networks |
| title_full |
Face image inpainting based on Generative Adversarial Networks |
| title_fullStr |
Face image inpainting based on Generative Adversarial Networks |
| title_full_unstemmed |
Face image inpainting based on Generative Adversarial Networks |
| title_sort |
Face image inpainting based on Generative Adversarial Networks |
| author |
Ivamoto, Victor Soares |
| author_facet |
Ivamoto, Victor Soares |
| author_role |
author |
| dc.contributor.none.fl_str_mv |
Lima, Clodoaldo Aparecido de Moraes |
| dc.contributor.author.fl_str_mv |
Ivamoto, Victor Soares |
| dc.subject.por.fl_str_mv |
Biometria Biometry Computer vision Facial inpainting Facial occlusion Generative Adversarial Network Image processing Oclusão facial Processamento de imagem Rede Adversária Generativa Restauração facial Visão computacional |
| topic |
Biometria Biometry Computer vision Facial inpainting Facial occlusion Generative Adversarial Network Image processing Oclusão facial Processamento de imagem Rede Adversária Generativa Restauração facial Visão computacional |
| description |
Face image inpainting is a challenging problem in computer vision with several practical uses, and is employed in many image preprocessing applications. In the past few years, deep learning has made great breakthroughs in the field of image inpainting. The impressive results achieved by Generative Adversarial Networks in image processing increased the attention of the scientific community in recent years around facial inpainting. Recent network architecture developments are the two-stage networks using coarse to fine approach, landmarks, semantic segmentation map, and edge maps that guide the inpainting process. Moreover, improved convolutions enlarge the receptive field and filter the values passed to the next layer, and attention layers create relationships between local and distant information. The objective of this project is to evaluate and compare the efficacy of the baseline models identified in the literature on face datasets with three occlusion types. To this end, a comparative study was performed among the baseline models to identify the advantages and disadvantages of each model on three types of facial occlusions. A literature review gathered the baseline models, face datasets, occlusions, and evaluation metrics. The baseline models were DF1, DF2, EdgeConnect, GLCI, GMCNN, PConv, and SIIGM. The datasets were CelebA and CelebA-HQ. The occlusions were a square mask centered in the image, an irregular shape mask and a facial mask (MTF). The evaluation metrics were PSNR, SSIM and LPIPS. The comparative study consisted in two experiments, one training the baseline models from scratch, and the other using pretrained models. Both experiments followed the same testing procedure. GMCNN achieved the best overall results with the square mask, DF2 was the best model with the irregular mask, and both models were the best with the MTF mask. PConv results were very bad in all experiments, except for irregular mask with the pretrained model. |
| publishDate |
2023 |
| dc.date.none.fl_str_mv |
2023-11-23 |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://www.teses.usp.br/teses/disponiveis/100/100131/tde-06022024-184049/ |
| url |
https://www.teses.usp.br/teses/disponiveis/100/100131/tde-06022024-184049/ |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
|
| dc.rights.driver.fl_str_mv |
Liberar o conteúdo para acesso público. info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
Liberar o conteúdo para acesso público. |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.coverage.none.fl_str_mv |
|
| dc.publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| publisher.none.fl_str_mv |
Biblioteca Digitais de Teses e Dissertações da USP |
| dc.source.none.fl_str_mv |
reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP |
| instname_str |
Universidade de São Paulo (USP) |
| instacron_str |
USP |
| institution |
USP |
| reponame_str |
Biblioteca Digital de Teses e Dissertações da USP |
| collection |
Biblioteca Digital de Teses e Dissertações da USP |
| repository.name.fl_str_mv |
Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP) |
| repository.mail.fl_str_mv |
virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br |
| _version_ |
1818598502180061184 |