An analysis of sample synthesis for deep learning based object detection

Leonardo Blanger

An analysis of sample synthesis for deep learning based object detection

Detalhes bibliográficos
Ano de defesa:	2020
Autor(a) principal:	Leonardo Blanger
Orientador(a):	Nina Sumiko Tomita Hirata
Banca de defesa:	David Menotti Gomes, Roberto de Alencar Lotufo
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Universidade de São Paulo
Programa de Pós-Graduação:	Ciência da Computação
Departamento:	Não Informado pela instituição
País:	BR
Link de acesso:	https://doi.org/10.11606/D.45.2020.tde-10112020-203810
Resumo:	This work investigates the use of artificially synthesized images as an attempt to reduce the dependency of modern Deep Learning based Object Detection techniques on expensive supervision. In particular, we propose using a big number of synthesized detection samples to pretrain Object Detection architectures before finetuning them on real detection data. As the major contribution of this project, we experimentally demonstrate how this pretraining works as a powerful initialization strategy, allowing the models to achieve competitive results using only a fraction of the original real labeled data. Additionally, in order to synthesize these samples, we propose a synthesis pipeline capable of generating an infinite stream of artificial images paired with bounding box annotations. We demonstrate how it is possible to design such a working synthesis pipeline just using already existing GAN techniques. Moreover, all stages in our synthesis pipeline can be fully trained using only classification images. Therefore, we managed to take advantage of bigger and cheaper classification datasets in order to improve results on the harder and more supervision hungry Object Detection problem. We demonstrate the effectiveness of this pretraining initialization strategy combined with the proposed synthesis pipeline, by performing detection using four real world objects: QR Codes, Faces, Birds and Cars.

Metadados do item

id	USP_5294f540bc55d5e0961e970fbdd9da60
oai_identifier_str	oai:teses.usp.br:tde-10112020-203810
network_acronym_str	USP
network_name_str	Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling	info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesis An analysis of sample synthesis for deep learning based object detection Uma análise de síntese de exemplos para detecção de objetos baseada em deep learning 2020-10-16Nina Sumiko Tomita HirataDavid Menotti GomesRoberto de Alencar LotufoLeonardo BlangerUniversidade de São PauloCiência da ComputaçãoUSPBR Aprendizado profundo Deep learning Deep learning Detecção de objetos Generative models Modelos gerativos Object detection Sample synthesis Síntese de amostras This work investigates the use of artificially synthesized images as an attempt to reduce the dependency of modern Deep Learning based Object Detection techniques on expensive supervision. In particular, we propose using a big number of synthesized detection samples to pretrain Object Detection architectures before finetuning them on real detection data. As the major contribution of this project, we experimentally demonstrate how this pretraining works as a powerful initialization strategy, allowing the models to achieve competitive results using only a fraction of the original real labeled data. Additionally, in order to synthesize these samples, we propose a synthesis pipeline capable of generating an infinite stream of artificial images paired with bounding box annotations. We demonstrate how it is possible to design such a working synthesis pipeline just using already existing GAN techniques. Moreover, all stages in our synthesis pipeline can be fully trained using only classification images. Therefore, we managed to take advantage of bigger and cheaper classification datasets in order to improve results on the harder and more supervision hungry Object Detection problem. We demonstrate the effectiveness of this pretraining initialization strategy combined with the proposed synthesis pipeline, by performing detection using four real world objects: QR Codes, Faces, Birds and Cars. Este trabalho investiga o uso de imagens sintetizadas como uma forma de reduzir a dependência de técnicas modernas de Detecção de Objetos, baseadas em Deep Learning, por formas caras de supervisão. Em particular, este trabalho propõe utilizar grandes quantidades de amostras de detecção sintetizadas para pré-treinar arquiteturas de Detecção de Objetos antes de ajustar estas arquiteturas usando dados reais. Como principal contribuição deste projeto, demonstramos experimentalmente como este pré-treinamento serve como uma poderosa estratégia de inicialização, permitindo que modelos atinjam resultados competitivos usando apenas uma fração dos dados rotulados reais. Além disso, para poder sintetizar estas amostras, propomos um pipeline de síntese capaz de gerar uma sequência infinita de imagens artificiais associadas a anotações no formato de bounding boxes. Demonstramos como é possível projetar este pipeline de síntese usando apenas técnicas já existentes baseadas em GANs. Além disso, todos os estágios do nosso pipeline de síntese podem ser completamente treinados usando apenas imagens rotuladas para classificação. Desta forma, fomos capazes de tirar proveito de datasets maiores e mais baratos de rotular, para melhorar os resultados em Detecção de Objetos, um problema mais difícil e para o qual produzir dados rotulados diretamente é mais custoso. Demonstramos a eficácia desta estratégia de inicialização via pré-treinamento, em combinação com nosso pipeline de síntese, através de experimentos envolvendo detecção de quatro objetos reais: Códigos QR, Faces, Pássaros, e Carros. https://doi.org/10.11606/D.45.2020.tde-10112020-203810info:eu-repo/semantics/openAccessengreponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USP2023-12-21T18:38:54Zoai:teses.usp.br:tde-10112020-203810Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.bropendoar:27212020-11-27T18:14:02Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.en.fl_str_mv	An analysis of sample synthesis for deep learning based object detection
dc.title.alternative.pt.fl_str_mv	Uma análise de síntese de exemplos para detecção de objetos baseada em deep learning
title	An analysis of sample synthesis for deep learning based object detection
spellingShingle	An analysis of sample synthesis for deep learning based object detection Leonardo Blanger
title_short	An analysis of sample synthesis for deep learning based object detection
title_full	An analysis of sample synthesis for deep learning based object detection
title_fullStr	An analysis of sample synthesis for deep learning based object detection
title_full_unstemmed	An analysis of sample synthesis for deep learning based object detection
title_sort	An analysis of sample synthesis for deep learning based object detection
author	Leonardo Blanger
author_facet	Leonardo Blanger
author_role	author
dc.contributor.advisor1.fl_str_mv	Nina Sumiko Tomita Hirata
dc.contributor.referee1.fl_str_mv	David Menotti Gomes
dc.contributor.referee2.fl_str_mv	Roberto de Alencar Lotufo
dc.contributor.author.fl_str_mv	Leonardo Blanger
contributor_str_mv	Nina Sumiko Tomita Hirata David Menotti Gomes Roberto de Alencar Lotufo
description	This work investigates the use of artificially synthesized images as an attempt to reduce the dependency of modern Deep Learning based Object Detection techniques on expensive supervision. In particular, we propose using a big number of synthesized detection samples to pretrain Object Detection architectures before finetuning them on real detection data. As the major contribution of this project, we experimentally demonstrate how this pretraining works as a powerful initialization strategy, allowing the models to achieve competitive results using only a fraction of the original real labeled data. Additionally, in order to synthesize these samples, we propose a synthesis pipeline capable of generating an infinite stream of artificial images paired with bounding box annotations. We demonstrate how it is possible to design such a working synthesis pipeline just using already existing GAN techniques. Moreover, all stages in our synthesis pipeline can be fully trained using only classification images. Therefore, we managed to take advantage of bigger and cheaper classification datasets in order to improve results on the harder and more supervision hungry Object Detection problem. We demonstrate the effectiveness of this pretraining initialization strategy combined with the proposed synthesis pipeline, by performing detection using four real world objects: QR Codes, Faces, Birds and Cars.
publishDate	2020
dc.date.issued.fl_str_mv	2020-10-16
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/masterThesis
format	masterThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://doi.org/10.11606/D.45.2020.tde-10112020-203810
url	https://doi.org/10.11606/D.45.2020.tde-10112020-203810
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	Universidade de São Paulo
dc.publisher.program.fl_str_mv	Ciência da Computação
dc.publisher.initials.fl_str_mv	USP
dc.publisher.country.fl_str_mv	BR
publisher.none.fl_str_mv	Universidade de São Paulo
dc.source.none.fl_str_mv	reponame:Biblioteca Digital de Teses e Dissertações da USP instname:Universidade de São Paulo (USP) instacron:USP
instname_str	Universidade de São Paulo (USP)
instacron_str	USP
institution	USP
reponame_str	Biblioteca Digital de Teses e Dissertações da USP
collection	Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv	Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv	virginia@if.usp.br\|\| atendimento@aguia.usp.br\|\|virginia@if.usp.br
_version_	1786376728661196800

An analysis of sample synthesis for deep learning based object detection

Registros relacionados