A contribution to semantic description of images and videos: an application of soft biometrics

Perlin, Hugo Alberto

A contribution to semantic description of images and videos: an application of soft biometrics

Detalhes bibliográficos
Ano de defesa:	2015
Autor(a) principal:	Perlin, Hugo Alberto
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Universidade Tecnológica Federal do Paraná Curitiba Brasil Programa de Pós-Graduação em Engenharia Elétrica e Informática Industrial UTFPR
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Visão por computador Aprendizado do computador Processamento de imagens Interpretação de imagens Linguagem de programação (Computadores) - Semântica Engenharia elétrica Computer vision Machine learning Image processing Picture interpretation Programming languages (Electronic computers) - Semantics Electric engineering CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::PROCESSAMENTO GRAFICO (GRAPHICS)
Link de acesso:	http://repositorio.utfpr.edu.br/jspui/handle/1/1808
Resumo:	Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.

Metadados do item

id	UTFPR-12_d5e914a8961f603047e19efef6bd5e58
oai_identifier_str	oai:repositorio.utfpr.edu.br:1/1808
network_acronym_str	UTFPR-12
network_name_str	Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT))
repository_id_str
spelling	A contribution to semantic description of images and videos: an application of soft biometricsUma contribuição para descrição semântica de imagens e vídeos: uma aplicação de biometrias fracasVisão por computadorAprendizado do computadorProcessamento de imagensInterpretação de imagensLinguagem de programação (Computadores) - SemânticaEngenharia elétricaComputer visionMachine learningImage processingPicture interpretationProgramming languages (Electronic computers) - SemanticsElectric engineeringCNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::PROCESSAMENTO GRAFICO (GRAPHICS)Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.Fundação AraucáriaOs seres humanos possuem uma alta capacidade de extrair informações de dados visuais, adquiridos por meio da visão. Através de um processo de aprendizado, que se inicia ao nascer e continua ao longo da vida, a interpretação de imagens passa a ser feita de maneira quase instintiva. Em um relance, uma pessoa consegue facilmente descrever com certa precisão os componentes principais que compõem uma determinada cena. De maneira geral, isto é feito extraindo-se características de baixo nível, como arestas, texturas e formas, e associando-as com significados de alto nível. Ou seja, realiza-se uma descrição semântica desta cena. Um exemplo disto é a capacidade de reconhecer outras pessoas e descrever suas características físicas e comportamentais. A área de visão computacional tem como principal objetivo desenvolver métodos capazes de realizar interpretação visual com desempenho similar aos humanos. Estes métodos englobam conhecimento de aprendizado de máquina e processamento de imagens. Esta tese tem como objetivo propor métodos de visão computacional que permitam a extração de informações de alto nível na forma de biometrias leves. Estas biometrias representam características inerentes ao corpo e ao comportamento humano. Porém, não permitem a identificação unívoca de uma pessoa. Para tanto, este problema foi abordado de duas formas, aprendizado não-supervisionado e supervisionado. A primeira busca agrupar as imagens através de um processo de aprendizado automático de extração de características, empregando técnicas de convoluções, computação evolucionária e clusterização. Nesta abordagem as imagens utilizadas contém faces e pessoas. A segunda abordagem emprega redes neurais convolucionais, que possuem a capacidade de operar sobre imagens cruas, aprendendo tanto o processo de extração de características quanto a classificação. Aqui as imagens são classificadas de acordo com gênero e roupas, divididas em parte superior e inferior do corpo humano. A primeira abordagem, quando testada com diferentes bancos de imagens, obteve uma acurácia de aproximadamente 80% para faces e não-faces e 70% para pessoas e não-pessoas. A segunda, testada utilizando imagens e vídeos, obteve uma acurácia de cerca de 70% para gênero, 80% para roupas da parte superior e 90% para a parte inferior. Os resultados destes estudos de casos, mostram que os métodos propostos são promissores, permitindo a realização de anotação automática de informações de alto nível. Isto abre possibilidades para o desenvolvimento de aplicações em diversas áreas, como busca de imagens e vídeos baseada em conteúdo e segurança por vídeo, reduzindo o esforço humano nas tarefas de anotação manual e monitoramento.Universidade Tecnológica Federal do ParanáCuritibaBrasilPrograma de Pós-Graduação em Engenharia Elétrica e Informática IndustrialUTFPRLopes, Heitor Silvériohttp://lattes.cnpq.br/4045818083957064Lopes, Heitor SilvérioKoerich, Alessandro LameirasBritto, Alceu de SouzaChidambaram, ChidambaramPerlin, Hugo Alberto2016-10-25T17:44:58Z2016-10-25T17:44:58Z2015-12-08info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfPERLIN, Hugo Alberto. A contribution to semantic description of images and videos: an application of soft biometrics. 2015. 110 f. Tese (Doutorado em Engenharia Elétrica e Informática Industrial) - Universidade Tecnológica Federal do Paraná, Curitiba, 2015.http://repositorio.utfpr.edu.br/jspui/handle/1/1808enginfo:eu-repo/semantics/openAccessreponame:Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT))instname:Universidade Tecnológica Federal do Paraná (UTFPR)instacron:UTFPR2016-10-26T05:01:47Zoai:repositorio.utfpr.edu.br:1/1808Repositório InstitucionalPUBhttp://repositorio.utfpr.edu.br:8080/oai/requestriut@utfpr.edu.br \|\| sibi@utfpr.edu.bropendoar:2016-10-26T05:01:47Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT)) - Universidade Tecnológica Federal do Paraná (UTFPR)false
dc.title.none.fl_str_mv	A contribution to semantic description of images and videos: an application of soft biometrics Uma contribuição para descrição semântica de imagens e vídeos: uma aplicação de biometrias fracas
title	A contribution to semantic description of images and videos: an application of soft biometrics
spellingShingle	A contribution to semantic description of images and videos: an application of soft biometrics Perlin, Hugo Alberto Visão por computador Aprendizado do computador Processamento de imagens Interpretação de imagens Linguagem de programação (Computadores) - Semântica Engenharia elétrica Computer vision Machine learning Image processing Picture interpretation Programming languages (Electronic computers) - Semantics Electric engineering CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::PROCESSAMENTO GRAFICO (GRAPHICS)
title_short	A contribution to semantic description of images and videos: an application of soft biometrics
title_full	A contribution to semantic description of images and videos: an application of soft biometrics
title_fullStr	A contribution to semantic description of images and videos: an application of soft biometrics
title_full_unstemmed	A contribution to semantic description of images and videos: an application of soft biometrics
title_sort	A contribution to semantic description of images and videos: an application of soft biometrics
author	Perlin, Hugo Alberto
author_facet	Perlin, Hugo Alberto
author_role	author
dc.contributor.none.fl_str_mv	Lopes, Heitor Silvério http://lattes.cnpq.br/4045818083957064 Lopes, Heitor Silvério Koerich, Alessandro Lameiras Britto, Alceu de Souza Chidambaram, Chidambaram
dc.contributor.author.fl_str_mv	Perlin, Hugo Alberto
dc.subject.por.fl_str_mv	Visão por computador Aprendizado do computador Processamento de imagens Interpretação de imagens Linguagem de programação (Computadores) - Semântica Engenharia elétrica Computer vision Machine learning Image processing Picture interpretation Programming languages (Electronic computers) - Semantics Electric engineering CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::PROCESSAMENTO GRAFICO (GRAPHICS)
topic	Visão por computador Aprendizado do computador Processamento de imagens Interpretação de imagens Linguagem de programação (Computadores) - Semântica Engenharia elétrica Computer vision Machine learning Image processing Picture interpretation Programming languages (Electronic computers) - Semantics Electric engineering CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::METODOLOGIA E TECNICAS DA COMPUTACAO::PROCESSAMENTO GRAFICO (GRAPHICS)
description	Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.
publishDate	2015
dc.date.none.fl_str_mv	2015-12-08 2016-10-25T17:44:58Z 2016-10-25T17:44:58Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	PERLIN, Hugo Alberto. A contribution to semantic description of images and videos: an application of soft biometrics. 2015. 110 f. Tese (Doutorado em Engenharia Elétrica e Informática Industrial) - Universidade Tecnológica Federal do Paraná, Curitiba, 2015. http://repositorio.utfpr.edu.br/jspui/handle/1/1808
identifier_str_mv	PERLIN, Hugo Alberto. A contribution to semantic description of images and videos: an application of soft biometrics. 2015. 110 f. Tese (Doutorado em Engenharia Elétrica e Informática Industrial) - Universidade Tecnológica Federal do Paraná, Curitiba, 2015.
url	http://repositorio.utfpr.edu.br/jspui/handle/1/1808
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Universidade Tecnológica Federal do Paraná Curitiba Brasil Programa de Pós-Graduação em Engenharia Elétrica e Informática Industrial UTFPR
publisher.none.fl_str_mv	Universidade Tecnológica Federal do Paraná Curitiba Brasil Programa de Pós-Graduação em Engenharia Elétrica e Informática Industrial UTFPR
dc.source.none.fl_str_mv	reponame:Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT)) instname:Universidade Tecnológica Federal do Paraná (UTFPR) instacron:UTFPR
instname_str	Universidade Tecnológica Federal do Paraná (UTFPR)
instacron_str	UTFPR
institution	UTFPR
reponame_str	Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT))
collection	Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT))
repository.name.fl_str_mv	Repositório Institucional da UTFPR (da Universidade Tecnológica Federal do Paraná (RIUT)) - Universidade Tecnológica Federal do Paraná (UTFPR)
repository.mail.fl_str_mv	riut@utfpr.edu.br \|\| sibi@utfpr.edu.br
_version_	1850498314128064512

A contribution to semantic description of images and videos: an application of soft biometrics

Registros relacionados