W-operator learning using linear models for both gray-level and binary inputs

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: Montagner, Igor dos Santos
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.teses.usp.br/teses/disponiveis/45/45134/tde-21082017-111455/
Resumo: Image Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains.
id USP_6e540fc503c963b95a25e9a4eeedb4cd
oai_identifier_str oai:teses.usp.br:tde-21082017-111455
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling W-operator learning using linear models for both gray-level and binary inputsAprendizado de w-operadores usando modelos lineares para imagens binárias e em níveis de cinzaAprendizado de máquinaImage processingLinear classification methodsMachine learningMáquinas de suporte vetorialProcessamento de imagensProjeto automático de W-operadoresSupport vector machinesW-operator learningImage Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains.Processamento de imagens pode ser usado para resolver problemas em diversas áreas, como imagens médicas, processamento de documentos e segmentação de objetos. Operadores de imagens normalmente são construídos combinando diversos operadores elementares e ajustando seus parâmetros. Uma abordagem alternativa é a estimação de operadores de imagens a partir de pares de exemplos contendo uma imagem de entrada e o resultado esperado. Restringindo os operadores considerados para o que são invariantes à translação e localmente definidos ($W$-operadores), podemos aplicar técnicas de Aprendizagem de Máquina para estimá-los. O formato que define quais vizinhos são usadas é chamado de janela. $W$-operadores treinados com janelas grandes frequentemente tem problemas de generalização, pois necessitam de grandes conjuntos de treinamento. Este problema é ainda mais grave ao treinar operadores em níveis de cinza. Apesar de técnicas como o projeto dois níveis, que combina a saída de diversos operadores treinados com janelas menores, mitigar em parte estes problemas, uma determinação de parâmetros complexa é necessária. Neste trabalho apresentamos duas técnicas que permitem o treinamento de operadores usando janelas grandes. A primeira, KA, é baseada em Máquinas de Suporte Vetorial (SVM) e utiliza técnicas de aproximação de kernels para realizar o treinamento de $W$-operadores. Uma escolha adequada de kernels permite o treinamento de operadores níveis de cinza e binários. A segunda técnica, NILC, permite a criação automática de combinações de operadores de imagens. Este método utiliza uma técnica de otimização específica para casos em que o número de características é muito grande. Ambos métodos obtiveram resultados competitivos com algoritmos da literatura em dois domínio de aplicação diferentes. O primeiro, Staff Removal, é um processamento de documentos binários frequente em sistemas de reconhecimento ótico de partituras. O segundo é um problema de segmentação de vasos sanguíneos em imagens em níveis de cinza.Biblioteca Digitais de Teses e Dissertações da USPHirata Junior, RobertoHirata, Nina Sumiko TomitaMontagner, Igor dos Santos2017-06-12info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttp://www.teses.usp.br/teses/disponiveis/45/45134/tde-21082017-111455/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2018-07-17T16:38:18Zoai:teses.usp.br:tde-21082017-111455Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212018-07-17T16:38:18Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv W-operator learning using linear models for both gray-level and binary inputs
Aprendizado de w-operadores usando modelos lineares para imagens binárias e em níveis de cinza
title W-operator learning using linear models for both gray-level and binary inputs
spellingShingle W-operator learning using linear models for both gray-level and binary inputs
Montagner, Igor dos Santos
Aprendizado de máquina
Image processing
Linear classification methods
Machine learning
Máquinas de suporte vetorial
Processamento de imagens
Projeto automático de W-operadores
Support vector machines
W-operator learning
title_short W-operator learning using linear models for both gray-level and binary inputs
title_full W-operator learning using linear models for both gray-level and binary inputs
title_fullStr W-operator learning using linear models for both gray-level and binary inputs
title_full_unstemmed W-operator learning using linear models for both gray-level and binary inputs
title_sort W-operator learning using linear models for both gray-level and binary inputs
author Montagner, Igor dos Santos
author_facet Montagner, Igor dos Santos
author_role author
dc.contributor.none.fl_str_mv Hirata Junior, Roberto
Hirata, Nina Sumiko Tomita
dc.contributor.author.fl_str_mv Montagner, Igor dos Santos
dc.subject.por.fl_str_mv Aprendizado de máquina
Image processing
Linear classification methods
Machine learning
Máquinas de suporte vetorial
Processamento de imagens
Projeto automático de W-operadores
Support vector machines
W-operator learning
topic Aprendizado de máquina
Image processing
Linear classification methods
Machine learning
Máquinas de suporte vetorial
Processamento de imagens
Projeto automático de W-operadores
Support vector machines
W-operator learning
description Image Processing techniques can be used to solve a broad range of problems, such as medical imaging, document processing and object segmentation. Image operators are usually built by combining basic image operators and tuning their parameters. This requires both experience in Image Processing and trial-and-error to get the best combination of parameters. An alternative approach to design image operators is to estimate them from pairs of training images containing examples of the expected input and their processed versions. By restricting the learned operators to those that are translation invariant and locally defined ($W$-operators) we can apply Machine Learning techniques to estimate image transformations. The shape that defines which neighbors are used is called a window. $W$-operators trained with large windows usually overfit due to the lack sufficient of training data. This issue is even more present when training operators with gray-level inputs. Although approaches such as the two-level design, which combines multiple operators trained on smaller windows, partly mitigates these problems, they also require more complicated parameter determination to achieve good results. In this work we present techniques that increase the window sizes we can use and decrease the number of manually defined parameters in $W$-operator learning. The first one, KA, is based on Support Vector Machines and employs kernel approximations to estimate image transformations. We also present adequate kernels for processing binary and gray-level images. The second technique, NILC, automatically finds small subsets of operators that can be successfully combined using the two-level approach. Both methods achieve competitive results with methods from the literature in two different application domains. The first one is a binary document processing problem common in Optical Music Recognition, while the second is a segmentation problem in gray-level images. The same techniques were applied without modification in both domains.
publishDate 2017
dc.date.none.fl_str_mv 2017-06-12
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://www.teses.usp.br/teses/disponiveis/45/45134/tde-21082017-111455/
url http://www.teses.usp.br/teses/disponiveis/45/45134/tde-21082017-111455/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815258365670981632