Statistical analysis applied to data classification and image filtering

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: ALMEIDA, Marcos Antonio Martins de
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Engenharia Eletrica
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/25506
Resumo: Statistical analysis is a tool of wide applicability in several areas of scientific knowledge. This thesis makes use of statistical analysis in two different applications: data classification and image processing targeted at document image binarization. In the first case, this thesis presents an analysis of several aspects of the consistency of the classification of the senior researchers in computer science of the Brazilian research council, CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico. The second application of statistical analysis developed in this thesis addresses filtering-out the back to front interference which appears whenever a document is written or typed on both sides of translucent paper. In this topic, an assessment of the most important algorithms found in the literature is made, taking into account a large quantity of parameters such as the strength of the back to front interference, the diffusion of the ink in the paper, and the texture and hue of the paper due to aging. A new binarization algorithm is proposed, which is capable of removing the back-to-front noise in a wide range of documents. Additionally, this thesis proposes a new concept of “intelligent” binarization for complex documents, which besides text encompass several graphical elements such as figures, photos, diagrams, etc.
id UFPE_e2c646bffce10f77a864968711e8e941
oai_identifier_str oai:repositorio.ufpe.br:123456789/25506
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str
spelling Statistical analysis applied to data classification and image filteringElectrical EingineeringData processingData classificationImage filteringStatistical analysis is a tool of wide applicability in several areas of scientific knowledge. This thesis makes use of statistical analysis in two different applications: data classification and image processing targeted at document image binarization. In the first case, this thesis presents an analysis of several aspects of the consistency of the classification of the senior researchers in computer science of the Brazilian research council, CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico. The second application of statistical analysis developed in this thesis addresses filtering-out the back to front interference which appears whenever a document is written or typed on both sides of translucent paper. In this topic, an assessment of the most important algorithms found in the literature is made, taking into account a large quantity of parameters such as the strength of the back to front interference, the diffusion of the ink in the paper, and the texture and hue of the paper due to aging. A new binarization algorithm is proposed, which is capable of removing the back-to-front noise in a wide range of documents. Additionally, this thesis proposes a new concept of “intelligent” binarization for complex documents, which besides text encompass several graphical elements such as figures, photos, diagrams, etc.Análise estatística é uma ferramenta de grande aplicabilidade em diversas áreas do conhecimento científico. Esta tese faz uso de análise estatística em duas aplicações distintas: classificação de dados e processamento de imagens de documentos visando a binarização. No primeiro caso, é aqui feita uma análise de diversos aspectos da consistência da classificação de pesquisadores sêniores do CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico, na área de Ciência da Computação. A segunda aplicação de análise estatística aqui desenvolvida trata da filtragem da interferência frente-verso que surge quando um documento é escrito ou impresso em ambos os lados da folha de um papel translúcido. Neste tópico é inicialmente feita uma análise da qualidade dos mais importantes algoritmos de binarização levando em consideração parâmetros tais como a intensidade da interferência frente-verso, a difusão da tinta no papel e a textura e escurecimento do papel pelo envelhecimento. Um novo algoritmo para a binarização eficiente de documentos com interferência frente-verso é aqui apresentado, tendo se mostrado capaz de remover tal ruído em uma grande gama de documentos. Adicionalmente, é aqui proposta a binarização “inteligente” de documentos complexos que envolvem diversos elementos gráficos (figuras, diagramas, etc).Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Engenharia EletricaLINS, Rafael Dueirehttp://lattes.cnpq.br/2140863905290751http://lattes.cnpq.br/7601016626256808ALMEIDA, Marcos Antonio Martins de2018-08-09T20:49:01Z2018-08-09T20:49:01Z2016-12-21info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://repositorio.ufpe.br/handle/123456789/25506engAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2019-10-25T12:12:25Zoai:repositorio.ufpe.br:123456789/25506Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212019-10-25T12:12:25Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv Statistical analysis applied to data classification and image filtering
title Statistical analysis applied to data classification and image filtering
spellingShingle Statistical analysis applied to data classification and image filtering
ALMEIDA, Marcos Antonio Martins de
Electrical Eingineering
Data processing
Data classification
Image filtering
title_short Statistical analysis applied to data classification and image filtering
title_full Statistical analysis applied to data classification and image filtering
title_fullStr Statistical analysis applied to data classification and image filtering
title_full_unstemmed Statistical analysis applied to data classification and image filtering
title_sort Statistical analysis applied to data classification and image filtering
author ALMEIDA, Marcos Antonio Martins de
author_facet ALMEIDA, Marcos Antonio Martins de
author_role author
dc.contributor.none.fl_str_mv LINS, Rafael Dueire
http://lattes.cnpq.br/2140863905290751
http://lattes.cnpq.br/7601016626256808
dc.contributor.author.fl_str_mv ALMEIDA, Marcos Antonio Martins de
dc.subject.por.fl_str_mv Electrical Eingineering
Data processing
Data classification
Image filtering
topic Electrical Eingineering
Data processing
Data classification
Image filtering
description Statistical analysis is a tool of wide applicability in several areas of scientific knowledge. This thesis makes use of statistical analysis in two different applications: data classification and image processing targeted at document image binarization. In the first case, this thesis presents an analysis of several aspects of the consistency of the classification of the senior researchers in computer science of the Brazilian research council, CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico. The second application of statistical analysis developed in this thesis addresses filtering-out the back to front interference which appears whenever a document is written or typed on both sides of translucent paper. In this topic, an assessment of the most important algorithms found in the literature is made, taking into account a large quantity of parameters such as the strength of the back to front interference, the diffusion of the ink in the paper, and the texture and hue of the paper due to aging. A new binarization algorithm is proposed, which is capable of removing the back-to-front noise in a wide range of documents. Additionally, this thesis proposes a new concept of “intelligent” binarization for complex documents, which besides text encompass several graphical elements such as figures, photos, diagrams, etc.
publishDate 2016
dc.date.none.fl_str_mv 2016-12-21
2018-08-09T20:49:01Z
2018-08-09T20:49:01Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/25506
url https://repositorio.ufpe.br/handle/123456789/25506
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Engenharia Eletrica
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Engenharia Eletrica
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1856041904736567296