Efficient adaptive multiresolution representation of music signals

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Figueiredo, Nicolas Silverio
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/45/45134/tde-17022021-201043/
Resumo: The inherent trade-off between time and frequency resolutions, which exists in conventional transforms (such as the Discrete Fourier Transform) may be a hindrance for the representation of music signals, since these transforms are incapable of simultaneously locating percussive events with precision in time and melodic events with precision in frequency. Adaptive representations intend to address this limitation by varying the analysis window size used in sub-regions of the time-frequency plane (TFP), and can be used as input representations in algorithms for automatic music transcription, source separation and musical expressiveness analysis. The main objective of the presented work is the development of an efficient adaptive transform, that serves as a counterpoint to traditional algorithms based on the combination of precomputed representations with different resolutions. The proposed Iteratively Refined Multi\\-resolution Spectro\\-gram (IRMS) works by performing successive refinements on top of an initial low frequency resolution spectrogram, located in the areas of the TFP that contain musical information such as notes, harmonics and expressive elements. Its development is built on the investigation of musical information estimators and sub-band processing techniques that allow the efficient computation of high resolution representations within isolated subregions of the TFP. As an investigation of sub-band processing algorithms for this task, a GUI application was built for the detailed high-resolution visualization of specific areas of a spectrogram. A comparative experiment between different musical information estimators was conducted, with good results for Shannon and Rényi entropies. This work also presents technical details on the integration between the detection of musically relevant subregions and their refinement via sub-band processing, that defines our final implementation of the IRMS. As an evaluation of the final solution, a comparative experiment based on computing cost between different time-frequency representations was conducted. The IRMS achieved execution times orders of magnitude faster than the other evaluated adaptive representations, and in some configurations presented a competitive computational cost with respect to the STFT and CQT, thus validating our proposal of an efficient alternative for adaptive representations.
id USP_ecef76456945c91786c3a1699d9bdcc3
oai_identifier_str oai:teses.usp.br:tde-17022021-201043
network_acronym_str USP
network_name_str Biblioteca Digital de Teses e Dissertações da USP
repository_id_str
spelling Efficient adaptive multiresolution representation of music signalsRepresentação eficiente adaptativa multiresolução de sinais musicaisAdaptive representationAutomatic music transcriptionComputação sonora e musicalMultiresolution representationRepresentação adaptativaRepresentação multi-resoluçãoSound and music computingTranscrição automática de músicaThe inherent trade-off between time and frequency resolutions, which exists in conventional transforms (such as the Discrete Fourier Transform) may be a hindrance for the representation of music signals, since these transforms are incapable of simultaneously locating percussive events with precision in time and melodic events with precision in frequency. Adaptive representations intend to address this limitation by varying the analysis window size used in sub-regions of the time-frequency plane (TFP), and can be used as input representations in algorithms for automatic music transcription, source separation and musical expressiveness analysis. The main objective of the presented work is the development of an efficient adaptive transform, that serves as a counterpoint to traditional algorithms based on the combination of precomputed representations with different resolutions. The proposed Iteratively Refined Multi\\-resolution Spectro\\-gram (IRMS) works by performing successive refinements on top of an initial low frequency resolution spectrogram, located in the areas of the TFP that contain musical information such as notes, harmonics and expressive elements. Its development is built on the investigation of musical information estimators and sub-band processing techniques that allow the efficient computation of high resolution representations within isolated subregions of the TFP. As an investigation of sub-band processing algorithms for this task, a GUI application was built for the detailed high-resolution visualization of specific areas of a spectrogram. A comparative experiment between different musical information estimators was conducted, with good results for Shannon and Rényi entropies. This work also presents technical details on the integration between the detection of musically relevant subregions and their refinement via sub-band processing, that defines our final implementation of the IRMS. As an evaluation of the final solution, a comparative experiment based on computing cost between different time-frequency representations was conducted. The IRMS achieved execution times orders of magnitude faster than the other evaluated adaptive representations, and in some configurations presented a competitive computational cost with respect to the STFT and CQT, thus validating our proposal of an efficient alternative for adaptive representations.A inerente troca entre resolução no tempo e na frequência de transformadas convencionais (como a Transformada Discreta de Fourier) pode ser um inconveniente na representação de sinais musicais, já que tais transformadas são incapazes de localizar simultaneamente eventos percussivos com precisão no tempo e eventos melódicos com precisão na frequência. Representações adaptativas buscam contornar essa limitação variando o tamanho da janela de análise utilizada em cada região do plano tempo-frequência, e possuem aplicações como entrada para algoritmos automáticos de transcrição de música, separação de fontes e análise de expressividade musical. O projeto apresentado tem como objetivo principal o desenvolvimento de uma representação adaptativa de baixo custo computacional, cuja estrutura se opõe à tradicional combinação de representações de diferentes resoluções pré-computadas. O proposto Iteratively Refined Multi\\-resolution Spectrogram (IRMS) funciona a partir de refinamentos sucessivos em cima de um espectro\\-grama inicial de baixa resolução de frequência, localizados nas áreas do plano tempo-frequência nas quais existe informação musical como notas, harmônicos e elementos expressivos. Seu desenvolvimento passa pela investigação de estimadores de informação musical e técnicas de processamento em sub-bandas que permitam uma computação eficiente de representações em alta resolução de regiões isoladas do plano tempo-frequência. Para a investigação de algoritmos de processamento em sub-bandas para essa finalidade, foi desenvolvida uma aplicação que permite a visualização em alta resolução de áreas específicas de um espectrograma. Um experimento comparativo entre diferentes estimadores de informação musical foi conduzido, com bons resultados para as entropias de Shannon e Rényi. Também são apresentados detalhes técnicos sobre a integração entre detecção de subregiões musicais e seu refinamento via processamento em sub-bandas, que dá origem à implementação final da IRMS. Como avaliação da solução, um experimento final comparativo baseado em custo computacional entre diferentes representações no plano tempo-frequência foi realizado. A IRMS alcançou tempos de execução ordens de magnitude menor do que as outras representações adaptativas avaliadas, e em algumas configurações apresentou custo computacional competitivo em relação à CQT e à STFT, validando a nossa proposta de uma alternativa eficiente para representações adaptativas.Biblioteca Digitais de Teses e Dissertações da USPQueiroz, Marcelo Gomes deFigueiredo, Nicolas Silverio2020-12-14info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://www.teses.usp.br/teses/disponiveis/45/45134/tde-17022021-201043/reponame:Biblioteca Digital de Teses e Dissertações da USPinstname:Universidade de São Paulo (USP)instacron:USPLiberar o conteúdo para acesso público.info:eu-repo/semantics/openAccesseng2021-02-19T00:10:01Zoai:teses.usp.br:tde-17022021-201043Biblioteca Digital de Teses e Dissertaçõeshttp://www.teses.usp.br/PUBhttp://www.teses.usp.br/cgi-bin/mtd2br.plvirginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.bropendoar:27212021-02-19T00:10:01Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)false
dc.title.none.fl_str_mv Efficient adaptive multiresolution representation of music signals
Representação eficiente adaptativa multiresolução de sinais musicais
title Efficient adaptive multiresolution representation of music signals
spellingShingle Efficient adaptive multiresolution representation of music signals
Figueiredo, Nicolas Silverio
Adaptive representation
Automatic music transcription
Computação sonora e musical
Multiresolution representation
Representação adaptativa
Representação multi-resolução
Sound and music computing
Transcrição automática de música
title_short Efficient adaptive multiresolution representation of music signals
title_full Efficient adaptive multiresolution representation of music signals
title_fullStr Efficient adaptive multiresolution representation of music signals
title_full_unstemmed Efficient adaptive multiresolution representation of music signals
title_sort Efficient adaptive multiresolution representation of music signals
author Figueiredo, Nicolas Silverio
author_facet Figueiredo, Nicolas Silverio
author_role author
dc.contributor.none.fl_str_mv Queiroz, Marcelo Gomes de
dc.contributor.author.fl_str_mv Figueiredo, Nicolas Silverio
dc.subject.por.fl_str_mv Adaptive representation
Automatic music transcription
Computação sonora e musical
Multiresolution representation
Representação adaptativa
Representação multi-resolução
Sound and music computing
Transcrição automática de música
topic Adaptive representation
Automatic music transcription
Computação sonora e musical
Multiresolution representation
Representação adaptativa
Representação multi-resolução
Sound and music computing
Transcrição automática de música
description The inherent trade-off between time and frequency resolutions, which exists in conventional transforms (such as the Discrete Fourier Transform) may be a hindrance for the representation of music signals, since these transforms are incapable of simultaneously locating percussive events with precision in time and melodic events with precision in frequency. Adaptive representations intend to address this limitation by varying the analysis window size used in sub-regions of the time-frequency plane (TFP), and can be used as input representations in algorithms for automatic music transcription, source separation and musical expressiveness analysis. The main objective of the presented work is the development of an efficient adaptive transform, that serves as a counterpoint to traditional algorithms based on the combination of precomputed representations with different resolutions. The proposed Iteratively Refined Multi\\-resolution Spectro\\-gram (IRMS) works by performing successive refinements on top of an initial low frequency resolution spectrogram, located in the areas of the TFP that contain musical information such as notes, harmonics and expressive elements. Its development is built on the investigation of musical information estimators and sub-band processing techniques that allow the efficient computation of high resolution representations within isolated subregions of the TFP. As an investigation of sub-band processing algorithms for this task, a GUI application was built for the detailed high-resolution visualization of specific areas of a spectrogram. A comparative experiment between different musical information estimators was conducted, with good results for Shannon and Rényi entropies. This work also presents technical details on the integration between the detection of musically relevant subregions and their refinement via sub-band processing, that defines our final implementation of the IRMS. As an evaluation of the final solution, a comparative experiment based on computing cost between different time-frequency representations was conducted. The IRMS achieved execution times orders of magnitude faster than the other evaluated adaptive representations, and in some configurations presented a competitive computational cost with respect to the STFT and CQT, thus validating our proposal of an efficient alternative for adaptive representations.
publishDate 2020
dc.date.none.fl_str_mv 2020-12-14
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://www.teses.usp.br/teses/disponiveis/45/45134/tde-17022021-201043/
url https://www.teses.usp.br/teses/disponiveis/45/45134/tde-17022021-201043/
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv
dc.rights.driver.fl_str_mv Liberar o conteúdo para acesso público.
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Liberar o conteúdo para acesso público.
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.coverage.none.fl_str_mv
dc.publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
publisher.none.fl_str_mv Biblioteca Digitais de Teses e Dissertações da USP
dc.source.none.fl_str_mv
reponame:Biblioteca Digital de Teses e Dissertações da USP
instname:Universidade de São Paulo (USP)
instacron:USP
instname_str Universidade de São Paulo (USP)
instacron_str USP
institution USP
reponame_str Biblioteca Digital de Teses e Dissertações da USP
collection Biblioteca Digital de Teses e Dissertações da USP
repository.name.fl_str_mv Biblioteca Digital de Teses e Dissertações da USP - Universidade de São Paulo (USP)
repository.mail.fl_str_mv virginia@if.usp.br|| atendimento@aguia.usp.br||virginia@if.usp.br
_version_ 1815258235192475648