Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance
Ano de defesa: | 2021 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | , , , , , |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | eng |
Instituição de defesa: |
Universidade Federal de Minas Gerais
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Música
|
Departamento: |
Não Informado pela instituição
|
País: |
Brasil
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/44588 |
Resumo: | The earliest empirical studies on music performance date back to the end of the nineteenth century, when the first mechanical devices capable of recording the actions of pianists on the instrument (key presses) were invented. Since then, many technologies that open up new possibilities for collecting data from musical performances have been invented or developed, including techniques for extracting information directly from audio recordings. These techniques, which have been driven by the fast-paced technological development in computer-related fields over the last decades, are the subject matter of this thesis. We introduce a new software library called Iracema, which contains techniques for extracting patterns of manipulation of timing, energy, and spectral content from monophonic audio recordings. In this endeavor, the clarinet is the instrument chosen for the baseline experiments and models, but most of the presented techniques should also work for other monophonic instruments. One of the most critical steps in studying musical performances is the detection of the note onsets because our perception of timing is strongly tied to this variable. We pay special attention to this topic, proposing an interactive web interface for the precise manual annotation of note onsets and conducting an experiment to assess the typical measurement error involved in this kind of task for clarinet recordings. We also propose an annotated dataset of solo clarinet recordings containing approximately 23 minutes of audio and a total of 3551 note onsets. Using this dataset, we train a convolutional neural network to generate a model for automatic note onset detection specifically on clarinet recordings and compare its results to other onset detection models. Finally, we discuss a study case using recordings of a clarinet excerpt by a few different clarinetists to demonstrate the use of the proposed library. |
id |
UFMG_f61778010d0400db621fe19c171dfe54 |
---|---|
oai_identifier_str |
oai:repositorio.ufmg.br:1843/44588 |
network_acronym_str |
UFMG |
network_name_str |
Repositório Institucional da UFMG |
repository_id_str |
|
spelling |
Mauricio Alves Loureirohttp://lattes.cnpq.br/9480268986413015Jose Augusto MannisHugo Bastos de PaulaFlávio Luiz SchiavoniSérgio Freire GarciaThiago de Almeida Magalhães CampolinaDavi Alves Motahttp://lattes.cnpq.br/9554212200728779Tairone Nunes Magalhães2022-08-25T16:41:38Z2022-08-25T16:41:38Z2021-04-28http://hdl.handle.net/1843/44588The earliest empirical studies on music performance date back to the end of the nineteenth century, when the first mechanical devices capable of recording the actions of pianists on the instrument (key presses) were invented. Since then, many technologies that open up new possibilities for collecting data from musical performances have been invented or developed, including techniques for extracting information directly from audio recordings. These techniques, which have been driven by the fast-paced technological development in computer-related fields over the last decades, are the subject matter of this thesis. We introduce a new software library called Iracema, which contains techniques for extracting patterns of manipulation of timing, energy, and spectral content from monophonic audio recordings. In this endeavor, the clarinet is the instrument chosen for the baseline experiments and models, but most of the presented techniques should also work for other monophonic instruments. One of the most critical steps in studying musical performances is the detection of the note onsets because our perception of timing is strongly tied to this variable. We pay special attention to this topic, proposing an interactive web interface for the precise manual annotation of note onsets and conducting an experiment to assess the typical measurement error involved in this kind of task for clarinet recordings. We also propose an annotated dataset of solo clarinet recordings containing approximately 23 minutes of audio and a total of 3551 note onsets. Using this dataset, we train a convolutional neural network to generate a model for automatic note onset detection specifically on clarinet recordings and compare its results to other onset detection models. Finally, we discuss a study case using recordings of a clarinet excerpt by a few different clarinetists to demonstrate the use of the proposed library.Os primeiros estudos empíricos em performance musical datam do final do século XIX, quando foram criados os primeiros dispositivos mecânicos capazes de gravar as ações de pianistas no instrumento (o pressionar das teclas). Desde então, várias tecnologias que abrem novas possibilidades de coleta de dados de performances musicais foram inventadas ou aprimoradas, incluindo técnicas de extração de informação a partir do sinal de áudio. Tais técnicas, que se aprimoraram em ritmo acentuado ao longo das últimas décadas, impulsionadas pelo rápido desenvolvimento das mais diversas áreas correlatas à computação, são o foco do presente trabalho. Propomos aqui uma nova biblioteca de software chamada Iracema, que contém técnicas para a extração de padrões temporais, de energia, e conteúdo espectral, a partir de gravações de áudio monofônicas. Escolhemos a clarineta como o instrumento a ser utilizado nos experimentos de referência e modelos propostos, mas a maior parte das técnicas aqui apresentadas pode ser aplicada a outros instrumentos monofônicos. Um dos passos mais importantes no estudo de performances musicais é a detecção dos instantes de \textit{onset} (início) das notas musicais, já que a nossa percepção rítmica (temporal) está fortemente associada a tais instantes. A este assunto dedicamos atenção especial, e propomos uma interface \textit{web} para a anotação manual precisa dos instantes de onset, além de realizar um experimento para avaliar o erro típico de anotação neste tipo de tarefa, para gravações de clarineta. Também propomos uma base de dados anotada contendo aproximadamente 23 minutos de áudio tocados na clarineta, contendo um total de 3551 onsets. Utilizando esta base de dados, treinamos uma rede neuronal convolucional para obter um modelo para detecção automática de onsets especificamente em gravações de clarineta, e comparamos os seus resultados com os de outros modelos. Por fim, exemplificamos e demonstramos o uso da biblioteca proposta por meio de um estudo de caso, envolvendo a análise de gravações de um excerto de uma peça, tocada por vários clarinetistas.CAPES - Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorengUniversidade Federal de Minas GeraisPrograma de Pós-Graduação em MúsicaUFMGBrasilhttp://creativecommons.org/licenses/by-nc-nd/3.0/pt/info:eu-repo/semantics/openAccessPerformance musicalMúsica e tecnologiaMúsica para clarineteProcessamento de som por computadorEditor de audio digitalAprendizado do computadorEmpirical musicologyNote onset detectionMusic performanceMusic information retrievalMachine learningIracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performanceIracema: da detecção de onsets de notas ao desenvolvimento de uma biblioteca de análise de conteúdo de áudio para o estudo empírico da performance musicalinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMGORIGINALTairone Nunes Magalhaes - Tese Final.pdfTairone Nunes Magalhaes - Tese Final.pdfapplication/pdf4587499https://repositorio.ufmg.br/bitstream/1843/44588/3/Tairone%20Nunes%20Magalhaes%20-%20Tese%20Final.pdfae0a2869973f34fd27a79592ffd260caMD53CC-LICENSElicense_rdflicense_rdfapplication/rdf+xml; charset=utf-8811https://repositorio.ufmg.br/bitstream/1843/44588/4/license_rdfcfd6801dba008cb6adbd9838b81582abMD54LICENSElicense.txtlicense.txttext/plain; charset=utf-82118https://repositorio.ufmg.br/bitstream/1843/44588/5/license.txtcda590c95a0b51b4d15f60c9642ca272MD551843/445882022-08-25 13:41:38.955oai:repositorio.ufmg.br:1843/44588TElDRU7Dh0EgREUgRElTVFJJQlVJw4fDg08gTsODTy1FWENMVVNJVkEgRE8gUkVQT1NJVMOTUklPIElOU1RJVFVDSU9OQUwgREEgVUZNRwoKQ29tIGEgYXByZXNlbnRhw6fDo28gZGVzdGEgbGljZW7Dp2EsIHZvY8OqIChvIGF1dG9yIChlcykgb3UgbyB0aXR1bGFyIGRvcyBkaXJlaXRvcyBkZSBhdXRvcikgY29uY2VkZSBhbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIChSSS1VRk1HKSBvIGRpcmVpdG8gbsOjbyBleGNsdXNpdm8gZSBpcnJldm9nw6F2ZWwgZGUgcmVwcm9kdXppciBlL291IGRpc3RyaWJ1aXIgYSBzdWEgcHVibGljYcOnw6NvIChpbmNsdWluZG8gbyByZXN1bW8pIHBvciB0b2RvIG8gbXVuZG8gbm8gZm9ybWF0byBpbXByZXNzbyBlIGVsZXRyw7RuaWNvIGUgZW0gcXVhbHF1ZXIgbWVpbywgaW5jbHVpbmRvIG9zIGZvcm1hdG9zIMOhdWRpbyBvdSB2w61kZW8uCgpWb2PDqiBkZWNsYXJhIHF1ZSBjb25oZWNlIGEgcG9sw610aWNhIGRlIGNvcHlyaWdodCBkYSBlZGl0b3JhIGRvIHNldSBkb2N1bWVudG8gZSBxdWUgY29uaGVjZSBlIGFjZWl0YSBhcyBEaXJldHJpemVzIGRvIFJJLVVGTUcuCgpWb2PDqiBjb25jb3JkYSBxdWUgbyBSZXBvc2l0w7NyaW8gSW5zdGl0dWNpb25hbCBkYSBVRk1HIHBvZGUsIHNlbSBhbHRlcmFyIG8gY29udGXDumRvLCB0cmFuc3BvciBhIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBxdWFscXVlciBtZWlvIG91IGZvcm1hdG8gcGFyYSBmaW5zIGRlIHByZXNlcnZhw6fDo28uCgpWb2PDqiB0YW1iw6ltIGNvbmNvcmRhIHF1ZSBvIFJlcG9zaXTDs3JpbyBJbnN0aXR1Y2lvbmFsIGRhIFVGTUcgcG9kZSBtYW50ZXIgbWFpcyBkZSB1bWEgY8OzcGlhIGRlIHN1YSBwdWJsaWNhw6fDo28gcGFyYSBmaW5zIGRlIHNlZ3VyYW7Dp2EsIGJhY2stdXAgZSBwcmVzZXJ2YcOnw6NvLgoKVm9jw6ogZGVjbGFyYSBxdWUgYSBzdWEgcHVibGljYcOnw6NvIMOpIG9yaWdpbmFsIGUgcXVlIHZvY8OqIHRlbSBvIHBvZGVyIGRlIGNvbmNlZGVyIG9zIGRpcmVpdG9zIGNvbnRpZG9zIG5lc3RhIGxpY2Vuw6dhLiBWb2PDqiB0YW1iw6ltIGRlY2xhcmEgcXVlIG8gZGVww7NzaXRvIGRlIHN1YSBwdWJsaWNhw6fDo28gbsOjbywgcXVlIHNlamEgZGUgc2V1IGNvbmhlY2ltZW50bywgaW5mcmluZ2UgZGlyZWl0b3MgYXV0b3JhaXMgZGUgbmluZ3XDqW0uCgpDYXNvIGEgc3VhIHB1YmxpY2HDp8OjbyBjb250ZW5oYSBtYXRlcmlhbCBxdWUgdm9jw6ogbsOjbyBwb3NzdWkgYSB0aXR1bGFyaWRhZGUgZG9zIGRpcmVpdG9zIGF1dG9yYWlzLCB2b2PDqiBkZWNsYXJhIHF1ZSBvYnRldmUgYSBwZXJtaXNzw6NvIGlycmVzdHJpdGEgZG8gZGV0ZW50b3IgZG9zIGRpcmVpdG9zIGF1dG9yYWlzIHBhcmEgY29uY2VkZXIgYW8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBvcyBkaXJlaXRvcyBhcHJlc2VudGFkb3MgbmVzdGEgbGljZW7Dp2EsIGUgcXVlIGVzc2UgbWF0ZXJpYWwgZGUgcHJvcHJpZWRhZGUgZGUgdGVyY2Vpcm9zIGVzdMOhIGNsYXJhbWVudGUgaWRlbnRpZmljYWRvIGUgcmVjb25oZWNpZG8gbm8gdGV4dG8gb3Ugbm8gY29udGXDumRvIGRhIHB1YmxpY2HDp8OjbyBvcmEgZGVwb3NpdGFkYS4KCkNBU08gQSBQVUJMSUNBw4fDg08gT1JBIERFUE9TSVRBREEgVEVOSEEgU0lETyBSRVNVTFRBRE8gREUgVU0gUEFUUk9Dw41OSU8gT1UgQVBPSU8gREUgVU1BIEFHw4pOQ0lBIERFIEZPTUVOVE8gT1UgT1VUUk8gT1JHQU5JU01PLCBWT0PDiiBERUNMQVJBIFFVRSBSRVNQRUlUT1UgVE9ET1MgRSBRVUFJU1FVRVIgRElSRUlUT1MgREUgUkVWSVPDg08gQ09NTyBUQU1Cw4lNIEFTIERFTUFJUyBPQlJJR0HDh8OVRVMgRVhJR0lEQVMgUE9SIENPTlRSQVRPIE9VIEFDT1JETy4KCk8gUmVwb3NpdMOzcmlvIEluc3RpdHVjaW9uYWwgZGEgVUZNRyBzZSBjb21wcm9tZXRlIGEgaWRlbnRpZmljYXIgY2xhcmFtZW50ZSBvIHNldSBub21lKHMpIG91IG8ocykgbm9tZXMocykgZG8ocykgZGV0ZW50b3IoZXMpIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBkYSBwdWJsaWNhw6fDo28sIGUgbsOjbyBmYXLDoSBxdWFscXVlciBhbHRlcmHDp8OjbywgYWzDqW0gZGFxdWVsYXMgY29uY2VkaWRhcyBwb3IgZXN0YSBsaWNlbsOnYS4KRepositório de PublicaçõesPUBhttps://repositorio.ufmg.br/oaiopendoar:2022-08-25T16:41:38Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
dc.title.pt_BR.fl_str_mv |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
dc.title.alternative.pt_BR.fl_str_mv |
Iracema: da detecção de onsets de notas ao desenvolvimento de uma biblioteca de análise de conteúdo de áudio para o estudo empírico da performance musical |
title |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
spellingShingle |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance Tairone Nunes Magalhães Empirical musicology Note onset detection Music performance Music information retrieval Machine learning Performance musical Música e tecnologia Música para clarinete Processamento de som por computador Editor de audio digital Aprendizado do computador |
title_short |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
title_full |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
title_fullStr |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
title_full_unstemmed |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
title_sort |
Iracema: from note onset detection challenges towards an audio content analysis library for the empirical study of music performance |
author |
Tairone Nunes Magalhães |
author_facet |
Tairone Nunes Magalhães |
author_role |
author |
dc.contributor.advisor1.fl_str_mv |
Mauricio Alves Loureiro |
dc.contributor.advisor1Lattes.fl_str_mv |
http://lattes.cnpq.br/9480268986413015 |
dc.contributor.referee1.fl_str_mv |
Jose Augusto Mannis |
dc.contributor.referee2.fl_str_mv |
Hugo Bastos de Paula |
dc.contributor.referee3.fl_str_mv |
Flávio Luiz Schiavoni |
dc.contributor.referee4.fl_str_mv |
Sérgio Freire Garcia |
dc.contributor.referee5.fl_str_mv |
Thiago de Almeida Magalhães Campolina Davi Alves Mota |
dc.contributor.authorLattes.fl_str_mv |
http://lattes.cnpq.br/9554212200728779 |
dc.contributor.author.fl_str_mv |
Tairone Nunes Magalhães |
contributor_str_mv |
Mauricio Alves Loureiro Jose Augusto Mannis Hugo Bastos de Paula Flávio Luiz Schiavoni Sérgio Freire Garcia Thiago de Almeida Magalhães Campolina Davi Alves Mota |
dc.subject.por.fl_str_mv |
Empirical musicology Note onset detection Music performance Music information retrieval Machine learning |
topic |
Empirical musicology Note onset detection Music performance Music information retrieval Machine learning Performance musical Música e tecnologia Música para clarinete Processamento de som por computador Editor de audio digital Aprendizado do computador |
dc.subject.other.pt_BR.fl_str_mv |
Performance musical Música e tecnologia Música para clarinete Processamento de som por computador Editor de audio digital Aprendizado do computador |
description |
The earliest empirical studies on music performance date back to the end of the nineteenth century, when the first mechanical devices capable of recording the actions of pianists on the instrument (key presses) were invented. Since then, many technologies that open up new possibilities for collecting data from musical performances have been invented or developed, including techniques for extracting information directly from audio recordings. These techniques, which have been driven by the fast-paced technological development in computer-related fields over the last decades, are the subject matter of this thesis. We introduce a new software library called Iracema, which contains techniques for extracting patterns of manipulation of timing, energy, and spectral content from monophonic audio recordings. In this endeavor, the clarinet is the instrument chosen for the baseline experiments and models, but most of the presented techniques should also work for other monophonic instruments. One of the most critical steps in studying musical performances is the detection of the note onsets because our perception of timing is strongly tied to this variable. We pay special attention to this topic, proposing an interactive web interface for the precise manual annotation of note onsets and conducting an experiment to assess the typical measurement error involved in this kind of task for clarinet recordings. We also propose an annotated dataset of solo clarinet recordings containing approximately 23 minutes of audio and a total of 3551 note onsets. Using this dataset, we train a convolutional neural network to generate a model for automatic note onset detection specifically on clarinet recordings and compare its results to other onset detection models. Finally, we discuss a study case using recordings of a clarinet excerpt by a few different clarinetists to demonstrate the use of the proposed library. |
publishDate |
2021 |
dc.date.issued.fl_str_mv |
2021-04-28 |
dc.date.accessioned.fl_str_mv |
2022-08-25T16:41:38Z |
dc.date.available.fl_str_mv |
2022-08-25T16:41:38Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/1843/44588 |
url |
http://hdl.handle.net/1843/44588 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/pt/ info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nc-nd/3.0/pt/ |
eu_rights_str_mv |
openAccess |
dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.publisher.program.fl_str_mv |
Programa de Pós-Graduação em Música |
dc.publisher.initials.fl_str_mv |
UFMG |
dc.publisher.country.fl_str_mv |
Brasil |
publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
instname_str |
Universidade Federal de Minas Gerais (UFMG) |
instacron_str |
UFMG |
institution |
UFMG |
reponame_str |
Repositório Institucional da UFMG |
collection |
Repositório Institucional da UFMG |
bitstream.url.fl_str_mv |
https://repositorio.ufmg.br/bitstream/1843/44588/3/Tairone%20Nunes%20Magalhaes%20-%20Tese%20Final.pdf https://repositorio.ufmg.br/bitstream/1843/44588/4/license_rdf https://repositorio.ufmg.br/bitstream/1843/44588/5/license.txt |
bitstream.checksum.fl_str_mv |
ae0a2869973f34fd27a79592ffd260ca cfd6801dba008cb6adbd9838b81582ab cda590c95a0b51b4d15f60c9642ca272 |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
repository.mail.fl_str_mv |
|
_version_ |
1793890948055302144 |