Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas
| Ano de defesa: | 2021 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/36618 |
Resumo: | The semantic web represents knowledge in a human-readable and machine-readable form. Linked data semantically associate concepts from different sources, and the reuse of data and vocabularies enriches information systems on the web, especially those aimed at organizing scientific research data. Such systems manipulate tabular data in two-dimensional degrees (CSV files - Comma Separated Values) with their associated metadata in the file header (first line of the file). In general, CSV metadata is insufficient to enable semantic integration and interoperability, that is, the ability of systems to communicate transparently (or as closely as possible) with other similar systems (or not). To contribute to research, CSV files must be integrated semantically and stored in data repositories. Tabular data must have its meanings made explicit so that the concepts treated are not lost or have their meanings distorted. The process of semantic data integration is performed by tools that automate the process, in order to systematize and streamline the work, minimizing human errors. These tools have different characteristics and implementations and the features (or functionalities) available in each of them impact on their ability to integrate data generating linked data for semantic web. A given data integration project can fail if the tool chosen to generate linked data does not have the features available to the project. Several comparison frameworks have been proposed to evaluate tools in the generation of linked data, but none of them uses a scale of values that simplifies the evaluation and summary of the results of the analyzes. This research proposes a comparison framework for semantic integration tools of tabular data in CSV. The features of the framework are based on the scientific literature with points arranged on a positive number line. In the methodological path, a CSV file is used in the semantic integration process, then the tools are evaluated in the light of the comparison framework. Thus, having the tools on a positive number line, it is possible to know which of them have the most adequate features for a given integration project or even the best evaluated features. The results of this work are useful for all those who need to evaluate tools in their semantic data integration projects, especially in scientific research, since the data connected conceptually contribute greatly to them. |
| id |
UFMG_4e3334e03f24bea81adc7aa38e594a52 |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/36618 |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentasCiência da informaçãoIntegração semânticaDados tabularesFerramentas para integração semânticaArcabouço comparativoWeb semânticaDados conectadosThe semantic web represents knowledge in a human-readable and machine-readable form. Linked data semantically associate concepts from different sources, and the reuse of data and vocabularies enriches information systems on the web, especially those aimed at organizing scientific research data. Such systems manipulate tabular data in two-dimensional degrees (CSV files - Comma Separated Values) with their associated metadata in the file header (first line of the file). In general, CSV metadata is insufficient to enable semantic integration and interoperability, that is, the ability of systems to communicate transparently (or as closely as possible) with other similar systems (or not). To contribute to research, CSV files must be integrated semantically and stored in data repositories. Tabular data must have its meanings made explicit so that the concepts treated are not lost or have their meanings distorted. The process of semantic data integration is performed by tools that automate the process, in order to systematize and streamline the work, minimizing human errors. These tools have different characteristics and implementations and the features (or functionalities) available in each of them impact on their ability to integrate data generating linked data for semantic web. A given data integration project can fail if the tool chosen to generate linked data does not have the features available to the project. Several comparison frameworks have been proposed to evaluate tools in the generation of linked data, but none of them uses a scale of values that simplifies the evaluation and summary of the results of the analyzes. This research proposes a comparison framework for semantic integration tools of tabular data in CSV. The features of the framework are based on the scientific literature with points arranged on a positive number line. In the methodological path, a CSV file is used in the semantic integration process, then the tools are evaluated in the light of the comparison framework. Thus, having the tools on a positive number line, it is possible to know which of them have the most adequate features for a given integration project or even the best evaluated features. The results of this work are useful for all those who need to evaluate tools in their semantic data integration projects, especially in scientific research, since the data connected conceptually contribute greatly to them.Universidade Federal de Minas Gerais2021-06-30T12:43:03Z2025-09-08T23:35:51Z2021-06-30T12:43:03Z2021-02-22info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/36618porhttp://creativecommons.org/licenses/by-nd/3.0/pt/info:eu-repo/semantics/openAccessRafael Rochareponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-08T23:35:51Zoai:repositorio.ufmg.br:1843/36618Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-08T23:35:51Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| title |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| spellingShingle |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas Rafael Rocha Ciência da informação Integração semântica Dados tabulares Ferramentas para integração semântica Arcabouço comparativo Web semântica Dados conectados |
| title_short |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| title_full |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| title_fullStr |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| title_full_unstemmed |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| title_sort |
Integração semântica de dados tabulares em CSV: proposta de arcabouço comparativo de ferramentas |
| author |
Rafael Rocha |
| author_facet |
Rafael Rocha |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Rafael Rocha |
| dc.subject.por.fl_str_mv |
Ciência da informação Integração semântica Dados tabulares Ferramentas para integração semântica Arcabouço comparativo Web semântica Dados conectados |
| topic |
Ciência da informação Integração semântica Dados tabulares Ferramentas para integração semântica Arcabouço comparativo Web semântica Dados conectados |
| description |
The semantic web represents knowledge in a human-readable and machine-readable form. Linked data semantically associate concepts from different sources, and the reuse of data and vocabularies enriches information systems on the web, especially those aimed at organizing scientific research data. Such systems manipulate tabular data in two-dimensional degrees (CSV files - Comma Separated Values) with their associated metadata in the file header (first line of the file). In general, CSV metadata is insufficient to enable semantic integration and interoperability, that is, the ability of systems to communicate transparently (or as closely as possible) with other similar systems (or not). To contribute to research, CSV files must be integrated semantically and stored in data repositories. Tabular data must have its meanings made explicit so that the concepts treated are not lost or have their meanings distorted. The process of semantic data integration is performed by tools that automate the process, in order to systematize and streamline the work, minimizing human errors. These tools have different characteristics and implementations and the features (or functionalities) available in each of them impact on their ability to integrate data generating linked data for semantic web. A given data integration project can fail if the tool chosen to generate linked data does not have the features available to the project. Several comparison frameworks have been proposed to evaluate tools in the generation of linked data, but none of them uses a scale of values that simplifies the evaluation and summary of the results of the analyzes. This research proposes a comparison framework for semantic integration tools of tabular data in CSV. The features of the framework are based on the scientific literature with points arranged on a positive number line. In the methodological path, a CSV file is used in the semantic integration process, then the tools are evaluated in the light of the comparison framework. Thus, having the tools on a positive number line, it is possible to know which of them have the most adequate features for a given integration project or even the best evaluated features. The results of this work are useful for all those who need to evaluate tools in their semantic data integration projects, especially in scientific research, since the data connected conceptually contribute greatly to them. |
| publishDate |
2021 |
| dc.date.none.fl_str_mv |
2021-06-30T12:43:03Z 2021-06-30T12:43:03Z 2021-02-22 2025-09-08T23:35:51Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/36618 |
| url |
https://hdl.handle.net/1843/36618 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
http://creativecommons.org/licenses/by-nd/3.0/pt/ info:eu-repo/semantics/openAccess |
| rights_invalid_str_mv |
http://creativecommons.org/licenses/by-nd/3.0/pt/ |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1856413964565479424 |