Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia
| Ano de defesa: | 2009 |
|---|---|
| Autor(a) principal: | |
| Orientador(a): | |
| Banca de defesa: | |
| Tipo de documento: | Dissertação |
| Tipo de acesso: | Acesso aberto |
| Idioma: | por |
| Instituição de defesa: |
Universidade Federal de Minas Gerais
|
| Programa de Pós-Graduação: |
Não Informado pela instituição
|
| Departamento: |
Não Informado pela instituição
|
| País: |
Não Informado pela instituição
|
| Palavras-chave em Português: | |
| Link de acesso: | https://hdl.handle.net/1843/SLSS-7WJN62 |
Resumo: | The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract on a open digital library, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solutions and show significant improvements in terms of effective quality prediction. |
| id |
UFMG_e3c291b72b0e3eee6efea3088654024e |
|---|---|
| oai_identifier_str |
oai:repositorio.ufmg.br:1843/SLSS-7WJN62 |
| network_acronym_str |
UFMG |
| network_name_str |
Repositório Institucional da UFMG |
| repository_id_str |
|
| spelling |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédiaBibliotecas digitaisRecuperação de informaçãoBibliotecas digitaisrecuperação de informaçãoThe old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract on a open digital library, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solutions and show significant improvements in terms of effective quality prediction.Universidade Federal de Minas Gerais2019-08-11T06:44:45Z2025-09-09T01:01:22Z2019-08-11T06:44:45Z2009-04-03info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttps://hdl.handle.net/1843/SLSS-7WJN62Daniel Hasan Dalipinfo:eu-repo/semantics/openAccessporreponame:Repositório Institucional da UFMGinstname:Universidade Federal de Minas Gerais (UFMG)instacron:UFMG2025-09-09T01:01:22Zoai:repositorio.ufmg.br:1843/SLSS-7WJN62Repositório InstitucionalPUBhttps://repositorio.ufmg.br/oairepositorio@ufmg.bropendoar:2025-09-09T01:01:22Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG)false |
| dc.title.none.fl_str_mv |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| title |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| spellingShingle |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia Daniel Hasan Dalip Bibliotecas digitais Recuperação de informação Bibliotecas digitais recuperação de informação |
| title_short |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| title_full |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| title_fullStr |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| title_full_unstemmed |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| title_sort |
Um método automático para estimativa da qualidade de enciclopédias colaborativas on-line: um estudo de caso sobre a wikipédia |
| author |
Daniel Hasan Dalip |
| author_facet |
Daniel Hasan Dalip |
| author_role |
author |
| dc.contributor.author.fl_str_mv |
Daniel Hasan Dalip |
| dc.subject.por.fl_str_mv |
Bibliotecas digitais Recuperação de informação Bibliotecas digitais recuperação de informação |
| topic |
Bibliotecas digitais Recuperação de informação Bibliotecas digitais recuperação de informação |
| description |
The old dream of a universal repository containing all the human knowledge and culture is becoming possible through the Internet and the Web. Moreover, this is happening with the direct collaborative, participation of people. Wikipedia is a great example. It is an enormous repository of information with free access and edition, created by the community in a collaborative manner. However, this large amount of information, made available democratically and virtually without any control, raises questions about its relative quality. In this work we explore a significant number of quality indicators, some of them proposed by us and used here for the first time, and study their capability to assess the quality of Wikipedia articles. Furthermore, we explore machine learning techniques to combine these quality indicators into one single assessment judgment. Through experiments, we show that the most important quality indicators are the easiest ones to extract on a open digital library, namely, textual features related to length, structure and style. We were also able to determine which indicators did not contribute significantly to the quality assessment. These were, coincidentally, the most complex features, such as those based on link analysis. Finally, we compare our combination method with state-of-the-art solutions and show significant improvements in terms of effective quality prediction. |
| publishDate |
2009 |
| dc.date.none.fl_str_mv |
2009-04-03 2019-08-11T06:44:45Z 2019-08-11T06:44:45Z 2025-09-09T01:01:22Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
| format |
masterThesis |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/1843/SLSS-7WJN62 |
| url |
https://hdl.handle.net/1843/SLSS-7WJN62 |
| dc.language.iso.fl_str_mv |
por |
| language |
por |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.format.none.fl_str_mv |
application/pdf |
| dc.publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| publisher.none.fl_str_mv |
Universidade Federal de Minas Gerais |
| dc.source.none.fl_str_mv |
reponame:Repositório Institucional da UFMG instname:Universidade Federal de Minas Gerais (UFMG) instacron:UFMG |
| instname_str |
Universidade Federal de Minas Gerais (UFMG) |
| instacron_str |
UFMG |
| institution |
UFMG |
| reponame_str |
Repositório Institucional da UFMG |
| collection |
Repositório Institucional da UFMG |
| repository.name.fl_str_mv |
Repositório Institucional da UFMG - Universidade Federal de Minas Gerais (UFMG) |
| repository.mail.fl_str_mv |
repositorio@ufmg.br |
| _version_ |
1856414079980142592 |