Mostrar registro simples

dc.contributor.authorBarbon, Rafael Silva
dc.contributor.authorAkabane, Ademar Takeo
dc.date.accessioned2024-03-18T14:50:50Z
dc.date.available2024-03-18T14:50:50Z
dc.date.issued2022-10-26
dc.identifier.urihttp://repositorio.sis.puc-campinas.edu.br/xmlui/handle/123456789/17187
dc.description.abstractThe Internet of Things is a paradigm that interconnects several smart devices through the internet to provide ubiquitous services to users. This paradigm and Web 2.0 platforms generate countless amounts of textual data. Thus, a significant challenge in this context is automatically performing text classification. State-of-the-art outcomes have recently been obtained by employing language models trained from scratch on corpora made up from news online to handle text classification better. A language model that we can highlight is BERT (Bidirectional Encoder Representations from Transformers) and also DistilBERT is a pre-trained smaller general-purpose language representation model. In this context, through a case study, we propose performing the text classification task with two previously mentioned models for two languages (English and Brazilian Portuguese) in different datasets. The results show that DistilBERT’s training time for English and Brazilian Portuguese was about 45% faster than its larger counterpart, it was also 40% smaller, and preserves about 96% of language comprehension skills for balanced datasets.pt_BR
dc.description.sponsorshipNão recebi financiamentopt_BR
dc.language.isoengpt_BR
dc.publisherSensorspt_BR
dc.rightsAcesso abertopt_BR
dc.subjectbig datapt_BR
dc.subjectpre-trained modelpt_BR
dc.subjectBERTpt_BR
dc.subjectDistilBERTpt_BR
dc.subjectBERTimbaupt_BR
dc.subjectDistilBERTimbaupt_BR
dc.subjecttransformerbased machine learningpt_BR
dc.titleTowards Transfer Learning Techniques—BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Studypt_BR
dc.title.alternativeRumo a técnicas de aprendizagem por transferência - BERT, DistilBERT, BERTimbau e DistilBERTimbau para classificação automática de texto de diferentes idiomas: um estudo de casopt_BR
dc.typeArtigopt_BR
dc.contributor.institutionPontifícia Universidade Católica de Campinas (PUC-Campinas)pt_BR
dc.identifier.lattes9713891218812963pt_BR
dc.identifier.lattes6781874728187325pt_BR
puc.centerNão se aplicapt_BR
puc.graduateProgramSistemas de Infraestrutura Urbanapt_BR
puc.embargoOnlinept_BR
puc.undergraduateProgramNão se aplicapt_BR


Arquivos deste item

Thumbnail

Este registro aparece na(s) seguinte(s) coleção(s)

Mostrar registro simples