Classification of Turkish tweets by document vectors and investigation of the effects of parameter changes on classification success

BİLGİN, Metin

Classification of Turkish tweets by document vectors and investigation of the effects of parameter changes on classification success

Metin BİLGİN ¹

¹Bursa Uludağ University, Department of Computer Engineering, BURSA

Sigma J Eng Nat Sci 2020; 38(3): 1581-1592

Full Text PDF

Abstract

Natural language processing is an artificial intelligence field which is gaining in popularity in recent years. To make an emotional deduction from texts related to an issue, or classify documents are of great importance considering the increasing data size in today's world. Understanding and interpreting written texts is a feature that pertains to people. But, it is possible to deduce from texts or classify texts using natural language processing which is a sub-branch of machine learning and artificial intelligence. In this study, both text classification was made on Turkish tweets, and text classification success of method parameter changes was investigated using two different methods of the algorithm mentioned as document vectors in the literature. It was found in the study that as well as higher accuracy values were obtained by the DBoW (Distributed Bag of Words) method than DM (Distributed Memory) method; higher accuracy values were also obtained by DBoW-NS (Negative Sampling) architecture than others.

Keywords: Text classification, natural language processing, document vectors, doc2vec, sentiment analysis, deep learning.