Clustering business news for strategic insight: An unsupervised learning approach

DAHAKE, Gaurav V.; ALI, M. S.

doi:10.14744/sigma.2026.1967

Clustering business news for strategic insight: An unsupervised learning approach

Gaurav V. DAHAKE ¹

, M. S. ALI ¹

¹Department of Computer Science and Engineering, Prof. Ram Meghe College of Engineering and Management, Amravati, 01128, India

Sigma J Eng Nat Sci 2026; 44(1): 55-73 DOI: 10.14744/sigma.2026.1967

Full Text PDF

Abstract

Business news assists as an important tool which enables business leaders and marketers to create strategic plans through their decision-making process. The large number of news articles spread across different platforms creates difficulties when trying to find important information through efficient methods. The research presents an approach which employs unsupervised learning methods to cluster business news articles for solving information overload problems. The system uses K-Means and Agglomerative clustering algorithms together with Bag-of-Words and TF-IDF and BERT embeddings to identify articles with similar content. The research evaluates clustering performance through various assessment methods which include Elbow Score and Silhouette Score and Calinski-Harabasz Index and Davies-Bouldin Index to determine which approach works best. The combination of BERT with K-Means produces better results than TF-IDF because it reaches an accuracy level of 84.46%. The pre-processing stage required tokenization and stop word removal and stemming to solve the problems which occurred when dealing with noisy data and unimportant information. The results show that BERT achieves better clustering results because it understands deep semantic meanings in text data. The research investigates both scalability and practical implementation of the system through its proposed studies about real-time system flexibility and data bias ethical problems. The proposed solution improves users’ ability to access business news which enables them to make better decisions.

Keywords: BERT; Business News; Clustering Evaluation; Information Overload; K-Means; Text Clustering; Unsupervised Learning