Predicting Facebook-Users Personality based on Status and Linguistic Features via Flexible Regression Analysis Techniques

Published in SAC 18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing, Page - 339-345, 2018

Recommended citation: Prantik Howlader, Kuntal Kumar Pal, Alfredo Cuzzocrea, and S. D. Madhu Kumar. 2018. Predicting facebook-users' personality based on status and linguistic features via flexible regression analysis techniques. In Proceedings of the 33rd Annual ACM Symposium on Applied Computing (SAC '18). ACM, New York, NY, USA, 339-345. DOI:

[PDF] [Code]


The psychological constructs of a user of social media are clearly visible from his/her posts and other activities. But predicting this is a challenging task. This paper explores the use of Linear Regression (LR) and Support Vector Regression (SVR) for predicting the Big Five Personality scores, which provide a quantitative measure of the personality traits of users. A performance comparison is made about the regression models on topics from Facebook users’ statuses and topics from Facebook statuses along with features extracted via using Linguistic Inquiry and Word Count (LIWC) tool. Further, we have investigated the effect of number of topics found by Latent Dirichlet Allocation (LDA) on the performance of regression models. We found that SVR with Polynomial and Radial Basis Function kernel, respectively, provides better results in predicting big five personality traits. We found that the mean squared error of LR increases with the number of topics. But this increase is less in the case when we consider additional LIWC features for regression.