https://doi.org/10.20339/PhS.5-17.049
Litvinova Tatiana A.,
PhD in Philology, researcher,
Head of Lab of Corpus Sociolinguistics and Authorship Profiling
(RusProfiling Lab)
Voronezh State Pedagogical University
e-mail: centr_rus_yaz@mail.ru
Seredin Pavel V.,
Dr.Sci. in Physics and Mathematics,
Associate professor of Department of Solid State and Nanostructures
Voronezh State University
e-mail: paul@phys.vsu.ru
Litvinova Olga A.,
Assistant lecturer in English Language Department
Voronezh State Pedagogical University
e-mail: olga_litvinova_teacher@mail.ru
Zagorovskaya Olga V.,
Dr.Sci. in Philology, Professor,
Head of department in Russian Language, Modern Russian and Foreign Literature
Voronezh State Pedagogical University
e-mail: olzagor@yandex.ru
Suicide is among the three most common causes of mortality in the group of young adults (15 to 24 years), but methods of diagnosing the propensity of personality to suicide has not developed so far. One of the promising areas of research in this area is a quantitative analysis of the speech. The foreign scientists use automatic text processing methods (natural language processing) and machine learning methods to study texts by suicides (mainly dying notes) and to built models to classify text as belonging or not belonging to suicides. However, for the development of techniques to identify individuals who are prone to suicide, it is necessary to analyze not only the suicide notes (usually a small amount of text), but also other texts by suicides. The aim of present paper is to construct a mathematical model that allows to classify a text as belonging or not belonging to suicide on the basis of the numerical values of linguistic parameters. The resulting model showed 67.5% accuracy.
Keywords: text classification methods, statistical methods in linguistics, mathematical model, predictors of suicide, corpus, computational linguistics.
References
- Litvinova, T.A. Diagnostirovanie sklonnosti lichnosti k suicidal'-nomu povedeniju na osnove analiza ee rechevoj produkcii: metody i podhody. Vestnik Marijskogo universiteta, 2016, no. 2 (22), pp. 57–61
- Predotvrashhenie samoubijstva: global'nyj imperativ: per. s angl. / Vsemirnaja organizacija zdravoohranenija, 2014. URL: http://psychiatr.ru/download/1863?view=1&name=Suicide-report-a-global-im... (access date: 23.10.2016)
- Handelman, L., Lester, D. The Content of Suicide Notes from Attempters and Completers. Crisis, 2007, no. 28, pp. 102–104.
- Jones, N., Bennell, C. The Development and Validation of Statistical Prediction Rules for Discriminating Between Genuine and Simulated Suicide Notes. Archives of Suicide Research: official journal of the International Academy for Suicide Research, 2007, no. 11 (2), pp. 219–225.
- Litvinova, T. et al. “RusPersonality”: A Russian corpus for authorship profiling and deception detection. In: Proceedings of International FRUCT Conference on Intelligence, Social Media and Web (ISMW FRUCT). St. Petersburg, 2016a, pp. 1–7.
- Litvinova, T., Zagorovskay, O., Litvinova ,O., Seredin, P. Profiling a Set of Personality Traits of a Text’s Author: A Corpus-Based Approach. LNAI, 2016b, no. 9811, pp. 555–562.
- McCoy, Th.H. et al. Improving Prediction of Suicide and Accidental Death After Discharge From General Hospitals With Natural Language Processing. JAMA Psychiatry, 2016, no. 73 (10), pp. 1064–1071.
- Pennebaker, J.W. et al. The development and psychometric properties of LIWC2007. Austin, TX: LIWC.net, 2007.
- Pestian, J. et al. Suicide Note Classification Using Natural Language Processing: A Content Analysis. Biomed Inform Insights, 2010, no. 3, pp. 19–28.