Your shopping cart is empty.
Log in

Literary heritage of the 19–20 centuries: classification of raster images for intellectual analysis and thematic modeling of the corpus of handwritten texts

E.N. Penskaya, L.V. Khachaturian
$2.50

UDC 82(0.032):004

DOI 10.20339/PhS.5-23.160     

 

Penskaya Elena N.,

Doctor of Philology, Professor

National Research University “Higher School of Economics;

Head of the Group of the Center for Interdisciplinary Research

Moscow Institute of Physics and Technology

ORCID: 0000-0003-2469-584X

e-mail: e.penskaya@gmail.com

Khachaturian Lyubov V.,

Candidate of Culturology, Associate Professor

National Research University “Higher School of Economics”;

Senior Scientific Researcher of the Center for Interdisciplinary Research

Moscow Institute of Physics and Technology

ORCID: 0000-0002-2689-5186

e-mail: rgali2010@yandex.ru

 

The article examines the current trends in working with digital forms of handwritten heritage on the history of Russian literature of the second half of the 19 — mid-20 century. The process of forming virtual archives is analyzed as a gradual accumulation of the “big date” of scientific research — an unrecognized information array of raster documents containing tens of thousands of digital forms of archival documents. New approaches to classifying raster images of handwritten documents for use in intelligent analysis systems, experimental methods of visualization of archival documents, as well as previously unused capabilities of the search engine are proposed. Much attention is paid to the architectonics of the manuscript: the transition from graphic elements of a raster image to semantic ones, which allows the use of data mining elements for an unrecognized data array.

Keywords: manuscript heritage, digital form, bitmap image, new methods, manuscript architectonics, big data, data mining.

 

References

1. Moretti F. Dal’nee chtenie / per. s angl. A. Vdovina, O. Sobchuka, A. Sheli; nauch. red. per. I. Kushnareva. Moscow: Izd-vo Instituta Gaidara, 2016. 352 s.

2. Cohen M. The sentimental education of the novel. Princeton: Princeton University Press, 1999. 219 p.

3. Sobranie P.N. i S.P. Luknitskikh // RO IRLI RAN. Ф. 754. Oп. 1.

4. Sobranie P.N. i S.P. Luknitskikh // Avtograf. XX vek: portal. URL: http://gumilev.literature-archive.ru/ru/digital-archive/stihotvoreniya-i... (01.12.2022).

5. Lipkina A.L., Mestetskiy L.M. Klassifikatsiia bukv v izobrazheniiakh na osnove mediannogo predstavleniia // Geometric Modeling. Computer Graphics in Education. 2018. URL: https://www.graphicon.ru/html/2018/papers/362-368.pdf (01.12.2022).

6. Lipkina A., Mestetskiy L. Grapheme approach to recognizing letters based on medial representation // Proceedings of the 14th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. 2019. Vol. 4. No. 1. P. 351–358.

7. Borodkin L.I. Digital history: Primenenie tsifrovykh media v sokhranenii kul’turnogo naslediia? // Metodologicheskie problemy istoricheskoi informatiki: informatsionnyi biulleten’. 2012. T. 1. No. 1. S. 14–21.

8. Borodkin L.I. Virtual’naia rekonstruktsiia istoricheskogo gorodskogo landshafta: problemy mezhdistsiplinarnogo sinteza i ikh reshenie // Istoriko-kul’turnoe nasledie v tsifrovom izmerenii: materialy Mezhdunar. nauch. konf., g. Perm, 20–22 oktiabria 2021 g. Perm, 2021. 210 s. URL: http://www.psu.ru/files/docs/science/books/sborniki/istoriko-kulturnoe-n... (01.12.2022).

9. Yumasheva Yu.Yu. Metodicheskie rekomendatsii po elektronnomu ko-pirovaniiu arkhivnykh dokumentov i upravleniiu poluchennym informatsionnym massivom. Moscow: VNIIDAD, 2012. 125 s. URL: https://archives.gov.ru/documents/rekomend_el-copy-archival-documents.shtml (01.12.2022).

10. Bogomolov N.A., Gaiduk V.L. Valerii Briusov: dnevnik 1890 g. / predisl. N.A. Bogomolova, podgot. teksta i primech. V.L. Gaiduk i N.A. Bogomolova // Studia Literarum. 2020. T. 5. No. 3. S. 328–357.

11. Ob”edinennyi elektronnyi arkhiv Ivana Bunina. URL: http://www.bunin-rgali.ru (01.12.2022).

12. Virtual’nyi arkhiv Anny Akhmatovoi. URL: http://www.akhmatova-rgali.ru (01.12.2022).

13. Ob”edinennyi arkhiv Viach. Ivanova. URL: http://www.ivanov-rgali.ru/ (01.12.2022).

14. Avtograf. XX vek: elektronnyi arkhiv russkoi literatury. URL: http://literature-archive.ru (01.12.2022).

15. Ob”edinennyi tsifrovoi arkhiv rukopisei F.M. Dostoevskogo. URL: https://dostoevskyarchive.pushdom.ru/about (01.12.2022).

16. Lavrov A. “U nas vse — tselina: kuda ni kopni, vse vpervye” // Arzamas. 2022. 3 avgusta. URL: https://arzamas.academy/mag/1108-lavrov (01.12.2022).

17. Lavrov A.V. Teksty i kommentarii: iz materialov k istorii russkoi literatury pervoi treti XX veka. St. Petersburg: Pushkinskii Dom, 2018. 528 s.

18. Guseinov V.N. Literaturnyi arkhiv kak kul’turnaia praktika i sotsial’nyi opyt»: materialy Mezhdunar. nauch.-prakt. konf. // Vestnik Moskovskogo universiteta. Ser. 9: Filologiia. 2023 (v pechati).

19. ISAD (G): osnovnoi mezhdunarodnyi standart arkhivnogo opisaniia / per. s angl.; gl. red. per. E.D. Zhabko. 2-e izd. St. Petersburg: Prezidentskaia biblioteka imeni B.N. El’tsina, 2011. 247 s.

20. Pilshchikov I.A. Sem’ besed o filologii i Digital Humanities: interv’iu i diskussii (2015–2021). Moscow: MGU, 2022. 192 s.

21. Venediktova T.D. Khitroumnyi puteshestvennik // Novoe literaturnoe obozrenie. 2018. No. 2. S. 82–88.

22. Leonov V.P. Dal’nee chtenie kak strategiia tochnogo bibliografovedeniia // Nauchnye i tekhnicheskie biblioteki. 2019. No. 10. S. 56–67.