Your shopping cart is empty.
Log in

Markup of scientific and technical texts in the aspect of developing of the corpus

Iu.I. Butenko, G.O. Lukyanova
80,00 ₽

UDC 81`4

https://doi.org/10.20339/PhS.1-22.014

 

Butenko Iulia I.,

Candidate of Technical Sciences, Associate Professor of the

Romance-Germanic Languages Department

Bauman Moscow State Technical University

e-mail: iubutenko@bmstu.ru

Lukyanova Galina O.,

Candidate of Philology, Head of the Foreign Languages Department

Peoples’ Friendship University of Russia

e-mail: go_lukyanova.rudn@mail.ru

 

The article deals with the peculiarities of the markup of scientific and technical texts in developing a corpus of highly specialized texts. The scientific and technical texts as sources of filling the corpus are listed. The scientific and technical texts are analyzed from the position of markup of textual elements of different levels. The necessity of introducing interlevel types of markup of scientific and technical texts is substantiated. The significance of introducing structural markup when creating a corpus of scientific and technical texts is emphasized. The structural elements of scientific and technical texts for filling the corpus are listed. The current state of the problem of automatic extraction of terms from scientific and technical texts is analyzed. It is shown that the greatest difficulty is the marking of multicomponent terminological units in the corpus of scientific and technical texts. We identify literary terms as objects that require the development of additional tools for their processing, which may include various letters, symbols, numbers or their combinations. References as a factor influencing the classification and rubrication of scientific and technical texts are analyzed. The necessity of studying the types of references, as well as the ways of their automatic marking in the corpus of scientific and technical texts is substantiated. The necessity of introducing a separate marking of examples in scientific and technical texts is substantiated.

Keywords: scientific and technical text, corpus, markup, hierarchical-structured test, multi-component term.

 

References

1. Zakharov V.P. Korpusa russkogo iazyka // Trudy Instituta russkogo iazyka imeni V.V. Vinogradova. 2015. T. 6. S. 20–65.

2. Zakharov V.P., Khokhlova M.V. Avtomaticheskoe izvlechenie terminov iz spetsial’nykh tekstov s ispol’zovaniem distributivno-statisticheskogo metoda kak instrument sozdaniia tezaurusov // Strukturnaia i prikladnaia lingvistika. 2012. No. 9. S. 222–233.

3. Chashchina I.I., Andreeva N.P., Terenteva G.P. Istoriia stanovleniia terminovedeniia, perspektivy razvitiia // Kazanskaia nauka. 2021. No. 2. S. 94–99.

4. Grinev-Grinevich S.V., Sorokina E.A. Perspektivnye napravleniia razvitiia terminologicheskikh issledovanii // Vestnik Moskovskogo gosudarstvennogo oblastnogo universiteta. Seriia: Lingvistika. 2018. No. 5. S. 18–28.

5. Butenko Iu.I., Semenova E.L. Vliianie lingvisticheskikh osobennostei tekstov standartov na informatsionnyi poisk // Filologicheskie nauki. Nauchnye doklady vysshei shkoly. 2019. No. 6. S. 29–35. DOI 10.20339/PhS.6-19.029.

6. Butenko Iu. I., Margaryan T.D., Bolotova E.E. Scientific and technical text corpus as the basis for aerospace terminology standardization // Applied Linguistics Research Journal. 2021. Vol. 5 (3). P. 113–119.

7. Kruzhkov M.G. Informatsionnye resursy kontrastivnykh lingvisticheskikh issledovanii: elektronnye korpusa tekstov // Sistemy i sredstva informatiki. 2015. T. 25. No. 2. S. 140–159.

8. Lesnikov V.S. Vidy razmetok tekstovykh korpusov russkogo iazyka // Nauchno-tekhnicheskaia informatsiia. Ser. 2: Informatsionnye protsessy i sistemy. 2019. No. 9. S. 27–30.

9. Butenko Iu.I., Garazha V.V. BMSTU corpus of scientific and technical texts: conceptual framework // Applied Linguistics Research Journal. 2021. Vol. 5 (3). P. 76–81.

10. Popova N.G. Vvedenie k nauchnoi stat’e na angliiskom iazyke: struktura i kompozitsiia // Vysshee obrazovanie v Rossii. 2015. No. 6. S. 52–58.

11. Ivanov V.P. Kak napisat’ nauchnuiu stat’iu (struktura materiala i organizatsiia raboty) // Vestnik Polotskogo gosudarstvennogo univer-siteta. Ser. V: Promyshlennost’. Prikladnye nauki. 2016. No. 3. S. 195.

12. Rybakova G.R. O kategorii “uchebnyi tekst” v nauchnoi literature // Nauchnoe obozrenie. Ser. 2: Gumanitarnye nauki. 2011. No. 6. S. 64–73.

13. Leichik V.M. Iskhodnye poniatiia, osnovnye polozheniia, opredeleniia sovremennogo terminovedeniia i terminografii // Vestnik Khar’kovskogo politekhnicheskogo universiteta. 1994. No. 1. S. 147–180.

14. Grinev-Grinevich S.V., Sorokina E.A. Opyt opisaniia formal’noi struktury termina (na materiale angliiskoi terminologii leksikologii) // Vestnik Moskovskogo gosudarstvennogo oblastnogo universiteta. Seriia: Lingvistika. 2020. No. 5. S. 74–85.

15. Zakharov V.P., Khokhlova M.V. Avtomaticheskoe vyiavlenie terminologicheskikh slovosochetanii // Strukturnaia i prikladnaia lingvistika. 2014. No. 10. S. 182–200.

16. Citron D.T., Ginsparg P. Patterns of text reuse in a scientific corpus // PNAS. 2015. P. 25–30.

17. Kozlovskaya N.V., Janurik S. II-kompozity kak ob’’ekt neologii i neografii v XXI veke // Filologicheskie nauki. Nauchnye doklady vysshei shkoly. 2021. No. 2. S. 23–30. DOI 10.20339/PhS.2-21.023.

18. Batura T.V. Metody avtomaticheskoi klassifikatsii tekstov // Programmnye produkty i sistemy. 2017. No. 1 (30). S. 85–99.