Automation of the marked-up data sets formation for machine learning based on simulation modeling

Andrei A. Smirnov
Doctor of Military Sciences, Military Telecommunications Academy named after S. M. Budyonniy (VAS), Assistant Professor, 3, Tikhoretsky pr., Saint Petersburg, 194064, Russia, This email address is being protected from spambots. You need JavaScript enabled to view it., SPIN-код: 8559-4689, AuthorID: 850972

Аlexander M. Kudriavtsev
Doctor of Military Sciences, Professor, VAS, Professor, 3, Tikhoretsky pr., Saint Petersburg, 194064, Russia, This email address is being protected from spambots. You need JavaScript enabled to view it., SPIN-код: 4031-3294, AuthorID: 847484

Received September 19, 2023

The problematic issue of generating marked-up data sets for training artificial intelligence systems preparing for autonomous operation in the proposed new conditions, when the formation of a feature description of objects and situations with a known target variable is impossible or significantly difficult, is considered. As a general approach to its solution, it is proposed to build and use a software testing area that provides the generation of training and control samples, checking the effectiveness of various machine learning methods on them, forming sets of informative features of objects and phenomena, bundles of "vector-implementation of features – classification method (clustering, regression)". The results of the systematization of scientific approaches to the definition of machine learning types, as well as the main machine learning methods used in data mining, are presented. Using the example of solving the problem of assessing the dynamics of changes in the radio-electronic environment, the features of real feature descriptions of objects and phenomena that significantly affect the quality of problem solving are shown. To form training samples that adequately reflect these features, it is proposed to use simulation modeling systems that implement an agent-oriented approach to model construction. An example of setting a task and constructing such a model is presented, which ensures the formation of marked-up data sets corresponding to the simulated environmental conditions.

Key words
Information processing, data analysis, agent-based approach, AnyLogic.


Bibliographic description
Smirnov, A.A. and Kudriavtsev, A.M. (2024), "Automation of the marked-up data sets formation for machine learning based on simulation modeling", Robotics and Technical Cybernetics, vol. 12, no. 2, pp. 109-117, DOI: 10.31776/RTCJ.12204. (in Russian).

UDC identifier


