Hybrid algorithm for object recognition in computer vision system

Sergey O. Vlasov
Postgraduate Student, Keldysh Institute of Applied Mathematics Russian Academy of Sciences (Keldysh IAM RAS), 4, Miusskaya pl., Moscow, 125047, Russia, This email address is being protected from spambots. You need JavaScript enabled to view it., ORCID: 0009-0003-3144-0973

Andrey A. Boguslavsky
Doctor of Physical and Mathematical Sciences, Associate Professor, Keldysh IAM RAS, Leading Research Scientist, 4, Miusskaya pl., Moscow, 125047, Russia, This email address is being protected from spambots. You need JavaScript enabled to view it., ORCID: 0000-0001-7560-339X

Sergey M. Sokolov
Doctor of Physical and Mathematical Sciences, Professor, Keldysh IAM RAS, Chief Research Scientist, 4, Miusskaya pl., Moscow, 125047, Russia, This email address is being protected from spambots. You need JavaScript enabled to view it., ORCID: 0000-0001-6923-2510

Received January 11, 2025

Abstract
The paper considers the problem of recognizing a man-made object in video sequences using a combined (hybrid) algorithm that combines the traditional computer vision method of searching for keypoints and a neural network approach. The simultaneous use of various approaches of detecting and classifying an object of interest in an image is one of the ways to improve the quality of computer vision systems in terms of reducing the number of recognition errors while maintaining the speed of data processing. The scheme of operation of the proposed algorithm is presented. The choice of convolutional neural network architectures and the type of keypoint detector, which suggest the possibility of solving the problem in real time, are described. The implemented algorithm searches for special points in areas of the original image that presumably contain an object of interest and determined by a convolutional neural network pre-trained on a data set. At the next stage of the algorithm the detected keypoints in the areas under consideration are compared with reference fragments. Based on the comparison a conclusion is made about the presence of an object of interest in the area of the original image under consideration. A series of computational experiments were conducted to evaluate the efficiency of the proposed algorithm in terms of accuracy and execution time. The influence of the adjustable parameters and the type of convolutional neural network architecture on the quality of the method under consideration was estimated.

Key words
Computer vision systems, object detection in images, hybrid algorithm.

Bibliographic description
Vlasov, S.O., Boguslavsky, А.A. and Sokolov, S.M. (2025), "Hybrid algorithm for object recognition in computer vision system", Robotics and Technical Cybernetics, vol. 13, no. 2, pp. 121-128, EDN: GEBKJB. (in Russian).

EDN
GEBKJB

UDC identifier
004.93'1

References

Ardalani, N., Pal, S. and Gupta, P (2022), “DeepFlow: A cross-stack pathfinding framework for distributed ai systems”, arXiv, arXiv:2211.03309, DOI: 10.48550/arXiv.2211.03309, available at: https://arxiv.org/abs/2211.03309 (Accessed 21 May 2024).
Taspinar, Y. and Murat, S (2020), “Object recognition with hybrid deep learning methods and testing on embedded systems”, International Journal of Intelligent Systems and Applications in Engineering, vol. 8, pp. 71-77, DOI: 10.18201/ijisae.2020261587.
Varga, L.A. and Zell, A. (2021), “Tackling the background bias in sparse object detection via cropped windows”, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2768-2777.
Strotov, V.V. and Zhgutov, P.E. (2022), “Combining several algorithms to increase the accuracy of pedestrian detection and localization”, Proceedings of the International Conference on Computer Graphics and Vision “Graphicon”, Ryazan, Russia, no. 32., pp. 628-635, DOI: 10.20948/graphicon-2022-628-635. (in Russian).
Volkova, M.A., (2024), Algorithmic support for processing segmented sensory information for tracking moving objects in real time, Ph.D. Thesis, System analysis, control and information processing, MIREA — Russian Technological University, Moscow. Russia. (in Russian).
Avramenko, V.S. and Chichkov, E.S. (2024), “Object recognition by unmanned aerial vehicles based on the mobilenet neural network”, Proceedings of the International Scientific and Technological Conference EXTREME ROBOTICS, Saint Petersburg, Russia, no. 1(34), pp. 261-266. (in Russian).
Xu, Z., Hrustic, E. and Vivet, D. (2020), “CenterNet heatmap propagation for real-time video object detection”, Proceedings of the computer vision (ECCV), Springer, Cham, pp. 220-234, DOI:10.1007/978-3-030-58595-2_14.
Beresnev, А.P, Zoev, I.V. and Markov, N.G. (2018), “Research on convolutional neural networks of YOLO class for mobile object detection system”, Proceedings of the International Conference on Computer Graphics and Vision "Graphicon", Tomsk, Russia, no. 28., pp. 196-199. (in Russian).
Kaehler, A. and Bradski, G. (2017), Izuchaem OpenCV 3 [Learning OpenCV 3], Translated by Slinkin, A.A., DMK Press, Moscow, Russia. (in Russian).
GitHub, “TensorFlow 2 Detection Model Zoo, Google”, available at: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md (Accessed 20 May 2024).
Tan, M.X., Pang, R.M. and Le, Q.V (2020), “EfficientDet: Scalable and efficient object detection”, Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10778−10787, DOI: 10.1109/CVPR42600.2020.01079.
Khryashchev, V.V. and Priorov, A.L. (2023), “The use of the EfficientDet neural network in the task of detecting stomach pathologies on video images of endoscopic examination”, Modeli, sistemy, seti v ekonomike, tekhnike, prirode i obshchestve [Models, systems, networks in economics, technology, nature and society], no. 2, pp. 185–192, DOI: 10.21685/2227-8486-2023-2-12. (in Russian).
Song, S., Jing, J., Huang, Y. and Shi, M. (2021), “EfﬁcientDet for fabric defect detection based on edge computing”, Journal of engineered fibers and fabrics, vol. 16, no. 1, p. 155892502110083, DOI: 1177/1558925021100834.
Reddy, K. S. S., Ramesh, G., Praveen, J., Surekha, P. and Sharma, A. (2023), “A real-time automated system for object detection and facial recognition”, E3S Web of Conferences, no. 430, DOI: 10.1051/e3sconf/202343001076.
Duan, K., Bai, S., Xie, L., Qi, H. et al. (2022), “Centernet++ for object detection”, arXiv, arXiv:2204.08394, DOI: 10.48550/arXiv.2204.08394, available at: https://arxiv.org/abs/2204.08394 (Accessed 28 April 2024).
Suma, K.G., Sunitha, G., Karnati, R., Aruna, E.R. et al. (2024), “CETR: CenterNet-Vision transformer model for wheat head detection”, Journal of Autonomous Intelligence, vol. 7, no. 3, pp. 1-12., DOI: 10.32629/jai.v7i3.1189.
Ma, G., Wang, X., Liu, J., Chen, W. et al. (2022), “Intelligent detection of foreign matter in coal mine transportation belt based on convolution neural network”, Scientific Programming, 3, DOI: 10.1155/2022/9740622.
Lim, J., Astrid, M. (2019), “Small object detection using context and attention”, arXiv arXiv:1912.06319, DOI: 10.48550/arXiv.1912.06319, available at: https://arxiv.org/abs/1912.06319 (Accessed 28 April 2024).
Burdukowsky, S.O. (2022), “The efficiency of the tensorflow models in the application to the task of detection of eyes in the photo”, Vestnik NSUEM, vol. 2, pp. 228-238, DOI: 10.34020/2073-6495-2022-2-228-238. (in Russian).
GitHub, “Drone-tracking-datasets”, available at: https://github.com/CenekAlbl/drone-tracking-datasets (Accessed 27 May 2024).
GitHub, “LabelImg”, fvailable at: https://github.com/heartexlabs/labelImg (Accessed 21 May 2024).