Development of an AI/CV algorithm for recording and tracking user actions on the SIT Alemira Virtual Labs platform

Abstract

This paper presents a description and results of the development of a solution to the User Interface
Components Detection problem using Computer Vision algorithms. To solve this problem,
test data sets were generated (in total, more than 60,000 images, including their augmented
copies (resizing, color, displacement, stretching)), and 17 different models of object detection
in the image have been developed and tested. Among them, the most successful was selected -
a three-stage composite model for recognizing 6 different types of User Interface Components
(button, checkbox, folder, icon, text, toggle-switch). The first part of the model is a fast binary
classifier for detecting an object in the picture. The second part is a classifier for determining
the type of object. The third part is a detector for recognizing text on the object image.
Each of the obtained models gave an accuracy of more than 85%.
To work with the developed model, a simple UI interface was created in the form of a web
application, which provides the ability to record user activity (where each user step is recorded
and processed using the model described above), manually edit the received activity logs, as
well as “passing the log” - step-by-step performing the actions described in the log with their
automatic comparison (using image comparison algorithms).
This web application is made in a generalized form, which allows you to supplement it and
embed its functionality into other applications using the API.

Attachments
Publication Type
Publication Year
Subject
Computer Science