Hey @Sobia, in situations like virtual machine environments, certain applications are not compatible with the usage of normal scraping or UI automation technologies. So UiPath Studio uses OCR Activities which scan the entire screen of the machine, finding all the characters that are displayed. This enables the user to create automations based on what can be seen on the screen, simplifying automation in virtual machine environments. Citrix and other remote desktop utilities largely use OCR-based activities, as they only stream an image of the desktop to the user, which means normal UI selectors are impossible to find. Some of the OCR-based activities are:
1. Double Click OCR Text / Click OCR Text / Hover OCR Text: Uses OCR to scan the screen of the machine for text and perform actions relative to it. These are very useful activities in automating basic actions in virtual machine environments.
2.Get OCR Text: Extracts a string and its information from an indicated UI element using the OCR screen scraping method. By default, the Google OCR engine is used, but you can easily change it with Abbyy or Microsoft. This activity returns a string variable containing the text found in the UI element, and a TextInfo variable that contains the screen coordinates of all the found words.
3. Find OCR Text Position: Searches for a given string in an UI element, and returns a UIElement variable which contains the said string and the position where the text was found. This activity can be useful in locating UI elements relative to text on the screen.
4. OCR Text Exists: Checks if a text is found in a given UI element by using OCR technology and returns a boolean variable that is true if the text exists and false otherwise. This activity is useful in all types of text-based automation, as it enables you to make decisions based on whether or not a given string is displayed.
5. OCR Engines: OCR engines such as Google OCR, Google Cloud OCR, Microsoft OCR, Microsoft Cloud OCR and Abbyy Cloud OCR are available as separate activities. These activities extract a string and its position from a provided image by using different OCR engines. These activities can be used with other OCR activities and as output, the activities return an IEnumerable<KeyValuePair<Rectangle,String>> variable, which contains the extracted text and their on-screen coordinates, and a string variable which contains the extracted text.