如何识别/识别/从截图物体在屏幕上(文本框,标签,按钮等)的不同类型的?物体、不同类型、截图、文本框

2023-09-11 23:25:16 作者:泡芙.

我想提出一些能够识别不同的物体在屏幕上。可以说,我采取截图与文本框,标签和按钮的窗口。我想通过在图像中,它应能区分从其他。换句话说,它应该把名称文本字段'上,其中文本框位于按钮顶部和标签在标签上的顶部,按钮的位置上。

I would like to make something that is able to recognize different objects on the screen. Lets say I take a screenshot on a window with textfields, labels and buttons. I would like to pass in the image and it should be able to distinguish one from the other. In other words, it should put the name 'textfield' on top of the position where the textfields are located, 'button' on top of buttons and 'label' on top of labels.

下面是从互联网上一个样本图像,以可视化注册窗口:的 https://m.xsw88.com/allimgs/daicuo/20230911/4452.png.jpg

Here is a sample image from the internet, to visualize a 'registration window': https://m.xsw88.com/allimgs/daicuo/20230911/4452.png.jpg

我想做到这一点在Java中,但我不能确定,如果这甚至有可能。有没有人有任何想法我应该在哪里开始找?边缘检测?功能检测? OCR / ICR?

I would like to do this in Java, but I'm unsure if this is even possible. Does anyone have any ideas where I should start looking? Edge detection? Feature detection? OCR/ICR?

这是否已经存在?任何人都曾经遇到过这样的事情之前?

Does this already exist? Anyone ever come across something like this before?

可能有人请点我朝着正确的方向?我会强烈AP preciate它。

Could someone please point me to the right direction? I would highly appreciate it.

感谢您! :)

推荐答案

这是我会的工作就可以了:

This is how I would work on it:

A)识别/分割。不知道你的数据,你可能会被罚款的东西,如找一个矩形(或接近它,因为边缘是圆的)的一半不到你的Windows区域(取决于您的数据。)

A) Identification/Segmentation. Without knowing your data, you might be fine with something like "Find a rectangle (or something close to it, since edges are rounded) of less than half of your windows' area" (depends on your data..).

B)分类。就个人而言,我会规模每一个对象,你发现尺寸100 * 100(或其他),并与样本数据进行比较(是的,你可以扩展一个小复选框的大小。它不会显得pretty的,但不要紧它的样子..)。无论是蛮力(这就是为什么我缩放)或一些不错的分类算法。 (不要用神经网络,去支持向量机和近邻)。对于分类,我主要是看直方图和形状的矩形内部因素/瞬间。如果文本混淆的数据,摆脱它的分类之前的一些形态。

B) Classification. Personally, I'd scale every object you found to size 100*100 (or, whatever) and compare it with sample data (yes, you can scale a mini checkbox to that size. It won't look pretty, but it doesn't matter how it looks like..). Either "brute force" (which is why I scaled) or some nice classification algorithm. (Don't use neural networks, go SVM or nearest neighbour). For classification, I'd mostly look at histograms and shape factors/moments inside the rectangle. If the text confuses the data, get rid of it with some morphology before classification.

文本框是一个有点棘手,但对于这一点,我会使用一些OCR库,并期待在整个画面。 (就个人而言,我已经与IMAQ做得很好,但它的商业)。如果文本之外的一个箱子,你已经有了自己的标签。

Textfields are a bit tricky, but for that, I'd use some OCR library and look at the entire picture. (Personally, I've done well with IMAQ, but it's commercial). If the text is outside a box, you've got yourself a label.

您或许应该看看OpenCV的。

You should probably look into OpenCV.