类: OCRResult

2022/10/22约 1500 字大约 5 分钟老猫

ocr.OCRResult

OCR.detect 返回的数组的元素对象，包含了文字识别的可信度、文本内容、文本范围、文本旋转度以及文本旋转度的可信度等。

OCRResult 对象表示一次 OCR 识别的结果，包含了识别到的文字及其在图片中的位置、可信度等信息。可以通过 click() 方法直接点击识别到的文字位置。

Constructors

constructor

• new OCRResult(javaObject, bounds, text, confidence, rotation, rotationConfidence)

创建 OCR 识别结果对象。

参数

名称	类型	描述
`javaObject`	`JavaObject`	OCR 识别结果的原始 Java 对象
`bounds`	`Rect`	该识别文字在图片中的范围
`text`	`string`	OCR 识别的文字内容
`confidence`	`number`	OCR 识别文字的可信度
`rotation`	`number`	该识别文字在图片中的旋转角度
`rotationConfidence`	`number`	该识别文字的旋转角度可信度

属性

javaObject

• Readonly javaObject: JavaObject

OCR 识别结果的原始 Java 对象。在官方 PaddleOCR 中没有什么用，在其他官方 OCR 中可能可以获取附加的额外信息，比如行、字段落、词语分割。

bounds

• Readonly bounds: Rect

该识别文字在图片中的范围。可通过 left、top、right、bottom 等属性获取它的位置，通过 centerX、centerY 获得它的中心点，通过 width、height 获取文字区域的宽度和高度。

属性说明

bounds.left - 文字区域左边界的 X 坐标
bounds.top - 文字区域上边界的 Y 坐标
bounds.right - 文字区域右边界的 X 坐标
bounds.bottom - 文字区域下边界的 Y 坐标
bounds.centerX - 文字区域中心点的 X 坐标（只读属性）
bounds.centerY - 文字区域中心点的 Y 坐标（只读属性）
bounds.width - 文字区域的宽度（只读属性）
bounds.height - 文字区域的高度（只读属性）

示例

"nodejs";
const { createOCR } = require("ocr");
const { requestScreenCapture } = require("media_projection");

async function main() {
  const ocr = await createOCR();
  const capturer = await requestScreenCapture();
  const img = await capturer.nextImage();
  const results = await ocr.detect(img);

  for (const result of results) {
    const bounds = result.bounds;
    console.log(`文字: ${result.text}`);
    console.log(`位置: (${bounds.left}, ${bounds.top})`);
    console.log(`大小: ${bounds.width} x ${bounds.height}`);
    console.log(`中心点: (${bounds.centerX}, ${bounds.centerY})`);
    console.log(`右下角: (${bounds.right}, ${bounds.bottom})`);
  }

  ocr.release();
  capturer.stop();
}
main();

text

• Readonly text: string

OCR 识别的文字内容。

示例

"nodejs";
const { createOCR } = require("ocr");

async function main() {
  const ocr = await createOCR();
  const results = await ocr.detect(img);

  for (const result of results) {
    console.log(`识别到的文字: ${result.text}`);
  }

  ocr.release();
}
main();

confidence

• Readonly confidence: number

OCR 识别文字的可信度，范围为 [0, 1]，越接近 1 表示结果越准确、可信。

可信度说明

0.9 - 1.0: 非常高的可信度，识别结果基本准确
0.7 - 0.9: 较高的可信度，识别结果通常准确
0.5 - 0.7: 中等可信度，识别结果可能包含错误
0.0 - 0.5: 较低的可信度，识别结果可能不准确

示例

"nodejs";
const { createOCR } = require("ocr");
const { requestScreenCapture } = require("media_projection");

async function main() {
  const ocr = await createOCR();
  const capturer = await requestScreenCapture();
  const img = await capturer.nextImage();
  const results = await ocr.detect(img);

  // 只输出可信度大于 0.8 的结果
  const highConfidenceResults = results.filter((r) => r.confidence > 0.8);
  console.log(`高可信度结果数量: ${highConfidenceResults.length}`);

  for (const result of highConfidenceResults) {
    console.log(
      `文字: ${result.text}, 可信度: ${(result.confidence * 100).toFixed(1)}%`
    );
  }

  // 按可信度排序
  const sortedResults = results.sort((a, b) => b.confidence - a.confidence);
  console.log("可信度最高的文字:", sortedResults[0]?.text);

  ocr.release();
  capturer.stop();
}
main();

rotation

• Readonly rotation: number

该识别文字在图片中的旋转角度，范围为 [0, 360)，一般取值为 0 和 180 度。该字段仅在 OCRDetectionOptions 的 detectRotation 为 true 时有效。

注意

如果 OCRDetectionOptions.detectRotation 为 false（默认值），此字段的值可能不准确或为 0
旋转角度以度为单位，0 度表示文字正常方向，180 度表示文字倒置

示例

"nodejs";
const { createOCR } = require("ocr");
const { requestScreenCapture } = require("media_projection");

async function main() {
  const ocr = await createOCR();
  const capturer = await requestScreenCapture();
  const img = await capturer.nextImage();

  // 启用旋转检测
  const results = await ocr.detect(img, {
    detectRotation: true,
  });

  for (const result of results) {
    if (result.rotation !== 0) {
      console.log(`文字: ${result.text}`);
      console.log(`旋转角度: ${result.rotation} 度`);
      console.log(`旋转可信度: ${result.rotationConfidence}`);
    }
  }

  ocr.release();
  capturer.stop();
}
main();

rotationConfidence

• Readonly rotationConfidence: number

该识别文字的旋转角度可信度，范围为 [0, 1]。该字段仅在 OCRDetectionOptions 的 detectRotation 为 true 时有效。

注意

如果 OCRDetectionOptions.detectRotation 为 false（默认值），此字段的值可能不准确或为 0
可信度越接近 1，表示旋转角度的检测结果越可靠

方法

click

▸ click(): Promise<boolean>

在屏幕上点击 OCR 结果在图片中范围的中点位置，返回是否点击成功。实际上相当于 click(result.bounds.centerX, result.bounds.centerY)。

注意

此方法会使用无障碍服务进行点击操作
点击位置是文字区域的中点，即 (bounds.centerX, bounds.centerY)
如果截图和当前屏幕内容不一致，点击位置可能不准确
建议在截图后立即进行识别和点击，避免屏幕内容变化

示例

"nodejs";
const { createOCR } = require("ocr");
const { requestScreenCapture } = require("media_projection");
const { delay } = require("lang");

async function main() {
  const ocr = await createOCR();
  const capturer = await requestScreenCapture();

  // 等待第一张截图
  await capturer.awaitForImageAvailable();

  // 获取截图并识别文字
  const img = capturer.latestImage();
  const results = await ocr.detect(img);

  // 查找包含特定文字的结果并点击
  for (const result of results) {
    if (result.text.includes("确定") || result.text.includes("OK")) {
      console.log(`找到按钮: ${result.text}`);
      console.log(`位置: (${result.bounds.centerX}, ${result.bounds.centerY})`);

      const success = await result.click();
      console.log(`点击结果: ${success ? "成功" : "失败"}`);

      if (success) {
        await delay(1000); // 等待界面响应
      }
      break;
    }
  }

  ocr.release();
  capturer.stop();
}
main();

更多使用场景

"nodejs";
const { createOCR } = require("ocr");
const { requestScreenCapture } = require("media_projection");
const { delay } = require("lang");

async function main() {
  const ocr = await createOCR();
  const capturer = await requestScreenCapture();

  // 场景1: 查找并点击特定文字
  const img = await capturer.nextImage();
  const results = await ocr.detect(img);

  const targetText = "登录";
  const targetResult = results.find((r) => r.text === targetText);
  if (targetResult && targetResult.confidence > 0.8) {
    await targetResult.click();
  }

  // 场景2: 过滤低可信度结果后点击
  const highConfidenceResults = results.filter((r) => r.confidence > 0.9);
  for (const result of highConfidenceResults) {
    if (result.text.match(/^\d+$/)) {
      // 只点击纯数字
      await result.click();
      await delay(500);
    }
  }

  // 场景3: 按位置排序后依次点击
  const sortedResults = results.sort((a, b) => {
    if (a.bounds.top !== b.bounds.top) {
      return a.bounds.top - b.bounds.top; // 先按 Y 坐标排序
    }
    return a.bounds.left - b.bounds.left; // 再按 X 坐标排序
  });

  for (const result of sortedResults) {
    console.log(`点击: ${result.text}`);
    await result.click();
    await delay(300);
  }

  ocr.release();
  capturer.stop();
}
main();

返回值

Promise<boolean>

返回一个 Promise，resolve 时返回是否点击成功。true 表示点击成功，false 表示点击失败。