面向纯JavaScript的OCR识别引擎——Tesseract.js

Javascript 梦想屋 2020-4-3 16978 2

介绍

Tesseract.js是流行的面向纯Javascript的OCR引擎的。该库支持100多种语言（中文支持），自动文本方向和脚本检测，用于读取段落，单词和字符边界框的简单界面。Tesseract.js可以在浏览器和具有NodeJS服务器上运行。

面向纯JavaScript的ORC识别引擎——Tesseract.js

Github

面向纯JavaScript的ORC识别引擎——Tesseract.js

https://github.com/naptha/tesseract.js

使用方式

# For v2版本
npm install tesseract.js
yarn add tesseract.js

# For v1版本
npm install tesseract.js@1
yarn add tesseract.js@1

可以配合webpack或者直接在浏览器中引用

import Tesseract from 'tesseract.js';

Tesseract.recognize(
  'url.png',
  'eng',
  { logger: m => console.log(m) }
).then(({ data: { text } }) => {
  console.log(text);
})

import { createWorker } from 'tesseract.js';

const worker = createWorker({
  logger: m => console.log(m)
});

(async () => {
  await worker.load();
  await worker.loadLanguage('eng');
  await worker.initialize('eng');
  const { data: { text } } = await worker.recognize('url.png');
  console.log(text);
  await worker.terminate();
})();

使用场景

你可以用在你想使用的地方，官方提供了10种使用方式，分别是

在线版本

https://github.com/jeromewu/tesseract.js-offline

Electron版本

https://github.com/jeromewu/tesseract.js-electron

自定义训练数据

https://github.com/jeromewu/tesseract.js-custom-traineddata

Chrome扩展程序

https://github.com/jeromewu/tesseract.js-chrome-extension

Chrome Extension #2: https://github.com/fxnoob/image-to-text

Vue版本

https://github.com/jeromewu/tesseract.js-vue-app

React版本

https://github.com/jeromewu/tesseract.js-react-app

Angular版本

https://github.com/jeromewu/tesseract.js-angular-app

Typescript版本

https://github.com/jeromewu/tesseract.js-typescript

视频实时识别

https://github.com/jeromewu/tesseract.js-video

面向纯JavaScript的ORC识别引擎——Tesseract.js

总结

在日常的开发中OCR的使用场景或许还是蛮多的，如果你刚好有这种需求，不妨试一试Tesseract.js，enjoy it！

面向纯JavaScript的ORC识别引擎——Tesseract.js

javascript js-ocr识别 electron github Chrome nodejs https React http Java 浏览器 Vue ide

面向纯JavaScript的OCR识别引擎——Tesseract.js

介绍

Github

使用方式

使用场景

总结

您可能还会对下面的文章感兴趣：

相关文章

最新评论