您现在的位置：程式師世界 >> 編程語言 > >> 更多編程語言 >> Python

Python: under window10 OCR in actual combat

編輯：Python

前言

最近在做一個ocr的項目,Mainly identification Numbers and English letters,So now trying various open source based onpython語言(其實是想在pytorch框架下使用）的ocr庫

Enclosed test images as before：

easyocr

 pip install easyocr

在大量warningAfter successful installation！！！

Because I mainly identification Numbers and English letters,Using the Chinese version the effect not beautiful,It calls the principle seems to beeasyocr.Reader()The first parameter to the list will only call first,After the switch to English error rate as1/14(Only with the above measure)

import easyocr
path='C:/Users/11027/Desktop/test1.jpg'
reader = easyocr.Reader(['en'],gpu = False)
result = reader.readtext(path)
print(result)
#22G1會識別成'2261'

輸出：

Using CPU. Note: This module is much faster with a GPU.
[([[307, 67], [349, 67], [349, 121], [307, 121]], '[0]', 0.2496181348819496), ([[82.817179374673, 18.268717498692013], [226.37590341350617, -1.1571357200124375], [232.182820625327, 75.73128250130799], [88.62409658649383, 95.15713572001243]], 'TBJU', 0.8707327246665955), ([[86.65181877371504, 84.3117294627456], [289.02336666519545, 60.9225513668679], [292.34818122628496, 138.6882705372544], [89.97663333480457, 162.0774486331321]], '004057', 0.9909699998969699), ([[88.96405360576286, 157.26340370189882], [224.4601580727819, 137.28127584248975], [231.03594639423716, 210.73659629810118], [96.5398419272181, 230.71872415751025]], '2261', 0.9822619656327679)]

In order not to let you see too hard,截了一個圖

pytesseract

先去下載安裝包
然後在終端

pip install pytesseract

Give a piece of the installation is complete figure：

I started an error.

Didn't check the two points on the path：
①Environment path to increase the two system environment variables(To be honest I'm not sure whether the two are useful but nothing effect）

The specific variables increaseAccording to the location of the download.For example, I download location isD:\computervision\ocr\ocr_project\Tesseract-OCR.

The first is the addition of

第二個是加在path中

②修改pytesseract.py
This is also see their own Settings,我是直接pip install pytesseract在虛擬環境裡,So its position inD:\conda\envs\pytorch\Lib\site-packages\pytesseract

測試代碼

from PIL import Image
import pytesseract
path='C:/Users/11027/Desktop/test1.jpg'
text = pytesseract.image_to_string(path)
print(text)

Big to send！Didn't recognize anything to（是有效的,But not as for the English character and digital effects）

Retraining is too much trouble to skip

cnocr

下載的話：

pip3 install cnocr -i https://mirrors.aliyun.com/pypi/simple/

會有一堆warning,但是不影響！

And then for differentocrTasks have different solutions,Specific can see officialgithub

我用的是1.8MThe but it is still compares the other open sourceocrEffect of library is the most sweet

from cnocr import CnOcr
path='C:/Users/11027/Desktop/test1.jpg'
ocr = CnOcr(rec_model_name='en_number_mobile_v2.0')
out = ocr.ocr(path)
print(out)
# import testcnocr
# print(cnocr.__file__)

All can identify to！！！