1.windows上要安装tesseract-ocr-w64-setup-v5.0.0.20190623.exe程序,记住安装地址设置环境变量
下载地址:https://digi.bib.uni-mannheim.de/tesseract/
2.linux上要安装tesseract-ocr库,
sudo add-apt-repository ppa:alex-p/tesseract-ocr
3.
pip install pytesseract
尝试运行demo:
import pytesseract
from PIL import Image
image = Image.open("code.png")
text = pytesseract.image_to_string(image)
print(text)
很有可能报错:tesseract is not installed or it's not in your path
解决方法:
打开pytesseract.py文件,在该文件中找到以下代码:
try:
from PIL import Image
except ImportError:
import Image
tesseract_cmd = 'tesseract'
将tesseract_cmd 修改为Tesseract-OCR的tesseract.exe安装目录:r'D:\Program Files\Tesseract-OCR\tesseract.exe',结果如下:
try:
from PIL import Image
except ImportError:
import Image
# tesseract_cmd = 'tesseract'
tesseract_cmd = r'D:\Program Files\Tesseract-OCR\tesseract.exe'
修改完毕,保存,记得重新启动python,重新运行即可。
该库识别成功率不是很高,不建议使用。