2018年6月2日土曜日

gentooで、OpenCV + tesseract とGPU

画像から文字を取り出す


普通にOCR試したかっただけなのですが、結果から申しますと、期待していたほど精度もさほど良くなく、日本語の抽出はまったくうまくいきませんでした。(やり方がまずかったのか...)

設定と色々インストール


このマシンにはGeForce GTX 1050が刺さっているの、openclとopenglの動作する実装をeselectで切り替えておく
ugui7 ~ # eselect opencl list
Available OpenCL implementations:
  [1]   mesa
  [2]   nvidia *
ugui7 ~ # eselect opengl list
Available OpenGL implementations:
  [1]   nvidia *
  [2]   xorg-x11
ugui7 ~ # 

ちなみにopenclの実装をmesaに切り替えると、/usr/lib64/libtesseract.so.3の呼び出しでSEGVで腐る、多分、OpenclDeviceが帰ってこないようです。
#0  0x00007fffefd624b6 in strlen () from /lib64/libc.so.6
#1  0x00007fffee5ab5bf in OpenclDevice::getDeviceSelection() () from /usr/lib64/libtesseract.so.3
#2  0x00007fffee5ad1f8 in OpenclDevice::InitOpenclRunEnv_DeviceSelection(int) () from /usr/lib64/libtesseract.so.3
#3  0x00007fffee5ad25b in OpenclDevice::InitEnv() () from /usr/lib64/libtesseract.so.3
#4  0x00007fffee3b2e6a in tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, GenericVector<STRING> const*, GenericVector<STRING> const*, bool) () from /usr/lib64/libtesseract.so.3
#5  0x00007ffff38ae61e in cv::text::OCRTesseract::create(char const*, char const*, char const*, int, int) () from /usr/lib64/libopencv_text.so.3.4
#6  0x000055555555785d in main () at ocrtesseract.cpp:16

話戻して、パッケージはこんな感じ
ugui7 ~ # emerge -pv app-text/tesseract media-libs/opencv

 * IMPORTANT: config file '/etc/portage/package.keywords' needs updating.
 * See the CONFIGURATION FILES and CONFIGURATION FILES UPDATE TOOLS
 * sections of the emerge man page to learn how to update config files.

These are the packages that would be merged, in order:

Calculating dependencies... done!
[ebuild   R    ] app-text/tesseract-3.05.01::gentoo  USE="doc jpeg opencl png tiff -examples -math -osd -scrollview -static-libs -training -webp" L10N="ja -ar -bg -ca -chr -cs -da -de -el -es -fi -fr -he -hi -hu -id -it -ko -lt -lv -nl -no -pl -pt -ro -ru -sk -sl -sr -sv -th -tl -tr -uk -vi -zh-CN -zh-TW" 0 KiB
[ebuild   R   ~] media-libs/opencv-3.4.1-r2:0/3.4.1::gentoo  USE="contrib contrib_dnn eigen ffmpeg gtk ieee1394 jpeg jpeg2k opencl opengl openmp png python tesseract threads tiff -contrib_cvv -contrib_hdf -contrib_sfm -contrib_xfeatures2d -cuda -debug -dnn_samples -examples -gdal -gflags -glog -gphoto2 -gstreamer (-ipp) -java -lapack -libav -openexr -pch -qt5 -testprograms -v4l -vaapi -vtk -webp -xine" ABI_X86="32 (64) (-x32)" CPU_FLAGS_X86="sse sse2 -avx -avx2 -fma3 -popcnt -sse3 -sse4_1 -sse4_2 -ssse3" PYTHON_TARGETS="python2_7 python3_6 -python3_4 -python3_5" 0 KiB

Total: 2 packages (2 reinstalls), Size of downloads: 0 KiB

 * IMPORTANT: 36 news items need reading for repository 'gentoo'.
 * Use eselect news read to view new items.

ugui7 ~ # 

取り出してみる


画像はこちら



OCRTesseract::create関数のengをjpnにすれば、日本語の解析ができるようですが、抽出はうまく出来ませんでした、なので英語版で動作確認です。
#include <opencv2/opencv.hpp>
#include <opencv2/text.hpp>

using namespace std;

int main(void)
{
    auto image = cv::imread("test.jpg");
    cv::Mat gray;
    cv::cvtColor(image, gray, cv::COLOR_RGB2GRAY);
    
    string result;
    vector<cv::Rect> boxes;
    vector<string> words;
    vector<float> confidences;
    printf("Initialize OCRTesseract...\n");
    auto ocr = cv::text::OCRTesseract::create("/usr/share/tessdata", "eng", NULL, cv::text::OEM_DEFAULT, cv::text::PSM_AUTO);
    ocr->run(gray, result, &boxes, &words, &confidences);

    cout << " String              | Posistion  | Size       | confidences" << endl;
    cout << "---------------------+------------+------------+------------" << endl;
    for (int i = 0; i < boxes.size(); i++) {
        printf("%-20s | (%3d, %3d) | (%3d, %3d) | %f\n",
               words[i].c_str(),
               boxes[i].x, boxes[i].y,
               boxes[i].width, boxes[i].height,
               confidences[i]);
    }
    cout << endl << "Result:\n-------" << endl;
    cout << result.c_str();

    return 0;
}

そして、ビルド、実行結果
cuomo@ugui7 ~/opencv $ gcc -g -O0 `pkg-config opencv --cflags --libs` -lstdc++ ocrtesseract.cpp
cuomo@ugui7 ~/opencv $ ./a.out 
Initialize OCRTesseract...
[DS] Profile read from file (tesseract_opencl_profile_devices.dat).
[DS] Device[1] 1:GeForce GTX 1050 score is 0.087704
[DS] Device[2] 0:(null) score is 0.372587
[DS] Selected Device[1]: "GeForce GTX 1050" (OpenCL)
 String              | Posistion  | Size       | confidences
---------------------+------------+------------+------------
The                  | ( 39,  54) | ( 37,  17) | 87.536118
first                | ( 91,  54) | ( 63,  17) | 71.520576
step                 | (169,  56) | ( 50,  19) | 85.904022
is                   | (234,  54) | ( 23,  17) | 73.084877
always               | (273,  54) | ( 75,  21) | 78.761841
the                  | (364,  54) | ( 37,  17) | 88.950020
hardest              | (417,  54) | ( 97,  17) | 46.006390
There                | ( 39,  80) | ( 63,  17) | 85.972023
is                   | (117,  80) | ( 23,  17) | 73.084877
no                   | (157,  85) | ( 23,  12) | 85.654243
royal                | (196,  80) | ( 62,  21) | 79.272545
road                 | (274,  80) | ( 48,  17) | 85.654243
to                   | (338,  82) | ( 24,  15) | 85.466003
learning             | (377,  80) | (111,  21) | 46.006390
It s                 | ( 39, 105) | ( 49,  18) | 26.470589
never                | (105, 111) | ( 61,  12) | 85.544647
too                  | (182, 108) | ( 37,  15) | 85.466003
late                 | (234, 106) | ( 50,  17) | 78.761841
to                   | (299, 108) | ( 24,  15) | 85.466003
learn                | (338, 106) | ( 72,  17) | 46.006390
Kuso                 | ( 39, 134) | ( 50,  15) | 85.498596
ka2                  | (104, 132) | ( 36,  17) | 72.482178
Dnaeha               | (156, 132) | (111,  17) | 46.006390

Result:
-------
The first step is always the hardest 
There is no royal road to learning 
It s never too late to learn 

Kuso ka2 Dnaeha

普通にかかれている英文はうまく取れましたが、「?」とか入っているとなんか腐るようです。

ほんとにGPU使ってんのか?


x11-drivers/nvidia-drivers-396.24パッケージにnvidia-smiというコマンドが入っているのでこれで確認してみた。
cuomo@ugui7 ~ $ nvidia-smi -l 1 -i 0
Sat Jun  2 10:00:37 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.24                 Driver Version: 396.24                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:01:00.0  On |                  N/A |
| 35%   35C    P0    N/A /  75W |    244MiB /  1999MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       445      G   /usr/bin/X                                   189MiB |
|    0     23555      C   ./a.out                                       43MiB |
+-----------------------------------------------------------------------------+
...

Type CなのでCompute Processでa.out動いてるっぽい、が、OCRてきなところは期待していたより、さほど良くないみたい。

0 件のコメント:

コメントを投稿