画像から文字を取り出す
普通にOCR試したかっただけなのですが、結果から申しますと、期待していたほど精度もさほど良くなく、日本語の抽出はまったくうまくいきませんでした。(やり方がまずかったのか...)
設定と色々インストール
このマシンにはGeForce GTX 1050が刺さっているの、openclとopenglの動作する実装をeselectで切り替えておく
ugui7 ~ # eselect opencl list Available OpenCL implementations: [1] mesa [2] nvidia * ugui7 ~ # eselect opengl list Available OpenGL implementations: [1] nvidia * [2] xorg-x11 ugui7 ~ #
ちなみにopenclの実装をmesaに切り替えると、/usr/lib64/libtesseract.so.3の呼び出しでSEGVで腐る、多分、OpenclDeviceが帰ってこないようです。
#0 0x00007fffefd624b6 in strlen () from /lib64/libc.so.6 #1 0x00007fffee5ab5bf in OpenclDevice::getDeviceSelection() () from /usr/lib64/libtesseract.so.3 #2 0x00007fffee5ad1f8 in OpenclDevice::InitOpenclRunEnv_DeviceSelection(int) () from /usr/lib64/libtesseract.so.3 #3 0x00007fffee5ad25b in OpenclDevice::InitEnv() () from /usr/lib64/libtesseract.so.3 #4 0x00007fffee3b2e6a in tesseract::TessBaseAPI::Init(char const*, char const*, tesseract::OcrEngineMode, char**, int, GenericVector<STRING> const*, GenericVector<STRING> const*, bool) () from /usr/lib64/libtesseract.so.3 #5 0x00007ffff38ae61e in cv::text::OCRTesseract::create(char const*, char const*, char const*, int, int) () from /usr/lib64/libopencv_text.so.3.4 #6 0x000055555555785d in main () at ocrtesseract.cpp:16
話戻して、パッケージはこんな感じ
ugui7 ~ # emerge -pv app-text/tesseract media-libs/opencv * IMPORTANT: config file '/etc/portage/package.keywords' needs updating. * See the CONFIGURATION FILES and CONFIGURATION FILES UPDATE TOOLS * sections of the emerge man page to learn how to update config files. These are the packages that would be merged, in order: Calculating dependencies... done! [ebuild R ] app-text/tesseract-3.05.01::gentoo USE="doc jpeg opencl png tiff -examples -math -osd -scrollview -static-libs -training -webp" L10N="ja -ar -bg -ca -chr -cs -da -de -el -es -fi -fr -he -hi -hu -id -it -ko -lt -lv -nl -no -pl -pt -ro -ru -sk -sl -sr -sv -th -tl -tr -uk -vi -zh-CN -zh-TW" 0 KiB [ebuild R ~] media-libs/opencv-3.4.1-r2:0/3.4.1::gentoo USE="contrib contrib_dnn eigen ffmpeg gtk ieee1394 jpeg jpeg2k opencl opengl openmp png python tesseract threads tiff -contrib_cvv -contrib_hdf -contrib_sfm -contrib_xfeatures2d -cuda -debug -dnn_samples -examples -gdal -gflags -glog -gphoto2 -gstreamer (-ipp) -java -lapack -libav -openexr -pch -qt5 -testprograms -v4l -vaapi -vtk -webp -xine" ABI_X86="32 (64) (-x32)" CPU_FLAGS_X86="sse sse2 -avx -avx2 -fma3 -popcnt -sse3 -sse4_1 -sse4_2 -ssse3" PYTHON_TARGETS="python2_7 python3_6 -python3_4 -python3_5" 0 KiB Total: 2 packages (2 reinstalls), Size of downloads: 0 KiB * IMPORTANT: 36 news items need reading for repository 'gentoo'. * Use eselect news read to view new items. ugui7 ~ #
取り出してみる
画像はこちら
OCRTesseract::create関数のengをjpnにすれば、日本語の解析ができるようですが、抽出はうまく出来ませんでした、なので英語版で動作確認です。
#include <opencv2/opencv.hpp> #include <opencv2/text.hpp> using namespace std; int main(void) { auto image = cv::imread("test.jpg"); cv::Mat gray; cv::cvtColor(image, gray, cv::COLOR_RGB2GRAY); string result; vector<cv::Rect> boxes; vector<string> words; vector<float> confidences; printf("Initialize OCRTesseract...\n"); auto ocr = cv::text::OCRTesseract::create("/usr/share/tessdata", "eng", NULL, cv::text::OEM_DEFAULT, cv::text::PSM_AUTO); ocr->run(gray, result, &boxes, &words, &confidences); cout << " String | Posistion | Size | confidences" << endl; cout << "---------------------+------------+------------+------------" << endl; for (int i = 0; i < boxes.size(); i++) { printf("%-20s | (%3d, %3d) | (%3d, %3d) | %f\n", words[i].c_str(), boxes[i].x, boxes[i].y, boxes[i].width, boxes[i].height, confidences[i]); } cout << endl << "Result:\n-------" << endl; cout << result.c_str(); return 0; }
そして、ビルド、実行結果
cuomo@ugui7 ~/opencv $ gcc -g -O0 `pkg-config opencv --cflags --libs` -lstdc++ ocrtesseract.cpp cuomo@ugui7 ~/opencv $ ./a.out Initialize OCRTesseract... [DS] Profile read from file (tesseract_opencl_profile_devices.dat). [DS] Device[1] 1:GeForce GTX 1050 score is 0.087704 [DS] Device[2] 0:(null) score is 0.372587 [DS] Selected Device[1]: "GeForce GTX 1050" (OpenCL) String | Posistion | Size | confidences ---------------------+------------+------------+------------ The | ( 39, 54) | ( 37, 17) | 87.536118 first | ( 91, 54) | ( 63, 17) | 71.520576 step | (169, 56) | ( 50, 19) | 85.904022 is | (234, 54) | ( 23, 17) | 73.084877 always | (273, 54) | ( 75, 21) | 78.761841 the | (364, 54) | ( 37, 17) | 88.950020 hardest | (417, 54) | ( 97, 17) | 46.006390 There | ( 39, 80) | ( 63, 17) | 85.972023 is | (117, 80) | ( 23, 17) | 73.084877 no | (157, 85) | ( 23, 12) | 85.654243 royal | (196, 80) | ( 62, 21) | 79.272545 road | (274, 80) | ( 48, 17) | 85.654243 to | (338, 82) | ( 24, 15) | 85.466003 learning | (377, 80) | (111, 21) | 46.006390 It s | ( 39, 105) | ( 49, 18) | 26.470589 never | (105, 111) | ( 61, 12) | 85.544647 too | (182, 108) | ( 37, 15) | 85.466003 late | (234, 106) | ( 50, 17) | 78.761841 to | (299, 108) | ( 24, 15) | 85.466003 learn | (338, 106) | ( 72, 17) | 46.006390 Kuso | ( 39, 134) | ( 50, 15) | 85.498596 ka2 | (104, 132) | ( 36, 17) | 72.482178 Dnaeha | (156, 132) | (111, 17) | 46.006390 Result: ------- The first step is always the hardest There is no royal road to learning It s never too late to learn Kuso ka2 Dnaeha
普通にかかれている英文はうまく取れましたが、「?」とか入っているとなんか腐るようです。
ほんとにGPU使ってんのか?
x11-drivers/nvidia-drivers-396.24パッケージにnvidia-smiというコマンドが入っているのでこれで確認してみた。
cuomo@ugui7 ~ $ nvidia-smi -l 1 -i 0 Sat Jun 2 10:00:37 2018 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 396.24 Driver Version: 396.24 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 1050 Off | 00000000:01:00.0 On | N/A | | 35% 35C P0 N/A / 75W | 244MiB / 1999MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 445 G /usr/bin/X 189MiB | | 0 23555 C ./a.out 43MiB | +-----------------------------------------------------------------------------+ ...
Type CなのでCompute Processでa.out動いてるっぽい、が、OCRてきなところは期待していたより、さほど良くないみたい。
0 件のコメント:
コメントを投稿