๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ› Research/OCR

[์˜คํ”ˆ ์†Œ์Šค] EasyOCR ํ…์ŠคํŠธ ๊ฒ€์ถœ/์ธ์‹ AI ๋ชจ๋ธ์„ ๋ฌด๋ฃŒ๋กœ ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•ด๋ณด์ž

by ๋ญ…์ฆค 2022. 12. 16.
๋ฐ˜์‘ํ˜•

https://github.com/JaidedAI/EasyOCR

 

GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chines

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. - GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ ...

github.com

 

OCR(Optical Character Recognition) ๊ธฐ์ˆ ์€ ๊ฐ์ข… ๋ฌธ์„œ๋‚˜ ์ด๋ฏธ์ง€์—์„œ ํ…์ŠคํŠธ๋ฅผ ์ฝ์–ด๋“ค์—ฌ ์•„๋‚ ๋กœ๊ทธ ๋ฐ์ดํ„ฐ๋ฅผ ๋””์ง€ํ„ธํ™”ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๋ณ„๊ฑฐ ์•„๋‹Œ ๊ฒƒ ๊ฐ™์€ ๊ธฐ์ˆ  ๊ฐ™์•„ ๋ณด์ด์ง€๋งŒ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๊ตฌ์„ฑํ•˜๊ธฐ ์–ด๋ ต๊ณ  ๋‚˜๋ผ๋ณ„ ์–ธ์–ด๊ฐ€ ๋‹ค๋ฅด๊ณ  ๊ฐ™์€ ์–ธ์–ด์ด๋”๋ผ๋„ ๊ธ€์”จ์ฒด๊ฐ€ ๋‹ค์–‘ํ•ด์„œ ๊ณ ๋„ํ™”์‹œํ‚ค๊ธฐ ์–ด๋ ค์šด AI๊ธฐ์ˆ  ์ค‘ ํ•˜๋‚˜๋ผ ์ƒ๊ฐํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์—ฌ๋Ÿฌ ์ž๋™ํ™” ์ž‘์—…์— ํ•„์ˆ˜์ ์ธ ๊ธฐ์ˆ ์ด๋ผ ๊ต‰์žฅํžˆ ๋งŽ์€ ๊ธฐ์—…์—์„œ OCR ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•ด์„œ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ์™ธ๋ถ€ ์†”๋ฃจ์…˜ ์—…์ฒด์˜ API๋ฅผ ๊ตฌ๋งคํ•˜์—ฌ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ์ถ”์„ธ์ด๋‹ค.

 

์ด๋Ÿฐ OCR ๋ชจ๋ธ์„ ๊ณต์งœ๋กœ ์ œ๊ณตํ•˜๋Š” ์˜คํ”ˆ ์†Œ์Šค๊ฐ€ ๋ช‡ ๊ฐ€์ง€ ์žˆ๋Š”๋ฐ EasyOCR ์ด๋ผ๋Š” ์˜คํ”ˆ์†Œ์Šค๊ฐ€ ์•ฝ 80๊ฐœ์˜ ์–ธ์–ด์˜ ํ…์ŠคํŠธ ๊ฒ€์ถœ ๋ฐ ์ธ์‹ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•œ๋‹ค. ๋ ˆํผ์ง€ํ† ๋ฆฌ์˜ Readme ํŒŒ์ผ๋„ ์•„์ฃผ ์‰ฝ๊ฒŒ ์ž‘์„ฑ๋˜์–ด์žˆ๊ณ  ํ…Œ์ŠคํŠธํ•˜๊ธฐ๋„ ์ •๋ง ๊ฐ„ํŽธํ•˜๊ฒŒ ๋งŒ๋“ค์–ด์ ธ ์žˆ๋‹ค.

๊ฒŒ๋‹ค๊ฐ€ ๋ผ์ด์„ผ์Šค๋Š” Apache-2.0 ์ด๋ผ ๋ถ€๋‹ด์—†์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค.

 

EasyOCR ํ”„๋ ˆ์ž„์›Œํฌ

 

๋ฌผ๋ก  EasyOCR์˜ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ํ…์ŠคํŠธ ๊ฒ€์ถœ ๋ชจ๋ธ์€ Naver Clova AI์˜ CRAFT ๋ชจ๋ธ์„, ์ธ์‹ ๋ชจ๋ธ์—๋Š” CRNN ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋น„๊ต์  ์˜›๋‚  ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ณ ์žˆ์ง€๋งŒ ์ง์ ‘ ์‚ฌ์šฉํ•ด๋ณธ ๊ฒฐ๊ณผ ๊ฝค ๋งŒ์กฑ์Šค๋Ÿฌ์šด ์„ฑ๋Šฅ์ด ๋‚˜์˜จ๋‹ค.

 

๋ฌผ๋ก  ๋Œ€๋ถ€๋ถ„์˜ ๊ณต๊ฐœ๋œ AI ๋ชจ๋ธ์ด ๊ทธ๋ ‡๋“ฏ ๋ฐ”๋กœ ์ƒ์šฉ ์„œ๋น„์Šค์— ์ ์šฉํ•˜๊ธฐ์—” ๋ถ€์กฑํ•œ ์„ฑ๋Šฅ์ด๊ธด ํ•˜์ง€๋งŒ ์–ด๋Š์ •๋„์˜ ํ…์ŠคํŠธ ๊ฒ€์ถœ ๋ฐ ์ธ์‹ ์„ฑ๋Šฅ์„ ์›ํ•˜๋Š” ๋ถ„๋“ค๊ปœ ์ด๋ณด๋‹ค ์ข‹์€ ์˜ต์…˜์€ ์—†์„ ๊ฒƒ ๊ฐ™๋‹ค.

 

ํŠนํžˆ ์ „์ฒด ํ”„๋ ˆ์ž„์›Œํฌ ์ค‘๊ฐ„ ์ค‘๊ฐ„์— OCR์— ํ•„์š”ํ•œ pre/mid/post processing ์ด ํฌํ•จ๋˜์–ด ์žˆ์–ด ์ „์ฒด์ ์ธ OCR ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๊ตฌ์„ฑํ•ด์•ผ ํ•˜๋Š” ๋ถ„๋“ค์—๊ฒŒ ๋„์›€์ด ๋  ๊ฒƒ ๊ฐ™๋‹ค.

 

 

 

EasyOCR ์‚ฌ์šฉ๋ฒ•

EasyOCR ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•˜๊ณ  ์ธ์‹ํ•˜๊ธธ ์›ํ•˜๋Š” ์–ธ์–ด๋กœ Reader ๋ฅผ ์„ ์–ธํ•˜๊ณ  ์ด๋ฏธ์ง€๋ฅผ ๋„ฃ์–ด์ฃผ๋ฉด ๋œ๋‹ค.

import easyocr

reader = easyocr.Reader(['kr','en']) # Korean, English
result = reader.readtext('text.jpg')

 

EasyOCR ์ปค์Šคํ…€ ๋ชจ๋ธ ํ•™์Šต

EasyOCR์€ ์ปค์Šคํ…€ ๋ชจ๋ธ ํ•™์Šต ๊ธฐ๋Šฅ๋„ ์ œ๊ณตํ•œ๋‹ค. (trainer ํด๋”)

 

ํŠนํžˆ CRAFT ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ ์˜คํ”ผ์…œ ๋ ˆํผ์ง€ํ† ๋ฆฌ์—์„œ ํ•™์Šต ์ฝ”๋“œ๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— CRAFT ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•ด์„œ๋Š” ๋Œ€๋ถ€๋ถ„ easyOCR์„ ํ™œ์šฉํ•˜๋Š” ๊ฒƒ ๊ฐ™๋‹ค.

๋ฐ˜์‘ํ˜•