[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Character Region Awareness for Text Detection / CRAFT / ํ…์ŠคํŠธ ๊ฒ€์ถœ
ยท
๐Ÿ› Research/OCR
๋ณธ ๋…ผ๋ฌธ์€ Naver Clova์—์„œ CVPR 2019 ์— ๋ฐœํ‘œํ•œ Text Detection ๋…ผ๋ฌธ์œผ๋กœ, CRAFT ๋ผ๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. Text Detection ๋ถ„์•ผ์—์„œ ์›Œ๋‚™ ์œ ๋ช…๋‚œ ๋…ผ๋ฌธ์ด๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ํ…์ŠคํŠธ์˜ ํŠน์„ฑ๊ณผ ๋”ฅ๋Ÿฌ๋‹์˜ ํ•™์Šต ํŠน์„ฑ์„ ์•„์ฃผ ํšจ์œจ์ ์œผ๋กœ ์ด์šฉํ•œ ๋งค๋ ฅ์ ์ธ ์—ฐ๊ตฌ๋ผ ์ƒ๊ฐํ•œ๋‹ค. ์ž์„ธํ•œ ์„ค๋ช…์€ ๋‹ค๋ฅธ ๋ธ”๋กœ๊ทธ์—์„œ๋„ ์ž˜ ๋‚˜์™€์žˆ์œผ๋‹ˆ ๋‚˜๋Š” ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•œ ํ•ต์‹ฌ์ ์ธ ๋ถ€๋ถ„๋งŒ ์ •๋ฆฌํ•˜๋ ค ํ•œ๋‹ค. CRAFT ๋ชจ๋ธ์˜ ํ•ต์‹ฌ CRAFT ๋ชจ๋ธ์€ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ๋‹จ์–ด bbox๋ฅผ ๋ฐ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๋ฌธ์ž์˜ ์œ„์น˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” region score, ๋ฌธ์ž๊ฐ„ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” affinity score๋ฅผ ์˜ˆ์ธก ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” character-level annotation์ด ํ•„์š”ํ•œ๋ฐ ๋ฌธ์ž ํ•˜๋‚˜ ํ•˜๋‚˜..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels
ยท
๐Ÿ› Research/OCR
๋ณธ ๋…ผ๋ฌธ์€ CVPR 2021์—์„œ ๋ฐœํ‘œ๋œ Text Recognition ๋…ผ๋ฌธ์œผ๋กœ, TRBA ๋ชจ๋ธ ('What is wrong with scene text recognition model comparisons? dataset and model analysis')์„ ์ œ์•ˆํ•œ ๋ฐฑ์ •ํ›ˆ ๋‹˜์˜ ๋…ผ๋ฌธ์ด๊ธฐ๋„ ํ•˜๋‹ค. ๋ณธ๋ฌธ ๋‚ด์šฉ Scene Text Recognition (STR) ์—ฐ๊ตฌ์—์„œ๋Š” ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ผ๋ฐ˜์ ์œผ๋กœ ๋Œ€๊ทœ๋ชจ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ๋‹ค. ๋•Œ๋ฌธ์— ์•”๋ฌต์ ์œผ๋กœ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ๋Š” STR ๋ชจ๋ธ ํ•™์Šต์ด ๊ฑฐ์˜ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์•”๋ฌต์ ์ธ ์ƒ์‹(?)์ด ์žˆ์—ˆ๋‹ค๊ณ  ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด ์ƒ์‹์ด STR ์—ฐ๊ตฌ๋ฅผ ๋ฐฉํ•ดํ–ˆ๋‹ค๊ณ  ๋งํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ตœ๊ทผ์— ์ถ•์ ๋œ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ฉํ•˜๊ณ  ์ง€์ •๋œ ์‹ค์ œ ๋ฐ์ด..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis
ยท
๐Ÿ› Research/OCR
๋ณธ ๋…ผ๋ฌธ์€ ICCV 2019์—์„œ Naver Clova๊ฐ€ ๋ฐœํ‘œํ•œ Text Recognition ๋…ผ๋ฌธ์ด๋‹ค. (๊ณต์‹ ๋ ˆํผ์ง€ํ† ๋ฆฌ) ์ œ์•ˆํ•˜๋Š” ๋‚ด์šฉ ๊ธฐ์กด์˜ ์ •๋ฆฌ๋˜์–ด ์žˆ์ง€ ์•Š๋˜ STR(Scene Text Recognition) dataset์„ ์ •๋ฆฌํ•˜๊ณ  STR ์„ 4๋‹จ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ์ •๋ฆฝํ–ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ STR 4๋‹จ๊ณ„๋Š” ์•„๋ž˜์™€ ๊ฐ™๊ณ , ๊ฐ ๋‹จ๊ณ„์˜ ๋ชจ๋“ˆ๋ณ„ ๊ธฐ์—ฌ๋„๋ฅผ ์‹คํ—˜์„ ํ†ตํ•ด ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. Transformation Stage : TPS(Thin-Plate Spline)์ด๋ผ๋Š” STN(Spatial Transformation Network)์™€ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ด๋ฏธ์ง€ ๋…ธ๋ฉ€๋ผ์ด์ฆˆ (์™œ๊ณก๋˜์–ด ์žˆ๋Š” ํ…์ŠคํŠธ๋ฅผ ์ธ์‹ ๋ชจ๋ธ์ด ๊ฐ€์žฅ ์ธ์‹ํ•˜๊ธฐ ์‰ฌ์šด ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜) Feature Extraction Stage : ์ผ๋ฐ˜์ ์ธ CNN ์•„ํ‚คํ…์ฒ˜...
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Data Augmentation for Scene Text Recognition
ยท
๐Ÿ› Research/OCR
ํ…์ŠคํŠธ ์ธ์‹์— ํฌ์ปค์Šค๊ฐ€ ๋งž์ถฐ์ง„ augmentation์ด ์žˆ์„๊นŒ ์‹ถ์–ด ๋…ผ๋ฌธ์„ ์ฐพ๋˜์ค‘ ICCV 2021 ํ•™ํšŒ์—์„œ ๋ฐœํ‘œ๋œ STR์—์„œ์˜ Data augmentation ๋…ผ๋ฌธ์ด ์žˆ์–ด์„œ ์ •๋ฆฌํ•˜๋ ค ํ•œ๋‹ค. Abstract ์ผ๋ถ€Scene Text Recognition(STR) ๋ชจ๋ธ์€ ์‹ค์ œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ‰๊ฐ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๊ฐ„์˜ ๋ถˆ์ผ์น˜๋Š” ์ฃผ๋กœ nosie, artifacts, geometry, structure ๋“ฑ์˜ ์˜ํ–ฅ์„ ๋ฐ›์•„์„œ ์„ฑ๋Šฅ ์ €ํ•˜๋กœ ์ด์–ด์ง„๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด 36๊ฐœ์˜ image augmenation function์œผ๋กœ ๊ตฌ์„ฑ๋œ STRAug๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค. ๊ฐ ํ•จ์ˆ˜๋Š” ์ž์—ฐ ์žฅ๋ฉด์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฑฐ๋‚˜ ์นด๋ฉ”๋ผ ์„ผ์„œ์— ์˜ํ•ด ๋ฐœ์ƒํ•˜๊ฑฐ๋‚˜ ์‹ ํ˜ธ ์ฒ˜๋ฆฌ ์ž‘์—… ์ค‘ ๋ฐœ์ƒํ•˜๋Š” ์ด๋ฏธ์ง€ ์†์„ฑ์„..
[์—ฐ๊ตฌ ์†Œ๊ฐœ] ๋ฌธ์„œ ์ด๋ฏธ์ง€ ๊ทธ๋ฆผ์ž์ œ๊ฑฐ / ๋ฌธ์„œ OCR ๊ฒฐ๊ณผ๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด
ยท
๐Ÿ› Research/OCR
์š”์ฆ˜์€ ๋ฌธ์„œ๋ฅผ ์‚ฌ์ง„์œผ๋กœ ์ฐ์–ด์„œ ํšŒ์‚ฌ๋‚˜ ๊ณต๊ณต ๊ธฐ๊ด€์— ์ œ์ถœํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ด ๋•Œ ํšŒ์‚ฌ๋Š” ๋ฐ›์€ ๋ฌธ์„œ์—์„œ OCR ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•ด์„œ ํ…์ŠคํŠธ๋ฅผ ๋””์ง€ํ„ธํ™”์‹œ์ผœ์„œ ์ €์žฅํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ํœด๋Œ€ํฐ์œผ๋กœ ๋ฌธ์„œ ์‚ฌ์ง„์„ ์ฐ๋Š” ๊ฒฝ์šฐ ๊ทธ๋ฆผ์ž๊ฐ€ ๋งŽ์ด ์ƒ๊ฒจ์„œ ์ด๋ฏธ์ง€์˜ ํ€„๋ฆฌํ‹ฐ๊ฐ€ ๋–จ์–ด์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๊ณ  ์ด๋Š” ํ…์ŠคํŠธ ์ธ์‹ ์˜ค๋ฅ˜๋ฅผ ๋ฐœ์ƒํ•˜๊ฒŒ ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ... ์ด๋ฏธ์ง€์—์„œ ๊ทธ๋ฆผ์ž๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ์กด์žฌํ•œ๋‹ค๊ณ  ํ•œ๋‹ค. ์—ญ์‹œ ์„ธ์ƒ ์‚ฌ๋žŒ๋“ค์€ ์ฐธ ๋˜‘๋˜‘ํ•˜๊ณ  ์—†๋Š” ๊ฒŒ ์ž˜ ์—†๋‹ค... Paper : BEDSR-Net A Deep Shadow Removal Network from a Single Document Image / CVPR 2020 github : https://github.com/IsHYuhi/BEDSR-Net_A_Deep_Shadow_Removal_..
[์˜คํ”ˆ ์†Œ์Šค] EasyOCR ํ…์ŠคํŠธ ๊ฒ€์ถœ/์ธ์‹ AI ๋ชจ๋ธ์„ ๋ฌด๋ฃŒ๋กœ ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•ด๋ณด์ž
ยท
๐Ÿ› Research/OCR
https://github.com/JaidedAI/EasyOCR GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chines Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. - GitHub - JaidedAI/EasyOCR: Ready-to-use OCR with 80+ ... github.com OCR(Optical Character..