๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90

๐Ÿ› Research/OCR8

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Character Region Awareness for Text Detection / CRAFT / ํ…์ŠคํŠธ ๊ฒ€์ถœ ๋ณธ ๋…ผ๋ฌธ์€ Naver Clova์—์„œ CVPR 2019 ์— ๋ฐœํ‘œํ•œ Text Detection ๋…ผ๋ฌธ์œผ๋กœ, CRAFT ๋ผ๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. Text Detection ๋ถ„์•ผ์—์„œ ์›Œ๋‚™ ์œ ๋ช…๋‚œ ๋…ผ๋ฌธ์ด๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ํ…์ŠคํŠธ์˜ ํŠน์„ฑ๊ณผ ๋”ฅ๋Ÿฌ๋‹์˜ ํ•™์Šต ํŠน์„ฑ์„ ์•„์ฃผ ํšจ์œจ์ ์œผ๋กœ ์ด์šฉํ•œ ๋งค๋ ฅ์ ์ธ ์—ฐ๊ตฌ๋ผ ์ƒ๊ฐํ•œ๋‹ค. ์ž์„ธํ•œ ์„ค๋ช…์€ ๋‹ค๋ฅธ ๋ธ”๋กœ๊ทธ์—์„œ๋„ ์ž˜ ๋‚˜์™€์žˆ์œผ๋‹ˆ ๋‚˜๋Š” ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•œ ํ•ต์‹ฌ์ ์ธ ๋ถ€๋ถ„๋งŒ ์ •๋ฆฌํ•˜๋ ค ํ•œ๋‹ค. CRAFT ๋ชจ๋ธ์˜ ํ•ต์‹ฌ CRAFT ๋ชจ๋ธ์€ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ๋‹จ์–ด bbox๋ฅผ ๋ฐ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๋ฌธ์ž์˜ ์œ„์น˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” region score, ๋ฌธ์ž๊ฐ„ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” affinity score๋ฅผ ์˜ˆ์ธก ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” character-level annotation์ด ํ•„์š”ํ•œ๋ฐ ๋ฌธ์ž ํ•˜๋‚˜ ํ•˜๋‚˜.. 2023. 3. 13.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels ๋ณธ ๋…ผ๋ฌธ์€ CVPR 2021์—์„œ ๋ฐœํ‘œ๋œ Text Recognition ๋…ผ๋ฌธ์œผ๋กœ, TRBA ๋ชจ๋ธ ('What is wrong with scene text recognition model comparisons? dataset and model analysis')์„ ์ œ์•ˆํ•œ ๋ฐฑ์ •ํ›ˆ ๋‹˜์˜ ๋…ผ๋ฌธ์ด๊ธฐ๋„ ํ•˜๋‹ค. ๋ณธ๋ฌธ ๋‚ด์šฉ Scene Text Recognition (STR) ์—ฐ๊ตฌ์—์„œ๋Š” ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ผ๋ฐ˜์ ์œผ๋กœ ๋Œ€๊ทœ๋ชจ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ๋‹ค. ๋•Œ๋ฌธ์— ์•”๋ฌต์ ์œผ๋กœ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ๋Š” STR ๋ชจ๋ธ ํ•™์Šต์ด ๊ฑฐ์˜ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์•”๋ฌต์ ์ธ ์ƒ์‹(?)์ด ์žˆ์—ˆ๋‹ค๊ณ  ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด ์ƒ์‹์ด STR ์—ฐ๊ตฌ๋ฅผ ๋ฐฉํ•ดํ–ˆ๋‹ค๊ณ  ๋งํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ตœ๊ทผ์— ์ถ•์ ๋œ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ฉํ•˜๊ณ  ์ง€์ •๋œ ์‹ค์ œ ๋ฐ์ด.. 2023. 3. 12.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis ๋ณธ ๋…ผ๋ฌธ์€ ICCV 2019์—์„œ Naver Clova๊ฐ€ ๋ฐœํ‘œํ•œ Text Recognition ๋…ผ๋ฌธ์ด๋‹ค. (๊ณต์‹ ๋ ˆํผ์ง€ํ† ๋ฆฌ) ์ œ์•ˆํ•˜๋Š” ๋‚ด์šฉ ๊ธฐ์กด์˜ ์ •๋ฆฌ๋˜์–ด ์žˆ์ง€ ์•Š๋˜ STR(Scene Text Recognition) dataset์„ ์ •๋ฆฌํ•˜๊ณ  STR ์„ 4๋‹จ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ์ •๋ฆฝํ–ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ STR 4๋‹จ๊ณ„๋Š” ์•„๋ž˜์™€ ๊ฐ™๊ณ , ๊ฐ ๋‹จ๊ณ„์˜ ๋ชจ๋“ˆ๋ณ„ ๊ธฐ์—ฌ๋„๋ฅผ ์‹คํ—˜์„ ํ†ตํ•ด ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. Transformation Stage : TPS(Thin-Plate Spline)์ด๋ผ๋Š” STN(Spatial Transformation Network)์™€ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ด๋ฏธ์ง€ ๋…ธ๋ฉ€๋ผ์ด์ฆˆ (์™œ๊ณก๋˜์–ด ์žˆ๋Š” ํ…์ŠคํŠธ๋ฅผ ์ธ์‹ ๋ชจ๋ธ์ด ๊ฐ€์žฅ ์ธ์‹ํ•˜๊ธฐ ์‰ฌ์šด ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜) Feature Extraction Stage : ์ผ๋ฐ˜์ ์ธ CNN ์•„ํ‚คํ…์ฒ˜... 2023. 3. 12.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Data Augmentation for Scene Text Recognition ํ…์ŠคํŠธ ์ธ์‹์— ํฌ์ปค์Šค๊ฐ€ ๋งž์ถฐ์ง„ augmentation์ด ์žˆ์„๊นŒ ์‹ถ์–ด ๋…ผ๋ฌธ์„ ์ฐพ๋˜์ค‘ ICCV 2021 ํ•™ํšŒ์—์„œ ๋ฐœํ‘œ๋œ STR์—์„œ์˜ Data augmentation ๋…ผ๋ฌธ์ด ์žˆ์–ด์„œ ์ •๋ฆฌํ•˜๋ ค ํ•œ๋‹ค. Abstract ์ผ๋ถ€ Scene Text Recognition(STR) ๋ชจ๋ธ์€ ์‹ค์ œ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ‰๊ฐ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•™์Šต ๋ฐ์ดํ„ฐ์™€ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ ๋ถ„ํฌ ๊ฐ„์˜ ๋ถˆ์ผ์น˜๋Š” ์ฃผ๋กœ nosie, artifacts, geometry, structure ๋“ฑ์˜ ์˜ํ–ฅ์„ ๋ฐ›์•„์„œ ์„ฑ๋Šฅ ์ €ํ•˜๋กœ ์ด์–ด์ง„๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด 36๊ฐœ์˜ image augmenation function์œผ๋กœ ๊ตฌ์„ฑ๋œ STRAug๋ฅผ ์†Œ๊ฐœํ•œ๋‹ค. ๊ฐ ํ•จ์ˆ˜๋Š” ์ž์—ฐ ์žฅ๋ฉด์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๊ฑฐ๋‚˜ ์นด๋ฉ”๋ผ ์„ผ์„œ์— ์˜ํ•ด ๋ฐœ์ƒํ•˜๊ฑฐ๋‚˜ ์‹ ํ˜ธ ์ฒ˜๋ฆฌ ์ž‘์—… ์ค‘ ๋ฐœ์ƒํ•˜๋Š” ์ด๋ฏธ์ง€ ์†์„ฑ.. 2023. 3. 11.
[์›น ๋ฐ๋ชจ] ๋„ค์ด๋ฒ„ ํด๋กœ๋ฐ” OCR ๋ฐ๋ชจ OCR์€ ์ด๋ฏธ์ง€ ์†์—์„œ ํ…์ŠคํŠธ๋ฅผ ์ฐพ๊ณ  ์ฝ์–ด๋‚ด๋Š” ๊ธฐ์ˆ ๋กœ ์ตœ๊ทผ์—๋Š” ์›ํ•˜๋Š” ํ…์ŠคํŠธ ์ •๋ณด๋งŒ์„ ์ถ”์ถœํ•˜๋Š” ์ˆ˜์ค€๊นŒ์ง€ ๋„๋‹ฌํ–ˆ๊ณ , ์ด ๋ถ„์•ผ์—์„œ๋Š” ๋„ค์ด๋ฒ„๊ฐ€ ์—…๊ณ„ ์ตœ๊ณ  ์ˆ˜์ค€์˜ ๊ธฐ์ˆ ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋„ค์ด๋ฒ„๋Š” CVPR 2019์—์„œ ๋ฐœํ‘œํ•œ Text detection ๋ชจ๋ธ์ธ CRAFT, 21๋…„์— ๋ฐœํ‘œํ•œ end-to-end document understanding ๋ชจ๋ธ์ธ Donut ๊ทธ๋ฆฌ๊ณ  ๊ฐ€์žฅ ์ตœ๊ทผ์ธ 22๋…„์— ๋ฐœํ‘œํ•œ DEER ๋ชจ๋ธ๊นŒ์ง€ OCR ๋ถ€๋ถ„์—์„œ ๋งŽ์€ ๋…ผ๋ฌธ์„ ๋‚ด๊ณ  ์žˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ์˜ ์ˆ˜์น˜์ ์€ ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•œ ๊ฒƒ์€ ์•Œ๊ฒ ๋Š”๋ฐ, ์‹ค์ œ๋กœ ์–ผ๋งˆ๋‚˜ ์ž˜ ๋™์ž‘ํ•˜๋Š” ๋ชจ๋ธ์ผ๊นŒ? ๋„ค์ด๋ฒ„ ํด๋กœ๋ฐ”๋Š” OCR ์›น ๋ฐ๋ชจ๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ์–ด ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. (๋งํฌ) ๋„ค์ด๋ฒ„ ํด๋กœ๋ฐ” OCR ์›น ๋ฐ๋ชจ ํŽ˜์ด์ง€์—์„œ General OCR, ์˜์ˆ˜์ฆ, ์‹ ์šฉ์นด๋“œ ๋“ฑ .. 2023. 3. 1.
[์—ฐ๊ตฌ ์†Œ๊ฐœ] ๋ฌธ์„œ ์ด๋ฏธ์ง€ ๊ทธ๋ฆผ์ž์ œ๊ฑฐ / ๋ฌธ์„œ OCR ๊ฒฐ๊ณผ๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ์š”์ฆ˜์€ ๋ฌธ์„œ๋ฅผ ์‚ฌ์ง„์œผ๋กœ ์ฐ์–ด์„œ ํšŒ์‚ฌ๋‚˜ ๊ณต๊ณต ๊ธฐ๊ด€์— ์ œ์ถœํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค. ์ด ๋•Œ ํšŒ์‚ฌ๋Š” ๋ฐ›์€ ๋ฌธ์„œ์—์„œ OCR ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•ด์„œ ํ…์ŠคํŠธ๋ฅผ ๋””์ง€ํ„ธํ™”์‹œ์ผœ์„œ ์ €์žฅํ•˜๊ฒŒ ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ํœด๋Œ€ํฐ์œผ๋กœ ๋ฌธ์„œ ์‚ฌ์ง„์„ ์ฐ๋Š” ๊ฒฝ์šฐ ๊ทธ๋ฆผ์ž๊ฐ€ ๋งŽ์ด ์ƒ๊ฒจ์„œ ์ด๋ฏธ์ง€์˜ ํ€„๋ฆฌํ‹ฐ๊ฐ€ ๋–จ์–ด์ง€๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๊ณ  ์ด๋Š” ํ…์ŠคํŠธ ์ธ์‹ ์˜ค๋ฅ˜๋ฅผ ๋ฐœ์ƒํ•˜๊ฒŒ ํ•œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ... ์ด๋ฏธ์ง€์—์„œ ๊ทธ๋ฆผ์ž๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ์กด์žฌํ•œ๋‹ค๊ณ  ํ•œ๋‹ค. ์—ญ์‹œ ์„ธ์ƒ ์‚ฌ๋žŒ๋“ค์€ ์ฐธ ๋˜‘๋˜‘ํ•˜๊ณ  ์—†๋Š” ๊ฒŒ ์ž˜ ์—†๋‹ค... Paper : BEDSR-Net A Deep Shadow Removal Network from a Single Document Image / CVPR 2020 github : https://github.com/IsHYuhi/BEDSR-Net_A_Deep_Shadow_Removal_.. 2022. 12. 20.
728x90