๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90

๐Ÿ› Research58

[์˜คํ”ˆ์†Œ์Šค] OpenMMLab ์ปดํ“จํ„ฐ๋น„์ „ ์˜คํ”ˆ์†Œ์Šค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ | ๋‹ค์–‘ํ•œ ์ปดํ“จํ„ฐ๋น„์ „ ์—ฐ๊ตฌ ์ฃผ์ œ OpenMMLab OpenMMLab์€ ํ•™์ˆ  ์—ฐ๊ตฌ ๋ฐ ์‚ฐ์—… ์‘์šฉ์„ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ์ปดํ“จํ„ฐ๋น„์ „ ์˜คํ”ˆ์†Œ์Šค ํ”„๋กœ์ ํŠธ๋“ค์„ ์ œ๊ณตํ•œ๋‹ค. OpenMMLab์—์„œ๋Š” ์•„๋ž˜์™€ ๊ฐ™์€ ์žฅ์ ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ณ  ์„ค๋ช…ํ•˜๋Š”๋ฐ, ๊ต‰์žฅํžˆ ๋งŽ์€ ํ”„๋กœ์ ํŠธ์™€ ํˆด์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋‚ด๊ฐ€ ์จ๋ณธ ๊ฒƒ์€ ๊ทนํžˆ ์ผ๋ถ€์ด์ง€๋งŒ ํ•„์š”ํ•œ ๊ธฐ๋Šฅ๋“ค์„ ํŽธ๋ฆฌํ•˜๊ฒŒ ์ž˜ ๊ตฌํ˜„ํ–ˆ๋‹ค๋Š” ๋Š๋‚Œ์„ ๋ฐ›์•˜์—ˆ๋‹ค. ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์žฌ๊ตฌํ˜„์˜ ์–ด๋ ค์›€์„ ์ค„์ด๊ธฐ ์œ„ํ•œ ๊ณ ํ’ˆ์งˆ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ œ๊ณต ๋‹ค์–‘ํ•œ ๋ฐฑ์—”๋“œ ๋ฐ ์žฅ์น˜๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ํšจ์œจ์ ์ธ ๋ฐฐํฌ ๋„๊ตฌ ์ œ๊ณต ์ปดํ“จํ„ฐ๋น„์ „ ์—ฐ๊ตฌ ๋ฐ ๊ฐœ๋ฐœ์„ ์œ„ํ•œ ๊ฒฌ๊ณ ํ•œ ๊ธฐ๋ฐ˜ ๊ตฌ์ถ• ํ’€์Šคํƒ ํˆด์ฒด์ธ์œผ๋กœ ํ•™์ˆ  ์—ฐ๊ตฌ์™€ ์‚ฐ์—… ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ ๊ฐ„์˜ ๊ฒฉ์ฐจ ํ•ด์†Œ ๊ฐœ์ธ์ ์œผ๋กœ๋Š” Pytorch ๊ธฐ๋ฐ˜์ด๋ผ ์ข‹์•˜๋‹ค... OpenMMLab์€ ๊ต‰์žฅํžˆ ๋‹ค์–‘ํ•œ ์ปดํ“จํ„ฐ๋น„์ „ ์—ฐ๊ตฌ ์ฃผ์ œ์—์„œ ์ตœ์‹  ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ๊ณผ ๊ณ ์„ฑ๋Šฅ ์ฝ”๋“œ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ.. 2023. 4. 16.
[๊ธฐ์ˆ  ์†Œ๊ฐœ] 3D Object Scanning | MVS | ๊ฐ์ฒด ์Šค์บ๋‹ | ์‹ค์‹œ๊ฐ„ 3D ๊ฐ์ฒด ๋ณต์› 3D Object Scanning 3D Object Scanning์€ multi-view stereo (MVS) ๊ธฐ์ˆ ์„ ํ™œ์šฉํ•˜์—ฌ ๊ฐ์ฒด์˜ 3D shape์„ ๋ณต์›ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ์•„๋ž˜ ์˜์ƒ์„ ๋ณด๋ฉด Niantic์ด๋ผ๋Š” ๊ธฐ์—…์—์„œ Unity SDK์— ๋น ๋ฅธ non-lidar ์Šค์บ” ํˆด์„ ์ถ”๊ฐ€ํ•˜์—ฌ ์‚ฌ์šฉ์ž๊ฐ€ ๊ฐ์ฒด๋ฅผ ์‹ค์‹œ๊ฐ„์œผ๋กœ ์Šค์บ”ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•œ๋‹ค. ์Šค๋งˆํŠธํฐ์œผ๋กœ ๊ฐ์ฒด๋ฅผ ๋‹ค์–‘ํ•œ ๊ฐ๋„์—์„œ ์ดฌ์˜ํ•˜๊ณ  ๊ฐ์ฒด๋ฅผ ๋ณต์›ํ•˜๋Š”๋ฐ ํ’ˆ์งˆ์ด ๊ฝค ์ข‹์•„๋ณด์ธ๋‹ค. ๋˜ํ•œ RealityScan๊ณผ ๊ฐ™์€ ์•ฑ์„ ์‚ฌ์šฉํ•˜๋ฉด ์Šค๋งˆํŠธํฐ์œผ๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ 3D ์Šค์บ”์„ ๊ฒฝํ—˜ํ•ด ๋ณผ ์ˆ˜๋„ ์žˆ๋‹ค. Niantic ๊ธฐ์—…์˜ Object Scanning ์˜ˆ์‹œ RealityScan - 3D Scanning App์˜ ๊ฒฐ๊ณผ ์˜ˆ์‹œ ์ถœ์ฒ˜ : https://sketchfab.com/3d-models.. 2023. 4. 7.
[๊ธฐ์ˆ  ์†Œ๊ฐœ] Text-to-Image Generation | ์ด๋ฏธ์ง€ ์ƒ์„ฑ AI | DALL-E | GPT | dVAE Text to Image Generation Text to Image generation์€ ํ…์ŠคํŠธ ์ •๋ณด๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„์„œ ํ•ด๋‹น ํ…์ŠคํŠธ์— ํ•ด๋‹นํ•˜๋Š” ์ด๋ฏธ์ง€๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๋”ฅ๋Ÿฌ๋‹ ๊ธฐ์ˆ ์˜ ๋ฐœ์ „์œผ๋กœ ์ธํ•ด 2010๋…„๋Œ€ ์ค‘๋ฐ˜๋ถ€ํ„ฐ ๊ฐœ๋ฐœ๋˜๊ธฐ ์‹œ์ž‘ํ•ด 2022๋…„์—๋Š” OpenAI์˜ DALL-E 2 , Google Brain์˜ Imagen , StabilityAI์˜ Stable Diffusion ๊ณผ ๊ฐ™์€ ์ตœ์ฒจ๋‹จ ํ…์ŠคํŠธ-์ด๋ฏธ์ง€ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ๋ฌผ์ด ์‹ค์ œ ์‚ฌ์ง„๊ณผ ์‚ฌ๋žŒ์ด ๊ทธ๋ฆฐ ์˜ˆ์ˆ ํ’ˆ์˜ ํ’ˆ์งˆ์— ์ ‘๊ทผํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ๋‹ค. Text to Image generation์—์„œ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ด๋Ÿฌํ•œ GAN(Generative Adversarial Networks) ๋ชจ๋ธ์„ ํ…์ŠคํŠธ์™€ ์ด๋ฏธ์ง€๋ฅผ ์Œ์œผ๋กœ ์ด๋ฃจ๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ํ•™์Šต์‹œ์ผœ์„œ ๊ตฌํ˜„ํ•œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ".. 2023. 4. 6.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Character Region Awareness for Text Detection / CRAFT / ํ…์ŠคํŠธ ๊ฒ€์ถœ ๋ณธ ๋…ผ๋ฌธ์€ Naver Clova์—์„œ CVPR 2019 ์— ๋ฐœํ‘œํ•œ Text Detection ๋…ผ๋ฌธ์œผ๋กœ, CRAFT ๋ผ๋Š” ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. Text Detection ๋ถ„์•ผ์—์„œ ์›Œ๋‚™ ์œ ๋ช…๋‚œ ๋…ผ๋ฌธ์ด๊ณ  ๊ฐœ์ธ์ ์œผ๋กœ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ํ…์ŠคํŠธ์˜ ํŠน์„ฑ๊ณผ ๋”ฅ๋Ÿฌ๋‹์˜ ํ•™์Šต ํŠน์„ฑ์„ ์•„์ฃผ ํšจ์œจ์ ์œผ๋กœ ์ด์šฉํ•œ ๋งค๋ ฅ์ ์ธ ์—ฐ๊ตฌ๋ผ ์ƒ๊ฐํ•œ๋‹ค. ์ž์„ธํ•œ ์„ค๋ช…์€ ๋‹ค๋ฅธ ๋ธ”๋กœ๊ทธ์—์„œ๋„ ์ž˜ ๋‚˜์™€์žˆ์œผ๋‹ˆ ๋‚˜๋Š” ๋ชจ๋ธ ํ•™์Šต์„ ์œ„ํ•œ ํ•ต์‹ฌ์ ์ธ ๋ถ€๋ถ„๋งŒ ์ •๋ฆฌํ•˜๋ ค ํ•œ๋‹ค. CRAFT ๋ชจ๋ธ์˜ ํ•ต์‹ฌ CRAFT ๋ชจ๋ธ์€ ํ…์ŠคํŠธ ๊ฒ€์ถœ์„ ์œ„ํ•ด ๋‹จ์–ด bbox๋ฅผ ๋ฐ”๋กœ ์˜ˆ์ธกํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, ๋ฌธ์ž์˜ ์œ„์น˜๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” region score, ๋ฌธ์ž๊ฐ„ ๊ฑฐ๋ฆฌ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” affinity score๋ฅผ ์˜ˆ์ธก ์ด๋ฅผ ์œ„ํ•ด์„œ๋Š” character-level annotation์ด ํ•„์š”ํ•œ๋ฐ ๋ฌธ์ž ํ•˜๋‚˜ ํ•˜๋‚˜.. 2023. 3. 13.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels ๋ณธ ๋…ผ๋ฌธ์€ CVPR 2021์—์„œ ๋ฐœํ‘œ๋œ Text Recognition ๋…ผ๋ฌธ์œผ๋กœ, TRBA ๋ชจ๋ธ ('What is wrong with scene text recognition model comparisons? dataset and model analysis')์„ ์ œ์•ˆํ•œ ๋ฐฑ์ •ํ›ˆ ๋‹˜์˜ ๋…ผ๋ฌธ์ด๊ธฐ๋„ ํ•˜๋‹ค. ๋ณธ๋ฌธ ๋‚ด์šฉ Scene Text Recognition (STR) ์—ฐ๊ตฌ์—์„œ๋Š” ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๊ฐ€ ๋ถ€์กฑํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ผ๋ฐ˜์ ์œผ๋กœ ๋Œ€๊ทœ๋ชจ ํ•ฉ์„ฑ ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•™์Šต์„ ์ง„ํ–‰ํ•œ๋‹ค. ๋•Œ๋ฌธ์— ์•”๋ฌต์ ์œผ๋กœ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ๋งŒ์œผ๋กœ๋Š” STR ๋ชจ๋ธ ํ•™์Šต์ด ๊ฑฐ์˜ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ์•”๋ฌต์ ์ธ ์ƒ์‹(?)์ด ์žˆ์—ˆ๋‹ค๊ณ  ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ด ์ƒ์‹์ด STR ์—ฐ๊ตฌ๋ฅผ ๋ฐฉํ•ดํ–ˆ๋‹ค๊ณ  ๋งํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ตœ๊ทผ์— ์ถ•์ ๋œ ๋ฆฌ์–ผ ๋ฐ์ดํ„ฐ์…‹์„ ํ†ตํ•ฉํ•˜๊ณ  ์ง€์ •๋œ ์‹ค์ œ ๋ฐ์ด.. 2023. 3. 12.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis ๋ณธ ๋…ผ๋ฌธ์€ ICCV 2019์—์„œ Naver Clova๊ฐ€ ๋ฐœํ‘œํ•œ Text Recognition ๋…ผ๋ฌธ์ด๋‹ค. (๊ณต์‹ ๋ ˆํผ์ง€ํ† ๋ฆฌ) ์ œ์•ˆํ•˜๋Š” ๋‚ด์šฉ ๊ธฐ์กด์˜ ์ •๋ฆฌ๋˜์–ด ์žˆ์ง€ ์•Š๋˜ STR(Scene Text Recognition) dataset์„ ์ •๋ฆฌํ•˜๊ณ  STR ์„ 4๋‹จ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ์ •๋ฆฝํ–ˆ๋‹ค. ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•œ STR 4๋‹จ๊ณ„๋Š” ์•„๋ž˜์™€ ๊ฐ™๊ณ , ๊ฐ ๋‹จ๊ณ„์˜ ๋ชจ๋“ˆ๋ณ„ ๊ธฐ์—ฌ๋„๋ฅผ ์‹คํ—˜์„ ํ†ตํ•ด ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค. Transformation Stage : TPS(Thin-Plate Spline)์ด๋ผ๋Š” STN(Spatial Transformation Network)์™€ ์œ ์‚ฌํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ด๋ฏธ์ง€ ๋…ธ๋ฉ€๋ผ์ด์ฆˆ (์™œ๊ณก๋˜์–ด ์žˆ๋Š” ํ…์ŠคํŠธ๋ฅผ ์ธ์‹ ๋ชจ๋ธ์ด ๊ฐ€์žฅ ์ธ์‹ํ•˜๊ธฐ ์‰ฌ์šด ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜) Feature Extraction Stage : ์ผ๋ฐ˜์ ์ธ CNN ์•„ํ‚คํ…์ฒ˜... 2023. 3. 12.
728x90