[CV] SFM (Structure From Motion) : ์—ฐ์†๋œ 2D ์ด๋ฏธ์ง€๋“ค๋กœ ์นด๋ฉ”๋ผ ํฌ์ฆˆ์™€ 3D shape ์žฌ๊ตฌ์„ฑํ•˜๊ธฐ

2022. 6. 5. 03:23ยท๐Ÿ“– Fundamentals/3D vision & Graphics
๋ฐ˜์‘ํ˜•

๋ณธ ํฌ์ŠคํŒ…์—์„œ๋Š” Visual Localization์—์„œ ํ•ต์‹ฌ์ด ๋˜๋Š” ๊ธฐ์ˆ  ์ค‘ ํ•˜๋‚˜์ธ Structure From Motion (SfM) ์— ๋Œ€ํ•ด ๋‹ค๋ฃฌ๋‹ค. SfM์€ 2D ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ 3D ๊ตฌ์กฐ์™€ ์นด๋ฉ”๋ผ์˜ ์œ„์น˜(pose)๋ฅผ ๋ณต์›ํ•˜๋Š” ๊ธฐ์ˆ ๋กœ, ์—ฌ๋Ÿฌ ์žฅ์˜ ์ด๋ฏธ์ง€๋งŒ์œผ๋กœ ์žฅ๋ฉด์˜ 3์ฐจ์› ๊ตฌ์กฐ๋ฅผ ์žฌ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋‹ค.

 

์ด ๊ธ€์—์„œ๋Š” ๋ณต์žกํ•œ ์ˆ˜์‹ ์—†์ด, ๊ฐ ๋‹จ๊ณ„์˜ ๋ชฉ์ ๊ณผ ์˜๋ฏธ์— ์ง‘์ค‘ํ•˜์—ฌ SfM์˜ ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์„ ์†Œ๊ฐœํ•˜๊ณ ์ž ํ•œ๋‹ค.

 

SfM์€ ์ผ๋ฐ˜์ ์œผ๋กœ COLMAP๊ณผ ๊ฐ™์€ ๋„๊ตฌ์—์„œ ์‚ฌ์šฉ๋˜๋ฉฐ, COLMAP์€ GUI๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฐ•๋ ฅํ•œ SfM & MVS ํŒŒ์ดํ”„๋ผ์ธ์ด๋‹ค. ์ •๋ ฌ๋œ ์ด๋ฏธ์ง€๋ฟ ์•„๋‹ˆ๋ผ ์ •๋ ฌ๋˜์ง€ ์•Š์€ ์—ฌ๋Ÿฌ ๋ทฐ ์ด๋ฏธ์ง€๋“ค๋งŒ ์ž…๋ ฅํ•ด๋„, ์ž๋™์œผ๋กœ ์นด๋ฉ”๋ผ์˜ ์œ„์น˜์™€ 3D ๊ตฌ์กฐ๋ฅผ ๋ณต์›ํ•ด์ค€๋‹ค. ์ด๋Ÿฌํ•œ ํŠน์„ฑ ๋•๋ถ„์— SfM์€ Visual SLAM, AR/VR, ๋กœ๋ณดํ‹ฑ์Šค, ๋””์ง€ํ„ธ ํŠธ์œˆ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ณต๊ฐ„ ์ธ์‹ ์‘์šฉ์—์„œ ๋„๋ฆฌ ํ™œ์šฉ๋˜๊ณ  ์žˆ๋‹ค.

 

* SFM๊ณผ ์œ ์‚ฌํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜?

SfM(Structure from Motion)๊ณผ ์œ ์‚ฌํ•œ ๊ธฐ์ˆ ๋กœ๋Š” SLAM(Simultaneous Localization and Mapping)๊ณผ Visual Odometry(VO)๊ฐ€ ์žˆ๋‹ค. ์ด ์„ธ ๊ธฐ์ˆ ์€ ๋ชจ๋‘ ์ด๋ฏธ์ง€ ๋˜๋Š” ์„ผ์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ 3D ๊ณต๊ฐ„ ์ •๋ณด๋ฅผ ๋ณต์›ํ•˜๊ณ  ์นด๋ฉ”๋ผ์˜ ์›€์ง์ž„์„ ์ถ”์ •ํ•œ๋‹ค๋Š” ์ ์—์„œ ๊ณตํ†ต์ ์ด ์žˆ์ง€๋งŒ, ์‚ฌ์šฉ ๋ชฉ์ ๊ณผ ๋™์ž‘ ๋ฐฉ์‹์—์„œ ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค.

Visual Odometry๋Š” ์—ฐ์†๋œ ์ด๋ฏธ์ง€ ๋˜๋Š” ์„ผ์„œ ํ”„๋ ˆ์ž„ ์‚ฌ์ด์˜ ์ƒ๋Œ€์ ์ธ ์นด๋ฉ”๋ผ ์ด๋™(trajectory)์„ ์ถ”์ •ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ loop closure๋‚˜ ์ง€๋„ ์ƒ์„ฑ(map building)์€ ํฌํ•จํ•˜์ง€ ์•Š๋Š”๋‹ค. SLAM์€ VO์— loop closure ๊ธฐ๋Šฅ์ด ์ถ”๊ฐ€๋œ ํ˜•ํƒœ๋กœ, ์นด๋ฉ”๋ผ์˜ ์œ„์น˜๋ฅผ ์ถ”์ •ํ•˜๋ฉด์„œ ๋™์‹œ์— ํ™˜๊ฒฝ์˜ 3D ์ง€๋„๊นŒ์ง€ ํ•จ๊ป˜ ๊ตฌ์ถ•ํ•œ๋‹ค. ์ฃผ๋กœ ๋กœ๋ด‡์ด๋‚˜ ์ž์œจ์ฃผํ–‰ ์ฐจ๋Ÿ‰, AR ๊ธฐ๊ธฐ ๋“ฑ์—์„œ ์‹ค์‹œ๊ฐ„(real-time)์œผ๋กœ ๋™์ž‘ํ•˜๋„๋ก ์„ค๊ณ„๋˜์–ด ์žˆ์œผ๋ฉฐ, ๊ฒฝ๋Ÿ‰ํ™”๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์„ ํ˜ธ๋œ๋‹ค. ๋ฐ˜๋ฉด SfM์€ ์˜คํ”„๋ผ์ธ์—์„œ ๋™์ž‘ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ, ์‹ค์‹œ๊ฐ„์„ฑ์ด ์š”๊ตฌ๋˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ƒ๋Œ€์ ์œผ๋กœ ๋” ์ •๋ฐ€ํ•˜๊ณ  ๊ณ„์‚ฐ๋Ÿ‰์ด ๋งŽ์€ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๋‹ค. ๋‹ค์ˆ˜์˜ ์ด๋ฏธ์ง€์—์„œ ์นด๋ฉ”๋ผ ํฌ์ฆˆ์™€ 3D ํฌ์ธํŠธ๋ฅผ ํ•จ๊ป˜ ๋ณต์›ํ•˜๋ฉฐ, ์ฃผ๋กœ ์ •์ ์ธ ์žฅ๋ฉด์˜ 3D ๋ณต์›์— ์‚ฌ์šฉ๋œ๋‹ค.

 

* COLMAP์˜ SFM ์€ 2016๋…„ CVPR์— "Structure-from-Motion Revisited" ๋…ผ๋ฌธ์— ์„ค๋ช…๋˜์–ด ์žˆ์Œ

 

SFM (Structure From Motion)

Structure From Motion

 

Structure from Motion(SfM)์€ ๋™์ผํ•œ ์žฅ๋ฉด(Scene)์„ ๋‹ค์–‘ํ•œ ์‹œ์ (viewpoint)์—์„œ ์ดฌ์˜ํ•œ multi-view ์ด๋ฏธ์ง€๋“ค์„ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„, ๊ฐ ์ด๋ฏธ์ง€์˜ ์นด๋ฉ”๋ผ ์ž์„ธ(camera pose)์™€ 3D ๊ตฌ์กฐ(3D structure)๋ฅผ ๋ณต์›ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ SfM์€ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋œ๋‹ค.

  1. Feature Detection and Extraction
    ๊ฐ ์ด๋ฏธ์ง€์—์„œ SIFT ๋“ฑ์˜ ๋กœ์ปฌ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ํ†ตํ•ด ํŠน์ง•์ (feature point)์„ ์ถ”์ถœํ•˜๊ณ , ์ด๋Ÿฌํ•œ ํŠน์ง•์ ์ด radiometric ๋ฐ geometric ๋ณ€ํ™”์— ๊ฐ•๊ฑดํ•˜๋„๋ก ํ•œ๋‹ค.
  2. Feature Matching and Geometric Verification
    ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€ ๊ฐ„์˜ ๋™์ผํ•œ ํŠน์ง•์ ์„ ๋งค์นญํ•˜๊ณ , Epipolar Geometry๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ธฐํ•˜ํ•™์ ์œผ๋กœ ์œ ํšจํ•œ ๋งค์นญ์ธ์ง€ ๊ฒ€์ฆํ•œ๋‹ค. RANSAC ๋“ฑ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์ด ๊ณผ์ •์—์„œ ํ™œ์šฉ๋œ๋‹ค.
  3. Structure and Motion Reconstruction
    ์ดˆ๊ธฐ ์ด๋ฏธ์ง€ ์Œ์œผ๋กœ reconstruction์„ ์‹œ์ž‘ํ•˜๊ณ , PnP ๋ฐ Triangulation์„ ํ†ตํ•ด ์นด๋ฉ”๋ผ ์ž์„ธ์™€ 3D ํฌ์ธํŠธ๋ฅผ ์ ์ง„์ ์œผ๋กœ ์ถ”์ •ํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ Bundle Adjustment๋กœ ์ „์ฒด ๊ตฌ์กฐ๋ฅผ ์ตœ์ ํ™”ํ•œ๋‹ค.

 

์ด๋Ÿฌํ•œ SfM ํŒŒ์ดํ”„๋ผ์ธ์€ COLMAP๊ณผ ๊ฐ™์€ GUI ๊ธฐ๋ฐ˜ ํˆด์—์„œ ์ž˜ ์ •๋ฆฌ๋œ ํ˜•ํƒœ๋กœ ์ œ๊ณต๋˜๋ฉฐ, ์ •๋ ฌ๋˜์ง€ ์•Š์€ ๋‹ค์ˆ˜์˜ ์ด๋ฏธ์ง€๋“ค์„ ์ž…๋ ฅ์œผ๋กœ ๋„ฃ๊ธฐ๋งŒ ํ•ด๋„ ์ž๋™์œผ๋กœ ์นด๋ฉ”๋ผ ํฌ์ฆˆ ์ถ”์ • + 3D ๊ตฌ์กฐ ๋ณต์›์ด ๊ฐ€๋Šฅํ•˜๋‹ค.

 

ํ•˜์ง€๋งŒ COLMAP์˜ reconstruction ์„ฑ๋Šฅ์„ ๋†’์ด๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ช‡ ๊ฐ€์ง€ ์กฐ๊ฑด์„ ๊ณ ๋ คํ•ด์•ผ ํ•œ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์š”์†Œ๋“ค์ด ์ถฉ์กฑ๋˜์ง€ ์•Š์œผ๋ฉด SfM ๊ฒฐ๊ณผ๋ฌผ์€ ํ’ˆ์งˆ์ด ๋‚ฎ์•„์ง€๊ฑฐ๋‚˜, reconstruction ์ž์ฒด๊ฐ€ ์‹คํŒจํ•  ์ˆ˜ ์žˆ๋‹ค.

  • Texture๊ฐ€ ์ถฉ๋ถ„ํžˆ ์žˆ๋Š” ์ด๋ฏธ์ง€ ์‚ฌ์šฉ
    ํŠน์ง•์  ์ถ”์ถœ๊ณผ ๋งค์นญ์ด ์ž˜ ๋˜๊ธฐ ์œ„ํ•ด์„œ๋Š”, ํ‘œ๋ฉด์— ๋ฐ˜๋ณต๋˜์ง€ ์•Š๋Š” ์ถฉ๋ถ„ํ•œ ํ…์Šค์ฒ˜(texture) ์ •๋ณด๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ๋‹จ์ˆœํ•˜๊ฑฐ๋‚˜ ๊ท ์ผํ•œ ํ‘œ๋ฉด(์˜ˆ: ํฐ ๋ฒฝ, ํ•˜๋Š˜ ๋“ฑ)์€ ํฌ์ธํŠธ๊ฐ€ ๊ฑฐ์˜ ์ถ”์ถœ๋˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— 3D ๋ณต์›์ด ์–ด๋ ต๋‹ค.
  • ์œ ์‚ฌํ•œ ์กฐ๋ช… ์กฐ๊ฑด ์œ ์ง€
    ์กฐ๋ช…์ด ๊ธ‰๊ฒฉํžˆ ๋‹ค๋ฅด๋ฉด ๋™์ผํ•œ ๋ฌผ์ฒด๋ผ๋„ ํŠน์ง•์ ์ด ๋‹ค๋ฅด๊ฒŒ ์ถ”์ถœ๋˜์–ด ๋งค์นญ์ด ์‹คํŒจํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ฐ€๋Šฅํ•œ ํ•œ ์œ ์‚ฌํ•œ ๋ฐ๊ธฐ์™€ ๊ทธ๋ฆผ์ž ์กฐ๊ฑด์—์„œ ์ดฌ์˜๋œ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹๋‹ค.
  • ์‹œ๊ฐ์ ์œผ๋กœ ๋งŽ์ด ์ค‘์ฒฉ๋œ ์ด๋ฏธ์ง€ ํ™•๋ณด
    ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€๋“ค์ด ์ถฉ๋ถ„ํžˆ ๊ฒน์น˜๋Š”(=overlapping) ์˜์—ญ์„ ๊ฐ€์ ธ์•ผ, ๋™์ผํ•œ ํฌ์ธํŠธ์— ๋Œ€ํ•œ robustํ•œ ๋งค์นญ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์žฅ๋ฉด์˜ ์ค‘์ฒฉ๋„๊ฐ€ reconstruction์˜ ํ’ˆ์งˆ์„ ์ขŒ์šฐํ•œ๋‹ค.
  • ๋‹ค์–‘ํ•œ ์‹œ์ (viewpoint)์—์„œ์˜ ์ด๋ฏธ์ง€ ํ™•๋ณด
    ๋‹จ์ˆœํžˆ ์นด๋ฉ”๋ผ ์œ„์น˜๋งŒ ๋ฐ”๊พธ๊ธฐ๋ณด๋‹ค๋Š”, ๋‹ค์–‘ํ•œ ๊ฐ๋„์—์„œ ์ดฌ์˜๋œ ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•ด์•ผ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ณ  ์ž…์ฒด์ ์ธ 3D ๊ตฌ์กฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค. ํŠนํžˆ ๊ตฌ์กฐ์ ์ธ ๋ณต์›์ด ๋ชฉ์ ์ผ ๊ฒฝ์šฐ wide baseline ์ด๋ฏธ์ง€๊ฐ€ ํšจ๊ณผ์ ์ด๋‹ค.

1. Correspondence Search

Structure from Motion(SfM)์˜ ์ฒซ ๋‹จ๊ณ„๋Š” ์—ฌ๋Ÿฌ ์ด๋ฏธ์ง€ ๊ฐ„์˜ ๊ณตํ†ต๋œ ์˜์—ญ(Scene Overlap)์„ ์ฐพ๋Š” Correspondence Search์ด๋‹ค. ์ด ๊ณผ์ •์˜ ๊ฒฐ๊ณผ๋ฌผ์€ ๊ธฐํ•˜ํ•™์ ์œผ๋กœ ์œ ํšจ์„ฑ์ด ๊ฒ€์ฆ๋œ ์ด๋ฏธ์ง€ ์Œ(image pairs)๊ณผ, ๊ฐ ํŠน์ง•์ ์ด ์–ด๋–ค ์ด๋ฏธ์ง€์— ๊ด€์ธก๋˜์—ˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” image projection graph์ด๋‹ค.

 

1.1. Feature Extraction

SfM์—์„œ๋Š” ๊ฐ ์ด๋ฏธ์ง€์—์„œ ๊ณ ์œ ํ•œ ๋ชจ์–‘์„ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•˜๋ฉฐ, ์ด๋•Œ ์ถ”์ถœ๋˜๋Š” ํŠน์ง•์ ์€ ๋ฐฉ์‚ฌ์„  ๋ณ€ํ™”(radiometric)์™€ ๊ธฐํ•˜ํ•™์  ๋ณ€ํ˜•(geometric)์— ๋ถˆ๋ณ€(invariant)ํ•ด์•ผ ํ•œ๋‹ค.

 

์ด๋ฅผ ์œ„ํ•ด ์ผ๋ฐ˜์ ์œผ๋กœ SIFT(Scale-Invariant Feature Transform)์™€ ๊ฐ™์€ ๊ฐ•๋ ฅํ•œ ๋กœ์ปฌ ๋””์Šคํฌ๋ฆฝํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ ์ด๋ฏธ์ง€์—์„œ feature๋ฅผ ์ถ”์ถœํ•œ๋‹ค.

 

1.2. Matching

Feature Matching

 

๊ฐ ์ด๋ฏธ์ง€์—์„œ ์ถ”์ถœํ•œ ํŠน์ง•์ ๋“ค์„ ๋น„๊ตํ•˜์—ฌ ์„œ๋กœ ๊ฐ™์€ ๋ฌผ์ฒด์˜ ๋™์ผํ•œ ์œ„์น˜๋ฅผ ๋ฐ”๋ผ๋ณด๊ณ  ์žˆ๋Š” ์ง€์ ๋“ค์„ ์ฐพ๋Š”๋‹ค. ์ด ๋‹จ๊ณ„๋Š” ์ฃผ๋กœ ์™ธํ˜•์  ์œ ์‚ฌ์„ฑ์„ ๊ธฐ์ค€์œผ๋กœ ํ•˜๋ฉฐ, ์•„์ง ๊ธฐํ•˜ํ•™์ ์ธ ๊ฒ€์ฆ์€ ํฌํ•จ๋˜์ง€ ์•Š๋Š”๋‹ค. ๋”ฐ๋ผ์„œ ์ด ๋‹จ๊ณ„์—์„œ ์ž˜๋ชป๋œ ๋งค์นญ(outlier)๋„ ๋‹ค์ˆ˜ ํฌํ•จ๋  ์ˆ˜ ์žˆ๋‹ค

 

1.3. Geometric Verification

Epipolar Geometry

 

 

๋งค์นญ๋œ feature๋“ค์ด ์‹ค์ œ๋กœ ๊ธฐํ•˜ํ•™์ ์œผ๋กœ ์ผ์น˜ํ•˜๋Š”์ง€๋ฅผ ๊ฒ€์ฆํ•˜๋Š” ๋‹จ๊ณ„์ด๋‹ค. ์ด๋Š” Epipolar Geometry ๊ธฐ๋ฐ˜์˜ ๊ฒ€์ฆ์„ ํ†ตํ•ด ์ˆ˜ํ–‰๋œ๋‹ค.

๋‘ ์ด๋ฏธ์ง€ ๊ฐ„์˜ ์นด๋ฉ”๋ผ ์œ„์น˜์™€ ๋ฐฉํ–ฅ์ด ๋‹ค๋ฅด๊ธฐ ๋•Œ๋ฌธ์—, ๋™์ผํ•œ 3D ํฌ์ธํŠธ๊ฐ€ ๊ฐ ์ด๋ฏธ์ง€ ์ƒ์—์„œ ๊ด€์ธก๋˜๋Š” ๋ฐฉ์‹์€ ์ผ์ •ํ•œ ์ œ์•ฝ ์กฐ๊ฑด์„ ๋”ฐ๋ฅธ๋‹ค. ์ด ์ œ์•ฝ์„ ์ˆ˜ํ•™์ ์œผ๋กœ ํ‘œํ˜„ํ•œ ๊ฒƒ์ด Fundamental Matrix (uncalibrated) ๋˜๋Š” Essential Matrix (calibrated)์ด๋‹ค.

 

์ด ๋งคํŠธ๋ฆญ์Šค๋ฅผ ์ตœ์†Œํ•œ์˜ ๋งค์นญ๋œ ์ ๋“ค๋กœ๋ถ€ํ„ฐ ๊ณ„์‚ฐํ•˜๊ณ , ์ด ๊ณ„์‚ฐ๋œ ๋งคํŠธ๋ฆญ์Šค๊ฐ€ ์ „์ฒด ๋งค์นญ์„ ์–ผ๋งˆ๋‚˜ ์ž˜ ์„ค๋ช…ํ•˜๋Š”์ง€๋ฅผ ํŒ๋‹จํ•จ์œผ๋กœ์จ, ๊ธฐํ•˜ํ•™์ ์œผ๋กœ ํƒ€๋‹นํ•œ ๋งค์นญ์ธ์ง€ ๊ฒ€์ฆํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

Feature Matching after RANSAC

 

์‹ค์ œ ๋งค์นญ๋œ ํฌ์ธํŠธ ์ค‘์—๋Š” ์—ฌ์ „ํžˆ ๋งŽ์€ outlier๊ฐ€ ์กด์žฌํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, RANSAC(Random Sample Consensus)๊ณผ ๊ฐ™์€ robust estimation ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•œ๋‹ค. RANSAC์€ ๋ฌด์ž‘์œ„๋กœ ๋ฝ‘์€ ์†Œ์ˆ˜์˜ ๋งค์นญ ์ ๋“ค๋กœ๋ถ€ํ„ฐ ์—ฌ๋Ÿฌ Essential/Fundamental matrix๋ฅผ ์ถ”์ •ํ•˜๊ณ , ๊ฐ ํ›„๋ณด ๋ชจ๋ธ์ด ์ „์ฒด ๋งค์นญ์— ๋Œ€ํ•ด ์–ผ๋งˆ๋‚˜ ๋งŽ์€ inlier๋ฅผ ํฌํ•จํ•˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•œ๋‹ค.

์ด ๊ณผ์ •์„ ํ†ตํ•ด ๊ธฐํ•˜ํ•™์ ์œผ๋กœ ์œ ํšจํ•œ image pair์™€ ๊ทธ ์‚ฌ์ด์˜ inlier correspondence, geometric relation์„ ์ตœ์ข…์ ์œผ๋กœ ํš๋“ํ•œ๋‹ค.

 

 

2. Incremental Reconstruction

์ด ๋‹จ๊ณ„์—์„œ๋Š” ์•ž์„œ ์–ป์€ ์ด๋ฏธ์ง€ ๊ฐ„ correspondence ์ •๋ณด๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์‹ค์ œ 3์ฐจ์› ๊ตฌ์กฐ๋ฅผ ๋ณต์›ํ•œ๋‹ค. ์ž…๋ ฅ์€ image projection graph, ์ถœ๋ ฅ์€ ๊ฐ ์ด๋ฏธ์ง€์˜ ์นด๋ฉ”๋ผ pose (์œ„์น˜์™€ ๋ฐฉํ–ฅ)์™€ scene์˜ 3D ํฌ์ธํŠธ๋“ค์ด๋‹ค.

 

2.1. Initialization

SfM์˜ reconstruction์€ ์ผ๋ฐ˜์ ์œผ๋กœ 2-view reconstruction์œผ๋กœ ์‹œ์ž‘ํ•œ๋‹ค. ์ฒ˜์Œ ์‹œ์ž‘ํ•  ์ด๋ฏธ์ง€ ์Œ์€ ์‹ ์ค‘ํžˆ ์„ ํƒ๋˜์–ด์•ผ ํ•˜๋ฉฐ, ์นด๋ฉ”๋ผ ๊ฐ„ ์ค‘์ฒฉ์ด ์ถฉ๋ถ„ํ•˜๊ณ  ํŠน์ง•์  ๋งค์นญ์ด ํ’๋ถ€ํ•œ image pair๋ฅผ ์„ ํƒํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. ์ดˆ๊ธฐ reconstruction ์„ฑ๋Šฅ์ด ์ดํ›„ ์ „์ฒด ๋ชจ๋ธ ํ’ˆ์งˆ์— ํฐ ์˜ํ–ฅ์„ ๋ฏธ์น˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

 

2.2. Image Registration

๊ธฐ์กด์— ๋“ฑ๋ก๋œ ์ด๋ฏธ์ง€๋“ค์— ๋Œ€ํ•ด 3D ํฌ์ธํŠธ์™€์˜ ๋Œ€์‘ ๊ด€๊ณ„๊ฐ€ ์กด์žฌํ•  ๋•Œ, ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€๋ฅผ ๋“ฑ๋กํ•˜๊ธฐ ์œ„ํ•ด PnP(Perspective-n-Point) ๋ฌธ์ œ๋ฅผ ํ’€๊ฒŒ ๋œ๋‹ค.

 

PnP๋Š” 3D ํฌ์ธํŠธ์™€ ํ•ด๋‹น ์ด๋ฏธ์ง€ ๋‚ด์˜ 2D ๋Œ€์‘์ ์ด ์ฃผ์–ด์กŒ์„ ๋•Œ, ํ•ด๋‹น ์ด๋ฏธ์ง€์˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ(camera pose: ์œ„์น˜์™€ ๋ฐฉํ–ฅ)๋ฅผ ์ถ”์ •ํ•˜๋Š” ๋ฌธ์ œ์ด๋‹ค. ํ•„์š”์— ๋”ฐ๋ผ ์นด๋ฉ”๋ผ ๋‚ด๋ถ€ ํŒŒ๋ผ๋ฏธํ„ฐ(intrinsic parameters)๋„ ๋™์‹œ์— ์ถ”์ •ํ•œ๋‹ค

 

2.3. Triangulation

 

๋™์ผํ•œ 3D ํฌ์ธํŠธ๋ฅผ ์„œ๋กœ ๋‹ค๋ฅธ ์นด๋ฉ”๋ผ์—์„œ ๊ด€์ธกํ–ˆ์„ ๋•Œ ์ƒ๊ธฐ๋Š” ๋‘ ๊ฐœ์˜ ๊ด‘์„ (ray)์€, ์ด์ƒ์ ์œผ๋กœ๋Š” ํ•˜๋‚˜์˜ 3D ํฌ์ธํŠธ์—์„œ ๊ต์ฐจํ•ด์•ผ ํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ ๊ด€์ธก์—๋Š” ์žก์Œ์ด ์กด์žฌํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ •ํ™•ํžˆ ๊ต์ฐจํ•˜์ง€ ์•Š๊ณ  ์—‡๊ฐˆ๋ฆฌ๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ๋‹ค.

 

๋”ฐ๋ผ์„œ ์ด ๋‘ ray๊ฐ€ ๊ฐ€์žฅ ๊ฐ€๊น๊ฒŒ ์ ‘๊ทผํ•˜๋Š” ์ง€์ ์„ ์‚ผ๊ฐ์ธก๋Ÿ‰(triangulation)ํ•˜์—ฌ, 3D ํฌ์ธํŠธ๋ฅผ ์ถ”์ •ํ•œ๋‹ค. ์ด ๊ณผ์ •์€ ๋ชจ๋ธ์˜ 3์ฐจ์› ๊ตฌ์กฐ๋ฅผ ์™„์„ฑํ•˜๊ณ , ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ๋“ฑ๋ก์„ ์œ„ํ•œ ์ถ”๊ฐ€์ ์ธ 2D-3D correspondence๋ฅผ ์ œ๊ณตํ•˜๋Š” ํ•ต์‹ฌ ๋‹จ๊ณ„์ด๋‹ค.

 

2.4. Bundle Adjustment

 

 

๋ชจ๋“  ์ด๋ฏธ์ง€์˜ ํฌ์ฆˆ์™€ 3D ํฌ์ธํŠธ๋ฅผ globalํ•˜๊ฒŒ ์žฌ์กฐ์ •ํ•˜์—ฌ ์ตœ์ ํ™”ํ•˜๋Š” ๊ณผ์ •์ด๋‹ค. ๋ชฉํ‘œ๋Š” ๊ฐ 3D ํฌ์ธํŠธ๋ฅผ projectionํ•œ ์œ„์น˜์™€ ์‹ค์ œ ์ด๋ฏธ์ง€ ์ƒ์˜ 2D ์œ„์น˜ ๊ฐ„์˜ ์ฐจ์ด์ธ Reprojection Error๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

 

์ด ์ตœ์ ํ™”๋Š” ๋‹ค์ˆ˜์˜ ์นด๋ฉ”๋ผ์™€ ํฌ์ธํŠธ ๊ฐ„์˜ ๋ณต์žกํ•œ ๊ด€๊ณ„๋ฅผ ํฌํ•จํ•˜๋ฉฐ, ์ผ๋ฐ˜์ ์œผ๋กœ ๋น„์„ ํ˜• ์ตœ์ ํ™”(non-linear optimization)์ธ Gauss-Newton ๋˜๋Š” Levenberg-Marquardt ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๊ฒฐํ•œ๋‹ค.

 

"Bundle"์ด๋ผ๋Š” ์šฉ์–ด๋Š” ๊ฐ 3D ํฌ์ธํŠธ์—์„œ ์—ฌ๋Ÿฌ ์นด๋ฉ”๋ผ๋ฅผ ํ–ฅํ•ด ๋ป—๋Š” ๊ด‘์„ (rays)๋“ค์„ ๋ฌถ์–ด ํ‘œํ˜„ํ•œ ๋ฐ์„œ ์œ ๋ž˜ํ•œ๋‹ค. ์ด ๊ณผ์ •์€ ์ „์ฒด reconstruction์˜ ์ •๋ฐ€๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ํ•ต์‹ฌ ๋‹จ๊ณ„์ด๋ฉฐ, ๋ณดํ†ต ์ „์ฒด SfM ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ๊ฐ€์žฅ ๊ณ„์‚ฐ ๋น„์šฉ์ด ํฌ๋‹ค.

 

 

3. Dense Reconstruction

 

์œ„์˜ SfM ๋‹จ๊ณ„์—์„œ๋Š” ์ฃผ๋กœ ํŠน์ง•์  ๊ธฐ๋ฐ˜์˜ sparse point cloud๋ฅผ ์ƒ์„ฑํ•˜๊ฒŒ ๋œ๋‹ค. ํ•˜์ง€๋งŒ ์‹ค์ œ๋กœ ํ™œ์šฉ ๊ฐ€๋Šฅํ•œ 3D ๋ชจ๋ธ์„ ์–ป๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋” ์ •๋ฐ€ํ•˜๊ณ  ์กฐ๋ฐ€ํ•œ ๊ตฌ์กฐ๊ฐ€ ํ•„์š”ํ•˜๋‹ค. ์ด๋ฅผ ์œ„ํ•œ ๋‹จ๊ณ„๊ฐ€ Dense Reconstruction์ด๋‹ค.

3.1. Depth Map Estimation

๋“ฑ๋ก๋œ ๊ฐ ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ํ•ด๋‹น ๋ทฐ์—์„œ์˜ depth map (๊นŠ์ด ์ •๋ณด)์„ ์ถ”์ •ํ•œ๋‹ค. ์ด๋Š” stereo matching ๋˜๋Š” multi-view stereo(MVS) ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ์ˆ˜ํ–‰๋œ๋‹ค. ๊ฐ ํ”ฝ์…€์— ๋Œ€ํ•ด ๊ฐ€์žฅ ์ ์ ˆํ•œ depth ๊ฐ’์„ ์˜ˆ์ธกํ•˜์—ฌ, ๊ฐ ์ด๋ฏธ์ง€์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” 3์ฐจ์› ํฌ์ธํŠธ๋ฅผ ์ด˜์ด˜ํžˆ ์ถ”์ •ํ•˜๋Š” ๊ฒƒ์ด๋‹ค.

3.2. Depth Fusion

์—ฌ๋Ÿฌ ์ด๋ฏธ์ง€์—์„œ ์ถ”์ •๋œ depth map์€ ์„œ๋กœ ์ค‘๋ณต๋˜๋Š” ์˜์—ญ์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค. ์ด๋“ค ๊ฐ„์˜ ์ •ํ•ฉ์„ ์ˆ˜ํ–‰ํ•ด ์ผ๊ด€๋œ ํ•˜๋‚˜์˜ 3D ํฌ์ธํŠธ ํด๋ผ์šฐ๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋‹จ๊ณ„์ด๋‹ค. ๋ถˆํ™•์‹ค์„ฑ์ด๋‚˜ ๋…ธ์ด์ฆˆ๋ฅผ ์ค„์ด๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ filtering์ด๋‚˜ confidence-based weighting์ด ํ™œ์šฉ๋˜๊ธฐ๋„ ํ•œ๋‹ค.

3.3. Surface Reconstruction

์ตœ์ข…์ ์œผ๋กœ ์ƒ์„ฑ๋œ dense point cloud๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, mesh ํ˜•ํƒœ์˜ ํ‘œ๋ฉด(surface)์„ ๋ณต์›ํ•œ๋‹ค. ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” Poisson Surface Reconstruction์ด๋‚˜ Marching Cubes ๋“ฑ์ด ์žˆ์œผ๋ฉฐ, ์ด๋ฅผ ํ†ตํ•ด ์‹ค์ œ ๋ชจ๋ธ๋ง ๋ฐ ์‹œ๊ฐํ™”์— ์ ํ•ฉํ•œ 3D ๋ฉ”์‰ฌ ๊ตฌ์กฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋‹ค.

Dense Reconstruction์€ ์ผ๋ฐ˜ SfM์˜ ์ถœ๋ ฅ์— ๋น„ํ•ด ํ›จ์”ฌ ๋” ์„ธ๋ฐ€ํ•œ ๊ตฌ์กฐ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์—, ๊ณ ์ •๋ฐ€ 3D ๋ชจ๋ธ์ด ํ•„์š”ํ•œ ๋ถ„์•ผ(AR/VR, ๋กœ๋ณดํ‹ฑ์Šค, ๋””์ง€ํ„ธ ํŠธ์œˆ ๋“ฑ)์—์„œ ํ•„์ˆ˜์ ์ธ ๊ณผ์ •์ด๋‹ค.

๋ฐ˜์‘ํ˜•

'๐Ÿ“– Fundamentals > 3D vision & Graphics' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[3D Vision] Marching Cubes: 3D ๋ณผ๋ฅจ ๋ฐ์ดํ„ฐ๋ฅผ Mesh๋กœ ๋ฐ”๊พธ๋Š” ๋ฐฉ๋ฒ•  (0) 2025.03.24
[3D Vision] Point Cloud vs. Mesh: ์ฐจ์ด์ , ๋ณ€ํ™˜ ๋ฐฉ๋ฒ•  (0) 2025.03.12
[3D Vision] 3D ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ(Mesh, Point Cloud)์™€ ํฌ๋งท(OBJ, PLY, PCD)  (0) 2025.03.05
[CV] 3D Geometry ์„ค๋ช…  (0) 2022.04.04
[Graphics] 3D ๋ชจ๋ธ๋ง์„ ์œ„ํ•œ OBJ & MTL ํŒŒ์ผ ๊ตฌ์กฐ์™€ PBR ์žฌ์งˆ ์ •๋ฆฌ  (0) 2022.04.04
'๐Ÿ“– Fundamentals/3D vision & Graphics' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [3D Vision] Point Cloud vs. Mesh: ์ฐจ์ด์ , ๋ณ€ํ™˜ ๋ฐฉ๋ฒ•
  • [3D Vision] 3D ๋ฐ์ดํ„ฐ ๊ตฌ์กฐ(Mesh, Point Cloud)์™€ ํฌ๋งท(OBJ, PLY, PCD)
  • [CV] 3D Geometry ์„ค๋ช…
  • [Graphics] 3D ๋ชจ๋ธ๋ง์„ ์œ„ํ•œ OBJ & MTL ํŒŒ์ผ ๊ตฌ์กฐ์™€ PBR ์žฌ์งˆ ์ •๋ฆฌ
๋ญ…์ฆค
๋ญ…์ฆค
AI ๊ธฐ์ˆ  ๋ธ”๋กœ๊ทธ
    ๋ฐ˜์‘ํ˜•
  • ๋ญ…์ฆค
    CV DOODLE
    ๋ญ…์ฆค
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
  • ๊ณต์ง€์‚ฌํ•ญ

    • โœจ About Me
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (198)
      • ๐Ÿ“– Fundamentals (33)
        • Computer Vision (9)
        • 3D vision & Graphics (6)
        • AI & ML (15)
        • NLP (2)
        • etc. (1)
      • ๐Ÿ› Research (64)
        • Deep Learning (7)
        • Image Classification (2)
        • Detection & Segmentation (17)
        • OCR (7)
        • Multi-modal (4)
        • Generative AI (6)
        • 3D Vision (2)
        • Material & Texture Recognit.. (8)
        • NLP & LLM (11)
        • etc. (0)
      • ๐ŸŒŸ AI & ML Tech (7)
        • AI & ML ์ธ์‚ฌ์ดํŠธ (7)
      • ๐Ÿ’ป Programming (85)
        • Python (18)
        • Computer Vision (12)
        • LLM (4)
        • AI & ML (17)
        • Database (3)
        • Apache Airflow (6)
        • Docker & Kubernetes (14)
        • ์ฝ”๋”ฉ ํ…Œ์ŠคํŠธ (4)
        • C++ (1)
        • etc. (6)
      • ๐Ÿ’ฌ ETC (3)
        • ์ฑ… ๋ฆฌ๋ทฐ (3)
  • ๋งํฌ

  • ์ธ๊ธฐ ๊ธ€

  • ํƒœ๊ทธ

    ๋”ฅ๋Ÿฌ๋‹
    Text recognition
    ๊ฐ์ฒด๊ฒ€์ถœ
    multi-modal
    object detection
    GPT
    OCR
    CNN
    ํ”„๋กฌํ”„ํŠธ์—”์ง€๋‹ˆ์–ด๋ง
    ๊ฐ์ฒด ๊ฒ€์ถœ
    Python
    ์ปดํ“จํ„ฐ๋น„์ „
    VLP
    Computer Vision
    pytorch
    Image Classification
    AI
    3D Vision
    OpenCV
    OpenAI
    ChatGPT
    deep learning
    nlp
    pandas
    ๋„์ปค
    ํŒŒ์ด์ฌ
    airflow
    segmentation
    material recognition
    LLM
  • ์ตœ๊ทผ ๋Œ“๊ธ€

  • ์ตœ๊ทผ ๊ธ€

  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
๋ญ…์ฆค
[CV] SFM (Structure From Motion) : ์—ฐ์†๋œ 2D ์ด๋ฏธ์ง€๋“ค๋กœ ์นด๋ฉ”๋ผ ํฌ์ฆˆ์™€ 3D shape ์žฌ๊ตฌ์„ฑํ•˜๊ธฐ
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”