๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ› Research/Material & Texture Recognition

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] A 4D Light-Field Dataset and CNN Architectures for Material Recognition

by ๋ญ…์ฆค 2021. 10. 6.
๋ฐ˜์‘ํ˜•

๋ณธ ๋…ผ๋ฌธ์€ ECCV2016์— ๊ฒŒ์žฌ๋œ ๋…ผ๋ฌธ์œผ๋กœ 4D light-field dataset์„ ์ฒ˜์Œ์œผ๋กœ ์žฌ์งˆ ์ธ์‹์— ์‚ฌ์šฉํ•œ ์—ฐ๊ตฌ ์ž…๋‹ˆ๋‹ค.

 

Light field?

light field ๋Š” ๋น›์„ ์‹œ๊ณต๊ฐ„ field์—์„œ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•œ plenoptic function์œผ๋กœ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š”๋ฐ, ์ด์ค‘์—์„œ ๋น›์˜ ๋ฐฉํ–ฅ(๊ฐ€๋กœ ๊ฐ๋„, ์„ธ๋กœ๊ฐ๋„), ๋น›์˜ 2์ฐจ์› ์œ„์น˜(x,y) 4๊ฐœ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ 4D light-field๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.  Lytro Illum ๋“ฑ์˜ 4D light-field ์นด๋ฉ”๋ผ๋Š” micro lens array๋ฅผ main lens์™€ photo sensor ์‚ฌ์ด์— ์œ„์น˜์‹œ์ผœ ์˜ค๋ธŒ์ ํŠธ ํ•œ ์ง€์ ์—์„œ ์—ฌ๋Ÿฌ ๋ฐฉํ–ฅ์œผ๋กœ ๋ฐฉ์‚ฌ๋˜์–ด main lenz๋ฅผ ํ†ต๊ณผํ•˜๋Š” ๋น›์„ micro lens๋กœ ๋ถ„๋ฆฌ์‹œ์ผœ ๋”ฐ๋กœ ์ €์žฅ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” base-line์ด ๊ต‰์žฅํžˆ ์งง์€ multi-view ์นด๋ฉ”๋ผ์™€ ์œ ์‚ฌํ•˜๊ฒŒ ๋ณผ ์ˆ˜๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— multi-view ์นด๋ฉ”๋ผ๋กœ ํ•  ์ˆ˜ ์žˆ์—ˆ๋˜ depth estimation, 3D reconsturction, refocusing ๋“ฑ์˜ task์— 4D light-field ์นด๋ฉ”๋ผ๋ฅผ ํ™œ์šฉํ•œ ์—ฐ๊ตฌ๊ฐ€ ๋งŽ์ด ์†Œ๊ฐœ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

4D ligth field ์ด๋ฏธ์ง€๋Š” spatial domain ๊ณผ angular domain์ด ์กด์žฌํ•˜๋Š”๋ฐ, ๊ธ€๋กœ ์ฝ์œผ๋ฉด ๋งŽ์ด ํ—ท๊ฐˆ๋ฆฌ๋Š” ๋ถ€๋ถ„์ž…๋‹ˆ๋‹ค. spatial domain์€ ์ด๋ฏธ์ง€์—์„œ 2์ฐจ์› ๊ณต๊ฐ„ ์ขŒํ‘œ๊ณ„๋ฅผ ๋œปํ•˜๊ณ , angular domain์€ object ํ•œ ํฌ์ธํŠธ๋ฅผ ๋ณด๊ณ  ์žˆ์ง€๋งŒ ์กฐ๊ธˆ์”ฉ ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ๋„์—์„œ ๋ณธ ์ขŒํ‘œ๊ณ„๋ฅผ ๋œปํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜ ์นด๋ฉ”๋ผ์˜ ํ•ด์ƒ๋„๊ฐ€ 100x100 ์ด๋ผ๊ณ  ํ• ๋•Œ ์ด๋Š” spatial domain์ด๊ณ , light field์—์„œ spatial domain์ด 100x100, angular domain์ด 7x7 ์ด๋ฉด ์ด 700x700 ์˜ raw data ์ด๋ฏธ์ง€๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๊ณ  ์—ฌ๊ธฐ์„œ angular domain(7*7) ๋งŒํผ์˜ ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ๋„์—์„œ ์ž…์‚ฌ๋œ ๋น›์„ ์ €์žฅํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. task์— ๋”ฐ๋ผ raw data(700x700)๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๊ธฐ๋„ ํ•˜๊ณ  ํ›„์ฒ˜๋ฆฌ ๊ณผ์ •์„ ๊ฑฐ์ณ 100x100 ์ด๋ฏธ์ง€๋ฅผ 7x7(49)๊ฐœ ๋กœ ๋ถ„๋ฆฌ ์‹œ์ผœ ์‚ฌ์šฉํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

Introduction

๋ณธ ๋…ผ๋ฌธ์€ ์ด๋Ÿฌํ•œ 4D light field ์นด๋ฉ”๋ผ๋ฅผ ์žฌ์งˆ์ธ์‹์— ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์„ ๋†’์ด๊ณ ์ž ํ–ˆ์Šต๋‹ˆ๋‹ค. light field ์นด๋ฉ”๋ผ๋Š” baseline์ด ์งง์€ multi-view ์นด๋ฉ”๋กœ ๋ณผ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, light filed ์ด๋ฏธ์ง€์—์„œ surface์˜ partialํ•œ reflectance๋ฅผ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.reflectance๋Š” ์žฌ์งˆ์˜ ์ข…๋ฅ˜์™€ ํ‘œ๋ฉด์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋Š” ์–ด๋Š์ •๋„ ๊ณ ์œ ํ•œ ์†์„ฑ์ด๊ธฐ ๋•Œ๋ฌธ์— ์žฌ์งˆ ๋ถ„๋ฅ˜์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋ฌผ๋ก  ๋™์ผ ์žฌ์งˆ์ผ์ง€๋ผ๋„ ํ‘œ๋ฉด์˜ roughness๋“ฑ์— ๋”ฐ๋ผ reflectance๋Š” ๋งŽ์ด ๋‹ฌ๋ผ์งˆ ์ˆ˜ ์žˆ์ง€๋งŒ, ์ผ๋ฐ˜์ ์ธ ๊ฒฝ์šฐ ์žฌ์งˆ ๋ถ„๋ฅ˜์— ์ƒ๋‹นํ•œ ๋„์›€์„ ์ฃผ๋Š” feature์ž…๋‹ˆ๋‹ค.)

๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” 12๊ฐœ์˜ ์žฌ์งˆ class, class๋ณ„ 100๊ฐœ์˜ ์ด๋ฏธ์ง€๋กœ ์ด๋ฃจ์–ด์ง„ dataset์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋™ dataset์œผ๋กœ patch-wise classification์„ ์ˆ˜ํ–‰ํ•˜์—ฌ patch model์„ trainํ•˜๊ณ  FCN model์— fransferํ•˜์—ฌ semantic segmentation๊นŒ์ง€ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ 4D light field data๋ฅผ ์ธ์ฝ”๋”ฉํ•˜๊ธฐ ์œ„ํ•œ ๋ช‡๊ฐ€์ง€ CNN ๊ตฌ์กฐ๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. 2D ์ด๋ฏธ์ง€์— ๋น„ํ•ด 4D ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์žฌ์งˆ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ 6-7% ๊ฐ€๋Ÿ‰ ๋†’์ด๋Š” ์„ฑ๊ณผ๋ฅผ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

 

Light-field material dataset

dataset์˜ spatial resolution์€ 376x541์ด๋ฉฐ angular resolution์€ 14x14์ด์ง€๋งŒ(์นด๋ฉ”๋ผ ์„ฑ๋Šฅ) 7x7๊นŒ์ง€ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. (light field ์นด๋ฉ”๋ผ์˜ ํŠน์„ฑ์ƒ ๋์ชฝ angle๋กœ ๊ฐˆ์ˆ˜๋ก ์กฐ๋„๊ฐ€ ๋งŽ์ด ๋–จ์–ด์ง‘๋‹ˆ๋‹ค.) 

 

ํฅ๋ฏธ๋กœ์šด ์ ์€ ๋‹ค๋ฅธ ์žฌ์งˆ dataset๊ณผ ๋‹ค๋ฅด๊ฒŒ light-field๋งŒ์ด ์ž˜ ๊ตฌ๋ณ„ํ•  ์ˆ˜ ์žˆ๋Š” ํ˜ผ๋™๋˜๋Š” class๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ์œ„ ๊ทธ๋ฆผ์—์„œ paper, sky 2D ์ด๋ฏธ์ง€๋Š” ์‚ฌ๋žŒ์ด ๋ด๋„ ๊ตฌ๋ณ„ํ•˜๊ธฐ ํž˜๋“ค์ง€๋งŒ light field ์ด๋ฏธ์ง€๋กœ๋Š” paper๋Š” ์นด๋ฉ”๋ผ์—์„œ ๊ฐ€๊นŒ์šด ๊ฑฐ๋ฆฌ์— ์žˆ๊ธฐ ๋•Œ๋ฌธ์— micro lens๋กœ ์ธํ•ด ๊ฒฉ์ž๋ฌด๋Šฌ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ fabric๊ณผ fabric์„ ์ดฌ์˜ ํ›„ ์ธ์‡„ํ•œ paper ๋˜ํ•œ color์™€ texture๋Š” ์œ ์‚ฌํ•˜์ง€๋งŒ reflectance ์ฐจ์ด๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋„คํŠธ์›Œํฌ๊ฐ€ ๊ตฌ๋ณ„๊ฐ€๋Šฅํ•˜๋‹ค๊ณ  ์ฃผ์žฅํ•ฉ๋‹ˆ๋‹ค.

 

CNN architecture for 4D light-fields

 

1) View pool

๊ฐ view ๋ณ„ feature๋ฅผ ๊ฐ๊ฐ encoding ํ›„ max pooling 

view๋ณ„ ๊ฐ€์žฅ ๊ฐ•ํ•˜๊ฒŒ ๊ด€์ธก๋˜๋Š” feature๋“ค๋งŒ ๋ชจ์•„์„œ ๋ถ„๋ฅ˜ํ•˜๊ณ ์ž ํ•˜๋Š” ๋ฐฉ๋ฒ•์ธ ๊ฒƒ ๊ฐ™๊ณ  reflectance๋ฅผ ์ธ์ฝ”๋”ฉํ•˜๊ธฐ๋Š” ์–ด๋ ค์šด ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค.

์‚ฌ์‹ค baseline์ด ๊ต‰์žฅํžˆ ์งง๊ธฐ ๋•Œ๋ฌธ์— view๋ณ„ feature์˜ ์ฐจ์ด๊ฐ€ ๊ทธ๋ ‡๊ฒŒ ํฌ์ง€ ์•Š์„ ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹ ๊ฒƒ์ž…๋‹ˆ๋‹ค. (ํฐ ๊ฐ๋„ ์ฐจ์ด๋‚˜ object์˜ ๋‹ค๋ฅธ ๋ถ€๋ถ„์„ ๋ณผ๋•Œ๋Š” ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.)

2) stack

๋ชจ๋“  view์—์„œ ์ถ”์ถœ๋œ feature๋ฅผ ๋‹ค ์Œ“์€ ๋‹ค์Œ ๋‹ค์‹œ convolution filter์— ํ†ต๊ณผ์‹œ์ผœ view๋ณ„ correlation์„ ๋ด…๋‹ˆ๋‹ค.

๊ฐ€์žฅ ๋ฌด์‹ํ•˜๊ฒŒ view๋ณ„ correlation์„ ์ธ์ฝ”๋”ฉํ•˜๊ธฐ ์ข‹์€ ๋ฐฉ๋ฒ•์ด์ง€๋งŒ, ๋ชจ๋“  view๋ฅผ ์Šคํƒํ•˜๊ธฐ ๋•Œ๋ฌธ์— conv filter parameter ์ˆ˜๊ฐ€ ์ƒ๋‹นํžˆ ๋งŽ์ด ์ฆ๊ฐ€ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

3) EPI

Multi-view์—์„œ ๋งŽ์ด ์‚ฌ์šฉํ•˜๋Š” EPI ๋ฐฉ์‹์œผ๋กœ conv๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. (์„ฑ๋Šฅ์ด stack ๋ณด๋‹ค ์ข‹์ง€ ์•Š์•„ ์œ ์‹ฌํžˆ ๋ณด์ง€ ์•Š์Œ)

4) Angular filter

raw data ์ด๋ฏธ์ง€์—์„œ angular resolution์— ํ•ด๋‹นํ•˜๋Š” 7*7 ์‚ฌ์ด์ฆˆ๋กœ ๋จผ์ € conv(stride 7)๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ ํ•˜๋‚˜์˜ pixel์—์„œ view๋ณ„ correlation์„ ๋ณด๊ณ  spatial domain์—์„œ correlation์„ ๋ณด๊ณ ์ž ํ•ฉ๋‹ˆ๋‹ค.

 

 

4D filter

5) 4D filter

interleaved filter๋ผ ๋ถ€๋ฅด๋Š”๋ฐ, spatial domain correlation๋งŒ์„ ๋ณด๋Š” spatial filter์™€ angular domain correlation ๋งŒ์„ ๋ณด๋Š” angular filter๋ฅผ ๋ฒˆ๊ฐˆ์•„ ๊ฐ€๋ฉฐ ์‚ฌ์šฉํ•ด์„œ spatial, angular domain ์—์„œ ๋™์‹œ์— ์ ์ง„์ ์œผ๋กœ feature๋ฅผ low-level ์—์„œ high-level๋กœ ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค.

 

๊ฒฐ๊ณผ๋Š” angular domain correlation ์„ ๋ณด๋Š” 2๊ฐœ์˜ ๋ฐฉ๋ฒ•์ด ๊ฐ€์žฅ ํšจ๊ณผ์ ์ธ ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

Full scene material segmentation

FCN์„ fine-tuningํ•˜์—ฌ ์œ„์™€ ๊ฐ™์ด segmentation๊นŒ์ง€ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

 

My Conclusion

light field data๋กœ surface๋ฅผ ๊ด€์ธกํ•˜๋Š” ์ž‘์€ ๊ฐ๋„ ์ฐจ์ด์—์„œ ๊ด€์ธก ๊ฐ€๋Šฅํ•œ reflectance๊ฐ€ ์žฌ์งˆ ๋ถ„๋ฅ˜์— ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ๋‹ค๋Š” ์‚ฌ์‹ค๊ณผ ํ•จ๊ป˜ ligth field material recognition dataset์„ ์ œ๊ณตํ•œ ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค.

21๋…„ ๊ฐ€์„ ๊ธฐ์ค€์œผ๋กœ ๋ณธ ๋…ผ๋ฌธ์˜ ๋ถ„๋ฅ˜ ์„ฑ๋Šฅ์„ ๋†’ํžŒ ๋…ผ๋ฌธ์ด 2๊ฐœ ์ •๋„ ๋” ๊ฒŒ์žฌ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐ˜์‘ํ˜•