๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90

๐Ÿ› Research58

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers ๋ณธ ๋…ผ๋ฌธ์€ NeurIPS 2021 ์— ๊ณต๊ฐœ๋˜์—ˆ๊ณ , ์‹ฌํ”Œํ•˜๊ณ  ๊ฐ•๋ ฅํ•œ semantic segmentation task ์šฉ Transformer ์ธ SegFormer ๋ฅผ ์ œ์•ˆํ•˜๋Š” ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. Abstract ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํšจ์œจ์ ์ธ Segmentation task ์ˆ˜ํ–‰์„ ์œ„ํ•œ ๊ฐ„๋‹จํ•˜๊ณ  ํšจ์œจ์ ์ด๋ฉด์„œ ๊ฐ•๋ ฅํ•œ semantic segmentation ํ”„๋ ˆ์ž„์›Œํฌ์ธ SegFormer ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. SegFormer ๋Š” 1) multi-scale feature ๋ฅผ ์ถ”์ถœํ•˜๋Š” ์ƒˆ๋กœ์šด hierarchically structured Transformer encoder ๋กœ ๊ตฌ์„ฑ๋˜๊ณ , positional encoding์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ํ…Œ์ŠคํŠธ ์ด๋ฏธ์ง€์˜ ํ•ด์ƒ๋„๊ฐ€ ํ•™์Šต ์ด๋ฏธ์ง€์˜ ํ•ด์ƒ๋„์™€ ๋‹ค๋ฅผ ๋•Œ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜๋Š” positiona.. 2022. 8. 9.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Deep Learning for Large-Scale Traffic-Sign Detection and Recognition / ๊ตํ†ต ํ‘œ์ง€ํŒ ๊ฒ€์ถœ ๋ณธ ํฌ์ŠคํŒ…์—์„œ๋Š” Traffic sign detection (๊ตํ†ต ํ‘œ์ง€ํŒ ๊ฐ์ง€) ์— ๋Œ€ํ•œ ๋…ผ๋ฌธ 2๊ฐœ๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. Traffic-Sign Detection and Classification in the Wild / CVPR 2016 Deep Learning for Large-Scale Traffic-Sign Detection and Recognition / IEEE T-ITS 2019 Traffic sign detection ์€ object detection์˜ ํ•˜์œ„ task๋กœ ๋ณผ ์ˆ˜ ์žˆ๊ณ , ์ž์œจ ์ฃผํ–‰ ๋ฐ ๋„๋กœ ์ •๋ณด๋ฅผ ์ƒ์„ฑํ•˜๋Š”๋ฐ ํ•„์ˆ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๊ต‰์žฅํžˆ ์ž‘์€ ๊ฐ์ฒด๋ฅผ ๊ฐ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์ด ๊ถ๊ธˆํ–ˆ์—ˆ๋Š”๋ฐ, traffic sign detection ๋…ผ๋ฌธ๋“ค์ด ๋„์›€์ด ๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. "Traffic-Sign De.. 2022. 7. 8.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Class-Balanced Loss Based on Effective Number of Samples / Class imbalance๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ• Class Imabalance ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” CVPR 2019์— ๊ณต๊ฐœ๋œ ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋ฆฌ๋ทฐ๋Š” ๋ฌธ์ œ ์ •์˜์™€ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ฐœ๋…์ ์œผ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. (๋””ํ…Œ์ผ ์ œ์™ธ) Class Imbalance ? Class Imbalance ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์€ ๋”ฅ๋Ÿฌ๋‹์—์„œ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ์‚ฌ์šฉ๋˜๋Š” training data ์˜ class ๊ฐœ์ˆ˜๊ฐ€ balance ๊ฐ€ ๋งž์ง€ ์•Š๋Š” ์ƒํ™ฉ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๋งค์šฐ ๋นˆ๋ฒˆํ•œ ์ผ์ด๊ธฐ์— ์ค‘์š”ํ•œ task ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™๊ณ„์—์„œ๋Š” Long tail data ๋ผ๋Š” ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์€ class ๋ถ€ํ„ฐ ์•„์ฃผ ์ ์€ class ๊นŒ์ง€ ๋‹ค์–‘ํ•˜๊ฒŒ ๋ถ„ํฌํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ class imabalance ๋ฌธ์ œ์—์„œ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ? Clas.. 2022. 5. 21.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation / DeepLab v3+ / semantic segmentation์˜ ๊ธฐ์ดˆ Object Detection ์— YOLO ๊ฐ€ ์žˆ๋‹ค๋ฉด Segmentation ๋ถ„์•ผ์—์„  DeepLab ์ด ์ •๋ง ์œ ๋ช…ํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ECCV 2018 ์— ๋ฐœํ‘œ๋˜์–ด DeepLabV3+ ๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. Segmentation์—์„œ์˜ ์ค‘์š”ํ•œ ์š”์†Œ๋“ค์„ ๋ฐฐ์šธ ์ˆ˜ ์žˆ๊ณ , base ์‹คํ—˜ ์‹œ ์•„์ง๋„ ๋งŽ์ด ์‚ฌ์šฉํ•˜๊ธฐ๋„ ํ•˜๊ณ  ์ €๋„ ์—ฐ๊ตฌํ•˜๋ฉฐ ์ผ๋˜ ๋„คํŠธ์›Œํฌ๋ผ ์ •๋ฆฌํ•ด๋‘๋ ค ํ•ฉ๋‹ˆ๋‹ค. Abstract Spatial Pyramid pooling module ๋˜๋Š” encoder-decoder ๊ตฌ์กฐ๋Š” semantic segmentation ์ž‘์—…์„ ์œ„ํ•ด deep neural network์— ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ์ „์ž๋Š” multiple effective FoV ์—์„œ filter ๋˜๋Š” pooling ์œผ๋กœ ๋“ค์–ด์˜ค๋Š” feature์˜ multi-sca.. 2022. 5. 15.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] END-TO-END OPTIMIZED IMAGE COMPRESSION | ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ์‹์˜ ์˜์ƒ ์••์ถ• ICLR 2017 ์— ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์œผ๋กœ ์ œ๋ชฉ ๊ทธ๋Œ€๋กœ end-to-end ๋ฐฉ์‹์œผ๋กœ ์ด๋ฏธ์ง€ ์••์ถ• ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ๋ถ„์•ผ์— ๋Œ€ํ•œ ์ง€์‹์ด ๊ทธ๋ ‡๊ฒŒ ๋งŽ์ง€ ์•Š์•„์„œ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๊ฐ€ ํ—ˆ์ˆ (?)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค... ใ…Ž - ๊ธฐ๋ณธ์ ์ธ ์˜์ƒ ์••์ถ• ์„ค๋ช… : https://mvje.tistory.com/86?category=1033082 ์˜์ƒ ์••์ถ• - JPEG, MPEG ์˜์ƒ ์••์ถ• ๊ด€๋ จ ๋‚ด์šฉ์„ ๋‹ค์‹œ ๊ณต๋ถ€ํ•  ๊ธฐํšŒ๊ฐ€ ์ƒ๊ฒจ์„œ ๊นŒ๋จน๊ธฐ ์ „์— ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค! ๋ฏธ๋””์–ด ๋ฐ์ดํ„ฐ๋Š” ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ๋งŽ์•„์ง€๊ณ  ์ด๋ฅผ ์ €์žฅํ•  ๊ณต๊ฐ„์€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์˜์ƒ ์••์ถ•์€ ์ค‘์š”ํ•œ mvje.tistory.com Abstract Nonlinear analysis transformation, uniform quantizer, no.. 2022. 5. 14.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] MVSNet: Depth Inference for Unstructured Multi-view Stereo / ๋”ฅ๋Ÿฌ๋‹์„ ์ด์šฉํ•œ Multi-view Stereo Reconstruction ๋ฐฉ๋ฒ• ์ž„์˜์˜ N๊ฐœ์˜ view๋ฅผ ์ž…๋ ฅ์œผ๋กœ ์‚ฌ์šฉํ•˜๋Š” Multi-view Stereo Reconstuction task์—์„œ ์ „ํ†ต์ ์ธ ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ, CNN ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํ•™์Šต ๊ธฐ๋ฐ˜์˜ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” ์ฒซ ์—ฐ๊ตฌ์ด๊ธฐ์— ์†Œ๊ฐœํ•˜๋ ค ํ•ฉ๋‹ˆ๋‹ค. ์ง€๊ธˆ์€ ๋ณธ ๋…ผ๋ฌธ์—์„  ์ œ์•ˆํ•˜๋Š” MVSNet ๋ณด๋‹ค ์„ฑ๋Šฅ์ด ์ข‹์€ ๋„คํŠธ์›Œํฌ๊ฐ€ ๋งŽ์ง€๋งŒ, ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆํ•˜๋Š” ์•„์ด๋””์–ด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์—ฐ๊ตฌ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค. (22๋…„ ์ดˆ ๊ธฐ์ค€ SoTA๋Š” Transformer ๊ธฐ๋ฐ˜์˜ TransMVSNet์ž…๋‹ˆ๋‹ค.) Abstract ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” multi-view(๋‹ค์‹œ์ ) ์ด๋ฏธ์ง€์—์„œ depth map inference๋ฅผ ์œ„ํ•œ end-to-end ๋”ฅ๋Ÿฌ๋‹ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ๋„คํŠธ์›Œํฌ์—์„œ ๋‹ค์‹œ์  ์ด๋ฏธ์ง€๋“ค์˜ feature๋ฅผ ์ถ”์ถœํ•œ ํ›„ ๋ฏธ๋ถ„๊ฐ€๋Šฅํ•œ homography warping.. 2022. 3. 29.
728x90