๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90

๐Ÿ› Research/Deep Learning9

[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Class-Balanced Loss Based on Effective Number of Samples / Class imbalance๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ• Class Imabalance ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” CVPR 2019์— ๊ณต๊ฐœ๋œ ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋ฆฌ๋ทฐ๋Š” ๋ฌธ์ œ ์ •์˜์™€ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ฐœ๋…์ ์œผ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. (๋””ํ…Œ์ผ ์ œ์™ธ) Class Imbalance ? Class Imbalance ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์€ ๋”ฅ๋Ÿฌ๋‹์—์„œ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ์‚ฌ์šฉ๋˜๋Š” training data ์˜ class ๊ฐœ์ˆ˜๊ฐ€ balance ๊ฐ€ ๋งž์ง€ ์•Š๋Š” ์ƒํ™ฉ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๋งค์šฐ ๋นˆ๋ฒˆํ•œ ์ผ์ด๊ธฐ์— ์ค‘์š”ํ•œ task ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™๊ณ„์—์„œ๋Š” Long tail data ๋ผ๋Š” ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์€ class ๋ถ€ํ„ฐ ์•„์ฃผ ์ ์€ class ๊นŒ์ง€ ๋‹ค์–‘ํ•˜๊ฒŒ ๋ถ„ํฌํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ class imabalance ๋ฌธ์ œ์—์„œ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ? Clas.. 2022. 5. 21.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] END-TO-END OPTIMIZED IMAGE COMPRESSION | ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ์‹์˜ ์˜์ƒ ์••์ถ• ICLR 2017 ์— ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์œผ๋กœ ์ œ๋ชฉ ๊ทธ๋Œ€๋กœ end-to-end ๋ฐฉ์‹์œผ๋กœ ์ด๋ฏธ์ง€ ์••์ถ• ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ๋ถ„์•ผ์— ๋Œ€ํ•œ ์ง€์‹์ด ๊ทธ๋ ‡๊ฒŒ ๋งŽ์ง€ ์•Š์•„์„œ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๊ฐ€ ํ—ˆ์ˆ (?)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค... ใ…Ž - ๊ธฐ๋ณธ์ ์ธ ์˜์ƒ ์••์ถ• ์„ค๋ช… : https://mvje.tistory.com/86?category=1033082 ์˜์ƒ ์••์ถ• - JPEG, MPEG ์˜์ƒ ์••์ถ• ๊ด€๋ จ ๋‚ด์šฉ์„ ๋‹ค์‹œ ๊ณต๋ถ€ํ•  ๊ธฐํšŒ๊ฐ€ ์ƒ๊ฒจ์„œ ๊นŒ๋จน๊ธฐ ์ „์— ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค! ๋ฏธ๋””์–ด ๋ฐ์ดํ„ฐ๋Š” ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ๋งŽ์•„์ง€๊ณ  ์ด๋ฅผ ์ €์žฅํ•  ๊ณต๊ฐ„์€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์˜์ƒ ์••์ถ•์€ ์ค‘์š”ํ•œ mvje.tistory.com Abstract Nonlinear analysis transformation, uniform quantizer, no.. 2022. 5. 14.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Learning Transferable Visual Models From Natural Language Supervision / CLIP / Multi-modal network Open AI์—์„œ ๊ฒŒ์žฌํ•œ(ICML2021) Contrastive Language-Image Pre-training(CLIP)๋ฅผ ์ œ์•ˆํ•œ ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. Introduction & Motivation ๋”ฅ๋Ÿฌ๋‹์ด computer vision์˜ ๊ฑฐ์˜ ๋ชจ๋“  ๋ถ„์•ผ์—์„œ ๊ต‰์žฅํžˆ ์ž˜ ํ™œ์šฉ๋˜์ง€๋งŒ ํ˜„์žฌ ์ ‘๊ทผ ๋ฐฉ์‹์—๋Š” ๋ช‡๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ์กด์˜ vision model๋“ค์€ ํ•™์Šต๋œ task์—๋Š” ์„ฑ๋Šฅ์ด ์šฐ์ˆ˜ํ•˜์ง€๋งŒ ์ƒˆ๋กœ์šด task์— ์ ์šฉ์‹œํ‚ค๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ƒˆ๋กœ ํ•™์Šต์„ ์‹œํ‚ค์•ผ ํ•˜๋Š”(๊ทธ๋Ÿฌ๋ฉด ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์…‹๊ณผ ์ถ”๊ฐ€ ๋ ˆ์ด๋ธ”๋ง์ด ํ•„์š”..) ๋ฒˆ๊ฑฐ๋กœ์›€(?) ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฒค์น˜๋งˆํฌ์—์„œ ์ž˜ ์ˆ˜ํ–‰๋˜๋Š” ๋ช‡๋ช‡ model๋“ค์€ stress test์—์„œ ์ข‹์ง€ ์•Š์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ๋Œ€์•ˆ์œผ๋กœ raw text์™€ image๋ฅผ pair๋กœ ํ•™์Šต์‹œํ‚ค๋Š” ๋ฐฉ๋ฒ•.. 2022. 2. 26.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] A Simple Framework for Contrastive Learning of Visual Representations / SimCLR / Self-supervised Self-spuervised learning ์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” Contrasive learning ์ด๋ผ๋Š” ๊ฐœ๋…์„ ์†Œ๊ฐœํ•˜๊ธฐ ์œ„ํ•ด ICML2020์— ๊ฒŒ์žฌ๋œ ๋ณธ ๋…ผ๋ฌธ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์‚ฌ์ดํŠธ์— ๊ทธ๋ฆผ์œผ๋กœ ์„ค๋ช…์ด ์ž˜ ๋˜์–ด ์žˆ์–ด์„œ, ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ฐธ๊ณ ๋ฐ”๋ž๋‹ˆ๋‹ค. https://amitness.com/2020/03/illustrated-simclr/ Contrasive Learning ์šฐ์„ , contrasive learning ์€ 2๊ฐœ์˜ input์„ ๋„คํŠธ์›Œํฌ์— ์ฃผ์ž…ํ–ˆ์„ ๋•Œ, ์ด๋“ค์ด similar ํ•œ input ์ธ์ง€ differentํ•œ input ์ธ์ง€๋ฅผ ๊ตฌ๋ณ„ํ•ด์ฃผ๊ธฐ ์œ„ํ•œ ํ•™์Šต ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์•„๋ž˜ ๊ทธ๋ฆผ์—์„œ๋Š” Image๋Š” ๊ณ ์–‘์ด์™€๋Š” similar ํ•˜๊ณ  ๊ฐ•์•„์ง€, ์ฝ”๋ผ๋ฆฌ์™€๋Š” different ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ, .. 2022. 1. 27.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows / ๋ฐœ์ „๋œ ํ˜•ํƒœ์˜ ViT NLP ๋ถ„์•ผ์—์„œ ์ด์Šˆ๊ฐ€ ๋˜์—ˆ๋˜ transformer('Attention Is All You Need/NIPS2017')๊ตฌ์กฐ๋ฅผ vision task์— ์ ‘๋ชฉํ•œ Vision Transformer(ViT)์™€ ViT์—์„œ ๊ฐœ์„ ๋œ ๊ตฌ์กฐ์ธ Swin Transformer์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. * ๋…ผ๋ฌธ A. AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE / ICLR2021 B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows / ICCV2021 1. Vision Transformer (ViT) Computer vision ๋ถ„์•ผ์—์„œ ๊ธฐ์กด์˜ self attent.. 2022. 1. 8.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Non-local Neural Networks / Vision Transformer์˜ ์‹œ์ดˆ Non-local network ์ •๋ฆฌ... CNN ์€ ์–•์€ layer์—์„œ๋Š” spatial domain์—์„œ์˜ localํ•œ ์˜์—ญ์˜ correlation์„, ๊นŠ์€ layer์—์„œ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ globalํ•œ ์˜์—ญ๊นŒ์ง€์˜ correlation์„ ์ถ”์ถœํ•˜๋Š” local operator ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ layer๊ฐ€ ๊นŠ์–ด์ง€๋”๋ผ๋„ ํ•œ๋ฒˆ์˜ ์—ฐ์‚ฐ์—์„œ ์ „์ฒด ์˜์—ญ์˜ correlation์„ ์ถ”์ถœํ•˜๋Š” non-local ์—ฐ์‚ฐ๊ณผ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋•Œ๋ฌธ์— CNN์€ spatial domain ๋˜๋Š” temporal domain ์ƒ์—์„œ ๊ฑฐ๋ฆฌ๊ฐ€ ๋จผ feature ๋“ค๊ฐ„์˜ correlation์ด ์ถ”์ถœ๋˜๊ธฐ ํž˜๋“  ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ Non-local operation์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์€ non-local block.. 2021. 12. 12.
728x90