[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Class-Balanced Loss Based on Effective Number of Samples / Class imbalance๋ฅผ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ๋ฒ•
ยท
๐Ÿ› Research/Deep Learning
Class Imabalance ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” CVPR 2019์— ๊ณต๊ฐœ๋œ ๋…ผ๋ฌธ์„ ์†Œ๊ฐœํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋ฆฌ๋ทฐ๋Š” ๋ฌธ์ œ ์ •์˜์™€ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ฐœ๋…์ ์œผ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. (๋””ํ…Œ์ผ ์ œ์™ธ) Class Imbalance ? Class Imbalance ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์€ ๋”ฅ๋Ÿฌ๋‹์—์„œ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•™์Šต์‹œํ‚ฌ ๋•Œ ์‚ฌ์šฉ๋˜๋Š” training data ์˜ class ๊ฐœ์ˆ˜๊ฐ€ balance ๊ฐ€ ๋งž์ง€ ์•Š๋Š” ์ƒํ™ฉ์„ ๋งํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ ๋ฐ์ดํ„ฐ์—์„œ๋Š” ๋งค์šฐ ๋นˆ๋ฒˆํ•œ ์ผ์ด๊ธฐ์— ์ค‘์š”ํ•œ task ๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™๊ณ„์—์„œ๋Š” Long tail data ๋ผ๋Š” ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ๋งŽ์€ class ๋ถ€ํ„ฐ ์•„์ฃผ ์ ์€ class ๊นŒ์ง€ ๋‹ค์–‘ํ•˜๊ฒŒ ๋ถ„ํฌํ•˜๋Š” ๋ฐ์ดํ„ฐ์…‹์„ ์‚ฌ์šฉํ•˜์—ฌ class imabalance ๋ฌธ์ œ์—์„œ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ? Clas..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] END-TO-END OPTIMIZED IMAGE COMPRESSION | ๋”ฅ๋Ÿฌ๋‹ ๋ฐฉ์‹์˜ ์˜์ƒ ์••์ถ•
ยท
๐Ÿ› Research/Deep Learning
ICLR 2017 ์— ๋ฐœํ‘œ๋œ ๋…ผ๋ฌธ์œผ๋กœ ์ œ๋ชฉ ๊ทธ๋Œ€๋กœ end-to-end ๋ฐฉ์‹์œผ๋กœ ์ด๋ฏธ์ง€ ์••์ถ• ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์ตœ์ ํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๋Š” ๋…ผ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ๋ถ„์•ผ์— ๋Œ€ํ•œ ์ง€์‹์ด ๊ทธ๋ ‡๊ฒŒ ๋งŽ์ง€ ์•Š์•„์„œ ๋…ผ๋ฌธ ๋ฆฌ๋ทฐ๊ฐ€ ํ—ˆ์ˆ (?)ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค... ใ…Ž - ๊ธฐ๋ณธ์ ์ธ ์˜์ƒ ์••์ถ• ์„ค๋ช… : https://mvje.tistory.com/86?category=1033082 ์˜์ƒ ์••์ถ• - JPEG, MPEG ์˜์ƒ ์••์ถ• ๊ด€๋ จ ๋‚ด์šฉ์„ ๋‹ค์‹œ ๊ณต๋ถ€ํ•  ๊ธฐํšŒ๊ฐ€ ์ƒ๊ฒจ์„œ ๊นŒ๋จน๊ธฐ ์ „์— ์ •๋ฆฌํ•ฉ๋‹ˆ๋‹ค! ๋ฏธ๋””์–ด ๋ฐ์ดํ„ฐ๋Š” ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ๋งŽ์•„์ง€๊ณ  ์ด๋ฅผ ์ €์žฅํ•  ๊ณต๊ฐ„์€ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ํ•œ๊ณ„๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์˜์ƒ ์••์ถ•์€ ์ค‘์š”ํ•œ mvje.tistory.com Abstract Nonlinear analysis transformation, uniform quantizer, no..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] A Simple Framework for Contrastive Learning of Visual Representations / SimCLR / Self-supervised
ยท
๐Ÿ› Research/Deep Learning
Self-spuervised learning ์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‚ด๋Š” Contrasive learning ์ด๋ผ๋Š” ๊ฐœ๋…์„ ์†Œ๊ฐœํ•˜๊ธฐ ์œ„ํ•ด ICML2020์— ๊ฒŒ์žฌ๋œ ๋ณธ ๋…ผ๋ฌธ์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์‚ฌ์ดํŠธ์— ๊ทธ๋ฆผ์œผ๋กœ ์„ค๋ช…์ด ์ž˜ ๋˜์–ด ์žˆ์–ด์„œ, ์ž์„ธํ•œ ๋‚ด์šฉ์€ ์ฐธ๊ณ ๋ฐ”๋ž๋‹ˆ๋‹ค. https://amitness.com/2020/03/illustrated-simclr/ Contrasive Learning ์šฐ์„ , contrasive learning ์€ 2๊ฐœ์˜ input์„ ๋„คํŠธ์›Œํฌ์— ์ฃผ์ž…ํ–ˆ์„ ๋•Œ, ์ด๋“ค์ด similar ํ•œ input ์ธ์ง€ differentํ•œ input ์ธ์ง€๋ฅผ ๊ตฌ๋ณ„ํ•ด์ฃผ๊ธฐ ์œ„ํ•œ ํ•™์Šต ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์•„๋ž˜ ๊ทธ๋ฆผ์—์„œ๋Š” Image๋Š” ๊ณ ์–‘์ด์™€๋Š” similar ํ•˜๊ณ  ๊ฐ•์•„์ง€, ์ฝ”๋ผ๋ฆฌ์™€๋Š” different ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ, ..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Swin Transformer: Hierarchical Vision Transformer using Shifted Windows / ๋ฐœ์ „๋œ ํ˜•ํƒœ์˜ ViT
ยท
๐Ÿ› Research/Deep Learning
NLP ๋ถ„์•ผ์—์„œ ์ด์Šˆ๊ฐ€ ๋˜์—ˆ๋˜ transformer('Attention Is All You Need/NIPS2017')๊ตฌ์กฐ๋ฅผ vision task์— ์ ‘๋ชฉํ•œ Vision Transformer(ViT)์™€ ViT์—์„œ ๊ฐœ์„ ๋œ ๊ตฌ์กฐ์ธ Swin Transformer์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค. * ๋…ผ๋ฌธ A. AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE / ICLR2021 B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows / ICCV2021 1. Vision Transformer (ViT) Computer vision ๋ถ„์•ผ์—์„œ ๊ธฐ์กด์˜ self attent..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Non-local Neural Networks / Vision Transformer์˜ ์‹œ์ดˆ
ยท
๐Ÿ› Research/Deep Learning
Non-local network ์ •๋ฆฌ... CNN ์€ ์–•์€ layer์—์„œ๋Š” spatial domain์—์„œ์˜ localํ•œ ์˜์—ญ์˜ correlation์„, ๊นŠ์€ layer์—์„œ๋Š” ์ƒ๋Œ€์ ์œผ๋กœ globalํ•œ ์˜์—ญ๊นŒ์ง€์˜ correlation์„ ์ถ”์ถœํ•˜๋Š” local operator ๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ layer๊ฐ€ ๊นŠ์–ด์ง€๋”๋ผ๋„ ํ•œ๋ฒˆ์˜ ์—ฐ์‚ฐ์—์„œ ์ „์ฒด ์˜์—ญ์˜ correlation์„ ์ถ”์ถœํ•˜๋Š” non-local ์—ฐ์‚ฐ๊ณผ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋•Œ๋ฌธ์— CNN์€ spatial domain ๋˜๋Š” temporal domain ์ƒ์—์„œ ๊ฑฐ๋ฆฌ๊ฐ€ ๋จผ feature ๋“ค๊ฐ„์˜ correlation์ด ์ถ”์ถœ๋˜๊ธฐ ํž˜๋“  ๊ตฌ์กฐ์ž…๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ Non-local operation์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ๊ทธ๋ฆผ์€ non-local block..
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] SHAPE-TEXTURE DEBIASED NEURAL NETWORK TRAINING / ๋‰ด๋Ÿด ๋„คํŠธ์›Œํฌ์—์„œ shape๊ณผ texture์˜ ๊ด€๊ณ„
ยท
๐Ÿ› Research/Deep Learning
ICLR 2021์— ๊ฐœ์ œ๋œ ๋…ผ๋ฌธ์ด๋ฉฐ object์™€ shape, texture์™€์˜ ๊ด€๊ณ„, ๊ทธ๋ฆฌ๊ณ  object recognition ๋“ฑ์˜ vision task์—์„œ shape๊ณผ texture ์ •๋ณด๋ฅผ ๋ชจ๋‘ ์ด์šฉํ•˜์—ฌ ํ•™์Šตํ•˜์—ฌ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚จ shape-texture debiased neural network๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. Introduction Shape๊ณผ texture๋Š” ๋ชจ๋‘ object๋ฅผ ์ธ์‹ํ•  ๋•Œ ์ค‘์š”ํ•œ ๋‹จ์„œ๋“ค์ž…๋‹ˆ๋‹ค. ์ด๋ฏธ ์ด์ „์˜ object recognition ์—ฐ๊ตฌ์—์„œ shape๊ณผ texture๋ฅผ ์ ์ ˆํ•˜๊ฒŒ ๊ฒฐํ•ฉํ•˜๋ฉด ์ธ์‹ ์„ฑ๋Šฅ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Œ์ด ๋ฐํ˜€์กŒ์Šต๋‹ˆ๋‹ค. ‘IMAGENET-TRAINED CNNS ARE BIASED TOWARDS TEXTURE; INCREASING SHAPE BIAS IMPROVES A..