๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
728x90

๐Ÿ› Research51

VAE (Variational Autoencoder) ์„ค๋ช… | VAE Pytorch ์ฝ”๋“œ ์˜ˆ์‹œ VAE (Variational Autoencoder) VAE(Variational Autoencoder)๋Š” ์ƒ์„ฑ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜๋กœ, ์ฃผ๋กœ ์ฐจ์› ์ถ•์†Œ ๋ฐ ์ƒ์„ฑ ์ž‘์—…์— ์‚ฌ์šฉ๋˜๋Š” ์‹ ๊ฒฝ๋ง ์•„ํ‚คํ…์ฒ˜์ด๋‹ค. VAE๋Š” ๋ฐ์ดํ„ฐ์˜ ์ž ์žฌ ๋ณ€์ˆ˜๋ฅผ ํ•™์Šตํ•˜๊ณ  ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š”๋ฐ, ํŠนํžˆ ์ด๋ฏธ์ง€ ๋ฐ ์Œ์„ฑ ์ƒ์„ฑ๊ณผ ๊ฐ™์€ ์‘์šฉ ๋ถ„์•ผ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ VAE๋Š” ํฌ๊ฒŒ ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋”๋ผ๋Š” ๋‘ ๋ถ€๋ถ„์œผ๋กœ ๊ตฌ์„ฑ๋˜์–ด ์žˆ๋‹ค. Autoencoder(์˜คํ† ์ธ์ฝ”๋”)์™€ ํ—ท๊ฐˆ๋ฆด ์ˆ˜ ์žˆ๋Š”๋ฐ, ์˜คํ† ์ธ์ฝ”๋”๋Š” ์ธํ’‹์„ ๋˜‘๊ฐ™์ด ๋ณต์›ํ•  ์ˆ˜ ์žˆ๋Š” latent variable z๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ๋ชฉ์ , ์ฆ‰ ์ธ์ฝ”๋”๋ฅผ ํ•™์Šตํ•˜๋Š” ๊ฒƒ์ด ์ฃผ ๋ชฉ์ ์ด๊ณ , VAE์˜ ๊ฒฝ์šฐ ์ธํ’‹ x๋ฅผ ์ž˜ ํ‘œํ˜„ํ•˜๋Š” latent vector๋ฅผ ์ถ”์ถœํ•˜๊ณ , ์ด๋ฅผ ํ†ตํ•ด ์ธํ’‹ .. 2024. 1. 6.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] End-to-End Object Detection with Transformers | DETR ์„ค๋ช… ์˜ค๋Š˜์€ 2020๋…„์— Meta์—์„œ ๊ณต๊ฐœํ•œ DETR ๋ชจ๋ธ(ECCV 2020)์„ ๋ฆฌ๋ทฐํ•ด ๋ณด๊ณ ์ž ํ•œ๋‹ค. ํ”ผ ์ธ์šฉ์ˆ˜๊ฐ€ 9000ํšŒ์— ์œก๋ฐ•ํ•˜๋ฉฐ, ์ตœ๊ทผ ๊ณต๊ฐœ๋˜๋Š” ๊ฐ์ฒด ๊ฒ€์ถœ ๋…ผ๋ฌธ๋“ค์„ ๋ณด๋ฉด DETR ๊ธฐ๋ฐ˜์˜ ์—ฐ๊ตฌ๋„ ์‹ฌ์‹ฌ์น˜ ์•Š๊ฒŒ ๋ณผ ์ˆ˜ ์žˆ๋‹ค. Deformable DETR, Conditional DETR, Group DETR, Co-DETR, ... DETR (DEtection TRansformer) DETR์€ ํŠธ๋žœ์Šคํฌ๋จธ์™€ ์ด๋ถ„ ๋งค์นญ(Bipartite-matching) ๊ธฐ๋ฐ˜์˜ ์ƒˆ๋กœ์šด ๊ฒ€์ถœ ๋ฐฉ์‹์„ ๋„์ž…ํ•˜์—ฌ RPN, NMS์™€ ๊ฐ™์€ hand-crafted ํ•œ ์—”์ง€๋‹ˆ์–ด๋ง์ด ํ•„์š”์—†๋Š” ๋ชจ๋ธ ๊ตฌ์กฐ๋ผ๊ณ  ํ•œ๋‹ค. ๊ตฌ์กฐ์ ์œผ๋กœ ๊ต‰์žฅํžˆ ๊ฐ„๋‹จํ•˜๋ฉด์„œ ๋‹ค๋ฅธ task์— ๋Œ€ํ•œ ํ™•์žฅ์„ฑ๋„ ์ข‹๊ณ , ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์ด์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํฐ ๊ฐ์ฒด๋ฅผ ๊ฒ€์ถœ ๋Šฅ๋ ฅ์ด Faste.. 2023. 11. 25.
[NLP] BERT ๊ฐ„๋‹จ ์„ค๋ช… | Bi-Directional LM | ์–‘๋ฐฉํ–ฅ ์–ธ์–ด ๋ชจ๋ธ BERT(Bidirectional Encoder Representations from Transformers) BERT๋Š” ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ (NLP) ๋ถ„์•ผ์—์„œ ํ˜์‹ ์ ์ธ ๋ชจ๋ธ ์ค‘ ํ•˜๋‚˜๋กœ, ๊ตฌ๊ธ€์ด ๊ฐœ๋ฐœํ•ด 2018๋…„์— ๊ณต๊ฐœ๋˜์—ˆ๋‹ค. BERT๋Š” ์ด์ „์˜ NLP ๋ชจ๋ธ๋ณด๋‹ค ๋” ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ์ž‘์—…์—์„œ ์ƒ์œ„ ์„ฑ๊ณผ๋ฅผ ์ด๋ฃจ์–ด ๋ƒˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ํŠนํžˆ ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ์–ธ์–ด ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค๋ฅธ NLP ์ž‘์—…์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ชฉ์  ๋ชจ๋ธ๋กœ ์ฃผ๋ชฉ๋ฐ›์•˜๋‹ค. ๋…ผ๋ฌธ ์ œ๋ชฉ์€ ์•„๋ž˜์™€ ๊ฐ™์œผ๋ฉฐ ํ”ผ์ธ์šฉ์ˆ˜๋Š” ์•ฝ 8๋งŒํšŒ(23๋…„ 9์›” ๊ธฐ์ค€)๋กœ ์ด์ œ๋Š” LM ๋ถ„์•ผ์—์„œ ์ •๋ง ๊ธฐ๋ณธ์ด ๋˜๋Š” ์—ฐ๊ตฌ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค. paper : BERT: Pre-training of Deep Bidirectional Transformers for Languag.. 2023. 9. 25.
[ํŠœํ† ๋ฆฌ์–ผ] ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ์˜ˆ์ œ ์ฝ”๋“œ ์†Œ๊ฐœ | Image Classification | Pytorch Image Classification(์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜)์€ ์ปดํ“จํ„ฐ๋น„์ „๊ณผ ๋”ฅ๋Ÿฌ๋‹ ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ์˜ˆ์ œ ์ค‘ ํ•˜๋‚˜๋กœ, ํ”ํžˆ MNIST ๋ฐ์ดํ„ฐ์…‹์„ ์ด์šฉํ•œ ์ˆซ์ž ๋ถ„๋ฅ˜๊ธฐ๋‚˜ Cifar-10 ๊ฐ™์€ ์†Œ๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์…‹์„ ํ•™์Šตํ•˜๋Š” ์˜ˆ์ œ๊ฐ€ ๋งŽ์ด ๊ณต๊ฐœ๋˜์–ด ์žˆ๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์˜ ๊ฒฝ์šฐ ์•„์ฃผ ๋ณต์žกํ•œ ์ž‘์—…์ด ์•„๋‹Œ ๊ฒฝ์šฐ ResNet ์ •๋„๋กœ๋งŒ ํ•™์Šตํ•ด๋„ ๊ฝค ์ค€์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋‚ผ ์ˆ˜ ์žˆ๊ธฐ์— ํ•™์Šต & ํ‰๊ฐ€ ์ฝ”๋“œ๋ฅผ ๊ตฌ์ถ•ํ•ด๋‘๊ณ  ๋ฐ์ดํ„ฐ์…‹๊ณผ ๋ชจ๋ธ๋งŒ ๋ณ€๊ฒฝํ•ด๊ฐ€๋ฉฐ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŽธ๋ฆฌํ•˜๋‹ค. *๋ณต์žกํ•œ task์˜ ๊ฒฝ์šฐ ๋ณธ์ธ์ด ์‚ฌ์šฉ์ค‘์ธ ํ•™์Šต&ํ‰๊ฐ€ ์ฝ”๋“œ์— ์ƒˆ๋กœ์šด ๋ชจ๋ธ์„ ์ด์‹ํ•˜๋Š” ๊ฒƒ์ด ์‰ฝ์ง€ ์•Š๋‹ค. ๋Œ€๋ถ€๋ถ„ ์ด๋Ÿฐ ๊ฒฝ์šฐ์—๋Š” ํ•ด๋‹น ๋…ผ๋ฌธ(์—ฐ๊ตฌ)์˜ ์ €์ž๊ฐ€ ์ œ๊ณตํ•˜๋Š” ๊ณต์‹ ๋ ˆํฌ์ง€ํ† ๋ฆฌ๋ฅผ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํŽธํ•˜๋‹ค. ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ํ•™์Šต ๋ฐ ํ‰๊ฐ€ ์ฝ”๋“œ์— ์ž‘์„ฑ์— ์ฐธ๊ณ ํ•  ๋งŒํ•œ ์‚ฌ์ดํŠธ์™€ ๋ ˆํฌ์ง€ํ† .. 2023. 8. 11.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] NeRF ๊ฐ„๋‹จ ์„ค๋ช… & ์›๋ฆฌ ์ดํ•ดํ•˜๊ธฐ | ์ƒˆ๋กœ์šด ๋ฐฉํ–ฅ์—์„œ ๋ฐ”๋ผ๋ณธ view๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๊ธฐ์ˆ  - paper : NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis / ECCV2020 NeRF ๋…ผ๋ฌธ์ด ๊ณต๊ฐœ๋œ์ง€๋„ ์‹œ๊ฐ„์ด ๊ฝค ํ˜๋ €๋Š”๋ฐ, 2020 ECCV์—์„œ ๊ณต๊ฐœ๋์„ ๋•Œ๋งŒ ํ•ด๋„ ๊ต‰์žฅํžˆ ์‹ ๊ธฐํ•˜๊ณ  ํš๊ธฐ์ ์ธ view synthesis ๋ฐฉ๋ฒ•์œผ๋กœ ๊ด€์‹ฌ์„ ๋ฐ›์•˜์ง€๋งŒ, ์—ฌ๋Ÿฌ ๋‹จ์  ๋•Œ๋ฌธ์— ์‹ค์ œ ์„œ๋น„์Šค์— ์ ์šฉ๋˜๊ธฐ๋Š” ์‰ฝ์ง€ ์•Š์•˜๋‹ค. ํ•˜์ง€๋งŒ, 2023 CVPR์—์„œ๋Š” 2022๋…„์— ๋น„ํ•ด radiance๋ผ๋Š” ๋‹จ์–ด์˜ ์‚ฌ์šฉ์ด 80% ์ฆ๊ฐ€ํ•˜๊ณ , NeRF์˜ ๊ฒฝ์šฐ 39% ์ฆ๊ฐ€ํ–ˆ์„ ๋งŒํผ NeRF๋Š” ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜๊ณ  ์žˆ๋‹ค. ํŠนํžˆ ์ด์   ๊ฐœ๋… ์ฆ๋ช…์„ ๋„˜์–ด veiw editing ์ด๋‚˜ ๊ฐ์ข… application ์—ฐ๊ตฌ๊ฐ€ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ์ฆ‰ NeRF๊ฐ€ ์ด์ œ ๊ฐ์ข… ์„œ๋น„์Šค์— ํ™œ์šฉ๋ ๋งŒ.. 2023. 8. 10.
[๋…ผ๋ฌธ ๋ฆฌ๋ทฐ] Fast Segment Anything | Fast SAM | SAM์˜ ๊ฒฝ๋Ÿ‰ํ™” SAM (Segment Anything Model) ์„ค๋ช… ๋ฐ ์‚ฌ์šฉ ๋ฐฉ๋ฒ• [Meta AI] SAM (Segment Anything Model) ์‚ฌ์šฉ ๋ฐฉ๋ฒ• | ๋ชจ๋“  ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๋Š” Vision AI ๋ชจ๋ธ SAM (Segment Anything Model) Meta ์—์„œ SAM (Segment Anything Model) ์ด๋ผ๋Š” ์–ด๋–ค ๊ฒƒ์ด๋“  ๋ถ„ํ• ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ๊ณต๊ฐœํ–ˆ๋‹ค. ๋…ผ๋ฌธ ์ œ๋ชฉ ์ž์ฒด๊ฐ€ 'Segment Anything' ์ธ๋ฐ ๊ต‰์žฅํžˆ ์ž์‹ ๊ฐ ๋„˜์น˜๋Š” ์›Œ๋”ฉ์ด๋‹ค. ๊ฐ„๋‹จํ•œ ์„ค๋ช…์„ mvje.tistory.com Meta AI์˜ Segment Anything Model (SAM)์ด ๊ณต๊ฐœ๋œ์ง€ ์–ผ๋งˆ๋‚˜ ๋๋‹ค๊ณ  ๋ฒŒ์จ Fast SAM์ด๋ผ๋Š” ์†๋„๊ฐ€ ํ–ฅ์ƒ๋œ ๋ฒ„์ „์˜ SAM์ด ๊ณต๊ฐœ๋˜์—ˆ๋‹ค. ๋น…ํ…Œํฌ ๊ธฐ์—…์—์„œ ํ˜์‹ ์ ์ธ AI ๋ชจ๋ธ์„ ์ง€์†์ .. 2023. 7. 2.
728x90