[Meta AI] SAM (Segment Anything Model) ์‚ฌ์šฉ ๋ฐฉ๋ฒ• | ๋ชจ๋“  ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๋Š” Vision AI ๋ชจ๋ธ

2023. 4. 21. 18:09ยท๐Ÿ’ป Programming/Computer Vision
๋ฐ˜์‘ํ˜•
SAM (Segment Anything Model)

 
Meta ์—์„œ SAM (Segment Anything Model) ์ด๋ผ๋Š” ์–ด๋–ค ๊ฒƒ์ด๋“  ๋ถ„ํ• ํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์„ ๊ณต๊ฐœํ–ˆ๋‹ค. ๋…ผ๋ฌธ ์ œ๋ชฉ ์ž์ฒด๊ฐ€ 'Segment Anything' ์ธ๋ฐ ๊ต‰์žฅํžˆ ์ž์‹ ๊ฐ ๋„˜์น˜๋Š” ์›Œ๋”ฉ์ด๋‹ค. 
 
๊ฐ„๋‹จํ•œ ์„ค๋ช…์„ ์‚ดํŽด๋ณด๋ฉด, SAM์€ point๋‚˜ box์™€ ๊ฐ™์€ ์ž…๋ ฅ ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ†ตํ•ด ๊ณ ํ’ˆ์งˆ ๊ฐ์ฒด ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•˜๋ฉฐ ๋ชจ๋“  ๊ฐ์ฒด์— ๋Œ€ํ•œ ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ํ•œ๋‹ค. ์•ฝ 1,100๋งŒ ๊ฐœ์˜ ์ด๋ฏธ์ง€์™€ 11์–ต ๊ฐœ์˜ ๋งˆ์Šคํฌ๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•™์Šต๋˜์—ˆ์œผ๋ฉฐ ๋‹ค์–‘ํ•œ segmentation task์—์„œ ๊ฐ•๋ ฅํ•œ zero-shot ์„ฑ๋Šฅ์„ ๋ณด์ธ๋‹ค๊ณ  ํ•œ๋‹ค.
 
 
Segment Anything ์›น๋ฐ๋ชจ

 

Segment Anything

Meta AI Computer Vision Research

segment-anything.com

 


 
Meta AI์—์„œ SAM์˜ ์›น๋ฐ๋ชจ๋ฅผ ์ œ๊ณตํ•˜๊ณ  ์žˆ์–ด ๋‹ค์–‘ํ•œ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€๋‚˜ ์—…๋กœ๋“œํ•œ ์ด๋ฏธ์ง€๋กœ SAM ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•ด ๋ณผ ์ˆ˜ ์žˆ๋‹ค. ์•ž์„œ ๋งํ•œ๋Œ€๋กœ ์ด๋ฏธ์ง€์—์„œ ์ ์„ ์ฐ๊ฑฐ๋‚˜ ๋ฐ•์Šค๋ฅผ ๊ทธ๋ ค ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๊ฑฐ๋‚˜ ์ด๋ฏธ์ง€ ์ „์ฒด์— ๋Œ€ํ•œ ๊ฐ์ฒด ๋ถ„ํ• ์„ ์‹คํ–‰ํ•  ์ˆ˜๋„ ์žˆ๋‹ค.
 
์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ SAM ๋ชจ๋ธ์„ ์‚ฌ์šฉ ์‹œ ๋”์šฑ ๋‹ค์–‘ํ•œ ์‘์šฉ์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ๊ตฌ๊ธ€ ์ฝ”๋žฉ์„ ์ด์šฉํ•œ ํŠœํ† ๋ฆฌ์–ผ์„ ์ง„ํ–‰ํ•ด ๋ณด๊ณ ์ž ํ•œ๋‹ค. (ํŠœํ† ๋ฆฌ์–ผ ์ฝ”๋“œ)
 

SAM ํŠœํ† ๋ฆฌ์–ผ #1 - ์ž๋™ ๋ถ„ํ• 

 
์ฒซ ๋ฒˆ์งธ ํŠœํ† ๋ฆฌ์–ผ์€ ํฌ์ธํŠธ๋‚˜ ๋ฐ•์Šค ์ž…๋ ฅ ์—†์ด ์ž๋™์œผ๋กœ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ๋ฌผ๋ก  ์ž๋™์ด๋”๋ผ๋„ ๊ฐ์ข… ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์กฐ์ ˆํ•˜์—ฌ ๋””ํ…Œ์ผ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ๋‹ค.

๋”๋ณด๊ธฐ

 

  • ์šฐ์„  SAM์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ํ™˜๊ฒฝ ์„ค์ • ํ•„์š”
  • output ๋งˆ์Šคํฌ๋ฅผ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•œ ํ•จ์ˆ˜ ์„ ์–ธ

 

  • ๋‹ค์šด๋ฐ›์€ 'dog.jpg'๋ฅผ ์‚ฌ์šฉํ•ด๋„ ๋˜์ง€๋งŒ ์˜ˆ์‹œ์ฒ˜๋Ÿผ ์ด๋ฏธ์ง€๋ฅผ ์—…๋กœ๋“œํ•ด์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • ์—…๋กœ๋“œํ•˜๋Š” ๊ฒฝ์šฐ ๋ฐ”์ดํŠธ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณ€ํ™˜ํ•˜๋Š” ๊ณผ์ • ํ•„์š” (์œ„ ์ฝ”๋“œ ์ฐธ๊ณ )

 

 

  • ์ด๋ฏธ์ง€ ์ž…๋ ฅ๋งŒ์œผ๋กœ ๊ฐ์ฒด ๋ถ„ํ• 
  • Default ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’์œผ๋กœ ์‹คํ–‰

 

  • mask generator์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋ณ€๊ฒฝํ•˜์—ฌ ๊ฐ์ฒด ๋ถ„ํ• 

 
 

SAM ํŠœํ† ๋ฆฌ์–ผ #2 - ์„ ํƒ ๋ถ„ํ• 

๋‘ ๋ฒˆ์งธ ํŠœํ† ๋ฆฌ์–ผ์€ ํฌ์ธํŠธ๋‚˜ ๋ฐ•์Šค ์ž…๋ ฅ์œผ๋กœ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค. ์ž๋™ ๋ฐฉ๋ฒ•์— ๋น„ํ•ด ์›ํ•˜๋Š” ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ์‰ฝ๊ณ  ๋ฐ•์Šค์™€ ํฌ์ธํŠธ์˜ ์กฐํ•ฉ ๋“ฑ์œผ๋กœ ๊ฝค ๋””ํ…Œ์ผํ•œ ๊ฐ์ฒด ๋ถ„ํ• ์ด ๊ฐ€๋Šฅํ•˜๋‹ค. (e.g. ์ž๋™์ฐจ ๋ฐ”ํ€ด์—์„œ ํœ ์„ ์ œ์™ธํ•œ ํƒ€์ด์–ด๋งŒ ๋ถ„ํ• )
 

๋”๋ณด๊ธฐ

*ํ™˜๊ฒฝ ์„ธํŒ…์€ ์ƒ๋žต

  • ์›ํ•˜๋Š” ์ด๋ฏธ์ง€๋ฅผ ์—…๋กœ๋“œํ•  ์ˆ˜๋„ ์žˆ๊ณ  ๋‹ค์šด๋ฐ›์€ 'truck.jpg' ์ด๋ฏธ์ง€๋„ ์‚ฌ์šฉ ๊ฐ€๋Šฅ

 

 

  • ํฌ์ธํŠธ๋กœ ๊ฐ์ฒด ๋ถ„ํ• ์„ ํ•˜๊ธฐ ์œ„ํ•ด ์ด๋ฏธ์ง€์˜ ํŠน์ • ์œ„์น˜์— ์ขŒํ‘œ ์„ค์ •
  • ์˜ˆ์ œ์—์„œ๋Š” ํŠธ๋Ÿญ์˜ ์ฐฝ๋ฌธ์— ํฌ์ธํŠธ ์ง€์ •

 

  • multitask_output = True๋กœ ์ง€์ •ํ•˜์—ฌ mask๊ฐ€ 3๊ฐœ ์ถœ๋ ฅ๋˜๊ณ  ์ด๋“ค์€ ๊ณ„์ธต์  ๋ถ„ํ•  ๊ฒฐ๊ณผ
  • ์˜ˆ์ œ์˜ ๊ฒฝ์šฐ ๊ฐ€์žฅ ์ž‘์€ ๋‹จ์œ„์˜ ์ฐฝ๋ฌธ, ์—ฐ๊ฒฐ๋œ ์ฐฝ๋ฌธ, ์ฐจ๋Ÿ‰ ์ „์ฒด๋ฅผ ๋ถ„ํ• 

 

  • 2๊ฐœ ์ด์ƒ์˜ ํฌ์ธํŠธ๋“ค๋กœ ํฌ์ธํŠธ๊ฐ€ ํ•จ๊ป˜ ๊ณต์œ ํ•˜๋Š” ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•  ์ˆ˜ ์žˆ์Œ

 

  • ํฌ์ธํŠธ์˜ input_label์„ ์„ค์ •ํ•˜์—ฌ(0๋˜๋Š” 1) ํŠน์ • ํฌ์ธํŠธ๋Š” ํฌํ•จํ•˜๊ณ  ํŠน์ • ํฌ์ธํŠธ๋Š” ํฌํ•จํ•˜์ง€ ์•Š๋Š” ๋ถ„ํ•  ๊ฐ€๋Šฅ
  • ์˜ˆ์ œ์˜ ๊ฒฝ์šฐ ์ฒซ ๋ฒˆ์งธ ํฌ์ธํŠธ(ํŠธ๋Ÿญ ์ฐฝ๋ฌธ)์€ ํฌํ•จํ•˜๊ณ  ๋‘ ๋ฒˆ์งธ ํฌ์ธํŠธ(ํŠธ๋Ÿญ ์•ž๋ฌธ)์€ ํฌํ•จํ•˜์ง€ ์•Š๋Š” ๋ถ„ํ•  ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์คŒ

 

  • ๋ฐ•์Šค๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฐ•์Šค ๋‚ด๋ถ€์˜ ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•  ์ˆ˜๋„ ์žˆ์Œ

 

  • ๋ฐ•์Šค์™€ ํฌ์ธํŠธ์˜ ์กฐํ•ฉ์œผ๋กœ ๋ถ„ํ• 
  • point_labels ์„ค์ •์œผ๋กœ ๋ฐ•์Šค ๋‚ด๋ถ€์˜ ํฌ์ธํŠธ ๋ถ€๋ถ„์€ ์ œ์™ธํ•œ segment๋ฅผ ์ถ”์ถœ ๊ฐ€๋Šฅ

 

 

  • ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€์™€ ๋‹ค์ค‘ ์ž…๋ ฅ ํ”„๋กฌํ”„ํŠธ๋กœ ์—ฌ๋Ÿฌ segment ์ถ”์ถœ ๊ฐ€๋Šฅ
๋ฐ˜์‘ํ˜•

'๐Ÿ’ป Programming > Computer Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[python] 3์ฐจ์› ๊ณต๊ฐ„ ํšŒ์ „ ๋ณ€ํ™˜ | scipy Rotation  (1) 2024.03.29
[OpenCV] Feature Detection & Matching | ํŠน์ง• ๊ฒ€์ถœ๊ณผ ๋งค์นญ | ์ด๋ฏธ์ง€์—์„œ ์œ ์‚ฌํ•œ ํŠน์ง• ์ฐพ์•„๋‚ด๊ธฐ | ์ด๋ฏธ์ง€ ๋Œ€์‘์   (0) 2023.04.03
[OpenCV] Template Matching ํ…œํ”Œ๋ฆฟ ๋งค์นญ | ์ด๋ฏธ์ง€์—์„œ ์œ ์‚ฌํ•œ ๋ถ€๋ถ„ ์ฐพ์•„๋‚ด๊ธฐ  (0) 2023.03.31
[OpenCV] Image Contour ์ถ”์ถœ | ์ด๋ฏธ์ง€ ์ปจํˆฌ์–ด | ๊ฐ์ฒด ์œค๊ณฝ์„  ์ถ”์ถœ | ๊ธฐ์ดˆ์ ์ธ segmentation ๋ฐฉ๋ฒ•  (0) 2023.03.30
[OpenCV] Morphological Operations ๋ชจํด๋กœ์ง€ ์—ฐ์‚ฐ | ๊ฐ์ฒด์˜ ๋‚ด๋ถ€ ์ฑ„์šฐ๊ธฐ | ๊ฐ์ฒด์˜ ๊ฒฝ๊ณ„ ๋ถ€๋“œ๋Ÿฝ๊ฒŒ | ๊ฐ์ฒด์˜ ํฌ๊ธฐ ์ค„์ด๊ธฐ | ๊ฐ์ฒด ์—ฐ๊ฒฐํ•˜๊ธฐ | ๊ฒฝ๊ณ„ ๊ฐ•์กฐํ•˜๊ธฐ  (0) 2023.03.29
'๐Ÿ’ป Programming/Computer Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [python] 3์ฐจ์› ๊ณต๊ฐ„ ํšŒ์ „ ๋ณ€ํ™˜ | scipy Rotation
  • [OpenCV] Feature Detection & Matching | ํŠน์ง• ๊ฒ€์ถœ๊ณผ ๋งค์นญ | ์ด๋ฏธ์ง€์—์„œ ์œ ์‚ฌํ•œ ํŠน์ง• ์ฐพ์•„๋‚ด๊ธฐ | ์ด๋ฏธ์ง€ ๋Œ€์‘์ 
  • [OpenCV] Template Matching ํ…œํ”Œ๋ฆฟ ๋งค์นญ | ์ด๋ฏธ์ง€์—์„œ ์œ ์‚ฌํ•œ ๋ถ€๋ถ„ ์ฐพ์•„๋‚ด๊ธฐ
  • [OpenCV] Image Contour ์ถ”์ถœ | ์ด๋ฏธ์ง€ ์ปจํˆฌ์–ด | ๊ฐ์ฒด ์œค๊ณฝ์„  ์ถ”์ถœ | ๊ธฐ์ดˆ์ ์ธ segmentation ๋ฐฉ๋ฒ•
๋ญ…์ฆค
๋ญ…์ฆค
AI ๊ธฐ์ˆ  ๋ธ”๋กœ๊ทธ
    ๋ฐ˜์‘ํ˜•
  • ๋ญ…์ฆค
    CV DOODLE
    ๋ญ…์ฆค
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
  • ๊ณต์ง€์‚ฌํ•ญ

    • โœจ About Me
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (198)
      • ๐Ÿ“– Fundamentals (33)
        • Computer Vision (9)
        • 3D vision & Graphics (6)
        • AI & ML (15)
        • NLP (2)
        • etc. (1)
      • ๐Ÿ› Research (64)
        • Deep Learning (7)
        • Image Classification (2)
        • Detection & Segmentation (17)
        • OCR (7)
        • Multi-modal (4)
        • Generative AI (6)
        • 3D Vision (2)
        • Material & Texture Recognit.. (8)
        • NLP & LLM (11)
        • etc. (0)
      • ๐ŸŒŸ AI & ML Tech (7)
        • AI & ML ์ธ์‚ฌ์ดํŠธ (7)
      • ๐Ÿ’ป Programming (85)
        • Python (18)
        • Computer Vision (12)
        • LLM (4)
        • AI & ML (17)
        • Database (3)
        • Apache Airflow (6)
        • Docker & Kubernetes (14)
        • ์ฝ”๋”ฉ ํ…Œ์ŠคํŠธ (4)
        • C++ (1)
        • etc. (6)
      • ๐Ÿ’ฌ ETC (3)
        • ์ฑ… ๋ฆฌ๋ทฐ (3)
  • ๋งํฌ

  • ์ธ๊ธฐ ๊ธ€

  • ํƒœ๊ทธ

    nlp
    material recognition
    ๊ฐ์ฒด ๊ฒ€์ถœ
    deep learning
    OCR
    OpenAI
    pytorch
    GPT
    Text recognition
    CNN
    ํŒŒ์ด์ฌ
    ๋”ฅ๋Ÿฌ๋‹
    Image Classification
    OpenCV
    ChatGPT
    ์ปดํ“จํ„ฐ๋น„์ „
    3D Vision
    airflow
    object detection
    Python
    ๋„์ปค
    segmentation
    Computer Vision
    LLM
    ๊ฐ์ฒด๊ฒ€์ถœ
    VLP
    AI
    multi-modal
    pandas
    ํ”„๋กฌํ”„ํŠธ์—”์ง€๋‹ˆ์–ด๋ง
  • ์ตœ๊ทผ ๋Œ“๊ธ€

  • ์ตœ๊ทผ ๊ธ€

  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
๋ญ…์ฆค
[Meta AI] SAM (Segment Anything Model) ์‚ฌ์šฉ ๋ฐฉ๋ฒ• | ๋ชจ๋“  ๊ฐ์ฒด๋ฅผ ๋ถ„ํ• ํ•˜๋Š” Vision AI ๋ชจ๋ธ
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”