[๊ฐ์ฒด ๊ฒ€์ถœ] ์•ต์ปค ๋ฐ•์Šค(Anchor Box)๋ž€ ๋ฌด์—‡์ธ๊ฐ€? | ๊ฐ์ฒด ๊ฒ€์ถœ ๋ชจ๋ธ์—์„œ์˜ ์—ญํ• ๊ณผ ํ•œ๊ณ„

2024. 8. 10. 15:33ยท๐Ÿ“– Fundamentals/Computer Vision
๋ฐ˜์‘ํ˜•

๊ฐ์ฒด ๊ฒ€์ถœ(Object Detection) ๋ชจ๋ธ์—์„œ๋Š” ์ด๋ฏธ์ง€ ์† ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ํฌ๊ธฐ๋ฅผ ์˜ˆ์ธกํ•ด์•ผ ํ•œ๋‹ค. ์ด๋•Œ ์‚ฌ์šฉ๋˜๋Š” ์ค‘์š”ํ•œ ๊ฐœ๋… ์ค‘ ํ•˜๋‚˜๊ฐ€ ์•ต์ปค ๋ฐ•์Šค(Anchor Box)์ด๋‹ค.


 

1. ์•ต์ปค ๋ฐ•์Šค๋ž€?

 

์•ต์ปค ๋ฐ•์Šค๋Š” ์ด๋ฏธ์ง€์˜ ๊ฐ ์œ„์น˜์— ๋ฏธ๋ฆฌ ์ •์˜๋œ ์—ฌ๋Ÿฌ ํฌ๊ธฐ์™€ ๋น„์œจ์˜ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋ฅผ ์˜๋ฏธํ•œ๋‹ค. ๋ชจ๋ธ์€ ์ด ์•ต์ปค ๋ฐ•์Šค๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์‹ค์ œ ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ํด๋ž˜์Šค ์ •๋ณด๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค. ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์˜ ๊ฐ์ฒด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ํƒ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ๊ฐ ์œ„์น˜๋งˆ๋‹ค ๋ฐฐ์น˜ํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.

 

2. ์•ต์ปค ๋ฐ•์Šค์˜ ์›๋ฆฌ์™€ ์‚ฌ์šฉ ๋ฐฉ์‹

๊ฐ์ฒด ๊ฒ€์ถœ ๋ชจ๋ธ์€ ๊ฐ ์•ต์ปค ๋ฐ•์Šค์— ๋Œ€ํ•ด ๋‘ ๊ฐ€์ง€๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค.

  • ํ•ด๋‹น ๋ฐ•์Šค์— ์–ด๋–ค ํด๋ž˜์Šค๊ฐ€ ์กด์žฌํ•˜๋Š”์ง€
  • ํ•ด๋‹น ๋ฐ•์Šค๋ฅผ ์–ด๋–ป๊ฒŒ ์กฐ์ •ํ•ด์•ผ ์‹ค์ œ ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ๋งž์ถœ ์ˆ˜ ์žˆ๋Š”์ง€

์ด๋•Œ, ์˜ˆ์ธก๋œ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๊ฐ€ ์‹ค์ œ ๊ฐ์ฒด์˜ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค์™€ ์–ผ๋งˆ๋‚˜ ๊ฒน์น˜๋Š”์ง€๋ฅผ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด IoU(Intersection over Union) ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.

์˜ˆ์‹œ๋กœ ๋ณด๋Š” ํ•™์Šต ๊ณผ์ •

  1. ์‚ฌ์ „ ์ •์˜๋œ ์•ต์ปค ๋ฐ•์Šค ์ƒ์„ฑ
    ์ด๋ฏธ์ง€์˜ ๊ฐ ์œ„์น˜์— ๋Œ€ํ•ด ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์™€ ๋น„์œจ์˜ ์•ต์ปค ๋ฐ•์Šค๊ฐ€ ์‚ฌ์ „์— ์ •์˜๋œ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, 3๊ฐœ์˜ ํฌ๊ธฐ์™€ 3๊ฐœ์˜ ๋น„์œจ์„ ์‚ฌ์šฉํ•˜๋ฉด ํ•œ ์œ„์น˜๋‹น 9๊ฐœ์˜ ์•ต์ปค ๋ฐ•์Šค๊ฐ€ ์ƒ์„ฑ๋œ๋‹ค.
  2. ์‹ค์ œ ๊ฐ์ฒด์™€์˜ ๋งค์นญ
    ํ•™์Šต ์ค‘ ๋ชจ๋ธ์€ ์•ต์ปค ๋ฐ•์Šค์™€ ์‹ค์ œ ๊ฐ์ฒด์˜ ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค ๊ฐ„์˜ IoU๋ฅผ ๊ณ„์‚ฐํ•˜๊ณ , IoU๊ฐ€ ๊ฐ€์žฅ ๋†’์€ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ํ•ด๋‹น ๊ฐ์ฒด์— ํ• ๋‹นํ•œ๋‹ค.
  3. ํด๋ž˜์Šค ๋ฐ ์œ„์น˜ ํ•™์Šต
    ๋งค์นญ๋œ ์•ต์ปค ๋ฐ•์Šค๋Š” ์‹ค์ œ ๊ฐ์ฒด์˜ ํด๋ž˜์Šค๋ฅผ ์˜ˆ์ธกํ•˜๋„๋ก ํ•™์Šต๋˜๋ฉฐ, ๋™์‹œ์— ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค ์ขŒํ‘œ๋ฅผ ์‹ค์ œ ๊ฐ์ฒด์˜ ์œ„์น˜์— ๋งž๊ฒŒ ์กฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ํ•™์Šตํ•œ๋‹ค.
  4. ์†์‹ค ํ•จ์ˆ˜ ๊ณ„์‚ฐ
    ๋ชจ๋ธ์ด ์˜ˆ์ธกํ•œ ํด๋ž˜์Šค์™€ ์œ„์น˜๊ฐ€ ์‹ค์ œ ๊ฐ’๊ณผ ์–ผ๋งˆ๋‚˜ ์ฐจ์ด ๋‚˜๋Š”์ง€๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์†์‹ค(loss)์„ ๊ณ„์‚ฐํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ํด๋ž˜์Šค ์˜ˆ์ธก์—๋Š” Cross-Entropy Loss, ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค ์˜ˆ์ธก์—๋Š” L1 Loss ๋˜๋Š” GIoU Loss๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค.
  5. ๋ชจ๋ธ ์—…๋ฐ์ดํŠธ
    ์†์‹ค ๊ฐ’์— ๋”ฐ๋ผ ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋ฉฐ, ์ด๋ฅผ ๋ฐ˜๋ณตํ•˜๋ฉด์„œ ๋” ์ •ํ™•ํ•œ ์˜ˆ์ธก์ด ๊ฐ€๋Šฅํ•ด์ง„๋‹ค.

 

3. ์•ต์ปค ๋ฐ•์Šค์˜ ์žฅ๋‹จ์ 

์žฅ์ 

  • ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์˜ ๊ฐ์ฒด ์ฒ˜๋ฆฌ์— ์œ ๋ฆฌํ•˜๋‹ค.
  • ๋‹ค์ค‘ ์Šค์ผ€์ผ ์˜ˆ์ธก์„ ํ†ตํ•ด ๊ฒ€์ถœ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค.

๋‹จ์ 

  • ์„ค์ •์ด ๋ณต์žกํ•˜๋‹ค.
    ์ ์ ˆํ•œ ํฌ๊ธฐ, ๋น„์œจ, ๊ฐœ์ˆ˜๋ฅผ ์„ค์ •ํ•ด์•ผ ํ•˜๋ฉฐ, ์ด๋Š” ์‹คํ—˜์ ์œผ๋กœ ์กฐ์ •ํ•ด์•ผ ํ•œ๋‹ค.
  • ์—ฐ์‚ฐ ๋น„์šฉ์ด ๋†’๋‹ค.
    ๋ชจ๋“  ์œ„์น˜์—์„œ ์—ฌ๋Ÿฌ ๊ฐœ์˜ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋ฏ€๋กœ ๊ณ„์‚ฐ๋Ÿ‰๊ณผ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ํฌ๋‹ค.
  • ํ›„์ฒ˜๋ฆฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค.
    ์ค‘๋ณต๋œ ์˜ˆ์ธก์„ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•ด NMS(Non-Maximum Suppression) ๊ฐ™์€ ํ›„์ฒ˜๋ฆฌ ๊ณผ์ •์ด ํ•„์š”ํ•˜๋‹ค.

 

4. ์•ต์ปค ๋ฐ•์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๊ฐ์ฒด ๊ฒ€์ถœ ๋ชจ๋ธ

  • Faster R-CNN
    Region Proposal Network(RPN)์—์„œ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ›„๋ณด ์˜์—ญ์„ ์ƒ์„ฑํ•˜๊ณ , ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ฐ์ฒด๋ฅผ ๊ฒ€์ถœํ•œ๋‹ค.
  • YOLOv3
    ์ด๋ฏธ์ง€์˜ ๊ฐ ๊ทธ๋ฆฌ๋“œ ์…€์—์„œ ๋ฏธ๋ฆฌ ์ •์˜๋œ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ๊ธฐ์ค€์œผ๋กœ ๊ฐ์ฒด๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค.
  • RetinaNet
    ๋‹ค์–‘ํ•œ ํฌ๊ธฐ์™€ ๋น„์œจ์˜ ์•ต์ปค ๋ฐ•์Šค๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•˜๋ฉฐ, Focal Loss๋ฅผ ํ†ตํ•ด ์ž‘์€ ๊ฐ์ฒด ๋ฐ ํด๋ž˜์Šค ๋ถˆ๊ท ํ˜• ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ๋‹ค.

 

์ตœ๊ทผ์—๋Š” ์•ต์ปค ๋ฐ•์Šค๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” DETR(Detection Transformer)์™€ ๊ฐ™์€ ๋ชจ๋ธ๋„ ๋“ฑ์žฅํ•˜๊ณ  ์žˆ๋‹ค. DETR๋Š” ๋ณต์žกํ•œ ์•ต์ปค ์„ค์ •์ด๋‚˜ NMS ์—†์ด๋„ ๊ฐ์ฒด๋ฅผ end-to-end ๋ฐฉ์‹์œผ๋กœ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์‹œํ•œ๋‹ค.

 

 

[๊ฐ์ฒด ๊ฒ€์ถœ] DETR ๋ชจ๋ธ ์™„์ „ ์ •๋ณตํ•˜๊ธฐ !! | Object Detection | ๊ฐ์ฒด ๊ฒ€์ถœ ํŠธ๋ Œ๋“œ ๊ณต๋ถ€ํ•˜๊ธฐ

๊ฐ์ฒด ๊ฒ€์ถœ(Object Detection)์€ ์ปดํ“จํ„ฐ ๋น„์ „ ๋ถ„์•ผ์—์„œ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๋ฌธ์ œ ์ค‘ ํ•˜๋‚˜๋กœ, ์ด๋ฏธ์ง€ ๋‚ด์—์„œ ๊ฐ์ฒด์˜ ์œ„์น˜์™€ ์ข…๋ฅ˜๋ฅผ ์‹๋ณ„ํ•˜๋Š” ์ž‘์—…์ด์—์š”. ์˜ค๋Š˜์€ ๊ฐ์ฒด ๊ฒ€์ถœ ๋ถ„์•ผ์—์„œ ํ•œ ํš์„ ๊ทธ์€ DETR (Detection T

mvje.tistory.com

 

๋ฐ˜์‘ํ˜•

'๐Ÿ“– Fundamentals > Computer Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

Equirectangular Image (๋“ฑ์žฅ๋ฐฉํ˜• ์ด๋ฏธ์ง€) ์„ค๋ช… | ์ด๋ฏธ์ง€ ์ขŒํ‘œ ๋ณ€ํ™˜ | ๊ตฌ๋ฉด์ขŒํ‘œ ๋ฒกํ„ฐ ๊ณ„์‚ฐ  (0) 2024.03.29
์ง๊ฐ์ขŒํ‘œ๊ณ„ & ๊ตฌ๋ฉด์ขŒํ‘œ๊ณ„ | ์ขŒํ‘œ ๋ณ€ํ™˜  (0) 2024.03.29
[๊ฐ์ฒด ๊ฒ€์ถœ] RPN์ด ๋ฌด์—‡์ผ๊นŒ? | ๊ฐ์ฒด ๊ฒ€์ถœ์—์„œ ํ›„๋ณด ์˜์—ญ์„ ์ƒ์„ฑํ•˜๋Š” ๋„คํŠธ์›Œํฌ | Region Proposal Network ์„ค๋ช…  (3) 2023.11.25
[๊ฐ์ฒด ๊ฒ€์ถœ] NMS๊ฐ€ ๋ฌด์—‡์ผ๊นŒ? | ๊ฐ์ฒด ๊ฒ€์ถœ์—์„œ ๊ฒน์น˜๋Š” bbox๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ๋ฒ• | Non-Maximum Suppression ์„ค๋ช…  (1) 2023.11.25
Computer Vision (์ปดํ“จํ„ฐ ๋น„์ „) ์ด ๋ฌด์—‡์ผ๊นŒ !?  (1) 2023.04.07
'๐Ÿ“– Fundamentals/Computer Vision' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • Equirectangular Image (๋“ฑ์žฅ๋ฐฉํ˜• ์ด๋ฏธ์ง€) ์„ค๋ช… | ์ด๋ฏธ์ง€ ์ขŒํ‘œ ๋ณ€ํ™˜ | ๊ตฌ๋ฉด์ขŒํ‘œ ๋ฒกํ„ฐ ๊ณ„์‚ฐ
  • ์ง๊ฐ์ขŒํ‘œ๊ณ„ & ๊ตฌ๋ฉด์ขŒํ‘œ๊ณ„ | ์ขŒํ‘œ ๋ณ€ํ™˜
  • [๊ฐ์ฒด ๊ฒ€์ถœ] RPN์ด ๋ฌด์—‡์ผ๊นŒ? | ๊ฐ์ฒด ๊ฒ€์ถœ์—์„œ ํ›„๋ณด ์˜์—ญ์„ ์ƒ์„ฑํ•˜๋Š” ๋„คํŠธ์›Œํฌ | Region Proposal Network ์„ค๋ช…
  • [๊ฐ์ฒด ๊ฒ€์ถœ] NMS๊ฐ€ ๋ฌด์—‡์ผ๊นŒ? | ๊ฐ์ฒด ๊ฒ€์ถœ์—์„œ ๊ฒน์น˜๋Š” bbox๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ๋ฒ• | Non-Maximum Suppression ์„ค๋ช…
๋ญ…์ฆค
๋ญ…์ฆค
AI ๊ธฐ์ˆ  ๋ธ”๋กœ๊ทธ
    ๋ฐ˜์‘ํ˜•
  • ๋ญ…์ฆค
    CV DOODLE
    ๋ญ…์ฆค
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
  • ๊ณต์ง€์‚ฌํ•ญ

    • โœจ About Me
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (198)
      • ๐Ÿ“– Fundamentals (33)
        • Computer Vision (9)
        • 3D vision & Graphics (6)
        • AI & ML (15)
        • NLP (2)
        • etc. (1)
      • ๐Ÿ› Research (64)
        • Deep Learning (7)
        • Image Classification (2)
        • Detection & Segmentation (17)
        • OCR (7)
        • Multi-modal (4)
        • Generative AI (6)
        • 3D Vision (2)
        • Material & Texture Recognit.. (8)
        • NLP & LLM (11)
        • etc. (0)
      • ๐ŸŒŸ AI & ML Tech (7)
        • AI & ML ์ธ์‚ฌ์ดํŠธ (7)
      • ๐Ÿ’ป Programming (85)
        • Python (18)
        • Computer Vision (12)
        • LLM (4)
        • AI & ML (17)
        • Database (3)
        • Apache Airflow (6)
        • Docker & Kubernetes (14)
        • ์ฝ”๋”ฉ ํ…Œ์ŠคํŠธ (4)
        • C++ (1)
        • etc. (6)
      • ๐Ÿ’ฌ ETC (3)
        • ์ฑ… ๋ฆฌ๋ทฐ (3)
  • ๋งํฌ

  • ์ธ๊ธฐ ๊ธ€

  • ํƒœ๊ทธ

    nlp
    pandas
    ์ปดํ“จํ„ฐ๋น„์ „
    OpenAI
    Image Classification
    material recognition
    Text recognition
    segmentation
    3D Vision
    multi-modal
    ํŒŒ์ด์ฌ
    deep learning
    Computer Vision
    pytorch
    LLM
    VLP
    ๋”ฅ๋Ÿฌ๋‹
    GPT
    ๊ฐ์ฒด๊ฒ€์ถœ
    AI
    airflow
    OCR
    ํ”„๋กฌํ”„ํŠธ์—”์ง€๋‹ˆ์–ด๋ง
    OpenCV
    object detection
    CNN
    ๊ฐ์ฒด ๊ฒ€์ถœ
    Python
    ChatGPT
    ๋„์ปค
  • ์ตœ๊ทผ ๋Œ“๊ธ€

  • ์ตœ๊ทผ ๊ธ€

  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
๋ญ…์ฆค
[๊ฐ์ฒด ๊ฒ€์ถœ] ์•ต์ปค ๋ฐ•์Šค(Anchor Box)๋ž€ ๋ฌด์—‡์ธ๊ฐ€? | ๊ฐ์ฒด ๊ฒ€์ถœ ๋ชจ๋ธ์—์„œ์˜ ์—ญํ• ๊ณผ ํ•œ๊ณ„
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”