[ONNX] ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ONNX Runtime์œผ๋กœ CPU ํ™˜๊ฒฝ์—์„œ ๊ฐ€์†ํ™”ํ•˜๊ธฐ

2023. 11. 16. 20:48ยท๐Ÿ’ป Programming/AI & ML
๋ฐ˜์‘ํ˜•

์š”์ฆ˜์€ ์–ด์ง€๊ฐ„ํ•œ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ GPU ์—†์ด ๋Œ๋ฆฌ๊ธฐ ์–ด๋ ต์ง€๋งŒ, ๋˜ ์˜์™ธ๋กœ ๊ฐ€๋ฒผ์šด ๋ชจ๋ธ๋“ค์€ CPU ๋งŒ์œผ๋กœ ๋Œ๋ฆด ์ˆ˜ ์žˆ๋‹ค. ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ํด๋ผ์šฐ๋“œ ๋น„์šฉ๋„ ์ค„์ผ ์ˆ˜ ์žˆ์œผ๋‹ˆ ์˜จ๋ผ์ธ ์˜ˆ์ธก์ด ํ•„์š”ํ•œ ๊ฒฝ์šฐ๊ฐ€ ์•„๋‹ˆ๋ผ๋ฉด CPU ํ™˜๊ฒฝ์—์„œ ์ธํผ๋Ÿฐ์Šคํ•˜๋Š” ๊ฒƒ๋„ ๊ณ ๋ คํ•ด ๋ณผ ๋งŒํ•˜๋‹ค.

 

๋ฌผ๋ก  CPU๋กœ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ธํผ๋Ÿฐ์Šค๋ฅผ ํ•˜๊ฒŒ ๋˜๋ฉด ์ƒ๋‹นํžˆ ๋А๋ฆฌ๋‹ค. ๋•Œ๋ฌธ์— ONNX ๋ชจ๋ธ ๋ณ€ํ™˜์„ ํ•˜๊ณ , ONNX runtime์œผ๋กœ ์ธํผ๋Ÿฐ์Šค๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ฉด ์กฐ๊ธˆ์ด๋ผ๋„ ๋ชจ๋ธ ์ธํผ๋Ÿฐ์Šค ์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ TensorRT์™€ ๋‹ฌ๋ฆฌ ONNX ๋ชจ๋ธ ๋ณ€ํ™˜์˜ ๊ฒฝ์šฐ ์ž…๋ ฅ ํ…์„œ ํฌ๊ธฐ ๋˜ํ•œ ๋™์ ์œผ๋กœ ๊ฐ€์ ธ๊ฐˆ ์ˆ˜ ์žˆ๋‹ค๋Š” ์žฅ์ ์ด ์žˆ๋‹ค.

 

๋ฌผ๋ก  ํ•˜๋“œ์›จ์–ด ํ™˜๊ฒฝ์— ๋”ฐ๋ผ, ๋ชจ๋ธ์— ๋”ฐ๋ผ, ์ž…๋ ฅ ํ…์„œ์˜ ํฌ๊ธฐ์— ๋”ฐ๋ผ ์†๋„ ํ–ฅ์ƒ์˜ ์ •๋„๊ฐ€ ๋‹ค๋ฅด๊ฑฐ๋‚˜, ์˜คํžˆ๋ ค ์†๋„๊ฐ€ ๋А๋ ค์งˆ ์ˆ˜๋„ ์žˆ์œผ๋‹ˆ ํ…Œ์ŠคํŠธ๋ฅผ ํ•ด๋ด์•ผ ํ•œ๋‹ค.

 

Resnet ์œผ๋กœ ๊ฐ„๋‹จํ•˜๊ฒŒ ํ…Œ์ŠคํŠธํ•ด๋ดค์„ ๋•Œ ์•ฝ 1.5~1.7๋ฐฐ ์ •๋„์˜ ์†๋„ ํ–ฅ์ƒ์ด ์žˆ์—ˆ๊ณ , ํ˜„์žฌ ์‚ฌ์šฉ์ค‘์ธ CNN ๊ธฐ๋ฐ˜์˜ ๊ฒ€์ถœ๊ธฐ๋กœ ํ…Œ์ŠคํŠธ๋ฅผ ํ•ด๋ดค์„ ๋•Œ๋„ ๋น„์Šทํ•œ ์ •๋„๋กœ ์†๋„๊ฐ€ ํ–ฅ์ƒ๋˜์—ˆ๋‹ค.

 

์†๋„๊ฐ€ ๋งŽ์ด ๋น ๋ฅผ ํ•„์š” ์—†๊ณ , ๋ชจ๋ธ์ด ์–ด๋А์ •๋„ ๊ฐ€๋ณ๋‹ค๋ฉด CPU ํ™˜๊ฒฝ์—์„œ ONNX ๋Ÿฐํƒ€์ž„์œผ๋กœ ๋ชจ๋ธ์„ ๋ฐฐํฌํ•˜๋Š” ๊ฒƒ๋„ ์ถฉ๋ถ„ํžˆ ์ƒ๊ฐํ•ด๋ณผ ์ˆ˜ ์žˆ๋Š” ์˜ต์…˜์ธ ๊ฒƒ ๊ฐ™๋‹ค.

 

ONNX Runtime ์˜ˆ์ œ ์ฝ”๋“œ

import torch
import torchvision
import numpy as np
import onnx
import onnxruntime as ort
from onnx import shape_inference
import time


# PyTorch ๋ชจ๋ธ ๋กœ๋“œ
torch_model = torchvision.models.resnet18(pretrained=False)
torch_model.eval()

# ์˜ˆ์ œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ ์ƒ์„ฑ
dummy_input = torch.randn(1, 3, 500, 500, requires_grad=True)

repetitions = 10

for _ in range(5):
    _ = torch_model(dummy_input)

start = time.time()
with torch.no_grad():
    for rep in range(repetitions):
        torch_out = torch_model(dummy_input)
end = time.time()

print('torch ๋ชจ๋ธ ํ‰๊ท  ์†Œ์š” ์‹œ๊ฐ„ : ', (end-start)/repetitions)
    

# # ๋ชจ๋ธ ๋ณ€ํ™˜
torch.onnx.export(torch_model,               # ์‹คํ–‰๋  ๋ชจ๋ธ
                    dummy_input,                         # ๋ชจ๋ธ ์ž…๋ ฅ๊ฐ’ (ํŠœํ”Œ ๋˜๋Š” ์—ฌ๋Ÿฌ ์ž…๋ ฅ๊ฐ’๋“ค๋„ ๊ฐ€๋Šฅ)
                    "test_resnet18.onnx",   # ๋ชจ๋ธ ์ €์žฅ ๊ฒฝ๋กœ (ํŒŒ์ผ ๋˜๋Š” ํŒŒ์ผ๊ณผ ์œ ์‚ฌํ•œ ๊ฐ์ฒด ๋ชจ๋‘ ๊ฐ€๋Šฅ)
                    export_params=True,        # ๋ชจ๋ธ ํŒŒ์ผ ์•ˆ์— ํ•™์Šต๋œ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ์ €์žฅํ• ์ง€์˜ ์—ฌ๋ถ€
                    opset_version=10,          # ๋ชจ๋ธ์„ ๋ณ€ํ™˜ํ•  ๋•Œ ์‚ฌ์šฉํ•  ONNX ๋ฒ„์ „
                    do_constant_folding=True,  # ์ตœ์ ํ™”์‹œ ์ƒ์ˆ˜ํด๋”ฉ์„ ์‚ฌ์šฉํ• ์ง€์˜ ์—ฌ๋ถ€
                    input_names = ['input'],   # ๋ชจ๋ธ์˜ ์ž…๋ ฅ๊ฐ’์„ ๊ฐ€๋ฆฌํ‚ค๋Š” ์ด๋ฆ„
                    output_names = ['output'], # ๋ชจ๋ธ์˜ ์ถœ๋ ฅ๊ฐ’์„ ๊ฐ€๋ฆฌํ‚ค๋Š” ์ด๋ฆ„
                    dynamic_axes={'input' : {0: 'batch_size', 2: 'height', 3: 'width'}},    # ๊ฐ€๋ณ€์ ์ธ ๊ธธ์ด๋ฅผ ๊ฐ€์ง„ ์ฐจ์›
                    )


path = "./test_resnet18.onnx"
onnx.save(onnx.shape_inference.infer_shapes(onnx.load(path)), path)

# # ONNX ๋ชจ๋ธ ๋กœ๋“œ
onnx_model = onnx.load("./test_resnet18.onnx")
onnx.checker.check_model(onnx_model)

# ONNX ๋Ÿฐํƒ€์ž„ ์„ธ์…˜ ์—ด๊ธฐ (CPU ์‚ฌ์šฉ ์„ค์ •)
ort_session = ort.InferenceSession("./test_resnet18.onnx", providers=['CPUExecutionProvider'])
print(ort.get_device())

# ์ธํผ๋Ÿฐ์Šค ์‹คํ–‰
ort_inputs = {ort_session.get_inputs()[0].name: np.array(dummy_input.detach())}

for _ in range(5):
    _ = ort_session.run(None, ort_inputs)

start = time.time()
with torch.no_grad():
    for rep in range(repetitions):
        ort_outputs = ort_session.run(None, ort_inputs)
end = time.time()

print('ONNX ํ‰๊ท  ์†Œ์š” ์‹œ๊ฐ„ : ', (end-start)/repetitions)
๋ฐ˜์‘ํ˜•

'๐Ÿ’ป Programming > AI & ML' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[ํŠœํ† ๋ฆฌ์–ผ] ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” CLIP & KoCLIP ๋ชจ๋ธ ์˜ˆ์ œ | ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI ์˜ˆ์ œ | CLIP & ํ•œ๊ตญ์–ด CLIP  (0) 2024.07.28
[pytorch] ๋ชจ๋ธ ์ผ๋ถ€๋ถ„๋งŒ ์ €์žฅํ•˜๊ธฐ/๋ถˆ๋Ÿฌ์˜ค๊ธฐ  (0) 2023.12.09
[Model Inference] Pytorch 2.0 Compile ์‚ฌ์šฉ ํ›„๊ธฐ ๋ฐ ์žฅ๋‹จ์  | pytorch compile ๋ชจ๋ธ ์ถ”๋ก  ์†๋„ ๊ฐœ์„  ํ…Œ์ŠคํŠธ  (1) 2023.10.07
[Model Inference] Torch-TensorRT ์‚ฌ์šฉ๋ฒ• | ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ตœ์ ํ™” ๋ฐ ์ธํผ๋Ÿฐ์Šค ๊ฐ€์†ํ™”  (1) 2023.10.02
[pytorch] Multi-GPU Training | ๋‹ค์ค‘ GPU ํ•™์Šต ์˜ˆ์‹œ| Distributed Data Parallel (DDP) | Data Parallel (DP)  (0) 2023.04.17
'๐Ÿ’ป Programming/AI & ML' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€
  • [ํŠœํ† ๋ฆฌ์–ผ] ๋ˆ„๊ตฌ๋‚˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” CLIP & KoCLIP ๋ชจ๋ธ ์˜ˆ์ œ | ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ AI ์˜ˆ์ œ | CLIP & ํ•œ๊ตญ์–ด CLIP
  • [pytorch] ๋ชจ๋ธ ์ผ๋ถ€๋ถ„๋งŒ ์ €์žฅํ•˜๊ธฐ/๋ถˆ๋Ÿฌ์˜ค๊ธฐ
  • [Model Inference] Pytorch 2.0 Compile ์‚ฌ์šฉ ํ›„๊ธฐ ๋ฐ ์žฅ๋‹จ์  | pytorch compile ๋ชจ๋ธ ์ถ”๋ก  ์†๋„ ๊ฐœ์„  ํ…Œ์ŠคํŠธ
  • [Model Inference] Torch-TensorRT ์‚ฌ์šฉ๋ฒ• | ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ์ตœ์ ํ™” ๋ฐ ์ธํผ๋Ÿฐ์Šค ๊ฐ€์†ํ™”
๋ญ…์ฆค
๋ญ…์ฆค
AI ๊ธฐ์ˆ  ๋ธ”๋กœ๊ทธ
    ๋ฐ˜์‘ํ˜•
  • ๋ญ…์ฆค
    CV DOODLE
    ๋ญ…์ฆค
  • ์ „์ฒด
    ์˜ค๋Š˜
    ์–ด์ œ
  • ๊ณต์ง€์‚ฌํ•ญ

    • โœจ About Me
    • ๋ถ„๋ฅ˜ ์ „์ฒด๋ณด๊ธฐ (202)
      • ๐Ÿ“– Fundamentals (33)
        • Computer Vision (9)
        • 3D vision & Graphics (6)
        • AI & ML (15)
        • NLP (2)
        • etc. (1)
      • ๐Ÿ› Research (67)
        • Deep Learning (7)
        • Image Classification (2)
        • Detection & Segmentation (17)
        • OCR (7)
        • Multi-modal (4)
        • Generative AI (8)
        • 3D Vision (3)
        • Material & Texture Recognit.. (8)
        • NLP & LLM (11)
        • etc. (0)
      • ๐Ÿ› ๏ธ Engineering (7)
        • Distributed Training (4)
        • AI & ML ์ธ์‚ฌ์ดํŠธ (3)
      • ๐Ÿ’ป Programming (86)
        • Python (18)
        • Computer Vision (12)
        • LLM (4)
        • AI & ML (18)
        • Database (3)
        • Apache Airflow (6)
        • Docker & Kubernetes (14)
        • ์ฝ”๋”ฉ ํ…Œ์ŠคํŠธ (4)
        • C++ (1)
        • etc. (6)
      • ๐Ÿ’ฌ ETC (3)
        • ์ฑ… ๋ฆฌ๋ทฐ (3)
  • ๋งํฌ

  • ์ธ๊ธฐ ๊ธ€

  • ํƒœ๊ทธ

    3D Vision
    segmentation
    OpenAI
    OCR
    deep learning
    material recognition
    ml
    Text recognition
    ๋”ฅ๋Ÿฌ๋‹
    pandas
    ์ปดํ“จํ„ฐ๋น„์ „
    Python
    pytorch
    ๊ฐ์ฒด๊ฒ€์ถœ
    multi-modal
    ํŒŒ์ด์ฌ
    OpenCV
    airflow
    AI
    ํ”„๋กฌํ”„ํŠธ์—”์ง€๋‹ˆ์–ด๋ง
    ๊ฐ์ฒด ๊ฒ€์ถœ
    generative ai
    ๋„์ปค
    Computer Vision
    LLM
    object detection
    VLP
    nlp
    ChatGPT
    CNN
  • ์ตœ๊ทผ ๋Œ“๊ธ€

  • ์ตœ๊ทผ ๊ธ€

  • hELLOยท Designed By์ •์ƒ์šฐ.v4.10.3
๋ญ…์ฆค
[ONNX] ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ ONNX Runtime์œผ๋กœ CPU ํ™˜๊ฒฝ์—์„œ ๊ฐ€์†ํ™”ํ•˜๊ธฐ
์ƒ๋‹จ์œผ๋กœ

ํ‹ฐ์Šคํ† ๋ฆฌํˆด๋ฐ”