λ³Έλ¬Έ λ°”λ‘œκ°€κΈ°
πŸ’» Programming/Python

[python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Pool μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Pool | python 속도 ν–₯상

by 뭅즀 2024. 1. 7.
λ°˜μ‘ν˜•

 

λ©€ν‹°ν”„λ‘œμ„Έμ‹±μ€ μ—¬λŸ¬ 개의 독립적인 ν”„λ‘œμ„ΈμŠ€λ₯Ό μƒμ„±ν•˜μ—¬ 각각의 ν”„λ‘œμ„ΈμŠ€κ°€ λ³‘λ ¬λ‘œ μž‘μ—…ν•˜λ„λ‘ ν•˜λŠ” 방식이닀. 각 ν”„λ‘œμ„ΈμŠ€λŠ” 독립적인 λ©”λͺ¨λ¦¬ 곡간을 가지며, ν”„λ‘œμ„ΈμŠ€ κ°„ 톡신 (Inter-Process Communication, IPC) λ©”μ»€λ‹ˆμ¦˜μ„ 톡해 데이터λ₯Ό κ΅ν™˜ν•  수 μžˆλ‹€.

 

νŒŒμ΄μ¬μ—μ„œ λ©€ν‹°ν”„λ‘œμ„Έμ‹±μ„ κ΅¬ν˜„ν•˜κΈ° μœ„ν•΄ μ‚¬μš©λ˜λŠ” μ£Όμš” λͺ¨λ“ˆμ€ multiprocessing이닀. multiprocessing λͺ¨λ“ˆμ€ νŒŒμ΄μ¬μ—μ„œ 닀쀑 ν”„λ‘œμ„ΈμŠ€λ₯Ό μ‚¬μš©ν•˜μ—¬ 병렬 μž‘μ—…μ„ μˆ˜ν–‰ν•˜λŠ” 데 도움이 λ˜λŠ” 도ꡬλ₯Ό μ œκ³΅ν•˜λŠ”λ°, threading λͺ¨λ“ˆκ³Ό 달리 Global Interpreter Lock (GIL)의 영ν–₯을 받지 μ•ŠμœΌλ―€λ‘œ CPU-bound μž‘μ—…μ— νš¨κ³Όμ μ΄λ‹€.

 


multiprocessing.Pool

multiprocessing.Pool ν΄λž˜μŠ€λŠ” λ©€ν‹°ν”„λ‘œμ„Έμ‹±μ„ μ‰½κ²Œ κ΅¬ν˜„ν•  수 μžˆλ„λ‘ λ„μ™€μ£ΌλŠ” 클래슀둜, κ°„λ‹¨ν•œ μ½”λ“œλ‘œ μ—¬λŸ¬ μž‘μ—…μ„ λ³‘λ ¬λ‘œ μ‹€ν–‰ν•  수 μžˆλ‹€. Pool ν΄λž˜μŠ€λŠ” μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€λ₯Ό μƒμ„±ν•˜κ³  κ΄€λ¦¬ν•˜λ©°, μž‘μ—…μ„ 이듀 ν”„λ‘œμ„ΈμŠ€μ— λΆ„μ‚°ν•˜μ—¬ μ²˜λ¦¬ν•œλ‹€.

Pool 클래슀의 μ€‘μš”ν•œ λ©”μ„œλ“œ 쀑 ν•˜λ‚˜λŠ” map λ©”μ„œλ“œμΈλ°, μ΄λŠ” μž…λ ₯ 데이터λ₯Ό μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€μ— λΆ„μ‚°ν•˜κ³  각각의 ν”„λ‘œμ„ΈμŠ€μ—μ„œ 주어진 ν•¨μˆ˜λ₯Ό μˆ˜ν–‰ν•œ ν›„ κ²°κ³Όλ₯Ό μˆ˜μ§‘ν•˜λŠ” 역할을 ν•œλ‹€. 각 μž…λ ₯에 λŒ€ν•œ κ²°κ³ΌλŠ” μž…λ ₯ μˆœμ„œλŒ€λ‘œ λ°˜ν™˜λœλ‹€.

 

 

multiprocessing.Pool(processes=None, initializer=None, initargs=())
  • processes: 생성할 ν”„λ‘œμ„ΈμŠ€μ˜ 개수. 기본값은 CPU μ½”μ–΄μ˜ 개수
  • initializer: 각 ν”„λ‘œμ„ΈμŠ€κ°€ μ‹œμž‘λ  λ•Œ ν˜ΈμΆœν•  μ΄ˆκΈ°ν™” ν•¨μˆ˜
  • initargs: μ΄ˆκΈ°ν™” ν•¨μˆ˜μ— 전달할 μΈμˆ˜λ“€μ˜ νŠœν”Œ

 

 pool.map(func, iterable, chunksize=None)
  • func: 각 μž…λ ₯에 λŒ€ν•΄ μ‹€ν–‰ν•  ν•¨μˆ˜
  • iterable: ν•¨μˆ˜μ— 전달할 μž…λ ₯ 데이터
  • chunksize: 각 ν”„λ‘œμ„ΈμŠ€μ— ν• λ‹Ήλ˜λŠ” 데이터 묢음의 크기. 크기가 μž‘μ„μˆ˜λ‘ μž‘μ€ λ©μ–΄λ¦¬λ‘œ λ‚˜λˆ„μ–΄μ§€λ©°, μ΄λŠ” μž‘μ€ μž‘μ—…λ“€μ΄ λΉ λ₯΄κ²Œ μ™„λ£Œλ  λ•Œ μœ μš©ν•˜λ‹€.

 

λ©€ν‹°ν”„λ‘œμ„Έμ‹± μ½”λ“œ μ˜ˆμ‹œ

# multiprocessing.Pool

from multiprocessing import Pool

def my_function(arg):
    result = arg ** 2
    print(f"Result: {result}")
    return result

if __name__ == "__main__":
    with Pool(processes=4) as pool:
        inputs = [1, 2, 3, 4, 5]
        results = pool.map(my_function, inputs)

    print("Results:", results)

 

  • λ©€ν‹°ν”„λ‘œμ„Έμ‹±μœΌλ‘œ μ‹€ν–‰ν•˜κ³ μž ν•˜λŠ” ν•¨μˆ˜λ₯Ό 생성(my_function)
  • multiprocessing.Pool을 λ§Œλ“€κ³ 
  • μ‹€ν–‰ν•˜κ³ μž ν•˜λŠ” ν•¨μˆ˜μ™€ iterableν•œ μž…λ ₯ 데이터λ₯Ό pool.map을 톡해 μž…λ ₯

 

 

# μ‹±κΈ€ ν”„λ‘œμ„ΈμŠ€ vs λ©€ν‹° ν”„λ‘œμ„ΈμŠ€ 속도 μΈ‘μ • 

import multiprocessing
import time

def my_function(number):
    result = 1
    for _ in range(number):
        result = result ** 2
    return result

def run_single_process(number):
    start_time = time.time()

    for _ in range(number):
        my_function(number)

    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"Single process - Elapsed Time: {elapsed_time} seconds")

def run_with_multiprocessing(number, num_processes):
    start_time = time.time()

    with multiprocessing.Pool(processes=num_processes) as pool:
        pool.map(my_function, [number] * num_processes)

    end_time = time.time()
    elapsed_time = end_time - start_time
    print(f"With multiprocessing - Elapsed Time: {elapsed_time} seconds")

if __name__ == "__main__":
    number_to_process = 1000
    num_processes = 4

    print("Running with single process:")
    run_single_process(number_to_process)

    print("\nRunning with multiprocessing:")
    run_with_multiprocessing(number_to_process, num_processes)

 

κ²°κ³Ό

  • μœ„ μ½”λ“œλ₯Ό μ‚¬μš©ν•˜λ©΄ κ°„λ‹¨ν•œ ν•¨μˆ˜λ₯Ό λ©€ν‹°ν”„λ‘œμ„Έμ‹±μœΌλ‘œ μ‹€ν–‰ν–ˆμ„ λ•Œμ™€ μ‹±κΈ€ ν”„λ‘œμ„ΈμŠ€λ‘œ μ‹€ν–‰ν–ˆμ„ λ•Œμ˜ 속도λ₯Ό μΈ‘μ •ν•  수 μžˆλ‹€
  • κ²°κ³Όλ₯Ό 보면 λ©€ν‹° ν”„λ‘œμ„Έμ‹±μ„ μ‚¬μš©ν–ˆμ„ λ•Œ 처리 μ‹œκ°„μ΄ 훨씬 λΉ λ₯Έ 것을 λ³Ό 수 μžˆλ‹€

 


 

μ•„λž˜ ν¬μŠ€νŒ…μ—μ„œλŠ” multiprocessing.Process을 ν™œμš©ν•œ λ©€ν‹°ν”„λ‘œμ„Έμ‹±κ³Ό Pool, Process 클래슀의 차이점에 λŒ€ν•΄ μ‚΄νŽ΄λ³Έλ‹€. 특히 Processλ₯Ό μ‚¬μš©ν•˜λŠ” 경우 μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€μ— μ„œλ‘œ λ‹€λ₯Έ μž‘μ—…μ„ ν• λ‹Ήν•  수 μžˆλ‹€.

 

 

[python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Process μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Process | μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€μ— μ„œλ‘œ λ‹€λ₯Έ

[python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Pool μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Pool | python 속도 ν–₯상 λ©€ν‹°ν”„λ‘œμ„Έμ‹±μ€ μ—¬λŸ¬ 개의 독립적인 ν”„λ‘œμ„ΈμŠ€λ₯Ό μƒμ„±ν•˜μ—¬ 각각의 ν”„λ‘œμ„ΈμŠ€κ°€ λ³‘λ ¬λ‘œ μž‘μ—…ν•˜λ„λ‘ ν•˜λŠ” 방식이닀

mvje.tistory.com

 

λ°˜μ‘ν˜•