[pandas] νŠΉμ • μ»¬λŸΌμ—μ„œ νŠΉμ • λ¬Έμžμ—΄μ΄ ν¬ν•¨λœ ν–‰ μ°ΎκΈ° | str.contains

2023. 11. 17. 08:47Β·πŸ’» Programming/Python
λ°˜μ‘ν˜•

νŒλ‹€μŠ€μ—μ„œλŠ” νŠΉμ • μ»¬λŸΌμ—μ„œ νŠΉμ • λ¬Έμžμ—΄μ΄ ν¬ν•¨λœ ν–‰μ„ μ°Ύμ„ μˆ˜ μžˆμŠ΅λ‹ˆλ‹€. μ΄λ₯Ό μœ„ν•΄μ„œλŠ” ν•΄λ‹Ή μ»¬λŸΌμ˜ λ¬Έμžμ—΄μ— λŒ€ν•΄ str.contains() λ©”μ„œλ“œλ₯Ό μ‚¬μš©ν•  μˆ˜ μžˆμŠ΅λ‹ˆλ‹€.

 

str.contains()

κΈ°λ³Έ μ‚¬μš©λ²•

contains_apple = df[df['컬럼 A'].str.contains('κ°€λ‚˜λ‹€λΌ')]
  • "컬럼 A"μ—μ„œ "κ°€λ‚˜λ‹€λΌ"κ°€ ν¬ν•¨λœ 행을 μ°ΎκΈ° μœ„ν•΄μ„œλŠ” μœ„μ™€ 같이 μ‚¬μš©ν•˜λ©΄ λœλ‹€.

 

 

μ •κ·œ ν‘œν˜„식 μ‚¬μš©

contains_pattern = df[df['A'].str.contains('사과|λ°”λ‚˜λ‚˜', regex=True)]
  • regex=True 둜 μ„€μ •ν•˜μ—¬ μ •κ·œ ν‘œν˜„μ‹μ„ μ‚¬μš©ν•  수 μžˆλ‹€.
  • 예λ₯Ό λ“€μ–΄, νŠΉμ • νŒ¨ν„΄μ΄λ‚˜ λ¬Έμžμ—΄μ„ μ •κ·œ ν‘œν˜„μ‹μœΌλ‘œ μ§€μ •ν•˜μ—¬ 검색할 수 μžˆλ‹€.
  • μœ„ μ˜ˆμ‹œλŠ” '사과' λ˜λŠ” 'λ°”λ‚˜λ‚˜'κ°€ ν¬ν•¨λœ 행을 μ°ΎλŠ” 것

 

 

λŒ€μ†Œλ¬Έμž ꡬ뢄 μ˜΅μ…˜ 

contains_case_sensitive = df[df['A'].str.contains('Apple', case=True)]
contains_case_insensitive = df[df['A'].str.contains('Apple', case=False)]
  • case λ§€κ°œλ³€μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ λŒ€μ†Œλ¬Έμžλ₯Ό κ΅¬λΆ„ν•˜κ±°λ‚˜ λ¬΄μ‹œν•  수 μžˆλ‹€.

 

 

NA(κ²°μΈ‘κ°’) 처리

contains_with_na = df[df['A'].str.contains('사과', na=False)]
  • na λ§€κ°œλ³€μˆ˜λ₯Ό μ‚¬μš©ν•˜μ—¬ NaN κ°’ μ²˜λ¦¬λ₯Ό μ§€μ •ν•  μˆ˜ μžˆμŠ΅λ‹ˆλ‹€.

 

 

μ˜ˆμ‹œ

import pandas as pd

# μƒ˜ν”Œ λ°μ΄ν„°ν”„λ ˆμž„ 생성
data = {'A': ['μ‚¬κ³Όλ°”λ‚˜λ‚˜', '포도딸기', 'λ³΅μˆ­μ•„', 'μ‚¬κ³Όμ˜€λ Œμ§€']}
df = pd.DataFrame(data)

# 'A' μ»¬λŸΌμ—μ„œ '사과'κ°€ ν¬ν•¨λœ ν–‰ μ°ΎκΈ°
contains_apple = df[df['A'].str.contains('사과')]

# κ²°κ³Ό 좜λ ₯
print(contains_apple)
  • A λΌλŠ” μ»¬λŸΌμ—μ„œ '사과'κ°€ ν¬ν•¨λœ 행을 μ°ΎλŠ” μ˜ˆμ‹œ
λ°˜μ‘ν˜•

'πŸ’» Programming > Python' μΉ΄ν…Œκ³ λ¦¬μ˜ λ‹€λ₯Έ κΈ€

[python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Process μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Process | μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€μ— μ„œλ‘œ λ‹€λ₯Έ μž‘μ—…μ„ ν• λ‹Ή  (3) 2024.01.07
[python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Pool μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Pool | python 속도 ν–₯상  (0) 2024.01.07
[pandas] νŠΉμ • 컬럼의 값이 곡백인 행을 μ œμ™Έν•˜λŠ” 방법 | dropna  (0) 2023.11.17
[pandas] νŠΉμ • μ»¬λŸΌμ—μ„œ μ€‘λ³΅λœ κ°’ 제거 | drop_duplicates  (1) 2023.11.17
[pandas] DataFrame μ„€λͺ… | 데이터 μ‘°μž‘, 필터링, μ‹œκ°ν™”, 톡계 뢄석  (0) 2023.11.16
'πŸ’» Programming/Python' μΉ΄ν…Œκ³ λ¦¬μ˜ λ‹€λ₯Έ κΈ€
  • [python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Process μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Process | μ—¬λŸ¬ ν”„λ‘œμ„ΈμŠ€μ— μ„œλ‘œ λ‹€λ₯Έ μž‘μ—…μ„ ν• λ‹Ή
  • [python] λ©€ν‹°ν”„λ‘œμ„Έμ‹± Pool μ‚¬μš©λ²• 및 μ½”λ“œ μ˜ˆμ‹œ | multiprocessing.Pool | python 속도 ν–₯상
  • [pandas] νŠΉμ • 컬럼의 값이 곡백인 행을 μ œμ™Έν•˜λŠ” 방법 | dropna
  • [pandas] νŠΉμ • μ»¬λŸΌμ—μ„œ μ€‘λ³΅λœ κ°’ 제거 | drop_duplicates
뭅즀
뭅즀
AI 기술 λΈ”λ‘œκ·Έ
    λ°˜μ‘ν˜•
  • 뭅즀
    moovzi’s Doodle
    뭅즀
  • 전체
    였늘
    μ–΄μ œ
  • 곡지사항

    • ✨ About Me
    • λΆ„λ₯˜ 전체보기 (213)
      • πŸ“– Fundamentals (34)
        • Computer Vision (9)
        • 3D vision & Graphics (6)
        • AI & ML (16)
        • NLP (2)
        • etc. (1)
      • πŸ› Research (75)
        • Deep Learning (7)
        • Perception (19)
        • OCR (7)
        • Multi-modal (5)
        • Image•Video Generation (18)
        • 3D Vision (4)
        • Material • Texture Recognit.. (8)
        • Large-scale Model (7)
        • etc. (0)
      • πŸ› οΈ Engineering (8)
        • Distributed Training & Infe.. (5)
        • AI & ML μΈμ‚¬μ΄νŠΈ (3)
      • πŸ’» Programming (92)
        • Python (18)
        • Computer Vision (12)
        • LLM (4)
        • AI & ML (18)
        • Database (3)
        • Distributed Computing (6)
        • Apache Airflow (6)
        • Docker & Kubernetes (14)
        • μ½”λ”© ν…ŒμŠ€νŠΈ (4)
        • C++ (1)
        • etc. (6)
      • πŸ’¬ ETC (4)
        • μ±… 리뷰 (4)
  • 링크

    • 리틀리 ν”„λ‘œν•„ (λ©˜ν† λ§, λ©΄μ ‘μ±…,...)
    • γ€Žλ‚˜λŠ” AI μ—”μ§€λ‹ˆμ–΄μž…λ‹ˆλ‹€γ€
    • Instagram
    • Brunch
    • Github
  • 인기 κΈ€

  • 졜근 λŒ“κΈ€

  • 졜근 κΈ€

  • hELLOΒ· Designed Byμ •μƒμš°.v4.10.3
뭅즀
[pandas] νŠΉμ • μ»¬λŸΌμ—μ„œ νŠΉμ • λ¬Έμžμ—΄μ΄ ν¬ν•¨λœ ν–‰ μ°ΎκΈ° | str.contains
μƒλ‹¨μœΌλ‘œ

ν‹°μŠ€ν† λ¦¬νˆ΄λ°”