๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
๐Ÿ› Research/NLP & LLM

[์˜คํ”ˆ ์†Œ์Šค] BERT๋ฅผ ์ด์šฉํ•œ ํ•œ๊ตญ์–ด ๊ฐœ์ฒด๋ช… ์ธ์‹ | NER (Named Entity Recognition)

by ๋ญ…์ฆค 2022. 12. 15.
๋ฐ˜์‘ํ˜•

 

 

NER(Named Entity Recognition)

 

Named Entity Recognition (NER)์€ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๊ธฐ์ˆ  ์ค‘ ํ•˜๋‚˜๋กœ, ๋ฌธ์žฅ ๋‚ด์—์„œ ํŠน์ •ํ•œ ์œ ํ˜•์˜ ๋ช…์นญ(๊ฐœ์ฒด)์„ ์ธ์‹ํ•˜๋Š” ์ž‘์—…์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, "Steve Jobs๋Š” Apple์˜ ์ฐฝ์—…์ž์ž…๋‹ˆ๋‹ค" ๋ผ๋Š” ๋ฌธ์žฅ์ด ์žˆ๋‹ค๋ฉด, "Steve Jobs"๋Š” ์ธ๋ฌผ(person), "Apple"์€ ์กฐ์ง(organization)์ด๋ผ๋Š” ์œ ํ˜•์˜ ๊ฐœ์ฒด๋กœ ์ธ์‹๋œ๋‹ค. ์ด์™ธ์—๋„ ์žฅ์†Œ, ์‹œ๊ฐ„ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ฐœ์ฒด๋ฅผ ์ธ์‹ํ•  ์ˆ˜ ์žˆ๋‹ค.

 

์ด๋Ÿฌํ•œ NER์€ ์ •๋ณด ์ถ”์ถœ, ์งˆ์˜ ์‘๋‹ต, ๋ฆฌ๋ทฐ ๋ถ„์„, ๊ธฐ๊ณ„๋ฒˆ์—ญ ๋“ฑ ๋‹ค์–‘ํ•œ ๊ณณ์—์„œ ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋‹ค. ์ „ํ˜€ ์ƒ๊ฐํ•˜์ง€ ๋ชปํ–ˆ๋˜ ํ™œ์šฉ์ฒ˜๋Š” ๊ธฐ๊ณ„๋ฒˆ์—ญ ๋ถ„์•ผ์ด๋‹ค. ์˜์–ด๋ฅผ ํ•œ๊ตญ์–ด๋กœ ๋ฒˆ์—ญํ•  ๋•Œ ๊ธฐ์—…์„ ์ง€์นญํ•˜๋Š” "Apple"์€ "์‚ฌ๊ณผ"๊ฐ€ ์•„๋‹Œ "์• ํ”Œ"๋กœ ๋ฒˆ์—ญํ•ด์•ผ ํ•œ๋‹ค. ์ด๋ ‡๋“ฏ ๋ฌธ๋งฅ์— ๋งž๋Š” ์˜ฌ๋ฐ”๋ฅธ ๋ฒˆ์—ญ์„ ์œ„ํ•ด์„œ๋Š” ๋ฌธ์žฅ์˜ ์ปจํ…์ŠคํŠธ ์†์—์„œ ๋‹จ์–ด์˜ ๊ฐœ์ฒด๋ช…์„ ํŒŒ์•…ํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. 

 

 

Pytorch-BERT-CRF-NER

์ถ”์ฒœํ•˜๋Š” ๋ ˆํผ์ง€ํ† ๋ฆฌ์—์„œ๋Š” pytorch๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ SKTBrain์—์„œ ํ•œ๊ตญ์–ด๋กœ ํ•™์Šต์‹œํ‚จ BERT ๋ชจ๋ธ์ธ KoBERT ๋ชจ๋ธ์„ ํ•™์Šต์— ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ํ•œ๋‹ค. NER์„ ํ™œ์šฉํ•œ ๊ฐ„๋‹จํ•œ ์‘์šฉ์„ ์œ„ํ•ด์„œ๋Š” ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉํ•ด๋ณผ ์ˆ˜๋„ ์žˆ๋‹ค.

 

 

 

- ํ•œ๊ตญ์–ด NER : https://github.com/eagle705/pytorch-bert-crf-ner

 

GitHub - eagle705/pytorch-bert-crf-ner: KoBERT์™€ CRF๋กœ ๋งŒ๋“  ํ•œ๊ตญ์–ด ๊ฐœ์ฒด๋ช…์ธ์‹๊ธฐ (BERT+CRF based Named Entity Recogn

KoBERT์™€ CRF๋กœ ๋งŒ๋“  ํ•œ๊ตญ์–ด ๊ฐœ์ฒด๋ช…์ธ์‹๊ธฐ (BERT+CRF based Named Entity Recognition model for Korean) - GitHub - eagle705/pytorch-bert-crf-ner: KoBERT์™€ CRF๋กœ ๋งŒ๋“  ํ•œ๊ตญ์–ด ๊ฐœ์ฒด๋ช…์ธ์‹๊ธฐ (BERT+CRF based Named Entity Recognition m...

github.com

- SKTBrain KoBERT : https://github.com/SKTBrain/KoBERT

 

GitHub - SKTBrain/KoBERT: Korean BERT pre-trained cased (KoBERT)

Korean BERT pre-trained cased (KoBERT). Contribute to SKTBrain/KoBERT development by creating an account on GitHub.

github.com

 

๋ฐ˜์‘ํ˜•