νΉμ§ | μ€λͺ |
---|---|
λν μ£Όμ | λ€μ΄λ² λΆμ€νΈμΊ ν AI Tech 7κΈ° NLP Trackμ Level 2 λλ©μΈ κΈ°μ΄ λν 'Open-Domain Question Answering (Machine Reading Comprehension)'μ λλ€. |
λν μ€λͺ | μ£Όμ΄μ§λ Documentsμ λ΄μ©μ κΈ°λ°μΌλ‘ μ§λ¬Έμ΄ μ£Όμ΄μ§λ©΄, κ·Έ μ§λ¬Έμ λν μ νν λ΅λ³μ λ¬Έμμμ μ°Ύμλ΄λ κ²μ λͺ©νλ‘ ν©λλ€. |
λ°μ΄ν° κ΅¬μ± | λ°μ΄ν°λ μν€νΌλμμ λ΄μ©μΌλ‘ λλΆλΆ μ΄λ£¨μ΄μ§ λ¬Έμ λ°μ΄ν°, κ·Έλ¦¬κ³ Questionκ³Ό Answerλ‘ κ΅¬μ±λμ΄ μμ΅λλ€. |
νκ° μ§ν | λ΅λ³μ μ νν μΆμΆνλμ§λ₯Ό νμΈνκΈ° μν΄ EM(Exact Match) μ§νκ° μ¬μ©λμμ΅λλ€. |
νλ‘μ νΈ κ²°κ³Ό Public 리λ보λ 2λ±, Private 리λ보λ 2λ±μ κΈ°λ‘νμμ΅λλ€.
νμ | μν |
---|---|
κΉμ§μ¬ | (νμ₯) λ² μ΄μ€λΌμΈ μ½λ μμ± λ° κ°μ , νλ‘μ νΈ λ§€λμ§ λ° νκ²½ κ΄λ¦¬, μ‘°μ¬ μ μ²λ¦¬ μκ³ λ¦¬μ¦ κ°λ°, μλ‘μ΄ μ κ·Ό λ°©λ²λ‘ μ μ, μμλΈ |
λ°κ·ν | λ°μ΄ν° νΉμ± λΆμ, EDA, Retrieval ꡬν, λΉκ΅ μ€ν λ° κ°μ (νμ΄λΈλ¦¬λ μμΉ. Re-ranking, Dense, SPLADE λ±λ±), Reader λͺ¨λΈ νμΈνλ |
μ€μ μ | KorQuAD 1.0 λ°μ΄ν° μ¦κ°, λͺ¨λΈ νμΈνλ, Reader λͺ¨λΈ κ°μ (CNN layer μΆκ°), Retrieval λͺ¨λΈ ꡬν(BM25), μμλΈ |
μ΄μ λ―Ό | λ°μ΄ν° μ¦κ° (AEDA, Truncation λ±), Question λ°μ΄ν°μ νλ, Korquad λ°μ΄ν°μ νλ |
μνν | EDA, Retrieval λͺ¨λΈ κ°μ (BM25Plus, Re-ranking νμ΄νΌνλΌλ―Έν° μ΅μ ν), Reader λͺ¨λΈ κ°μ (PLM μ μ λ° Trainer νλΌλ―Έν° μ΅μ ν) |
κ°μ | μ€λͺ |
---|---|
μ£Όμ | κΈ°κ³ λ ν΄ MRC (Machine Reading Comprehension) μ€ βOpen-Domain Question Answeringβ λ₯Ό μ£Όμ λ‘, μ£Όμ΄μ§ μ§μμ κ΄λ ¨λ λ¬Έμλ₯Ό νμνκ³ , ν΄λΉ λ¬Έμμμ μ μ ν λ΅λ³μ μ°Ύκ±°λ μμ±νλ taskλ₯Ό μν |
ꡬ쑰 | Retrieval λ¨κ³μ Reader λ¨κ³μ two-stage ꡬ쑰 μ¬μ© |
νκ° μ§ν | νκ° μ§νλ‘λ EM Score(Exact Match Score)μ΄ μ¬μ©λμκ³ , λͺ¨λΈμ΄ μμΈ‘ν textμ μ λ΅ textκ° κΈμ λ¨μλ‘ μμ ν λκ°μ κ²½μ°μλ§ μ μ λΆμ¬ |
κ°λ° νκ²½ | GPU : Tesla V100 Server 4λ, IDE : Vscode, Jupyter Notebook |
νμ νκ²½ | Notion(μ§ν μν© κ³΅μ ), Github(μ½λ λ° λ°μ΄ν° 곡μ ), Slack(μ€μκ° μν΅), W&B(μκ°ν, νμ΄νΌνλΌλ―Έν° νλ) |
- νλ‘μ νΈλ₯Ό μ§ννλ©° λ¨κ³λ³λ‘ μ€ννμ¬ μ μ©ν λ΄μ©λ€μ μλμ κ°μ΅λλ€.
νλ‘μΈμ€ | μ€λͺ |
---|---|
λ°μ΄ν° μ²λ¦¬ | AEDA, Swap Sentence, Truncation, Mecabμ νμ©ν Question κ°μ‘°, LLMκΈ°λ° μ‘°μ¬μ κ±° |
λͺ¨λΈ Finetuning | Korquad dataset μΆκ°, Korquad1 PLMμ Korquad2 λ°μ΄ν°μ fine-tuning |
Retriever λͺ¨λΈ κ°μ | BM25Plus, DPR, Hybrid Search, Re-rank(2-stage) |
Reader λͺ¨λΈ κ°μ | CNN Layer μΆκ°, Head Customizing, Dropout, Learning rate νλ |
μμλΈ λ°©λ² | Soft Voting: nbest_predictions.jsonμμ μ 곡νλ λ¨μ΄λ³ νλ₯ κ°μ νμ©ν΄μ, κ° νμΌμμ λ¨μ΄μ νλ₯ κ°μ νκ· λΈ ν κ°μ₯ λμ κ°μ μ ννλ λ°©μ |
λ²νΈ | λͺ¨λΈ+κΈ°λ² | EM(Public) |
---|---|---|
1 | uomnf97+BM25+CNN | 66.67 |
7 | Curtis+CNN+dropout(only_FC_0.05)+BM25Plus | 66.25 |
8 | Curtis+Truncation | 66.25 |
9 | HANTAEK_hybrid_optuna_topk20(k1=1.84) | 63.15 |
10 | HANTAEK_hybrid_optuna_topk20(k1=0.73) | 63.75 |
11 | HANTAEK_hybrid_optuna_topk10(k1=0.73) | 63.75 |
12 | uomnf97+BM25 | 67.08 |
13 | uomnf97+CNN+Re_rank500_20+Cosine | 67.08 |
14 | curtis+CNN+Re_rank_500_20 | 65.42 |
15 | nlp04_finetuned+CNN+BM25Plus+epoch1_predictions | 67.5 |
μ΅μ’ μ μΆ | Ensemble | EM(Public) | EM(Private) |
---|---|---|---|
O | λͺ¨λΈ 7,8,9,10,11,12,13 1:1:1:1:1:2:3 μμλΈ + μ‘°μ¬ LLM | 77.08 | 71.11 |
O | λͺ¨λΈ 14,8,15,10,11,12,13 1:1:1:1:2:3 μμλΈ + μ‘°μ¬ LLM | 77.08 | 71.67 |
λͺ¨λΈ 1,7,8,9,10,11,12 νκ· μμλΈ + μ‘°μ¬ LLM | 76.67 | 71.67 | |
λͺ¨λΈ 7,8,9,10,11,12,13 1:1:1:2:2:2 μμλΈ + μ‘°μ¬ LLM | 76.67 | 70.83 | |
1st SOTA | λͺ¨λΈ 15,9,10,11,12,13 1:1:1:1:3:3 μμλΈ + μ‘°μ¬ LLM | 75.42 | 74.17 |
λͺ¨λΈ 7,8,9,10,11,12,13 1:1:1:1:2:3 μμλΈ + μ‘°μ¬ LLM(n=5) | 75.42 | 71.67 | |
λͺ¨λΈ 7,8,9,10,11,12,13,14 1:1:1:1:2:2 μμλΈ + μ‘°μ¬ LLM | 74.58 | 71.11 | |
2nd SOTA | λͺ¨λΈ 14,8,9,10,11,12,13 1:1:1:1:2:3 μμλΈ + μ‘°μ¬ LLM | 74.58 | 72.22 |
νλ‘μ νΈ ν΄λ ꡬ쑰λ λ€μκ³Ό κ°μ΅λλ€.
level2-mrc-nlp-15
βββ data
β βββ test_dataset
β βββ train_dataset
β βββ wikipedia_documents.json
βββ docs
β βββ github_official_logo.png
β βββ leaderboard_final.png
β βββ leaderboard_mid.png
βββ models
βββ output
βββ README.md
βββ requirements.txt
βββ run.py
βββ src
βββ arguments.py
βββ CNN_layer_model.py
βββ data_analysis.py
βββ ensemble
β βββ probs_voting_ensemble_n.py
β βββ probs_voting_ensemble.py
β βββ scores_voting_ensemble.py
βββ korquad_finetuning_v2.ipynb
βββ main.py
βββ optimize_retriever.py
βββ preprocess_answer.ipynb
βββ qa_trainer.py
βββ retrieval_2s_rerank.py
βββ retrieval_BM25.py
βββ retrieval_Dense.py
βββ retrieval_hybridsearch.py
βββ retrieval.py
βββ retrieval_SPLADE.py
βββ retrieval_tfidf.py
βββ utils.py
βββ wandb
- arguments.py : λ°μ΄ν° μ¦κ°μ νλ νμΌ
- CNN_layer_model.py : PLMμ CNN Layerλ₯Ό μΆκ°ν ν΄λμ€ νμΌ
- data_analysis.py : λ°μ΄ν°μ μ λΆμνλ νμΌ
- ensemble : λͺ¨λΈ μμλΈμ νλ ν΄λ (Soft, Hard μ§μ)
- main.py : λͺ¨λΈ train, eval, prediction μ μννλ νμΌ
- optimize_retriever.py : 리νΈλ¦¬λ²μ νμ΄νΌνλΌλ―Έν°λ₯Ό μ΅μ ν νλ νμΌ
- qa_trainer.py : MRC Taskμ λν 컀μ€ν Trainer ν΄λμ€ νμΌ
- retrieval_2s_rerank.py : rerank 리νΈλ¦¬λ² νμΌ
- retrieval_BM25.py : bm25 리νΈλ¦¬λ² νμΌ
- retrieval_Dense.py : DPR 리νΈλ¦¬λ² νμΌ
- retrieval_hybridsearch.py : hybrid-search 리νΈλ¦¬λ² νμΌ
- retrieval_SPLADE.py : SPLADE 리νΈλ¦¬λ² νμΌ
- retrieval_tfidf.py : TF-IDF 리νΈλ¦¬λ² νμΌ
python=3.10
νκ²½μμ requirements.txtλ₯Ό pipλ‘ install ν©λλ€. (pip install -r requirements.txt
)python run.py
λ₯Ό μ λ ₯νμ¬ νλ‘κ·Έλ¨μ μ€νν©λλ€.