This repository has been archived by the owner on Oct 17, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
안녕하세요!
GIthub readme에서 self-report가 가능하다고 해서 이렇게 PR 요청을 드리게 되었습니다.
PR 요청 드리는 모델의 정보는 아래와 같습니다.
저희가 자체 평가를 할 때 기준이 되었던 자료는 아래와 같습니다.
Logickor 코드 내에서 활용하는 template 중
zero-shot (default)
을 활용하여, gpt4-1106-preview 모델로 평가를 진행하였습니다.각 모델에 대해, jsonl 파일로
답변(output), 판단근거(judge_message), 점수(score)
를 정리하여 공유드립니다!Logickor 리더보드 항상 응원합니다.
감사드립니다..!
각 분야별 점수도 huggingface 내 readme를 참고하시면 보실 수 있습니다