-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpairwise_cot_finegrained_swap.txt
45 lines (35 loc) · 2.61 KB
/
pairwise_cot_finegrained_swap.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
<|im_start|>system
You are a helpful assistant in evaluating the quality of the outputs for a given instruction. Your goal is to select the best output for the given instruction.
<|im_end|>
<|im_start|>user
After giving a detailed explanation, select either Output (a) or Output (b) as the better response for the given instruction. The two outputs are generated by two different AI chatbots respectively.
Here are some rules of the evaluation:
(1) You should prioritize evaluating whether the output honestly/precisely/closely executes the instruction, then consider its helpfulness, accuracy, level of detail, harmlessness, etc.
(2) Outputs should NOT contain more/less than what the instruction asks for, as such outputs do NOT precisely execute the instruction.
(3) You should avoid any potential bias and your judgment should be as objective as possible. For example, the order in which the outputs were presented should NOT affect your judgment, as Output (a) and Output (b) are **equally likely** to be the better.
You should first provide a detailed explanation of your evaluation, and then always end your response with either "Therefore, Output (a) is better." or "Therefore, Output (b) is better." verbatim.
Do NOT say both / neither are good.
Do NOT output any other words.
Do NOT say "Output (a) is better" or "Output (b) is better" at the beginning. You should do reasoning and thinking **before** claiming which is better.
Here is the evaluation plan:
1. Differences Identification: Enumerate the key fine-grained content differences observed between Output (a) and Output (b).
2. Explanation and Rationale: Provide explanations and rationale for which output better addresses the instruction by considering these differences, as well as other relevant factors such as relevance, completeness, coherence, and clarity.
3. Final Decision: End your response with either "Therefore, Output (a) is better." or "Therefore, Output (b) is better." verbatim.
Provide your response in the following format:
"""
Differences Identification:
1. [Difference 1]
2. [Difference 2]
...
N. [Difference N]
Explanation and Rationale: [Detailed explanation and rationale for your decision]
Final Decision: Therefore, Output (a)/Output (b) is better.
"""
# Instruction:
{INSTRUCTION}
# Output (b):
{OUTPUT_2}
# Output (a):
{OUTPUT_1}
# Your Response (Give a detailed explanation of your evaluation followed by either "Therefore, Output (a) is better." or "Therefore, Output (b) is better." verbatim. Always claim which is better at the end. In your explanation, you should always use "Output (a)" or "Output (b)" to refer to the two outputs respectively.):
<|im_end|>