Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0d87d2a6 is ambiguous #95

Closed
Mindjacker opened this issue Sep 6, 2023 · 13 comments
Closed

0d87d2a6 is ambiguous #95

Mindjacker opened this issue Sep 6, 2023 · 13 comments

Comments

@Mindjacker
Copy link

Untitled

In all three examples, the blue lines cut through the red boxes. But in the "correct" output, a box is colored blue, despite the fact that the blue line was merely adjacent.

@fchollet
Copy link
Owner

This is an acceptable level of ambiguity. You can simply try both solutions (change on touch vs change on cross).

@narvind2003
Copy link

Try both because there are 2 attempts given? That seems fair.

@Peter-Flynn
Copy link

Peter-Flynn commented Dec 21, 2024

I would argue that this problem is unacceptably ambiguous, and request that this issue be reopened.

There are two questions left unanswered by the examples:

  1. "Does a box become blue if a line merely touches it, rather than crossing through?" (mentioned above)
  2. "Is a vertical line always drawn between blue squares which share a column, or is such a line only drawn when those squares also touch the top and bottom of the grid?"

Both questions are reasonable, as they verifiably satisfy the examples no matter how they're answered. However, they lead to four different potential answers for the test question. With only two guesses available, this question introduces an element of luck in getting the answer correct.

OpenAI's o3 made two reasonable guesses on this problem, but was marked "incorrect".

@p1geoneon
Copy link

actually you cant say that you can just give both answers since you're given two tries because this question is 4 ways ambiguous there are 4 correct answers that work with the examples given this must be reopened

@chbran76
Copy link

the 2 right answered could had been also the 2 different ways of creating a cross.

the official answer is more an example of human arbitrariness than of logic
and therefor implicit wrong and should be corrected.

@m-vz
Copy link

m-vz commented Dec 25, 2024

There are two questions left unanswered by the examples:
1. "Does a box become blue if a line merely touches it, rather than crossing through?" (mentioned above)
2. "Is a vertical line always drawn between blue squares which share a column, or is such a line only drawn when those squares also touch the top and bottom of the grid?"

I disagree with your statement that the second question is left unanswered. Since the examples only ever show lines drawn across the grid (instead of along), that property should be implied.

Even if you argue there are four possible answers, two of them do something none of the examples show (connecting dots along the board). So with two possible responses those should be tried last since they divert from the examples the most.

@p1geoneon
Copy link

There are two questions left unanswered by the examples:

  1. "Does a box become blue if a line merely touches it, rather than crossing through?" (mentioned above)
  2. "Is a vertical line always drawn between blue squares which share a column, or is such a line only drawn when those squares also touch the top and bottom of the grid?"

I disagree with your statement that the second question is left unanswered. Since the examples only ever show lines drawn across the grid (instead of along), that property should be implied.

Even if you argue there are four possible answers, two of them do something none of the examples show (connecting dots along the board). So with two possible responses those should be tried last since they divert from the examples the most.

there are also 0 examples of a blue line being adjacent to a red square and turning it blue however thats the """correct""" answer so the answer challet says is right also has none of the examples showing it

@m-vz
Copy link

m-vz commented Dec 26, 2024

I agree, that is less likely to be the answer than not filling it in, so that would be my second try and not my first. My argument is that if you order the suggested solutions by increasing ambiguity, you will succeed with two tries.

@Peter-Flynn
Copy link

Peter-Flynn commented Dec 26, 2024

Since the examples only ever show lines drawn across the grid (instead of along), that property should be implied.

@m-vz, this line of thinking requires that there's something unique about the edges of the grid. That's "open question #2" in my post, and it goes unanswered by the examples, so we've looped back to the same issue.

@p1geoneon
Copy link

I agree, that is less likely to be the answer than not filling it in, so that would be my second try and not my first. My argument is that if you order the suggested solutions by increasing ambiguity, you will succeed with two tries.

ok sure fine i guess you could do that except now you're just making excuses for why this question is ok this is unacceptably ambiguous even though you technically should be able to get the correct answer in at least one of your attempts maybe that doesn't mean this question should be a valid question on the test

@m-vz
Copy link

m-vz commented Dec 27, 2024

@m-vz, this line of thinking requires that there's something unique about the edges of the grid. That's "open question #2" in my post, and it goes unanswered by the examples, so we've looped back to the same issue.

There doesn't need to be anything unique about the edges. In all examples, the lines connect two different edges, which implies a solution that only connects different edges is more likely and should be tried first. But maybe we can agree to disagree.

ok sure fine i guess you could do that except now you're just making excuses for why this question is ok this is unacceptably ambiguous even though you technically should be able to get the correct answer in at least one of your attempts maybe that doesn't mean this question should be a valid question on the test

As far as I understand it can be seen as part of the test whether the test-taker can prioritise more likely solutions in case of ambiguity, and if you do so you are able to get the correct answer without guessing.

@p1geoneon
Copy link

@m-vz, this line of thinking requires that there's something unique about the edges of the grid. That's "open question #2" in my post, and it goes unanswered by the examples, so we've looped back to the same issue.

There doesn't need to be anything unique about the edges. In all examples, the lines connect two different edges, which implies a solution that only connects different edges is more likely and should be tried first. But maybe we can agree to disagree.

ok sure fine i guess you could do that except now you're just making excuses for why this question is ok this is unacceptably ambiguous even though you technically should be able to get the correct answer in at least one of your attempts maybe that doesn't mean this question should be a valid question on the test

As far as I understand it can be seen as part of the test whether the test-taker can prioritise more likely solutions in case of ambiguity, and if you do so you are able to get the correct answer without guessing.

There is nothing more likely about the intended solution compared to any other solution. The fact that this is even being discussed in the first place is bad for this question. It could be fixed so simply; you would literally need to add one line to one example to fix this question. The fact that you could technically get the """correct""" solution by guessing and trying to get into the mind of the test maker and think, "Oh, I think they would say this is the right answer," is not good. You should not have to play mind games because the question is ambiguous. You are simply making excuses.

@Peter-Flynn
Copy link

Peter-Flynn commented Dec 28, 2024

implies a solution that only connects different edges is more likely

@m-vz, I think you'd struggle to justify this. It'd be just as valid to argue that the presence of two squares, rather than one, initiating a line is an implication that any two squares in a column should be connected by a line, and that alone is sufficient. To be clear, I don't think either implication is valid.

There simply isn't enough evidence for either conclusion, and neither is inherently more likely by reasoning. Hence the ambiguity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants