Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.6] Fixed stable sortBy() not sorting correctly when chained #21366

Closed
wants to merge 1 commit into from
Closed

[5.6] Fixed stable sortBy() not sorting correctly when chained #21366

wants to merge 1 commit into from

Conversation

JeppeKnockaert
Copy link

This is a repost of #21364, but this time as a request for 5.6, as the sort shouldn't change anymore in 5.5 (see the previously mentioned pull request).

This pull request is related to #21270 and #21214).

Problem

Suppose we have a collection like this:

$grades = collect([
    ['name' => 'Ann', 'grade' => 'B'],
    ['name' => 'Boris', 'grade' => 'B'],
    ['name' => 'Dave', 'grade' => 'A'],
    ['name' => 'Cathy', 'grade' => 'A'],
]);

We want everything sorted by grade first; if the grades are equal, we want to sort by name. In Laravel 5.4, we could do this:

$grades->sortBy('name')->sortBy('grade');

Because of how sortBy was changed to work stable with duplicate entries in 5.5 (cfr. pull requests mentioned earlier), this doesn't work anymore.

The result from the chained sortBy() methods is this in Laravel 5.5:

Illuminate\Support\Collection {#14
  #items: array:4 [
    0 => array:2 [
      "name" => "Dave"
      "grade" => "A"
    ]
    1 => array:2 [
      "name" => "Cathy"
      "grade" => "A"
    ]
    2 => array:2 [
      "name" => "Ann"
      "grade" => "B"
    ]
    3 => array:2 [
      "name" => "Boris"
      "grade" => "B"
    ]
  ]
}

This is not the expected behaviour: Cathy should come before Dave.

Cause

The cause of this becomes clear when we look at the keys of the array between the two sort operations. After performing the sortBy('name'), this is the collection:

Illuminate\Support\Collection {#16
  #items: array:4 [
    0 => array:2 [
      "name" => "Ann"
      "grade" => "B"
    ]
    1 => array:2 [
      "name" => "Boris"
      "grade" => "B"
    ]
    3 => array:2 [
      "name" => "Cathy"
      "grade" => "A"
    ]
    2 => array:2 [
      "name" => "Dave"
      "grade" => "A"
    ]
  ]
}

When we try to sort by grade now, it will see the grade of Cathy and Dave as equal and sort based on theirs keys instead (this was introduced by the stable sort).

Solution

One possible solution to this is to add a ->values() call in the chain:

$grades->sortBy('name')->values()->sortBy('grade');

This will reset the keys, resulting in key '2' for Cathy and key '3' for Dave.

I've used this idea to try and implement a fix for this issue. Instead of using $this->items directly, I use the values of $this->items. The keys for these values are used in the array_multisort call to make sure we don't lose the stable sort.

However, the keys of the values array, are not the original keys (which we want to keep too, especially when working with associative arrays). This is why I keep the $resultKeys variable containing the mapping from the keys of the values array to the original keys.

@taylorotwell
Copy link
Member

Can we just revert the sortBy behavior to its old functionality? @themsaid ... would that be a problem? I don't see why there is all of a sudden a big flurry of activity around this function.

@themsaid
Copy link
Member

@taylorotwell we can revert safely, I suggest we don't merge this change either and leave the method as in the 5.4 state, if people want a different behaviour they can add a macro.

@damiani
Copy link
Contributor

damiani commented Nov 12, 2017

@JeppeKnockaert This was a good solution, although it looks like sortBy has been reverted to it's 5.4 behavior and will remain that way.

So your example above will now respond correctly to the chained sort, but that can't be relied upon—add a couple more items, for example, and it breaks:

$grades = collect([
    ['name' => 'Ann', 'grade' => 'B'],
    ['name' => 'Boris', 'grade' => 'B'],
    ['name' => 'Dave', 'grade' => 'B'],
    ['name' => 'Cathy', 'grade' => 'C'],
    ['name' => 'Caleb', 'grade' => 'B'],
    ['name' => 'Alex', 'grade' => 'A'],
    ['name' => 'Edward', 'grade' => 'A'],
]);

will sort (incorrectly) to:

$grades->sortBy('name')->sortBy('grade');

Collection { 
    6 => [
      "name" => "Edward"
      "grade" => "A"
    ]
    0 => [
      "name" => "Alex"
      "grade" => "A"
    ]
    1 => [
      "name" => "Ann"
      "grade" => "B"
    ]
    2 => [
      "name" => "Boris"
      "grade" => "B"
    ]
    3 => [
      "name" => "Caleb"
      "grade" => "B"
    ]
    5 => [
      "name" => "Dave"
      "grade" => "B"
    ]
    4 => [
      "name" => "Cathy"
      "grade" => "C"
    ]
  ]
}

@biiiizzzz
Copy link

@damiani
Can I ask question? I am using laravel 5.5
When I have collection with less than 16 items, and I use both sort fuction. Use sortBy() and sortByDesc() on my code. The result is ok.
But with my collection have more than 16 items. I use two sort , and it's not correct.
Can you help me check this ?

@damiani
Copy link
Contributor

damiani commented Feb 9, 2020

Chaining multiple sortBy() clauses in a collection will produce potentially unexpected results (as discussed in #21270, #21214, #21364, etc.), because the underlying PHP sort methods do not perform a "stable sort" — PHP doesn't reliably preserve key order for items that are identical. For example, sorting an array of 16 identical values works as expected:

$a = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
asort($a);
print_r($a);

Array
(
    [0] => 1
    [1] => 1
    [2] => 1
    [3] => 1
    [4] => 1
    [5] => 1
    [6] => 1
    [7] => 1
    [8] => 1
    [9] => 1
    [10] => 1
    [11] => 1
    [12] => 1
    [13] => 1
    [14] => 1
    [15] => 1
)

... but add a 17th item, and PHP does not maintain the original order:

$a = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1];
asort($a);
print_r($a);

Array
(
    [0] => 1
    [9] => 1
    [15] => 1
    [14] => 1
    [13] => 1
    [12] => 1
    [11] => 1
    [10] => 1
    [8] => 1
    [1] => 1
    [7] => 1
    [6] => 1
    [5] => 1
    [4] => 1
    [3] => 1
    [2] => 1
    [16] => 1
)

That means the results of your first sortBy() might end up getting re-shuffled by the second sortBy(), and whether or not this happens depends on the nature of the data and the length of the collection.

For example, given this collection:

$names = collect([
    ['first' => 'Ann', 'last' => 'Smith'],
    ['first' => 'Alex', 'last' => 'Doe'],
    ['first' => 'Boris', 'last' => 'Doe'],
    ['first' => 'Betty', 'last' => 'Smith'],
    ['first' => 'Arthur', 'last' => 'Doe'],
    ['first' => 'Ann', 'last' => 'Brown'],
    ['first' => 'Zephyr', 'last' => 'Brown'],
]);

If you try $names->sortBy('first')->sortBy('last'), you'll get the following incorrect order (the first two results, Zephyr Brown and Ann Brown, should be reversed):

    6 => array:2 [
      "first" => "Zephyr"
      "last" => "Brown"
    ]
    5 => array:2 [
      "first" => "Ann"
      "last" => "Brown"
    ]
    1 => array:2 [
      "first" => "Alex"
      "last" => "Doe"
    ]
    4 => array:2 [
      "first" => "Arthur"
      "last" => "Doe"
    ]
    2 => array:2 [
      "first" => "Boris"
      "last" => "Doe"
    ]
    0 => array:2 [
      "first" => "Ann"
      "last" => "Smith"
    ]
    3 => array:2 [
      "first" => "Betty"
      "last" => "Smith"
    ]

To get around this problem, I wrote a collection macro called sortByMultiple. (If you're not familiar with macros in Laravel, here are two good articles about them, one by Peter Fox and one by Caleb Porzio.)

\Illuminate\Support\Collection::macro(
    'sortByMultiple',
    function (array $keys) {
      	return collect($keys)
          ->reverse()
          ->reduce(function ($carry, $item) use (&$count) {
            $key = ltrim($item, '+-');
            $descending = $item[0] === '-';
            
            return $count++ 
              ? $carry->groupBy($key)
              	    ->sortKeys(SORT_REGULAR, $descending)
              	    ->flatten(1)
              : $carry->sortBy($key, SORT_REGULAR, $descending);
          }, $this);
    }
);

sortByMultiple takes an array of keys to sort by, so $names->sortByMultiple(['last', 'first']) will yield the correct order, sorted by last name, and then by first name:

    0 => array:2 [
      "first" => "Ann"
      "last" => "Brown"
    ]
    1 => array:2 [
      "first" => "Zephyr"
      "last" => "Brown"
    ]
    2 => array:2 [
      "first" => "Alex"
      "last" => "Doe"
    ]
    3 => array:2 [
      "first" => "Arthur"
      "last" => "Doe"
    ]
    4 => array:2 [
      "first" => "Boris"
      "last" => "Doe"
    ]
    5 => array:2 [
      "first" => "Ann"
      "last" => "Smith"
    ]
    6 => array:2 [
      "first" => "Betty"
      "last" => "Smith"
    ]

This macro will sort in descending order if the key is prefixed with -, e.g.:

$names->sortByMultiple(['-last', 'first'])

Note that this macro relies on the collection method sortKeys, which isn't available until Laravel 5.6. For Laravel 5.5 and below, you could add sortKeys as another macro, though:

\Illuminate\Support\Collection::macro(
    'sortKeys',
    function ($options = SORT_REGULAR, $descending = false) {
        $items = $this->items;
        $descending 
          ? krsort($items, $options) 
          : ksort($items, $options);

        return new static($items);
    }
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants