-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'_' can't be right sorted in Windows #41
Comments
Please give specific examples of the input you are passing, the output you are getting, and the desired output. |
Just a guess, but have you tried using |
It does seem that Microsoft has a custom sorting order for characters (at least for Excel) as can be seen in the table given here. In this table the Having said this, it's not clear to me what the request is here. As filed, the issue simply states that
|
Allowing the user to modify the sorting table to match NTFS is a good optional feature. Note: on windows an alphabetical sorting order is baked into NTFS (see: https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-findfirstfileexa). The API also reports underscores after numbers when using an NTFS file systems. Output from "dir" request from cmd.exe:
Output from bash "ls" request on the same filesystem:
Conceptually, if the user requests alg.PATH semantics, and is on a Windows system, then Windows PATH should be chosen. When developing UI stuff, you typically want native OS semantics. If someone wants an OS insensitive sort order, the build-in |
Do you have a suggestion for how this might be implemented? At the end of the day, the sorting order of non-alphanumeric characters is arbitrary, so unlike a modification like being case insensitive (which just users |
To follow up with my above comment - I would welcome any PR that attempts to solve this issue. |
@earonesty Do you envision the user being able to customize the translation table, or would there be pre-defined translation tables? I think this could work by using |
@earonesty I went to implement this today, and I realized that the table in the link I gave above is incomplete, so I was unable to implement this solution. Do you know where a full table of ASCII to NTFS equivalence exists? Alternatively, is there an existing module or library that already exists that provides a collation function that makes strings sort like they are on NTFS? TO ANYONE INTERESTED IN HELPING.I would like to implement this with a new enum ( |
I got the same problem when sorting folder names, that now appear in a different order when used on Windows. The library fails when sorting numbers with prepending 'special' characters:
Results in: Regarding your question about a complete ASCII table, I could not find any. I put together a list of the characters found on a 'normal' western keyboard, so basically all the ASCII chars (and some from extended ASCII) in the correct order: https://i.postimg.cc/cL5hNSnd/image.png That is: Windows Explorer sorting seems to be different from NTFS sorting: https://devblogs.microsoft.com/oldnewthing/20050617-10/?p=35293. I think Explorer sorting should be used as this is what people usually see. Hope that might be helpful. |
I would imagine |
@ganego One of the articles you linked indicated that Windows Explorer uses the locale to sort... can you see if using |
I think the locale for Windows Explorer sorting only plays a role when going to extended-ASCII with all the special letters like äöüØ... See: https://www.ascii-code.com/ The basic-ASCII should be locale-independent. Results (W7, German):
Without Locale: Code used:
Beside the obvious differences, the one thing why I sumbled upon this issue was the difference in handling of <specialChar&Letter> vs <specialChar&Number>, see the EDIT: Here is a string that can serve as Python list for testing with all the folder names from above: |
I am all-in on adding this type of functionality. Having said that, I have reservations on how successful it can be:
I do not use Windows so I do not think I am in a position to implement this. But I would gladly accept a PR from the brave soul who wants to take a stab at this. |
Well, best case: We ask MS to open-source StrCmpLogicalW... I found this: https://gist.github.com/mcmarcu/7899295 which either uses the correct dll on Windows or on Linux some other function. Then of course you will only get "proper" Windows sorting on Windows - the question would be, do Linux users even need this? Maybe not. (But to be honest, I don't understand why a Linux user would be ok with their system sorting I then found this: https://psycodedeveloper.wordpress.com/2013/04/12/c-numeric-sorting-revisited/ Sorting right now mostly works, except for one big issue with <specialChar&Letter> vs <specialChar&Number>. If that somehow could be fixed for the numbers that are currently sorted incorrectly. |
Have you tried sorting with With respect to the Wordpress article: it looks like they are using a comparitor called |
That won't help. In the meantime, while looking to solve some other unrelated problem I found this:
Amazing, Python comes with this built-in, and it works:
But wait! it also bugs:
So while it actually perfectly sorts stuff like the problem mentioned before, it now fails at natural sorting :( I will open a bugreport, since this function clearly does not produce the correct order for Windows despite the name. |
It's your decision to file a bug report on this if you want (it's unrelated to |
OK - I have finally decided how I want to implement this.
|
Good to hear. |
Well, my original plan was to not even export |
That's an even more elegant solution. I like it. |
seems like one lambda is all you need: |
@ganego @earonesty Check out #123, which is the PR for this feature. I plan to merge sometime tomorrow. I would be open to some feedback, especially the documentation or the tests. |
Just looked over the code but did not run any tests. Seems you used my list for the test, so if it gives the correct results I guess it works. Doc also looks ok. When will you publish a new version? |
I am trying to see if I can solve #122. If I cannot by tonight, I think I will release tonight or tomorrow. |
@ganego It's out - |
I wonder if WINE's version is accurate. |
In windows OS, I want to sorted files in a content with a natural way. But I found your code seems to be not well done in windows. The
'_'
is sorted after number'0-9'
. But in windows OS, the'_'
is showed before number'0-9'
.The text was updated successfully, but these errors were encountered: