Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort fields and properties by MetadataToken. #157

Merged
merged 2 commits into from
Nov 17, 2020
Merged

Sort fields and properties by MetadataToken. #157

merged 2 commits into from
Nov 17, 2020

Conversation

jhickson
Copy link
Contributor

@jhickson jhickson commented Nov 17, 2020

I found an issue that when writing parquet files using a writer like:

var writer =
    ParquetFie.CreateRowWriter<(long, int, double, int, long, short)>(
        "path/to/my.parquet",
        new [] {"C1", "C2", "C3", "C4", "C5", "C6"}
    );

the resulting parquet file would have column C1 typed as int and C2 as long - i.e. the first two column types were swapped.

Upon debugging it transpired that this was when Type.GetFields() was being used to build the write delegate. Type.GetFields() was not returning fields in declaration order so the mapping of column name to type failed. The .NET docs for this method (and Type.GetProperties()) explicitly state that order is not guaranteed. So, I have changed the code to sort by MemberInfo.MetadataToken as advised in this SO article.

Note that no unit tests have been added to cover this as it proved impossible to find a circumstance which reliably failed with the old code. It seems the order GetFields uses is usually declaration order but might change under certain runtime conditions. I have verified though that my change here fixed the issue I was seeing in my code.

@jhickson jhickson marked this pull request as ready for review November 17, 2020 14:21
GPSnoopy
GPSnoopy previously approved these changes Nov 17, 2020
@jhickson
Copy link
Contributor Author

I'm not sure why the MacOS build is failing - I haven't touched the native code.

@GPSnoopy
Copy link
Contributor

macOS failure is likely unrelated. Leave it with me.

#158 shows this should work,
@GPSnoopy GPSnoopy merged commit 48f2041 into G-Research:master Nov 17, 2020
@jhickson jhickson deleted the honour_tuple_order branch November 17, 2020 18:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants