Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to include int32 hashmap keytype #778

Merged
merged 14 commits into from
Apr 25, 2024

Conversation

chuckyvt
Copy link
Contributor

This is a fairly minor update to include int32 key types to hash maps. I have tested this a good bit in my own code. It has worked well and is a nice addition to the hashmap capabilities.

Update to stdlib_hashmap_wrappers.f90 that includes support for int32 vector key types.
Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @chuckyvt. This is indeed a useful procedure to avoid multiple ...transfer(value, 1_int8).... with the default integer on the user side.
Could you also udpate the specs, please?
Could you also add a test for this specific procedure, please?
Should it be a corresponding get function (quite a long time I did use hashmap, so this question might be irrelevant)?

chuckyvt and others added 2 commits March 27, 2024 08:56
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Added get_inte32_key function.
@chuckyvt
Copy link
Contributor Author

In progress code updates. Added a 'get_int32_key' function to match set function.

Fixed end subroutine bug
Copy link
Contributor

@gareth-nx gareth-nx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @chuckyvt. I agree that some tests/specs are needed and understand this is in progress. But it looks like a nice addition.

chuckyvt added 2 commits April 3, 2024 19:59
Minor comment cleanup
Update to include native support for int32 key types.
@chuckyvt
Copy link
Contributor Author

chuckyvt commented Apr 4, 2024

Requested code updates are complete. Also updated the specs document. The testing item is a bit more work however. I could use some help or suggestions on the best approach. All the current tests are based around native int8 keys. For example, there are no test cases I can find for the existing character strings? My thought was it might make sense to add an extract test file dedicated to non int8 keys. Character key, int32 key, and could be extended in the future.

@gareth-nx
Copy link
Contributor

Thanks @chuckyvt. The addition of extra tests sounds good. Indeed, we will have to add tests for other reasons too (not necessarily in this pull request), see #785.

Updated test_maps.fypp to include tests for int32 and character key types, in addition to base int8.
@chuckyvt
Copy link
Contributor Author

Have pushed an updated to test_maps.fypp that includes tests for int32 and character key types, in addition to the existing base int8 keys. I'm not very familiar with the test drive framework, so I tried to stay as close as possible to the existing testing layout.

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @chuckyvt for adding the tests. I am a bit puzzled by your approach (but it does not mean it is wrong). Could you add a few comments to help the readers and future developers, please? MY main question is to understand why test_8_bits was changed to an array (see comments below).
Note: would it be an idea to add an new array test_32_bits and duplicate the tests of test_8_bits for 32bits (and leave unchanged test_8_bits)?

@@ -53,10 +54,10 @@ contains
type(error_type), allocatable, intent(out) :: error

type(chaining_hashmap_type) :: map
integer(int8) :: test_8_bits(test_size)
integer(int8) :: test_8_bits(test_size,key_types)

call generate_vector(test_8_bits)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_8_bits is not a vector anymore.
Could you explain why test_8_bits is changed to an array? I am not sure to understand the aim here.

Copy link
Contributor Author

@chuckyvt chuckyvt Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach is to create a unique int8 vector for each key type that can then be used via transfer or other mechanism to generate a unique key hash for the test program. Using a single int8 vector will result in duplicate keys and mapping conflicts, so that must be managed. There are other approaches, but this seemed to be the easiest to add and is extensible to future key types. For example, we could add a dedicated int32 vector generation function, but not sure how that improves on using transfer with an int8 array to create an int32 array, and would require a dedicated function for each key type. (Versus for example if there is want to add int64 or real key types, increase the size of test_8_bits as needed and add appropriate transfer function).

Copy link
Contributor Author

@chuckyvt chuckyvt Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its also worth noting that the structure of test_maps.fypp is not ideal with the code base being repeated for open and chaining maps. I believe refactoring this code into a single source approach should be done before any additional function additions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this is straightforward with fypp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its also worth noting that the structure of test_maps.fypp is not ideal with the code base being repeated for open and chaining maps. I believe refactoring this code into a single source approach should be done before any additional function additions.

Could you open an issue for this, please?

Comment on lines 150 to 158
call set( key, test_8_bits( index2:index2+test_block-1, 1 ) )
call map % key_test( key, present )
call check(error, present, "Int8 KEY not found in map KEY_TEST.")

call set( key, transfer( test_8_bits( index2:index2+test_block-1, 2 ), [0_int32] ) )
call map % key_test( key, present )
call check(error, present, "Int32 KEY not found in map KEY_TEST.")

call set( key, to_string( transfer( test_8_bits( index2:index2+test_block-1, 3 ), 0_int32 ) ) )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment to explain the different tests (and their purposes)? It might help reviewers and future developers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some additional comments per discussion above. Feel free to add more. I've also removed the character key type test. The code worked fine on my computer, but failed the CI checks for reasons I don't understand. It was a nice to have, but out of scope for this pull request which is for int32 key types.

Also, per the note above, I think test_maps.fypp should be refactored to remove duplicate code prior to adding other key type checks to make code addition less error prone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some additional comments per discussion above.

Great. Thank you.

I've also removed the character key type test. The code worked fine on my computer, but failed the CI checks for reasons I don't understand. It was a nice to have, but out of scope for this pull request which is for int32 key types.

I agree with you. Let's keep this PR focused on int32

chuckyvt and others added 3 commits April 15, 2024 21:08
Co-authored-by: Jeremie Vandenplas <jeremie.vandenplas@gmail.com>
Added addition commenting.  Removed the character key type test as that was failing CI checks for unknown reasons.
Copy link
Contributor

@gareth-nx gareth-nx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. I have one query though about transfer when the mold argument is not allocated -- is this standard? I wonder if this could be related to the issue with strings as keys that you mentioned earlier.

type(key_type), intent(in) :: key
integer(int32), allocatable, intent(out) :: value(:)

value = transfer( key % value, value )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I wonder whether it matters that value is unallocated in the RHS call to transfer (since it will be deallocated on input if allocated)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked this question at fortran discourse here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for asking. Intuitively it seems like it shouldn't matter since its only pulling type info, not size. But I also did a quick search and didn't come up with anything. We could hardcode it to [0_int8], but I don't like that quite as much.

@@ -53,10 +54,10 @@ contains
type(error_type), allocatable, intent(out) :: error

type(chaining_hashmap_type) :: map
integer(int8) :: test_8_bits(test_size)
integer(int8) :: test_8_bits(test_size,key_types)

call generate_vector(test_8_bits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this is straightforward with fypp.

Added doc strings to explain the key_type approach in test_map.fypp.

Indentation cleanup

Added pure specifier to get and set int32 key type routines.
@chuckyvt
Copy link
Contributor Author

Note with the latest submit I added pure to the int32 get and set subroutines. Eventually I think adding this to the other interfaces makes sense but is probably a separate pull request. (Also can make sure there are no issues with pure with the int32 type)

Copy link
Member

@jvdp1 jvdp1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Pending the issue with transfer, it can be merged IMO. Thank you @chuckyvt for this PR.

@gareth-nx
Copy link
Contributor

I'm also happy for this to be merged. It looks like the use of transfer with an unallocated mold is OK.

Do we have a record of what usage was causing segfaults? Might come in handy.

@jvdp1
Copy link
Member

jvdp1 commented Apr 25, 2024

I'll merge this PR. Thank you @chuckyvt

@jvdp1 jvdp1 merged commit f44133b into fortran-lang:master Apr 25, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants