Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul enum conversions #33262

Merged
merged 2 commits into from
Aug 22, 2019
Merged

Conversation

jbytheway
Copy link
Contributor

@jbytheway jbytheway commented Aug 16, 2019

Summary

SUMMARY: Infrastructure "Overhaul of enum <-> string conversion code"

Purpose of change

The previous approach to conversion between enum and string values had two function templates which had to be specialized for every enum (enum_to_string and string_to_enum).

Typically, they were implemented by having a global static map from strings to enum values which was used for lookup in one direction. Conversion in the other direction (enum -> string) was usually not implemented, but when it was it used a slow linear search through the map.

The problems I see with this approach are primarily:

  • enum_to_string should not be slow, but we don't want to have a separate map or array defined for every enum to make it fast.
  • There was nothing to ensure that when a new enum value was added it would also be added to the map.

Describe the solution

Change this to a new approach where only enum_to_string needs to be specialized, and string_to_enum is automatically synthesized from it.

The advantages of this approach are:

  • enum_to_string should be much more efficient.
  • enum_to_string can be implemented via a switch statement, which means the compiler will automatically let you know to update it when a new enum value is added to any enum (via the warning about unhandled enum values).
  • No need to implement string_to_enum. Consistency between the conversions in each direction happens automatically.
  • We get some additional error checking for free; if any two enum values map to the same string that will be reported (which could easily happen through a copy/paste error).
  • If we ever want to switch to a more performant hashtable implementation, we only need to change the code in one place.

The disadvantages include:

  • A little more code in headers.
  • Every string_to_enum contains a const static map initialized on first use, so there will be a check that it's initialized on every call. But this should be cheap compared to an unordered_map lookup.

To implement this I took advantage of the enum_traits originally added for enum_bitset. Every enum supporting such conversion now needs a 'last' member and an enum_traits specialization for the code to know which member this is (many of them already had such members). This new 'last' member needs to be handled somehow in any switch over the enum.

So, as a happy side-effect, all these enums will now work with enum_bitset without further changes.

I used astyle markup around the big switch statements to get one case per line; I find this much easier to read and see at a glance that the values match in a reasonable way.

Describe alternatives you've considered

Many things could be done differently:

  • Could use regular astyle formatting of the switch statements.
  • Could use an array rather than a switch statement (I didn't do this because I like the compiler warning that comes with unhandled enum values in switch statements).
  • Could make the maps global statics rather than function-local statics, but then we'd have to explicitly define them for each enum.

Additional context

I was inspired to do this by working on #32891 where I'm adding two new enum types that will need fast conversion in both directions.

I was hoping this change would be an overall reduction in lines of code. Alas, it is not.

src/enum_traits.h Outdated Show resolved Hide resolved
src/monstergenerator.cpp Outdated Show resolved Hide resolved
@ZhilkinSerg ZhilkinSerg added [C++] Changes (can be) made in C++. Previously named `Code` Code: Infrastructure / Style / Static Analysis Code internal infrastructure and style labels Aug 16, 2019
@jbytheway jbytheway force-pushed the enum_conversions branch 2 times, most recently from c1b40be to 89ed7a0 Compare August 16, 2019 11:53
The previous solution had two function templates which had to be
specialized for every enum (one for conversion in each direction).

Typically, they were implemented by having a global static map from
strings to enum values which was used for lookup in one direction.
Conversion in the other direction (enum -> string) was either not
implemented or used a slow linear search through the map.

Change this to a new approach where only enum_to_string needs to be
specialized, and string_to_enum is automatically synthesized from it.

The advantages of this approach are:
- enum_to_string should be much more efficient.
- enum_to_string can be implemented via a switch statement, which means
  the compiler will automatically let you know to update it when a new
  enum value is added to any enum.
- No need to implement string_to_enum.  Consistency between the
  conversions in each direction happens automatically.
- We get some additional error checking for free; if any two enum values
  map to the same string that will be reported (which could easily
  happen through a copy/paste error).
- If we ever want to switch to a more performant hashtable
  implementation, we only need to change the code in one place.

The disadvantages are:
- A little more code in headers.
- Every string_to_enum contains a const static, so there needs to be a
  check that it's initialized on every call.  But this should be cheap
  compared to an unordered_map lookup.

To implement this I took advantage of the enum_traits originally added
for enum_bitset.  Every enum supporting such conversion now needs a
'last' member and an enum_traits specialization for the code to know
which member this is (many of them already had such members).

So, as a happy side-effect, all these enums will now work with
enum_bitset without further changes.

I used astyle markup around the big switch statements to get one case
per line; I find this much easier to read and see at a glance that the
values match in a reasonable way.
No need to write a lambda when we already have a perfectly good
function.
@kevingranade kevingranade merged commit 16fb9ca into CleverRaven:master Aug 22, 2019
@jbytheway jbytheway deleted the enum_conversions branch August 22, 2019 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C++] Changes (can be) made in C++. Previously named `Code` Code: Infrastructure / Style / Static Analysis Code internal infrastructure and style
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants