Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output adapters should be templated. #2172

Closed
FrancoisChabot opened this issue Jun 5, 2020 · 7 comments
Closed

Output adapters should be templated. #2172

FrancoisChabot opened this issue Jun 5, 2020 · 7 comments
Labels
kind: enhancement/improvement state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated

Comments

@FrancoisChabot
Copy link
Contributor

FrancoisChabot commented Jun 5, 2020

Which feature do you want to see in the library?

While input adapters have been refactored to ditch polymorphism and add support for user-defined iterators, the output part of the system has remained the same. I propose that similar changes can be done there.

There is at least some evidence that having a faster encoding time would be desirable, so this would not be just a matter of increasing consistency within the codebase:

Originally posted by @abrownsword in #1457 (comment)

This would involve:

  • Migrating the internals to operate on templates, instead of an abstract base class.
  • Retrofitting all current valid syntax to make use of this.
  • Expanding user-facing API to support user-defined sinks through an output iterator.

Bonus:

  • convert to UTF-16 or UTF-32 when the destination iterator/container has a multi-byte value_type when dumping a non-binary format. (probably through a separate PR)

How would the feature be usable for other users?

  • The speedups incurred by moving from an abstract class to a template should be entirely transparent to users.
  • By accepting an arbitrary LegacyOutputIterator, we could support user-defined sinks easily.

For this, I suggest adding a the following method to basic_json:

template<typename OutputIteratorType>
void dump_to(OutputIteratorType ite, 
             const int indent = -1,
             const char indent_char = ' ',
             const bool ensure_ascii = false,
             const error_handler_t error_handler = error_handler_t::strict) const;

Which would be used like so:

  json data;
  std::vector<char> destination;
  data.dump_to(std::back_inserter(destination));
@abrownsword
Copy link

#1534 (comment)

@FrancoisChabot
Copy link
Contributor Author

FrancoisChabot commented Jun 5, 2020

Welp, looks like Output Iterators are going to be an issue here:

http://quick-bench.com/i9aGfwlqjxVfsgAsjb_jHEYDaSU

I was kinda hoping that the exponential allocation increase would lead to a lot less copying, but it seems that this only becomes relevant after 10s of millions of insertions...

We could always add overloads for types that support multiple insertions at once, like ostream, string and vector, but really, we'd want user types to be able to leverage that optimization if possible... So it looks like output_adapter_protocol will have to stay as a concept :(.

So the interface would be:

template<typename OutputAdapterType>
void dump_to(OutputAdapterType&& output_adapter, 
             const int indent = -1,
             const char indent_char = ' ',
             const bool ensure_ascii = false,
             const error_handler_t error_handler = error_handler_t::strict) const;

Where OutputAdapterType would be a type that implements the following API:

struct OutputAdapter
{
    using char_type= ... ;

    void put(char_typec);
    void write(const char_type* s, std::size_t count);
};

(std::ostream uses basically that exact same API, so we might as well match it)

The next question is: Given that this duck-typed API will be required, is there any reason to still support LegacyOutputIterator at all?

@abrownsword
Copy link

Was the legacy output interface actually usable externally? I didn't think it was.

I'm not sure I follow -- above you state that by accepting LegacyOutputIterator, we can accept arbitrary user-sinks (a highly desired feature, IMO). But at the bottom you ask if we need to support LegacyOutputIterator at all? Are you simply saying that via duck-typing the user can insert their own adapter, thus we get our customizable functionality, and providing a legacy binding isn't necessary?

@FrancoisChabot
Copy link
Contributor Author

FrancoisChabot commented Jun 5, 2020

My original proposal was to add support for LegacyOutputIterator as the way to accomplish user-defined sinks. That feature is not currently in the library in any way shape or form. That's my default preferred way because it matches the stdlib algorithms (like std::transform).

However, as I started investigating, I discovered that it's not possible to achieve the current performance while using LegacyOutputIterator because that API would be limited to only feeding a single character at a time.

So, we definitely have a need for something else than LegacyOutputIterator. But I was essentially asking: should support for LegacyOutputIterator still be part of the design of this feature for completeness? Or should we just support the ostream-like interface as the only way to implement user-defined sinks?

@abrownsword
Copy link

I see... I thought LegacyOutputIterator was just glue to support the old output adapter interface that's in the library already.

@stale
Copy link

stale bot commented Jul 5, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jul 5, 2020
@stale stale bot closed this as completed Jul 12, 2020
@nlohmann nlohmann reopened this Jul 12, 2020
@stale stale bot removed the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Jul 12, 2020
@stale
Copy link

stale bot commented Aug 16, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated label Aug 16, 2020
@stale stale bot closed this as completed Aug 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement/improvement state: stale the issue has not been updated in a while and will be closed automatically soon unless it is updated
Projects
None yet
Development

No branches or pull requests

3 participants