Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git_buf: now a public-only API (git_str is our internal API) #6078

Merged
merged 2 commits into from
Oct 17, 2021

Conversation

ethomson
Copy link
Member

To achieve the same result as #5534 (but in a completely opposite way), this makes git_buf a public only structure.

  1. We have not been completely consistent about git_bufs use in public APIs. This provides documentation as to how the APIs that return data in a git_buf will behave.
  2. Separate the public git_buf API from our internal string manipulation functions. The public git_buf is now only for providing a hunk of memory back to a user with a mechanism for them to reliably free it. Our internal string (and buffer) manipulation functions are now git_str.

After the work that we did throughout the post-v1.0 releases, point (2) is largely a mechanical exercise.

This introduces some helper functions to (safely) adapt a git_buf into a git_str for use internally.

Since many APIs are not strictly public-only (we want to be able to call them internally), this introduces a helpful wrapper function that will allow us to simply produce a wrapper API over our private APIs that expose it with a git_buf. This allows us to have a simple private API using a git_str that is wrapped with a git_buf without much boilerplate.

@ethomson ethomson force-pushed the ethomson/gitstr branch 2 times, most recently from f70883f to fe07fa6 Compare September 30, 2021 16:42
Copy link
Member

@carlosmn carlosmn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a diff. I'll admit I didn't check every single file for the conversion but did go over the other ones.

It's kinda big for everything to be changing, but it is better to have different types for internal and external uses, and the only users who might notice are those implementing things close to the library and which should expect changes either way.

docs/buffers.md Outdated Show resolved Hide resolved
docs/buffers.md Outdated Show resolved Hide resolved
git_str str = GIT_STR_INIT; \
int error; \
if ((error = git_buf_tostr(&str, buf)) == 0 && \
(error = fn(&str, __VA_ARGS__)) == 0) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the opposite logic from what we would normally use in our series of ifs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is - that's intentional. Since fn is arbitrary, it might return > 0 for some reason. (I don't recall if any of them do offhand) and that should be returned to the user. I find the above more readable than what would be closer to our normal logic:

if ((error = git_buf_tostr(&str, buf)) < 0 || \
    (error = fn(&str, __VA_ARGS__)) != 0) \

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair enough

We have been inconsistent about the way that we handle `git_buf`s
provided by users.  _Usually_ we require that it has been properly
initialized with `GIT_BUF_INIT`, but _sometimes_ we simply overwrite
the data in it regardless.  And even more rarely, we will grow a
user-provided buffer and concatenate data onto it (see
`git_diff_format_email`).

Document the path forward for `git_buf`, which is that we always
require that the buffer is intitialized with `GIT_BUF_INIT`.

`git_diff_format_email` will be kept backward compatible but users
are encouraged to switch to the new `git_email` APIs.
libgit2 has two distinct requirements that were previously solved by
`git_buf`.  We require:

1. A general purpose string class that provides a number of utility APIs
   for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
   can take ownership of.

By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.

Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class.  The name also
is an homage to Junio Hamano ("gitstr").

The public API remains `git_buf`, and has a much smaller footprint.  It
is generally only used as an "out" param with strict requirements that
follow the documentation.  (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)

Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.
@ethomson ethomson merged commit e61abdc into main Oct 17, 2021
@ethomson ethomson deleted the ethomson/gitstr branch October 17, 2021 16:55
craigds added a commit to koordinates/cppgit2 that referenced this pull request Mar 2, 2022
This change is slightly backward incompatible. libgit2 1.4.0 made a
chagne to the `git_buf` structure and deprecated a bunch of associated
functions, to the effect that `git_buf` is no longer modifiable by
users. It is expected that `git_buf` should only be set by libgit2
itself as a return from various functions.

So this change to the `data_buffer` class removes all modifier methods.
A data_buffer is returned where libgit2 returns a `git_buf`, but we no
longer allow the caller to modify the buffer data or grow the buffer.

related: libgit2/libgit2#6078

Perhaps we should change methods returning a `data_buffer` to return a
`std::string` instead?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants