-
-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[std] Add StringBuf.clear() #11848
base: development
Are you sure you want to change the base?
[std] Add StringBuf.clear() #11848
Conversation
@@ -101,6 +101,10 @@ | |||
throw "Invalid unicode char " + c; | |||
} | |||
|
|||
public function clear():Void { | |||
pos = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the hl.Bytes
instance should also be reinitialized for memory reasons. In theory you could have a huge StringBuf that never releases its memory after being cleared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The StringBuf
staying huge after being cleared is intentional! For example, if you built a huge string once:
hl.Gc.major();
Sys.println("before start: " + hl.Gc.stats());
var sb = new StringBuf();
for (i in 0...200000) {
sb.add("hello");
}
hl.Gc.major();
Sys.println("after filling: " + hl.Gc.stats());
before start: {currentMemory : 589824, allocationCount : 137, totalAllocated : 7104}
after filling: {currentMemory : 3145728, allocationCount : 191, totalAllocated : 7706696}
You may also want to build another one in the future. In that case, you don't pay the cost of allocating buffers. Multiple, if dealing with small elements and the internals need to constantly reallocate and blit.
@:privateAccess sb.pos = 0; // clear()
for (i in 0...200000) {
sb.add("world");
}
hl.Gc.major();
Sys.println("after reusing: " + hl.Gc.stats());
after reusing: {currentMemory : 3145728, allocationCount : 212, totalAllocated : 7707912}
If you worry however about hl.Bytes
leaking, the semantics around it did not change - if Bytes
are indeed GC-able, they should leak if and only if the StringBuf
also leaks (user error 😅), the same as it currently functions. (Ditto for compacting the heap/releasing memory back to the OS - there's currently nothing stopping users from creating big, non-clearable StringBufs. If this was a concern, this would be a concern right now on stable.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you're saying, it's just that this is not exactly the behavior I'd expect when clearing a buffer. From a design perspective, I expect existingBuf = new StringBuf()
and existingBuf.clear()
to have the same memory behavior. Introducing a subtle difference like this can easily lead to confusion.
eb20fac
to
b658699
Compare
Documentation isn't the solution here, I'll always expect a clear function to just clear the buffer without keeping some reserved memory around. Let's please make sure that this is the standard behavior on all targets. What you're describing has merit, but it is an orthogonal problem: we would want to be able to pre-allocate memory on a newly created buffer as well, without having to fill it once. Something like Java's |
Motivation
StringBuf
was created as an efficient way to concatenate multiple strings. Its main characteristic is that, on all targets, it's never slower than naively joining those strings, often being much faster.Given that a substantial portion of Haxe developers are gamedevs, who tend to obsess over extraneous allocations (GC is unpredictable), another important fact about
StringBuf
is that it also typically allocates less memory than the alternative approach.However, even in these performance-aware contexts it is necessary to construct many new
StringBuf
instances. There is no way to reuse the existing instance, perhaps with an already extended internal buffer.Implementation
This PR introduces a new method,
StringBuf.clear(): Void
.Its purpose is to remove all characters from that
StringBuf
. After clearing a buffer, itslength
is 0 andtoString()
returns an empty string. The buffer can then be used again.add(x)
is implemented as a naive concatb += x
,clear
simply sets the buffer to an empty string. (ThisStringBuf
instance does not have to be reallocated.)clear
reallocates the internal buffer. (The buffer's wrapper abstract and thisStringBuf
instance do not have to be reallocated.)clear
moves the buffer index/pointer to 0 (beginning). For performance purposes, the previous data in the buffer is not zeroed.Closing thoughts
Another suggested name for such a method is
reset()
. I have not chosen this name, because it implies to me that the whole implementation is being brought back to its original state, including zeroing the internal buffer – which, for performance, is not the case here.Reusable
StringBuf
s can, potentially, be pooled. However, given many different project needs, especially with regards to threaded usage support, such a pool was not implemented in this PR.