-
-
Notifications
You must be signed in to change notification settings - Fork 657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text/v2: Draw performances using DrawImage #2976
Comments
Can you point to the code? Intuitively, it looks to me like each potential draw command in progress could cache a few values that could be compared quickly to know if we are possibly breaking the draw command or not, I don't see why there's any need to iterate. E.g., we can quickly access the image's atlas pointer to know if it's the same we have had throughout the draw or not? I'm probably missing many things, but it would be nice to contextualize first. What cases make traversals really necessary? |
By "traversal", I mean the path each
So, I assume the performance penalty comes from traversing this for each glyph of a big text, even though they get merged in the end. By doing I made another local test by having the glyphs slice already available for both:
|
So, in the end I don't think What I'm suggesting, is that in such scenarios where it seems possible to have the control over the triangles to be drawn and the source image, it should be done this way (since I just see it as a "more advanced" usage of Ebitengine API, which can happen in user projects as well (think many sprites for example). Note: Also again, we should not forget about the fact that glyphs' source atlas cannot be controlled since Ebitengine treat them as random images. Which means it's possible that if a user is manipulating many source images as well as some glyphs, that glyphs images wouldn't be on the same source atlas => which means that a single edit: Just another tldr clarification, it's not triangles vs image, it's "many images at once" vs "many times 1 image". Just that the only way to submit many images at once is by using DrawTriangles+a source atlas |
As for this question specifically, I think @hajimehoshi would have a better answer, but |
My understanding is that @tinne26's question is 'why not merging DrawImage calls in an upper layer rather than |
Assuming this is the question, yeah this is possible, but I don't want to do that since the merging logic would be split and distributed. If the performance would be much improved, we could consider this. |
Ok, a few miscellaneous observations:
So, I'm not saying we try to do any of this, but I definitely feel like meaningful progress on this type of issues would require much more groundwork. Ok, Hajime would be able to do it on his own, sure, but my main point is that if only he is able to do it / visualize it / follow it / understand it, then it's probably not an optimization we want to add (obviously, talking about Regarding the |
Ebitengine should be dead simple for users, but unfortunately not so simple for implementers 😄
I'm fine to draw a big diagram to explain the flow how an image is rendered from DrawImage to the graphics driver. Let me have time. |
For your first point, yeah I took a few shortcuts obviously, but I think the data in its current form is already relevant. I was hoping for Hajime or someone else to assess if I missed something, by having the gist shared and for people wanting to join to check against it.
For your second point, I suggested this once, but it can probably be made by a user (I wanted to do it) as an external repository to test scenarios against each version of ebitengine.
I agree that the way this issue is written is mostly in destination to Hajime, because I believe I'm starting to have some knowledge regarding the things you mention, so this is addressed to either Hajime or people knowing what I'm refering to.
I mentioned optimizing
And this point, is actually the point of the issue, imo the only consideration is: how difficult would it be to manage an extra atlas at And I know there would be complexity involved in managing an extra atlas at an higher level, and I'm not really knowledgeable on the topic, I know @hajimehoshi implemented this logic internally, so that's why I left this unsolved (and actually mentionned about it in the original post), I wanted his input on this because I know this can be a critical point (if managing a text-level image atlas of glyphs taxes more resources than simply using DrawImage, but I said that in the OP already). So, just to be completely clear: whether I made another profiling, by having the glyphs cached (text.AppendGlyphs) and doing a simple for loop with This is just a general suggestion, I can already implement my own |
For a user atlas:
IMO we should not go with 2 unless we find a strong reason. |
Regarding what Zyko said (agree with the rest of the comment on general context, so focusing only on the technicalities of this particular issue):
So, the solution for preallocating all necessary space for a font atlas would go like this: for each font "face" (combination of size + font, you don't really need all glyphs ever into a single atlas), you iterate from 0 to To be honest, precomputing font atlas size is not a good general solution, because in many cases you would end up creating a big "face" for only a few letters, and reserving a lot of space and iterating all glyphs and computing their bounds (which unlike general text positioning with advance and so on requires iterating the whole glyph outline) is too wasteful. Partial recap:
And for that last case, what we have is a general problem that should be solved with an orthogonal API, which can be externally written as a sub-atlas that basically does everything an Ebitengine atlas does, plus being able to reassign and reorganize itself if more space is needed, without breaking contents into multiple separate atlases, and then a |
@tinne26 I agree with every concerns and ideas.
Yeah, I had 1 in mind so that glyphs share the same source, but a text.Draw can also share the same source with sprites and other rendering operations (also not sure what would be the need for unmanaged, so regular would be the way to go). |
If a user atlas is too big, there would be a big unused space on a user atlas and also on an internal atlas. If a user atlas is too small, you might have to recreate an atlas by extending it, and such extending an atlas is not so efficient: a new user atlas image would not belong to the original image for a while, since such an image is recognized as a newly created offscreen image for a rendering destination. |
I'm not sure if my issue is related because I didn't dig into it too much but my observation is that text rendering is too expensive in ebiten.
I use Intel Mac with integrated GPU. According to profiling, 95-99% CPU time spent inside CGO, for the every case. My solution/workaround so far is caching. I render every unique text once, cache the Images, and then draw them from the cache. It gives me regular ~10-14% CPU load. |
@frolosofsky While you are all talking about text rendering performance, there are different issues here:
If you are worried about text rendering performance, I think a reasonable expectation would be that rendering text takes an amount of time in the same order of magnitude as N |
Note: We're talking about a CPU bottleneck mostly
Note2: Old related issue #1880
Currently,
text.Draw
makes a call toDrawImage
for each glyph of a string which requires a traversal of the internal images'DrawTriangles
pipeline (transformations, copy, triangles merging mechanisms) before resolving to a draw command.This can be an issue with large texts, as the performance cost would scale based on the number of glyphs to render.
Since the glyphs share a source, they could benefit from a higher level atlas (owned as an image by
text/v2
or each source).Then, triangles can be built and submitted all at once in a
DrawTriangles
call instead of manyDrawImage
calls, which would trigger the internal draw pipeline only once.Potential hidden but unmeasured benefit (discussed on Discord):
It could become difficult for Ebitengine to reliably affect each glyph sharing the same text command to the same single atlas, because at this stage Ebitengine's internals are not aware of the glyphs' meaning and their parent context.
For example, this could potentially happen if: many images are created by a user, as well as many glyphs (from different font for example), or if some new glyphs from a registered font are used for the first time later in the game execution.
Here is a gist of a beginning of an implementation (profiling code is included because that's how I could measure): https://gist.github.com/Zyko0/413c536b625b7a7dc27ef2030ddfa027
Here is the output of pprof of the
Game.Draw
function for a 95sec profile:(
TextDraw
is the triangles implementation from the gist above)Extra notes:
MaxIndicesCount
, the triangles commands from this function would have to be split maybe?The text was updated successfully, but these errors were encountered: