Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite AtlasEngine to allow arbitrary overhangs #14959

Merged
Merged
Changes from 1 commit
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f24a9ea
A minor AtlasEngine refactoring
lhecker Mar 6, 2023
6ba233a
Fix transparency, scrolling, dirty rects
lhecker Mar 7, 2023
7ddddfe
Improve performance, Fix OOM when drawing whitespace
lhecker Mar 8, 2023
1eafcd4
Fix glyph rounding error, Fix custom shaders
lhecker Mar 8, 2023
01e596c
Adapter selection, Overlapping gridlines, QuadInstance simplification
lhecker Mar 15, 2023
c270284
Better dirty rect tracking and partial rerendering (WIP)
lhecker Mar 15, 2023
339b892
Mostly fix dirty rects, Reduce memory/PCIe usage
lhecker Mar 16, 2023
d44974a
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Mar 20, 2023
694daa7
Fix dirty area calculation, Add ATLAS_DEBUG_SHOW_DIRTY
lhecker Mar 20, 2023
badbd49
Finally fix broken rendering in BackendD3D
lhecker Mar 21, 2023
0d44fe4
Fix dirty rects in BackendD2D, Investigate broken support for hinted …
lhecker Mar 21, 2023
6232dfd
Implement line renditions for BackendD2D
lhecker Mar 23, 2023
da40a01
Fix D2D emoji rendering, Add support for line renditions
lhecker Mar 25, 2023
5d16e7e
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Mar 25, 2023
4879a36
Fix glyph measurements, Fix font axis support, Begin implementing sof…
lhecker Mar 27, 2023
2c06f8b
Fix glyph retry crash, Hyperlink hovering, Swap chain startup crash, …
lhecker Mar 28, 2023
c32bfec
Silence spell check
lhecker Mar 28, 2023
f068688
Simplify dxgi adapter invalidation, Fix dirty rect on backend recreation
lhecker Mar 28, 2023
f95d435
DWM folks said to test for IsCurrent(), Added basic soft font support
lhecker Mar 30, 2023
0f3b1d3
Implement line renditions for soft fonts
lhecker Mar 30, 2023
ec5f208
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Mar 30, 2023
20cb489
Fix AuditMode failures
lhecker Mar 30, 2023
4caf341
Fix background opacity in BackendD2D
lhecker Mar 31, 2023
d0fcc5b
Slightly reduce memory usage, Clean up AntialiasingMode, Document IDW…
lhecker Mar 31, 2023
4803617
Fix line endings, Remove weird IDWriteFontFace_SoftFont, Add flat_set…
lhecker Apr 2, 2023
4ef2b3f
Implement inverted cursors for D2D, Make _appendQuad a prettier & fas…
lhecker Apr 3, 2023
4aa71a1
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Apr 3, 2023
f8f0ea1
Fix background color alpha
lhecker Apr 3, 2023
2e03220
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Apr 4, 2023
7f1707b
Integrate changes to linear_flat_set from main
lhecker Apr 4, 2023
2602fa3
Fix AuditMode failures
lhecker Apr 4, 2023
d9b66ab
Some cleanup, Ligature per-cell coloring
lhecker Apr 6, 2023
b60bbc9
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Apr 7, 2023
da93dbd
Improve vertical coloring of overhangs, Fix hyperlink underline
lhecker Apr 7, 2023
1b9cd8d
Fix overlap split for double width glyphs
lhecker Apr 7, 2023
039e27f
Fix AuditMode, Fix DRCS baseline, Fix DECDWL color bitmaps
lhecker Apr 7, 2023
93722f8
Lots and lots and lots of fixes
lhecker Apr 11, 2023
9a5a8ec
Move swap chain responsibility from backends to AtlasEngine
lhecker Apr 15, 2023
26a5ab3
Merge remote-tracking branch 'origin/main' into dev/lhecker/atlas-eng…
lhecker Apr 21, 2023
270c1ba
Begin writing documentation, Fix some ATLAS_ATTR_COLD on BackendD2D
lhecker Apr 21, 2023
900b6a9
Change the inverted cursor rendering approach
lhecker Apr 25, 2023
b1590cc
Improve inverted cursor via hole punching
lhecker Apr 25, 2023
990f57a
Fix hole punching algorithm, Implement semi-reverse cursors
lhecker Apr 26, 2023
ab13e16
Add an Emoji shortcut
lhecker Apr 26, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Change the inverted cursor rendering approach
lhecker committed Apr 25, 2023
commit 900b6a9d53dbc6da8ecd37489607eee28879674b
133 changes: 73 additions & 60 deletions src/renderer/atlas/AtlasEngine.cpp
Original file line number Diff line number Diff line change
@@ -87,75 +87,88 @@ try
_api.invalidatedRows.start = std::min(_api.invalidatedRows.start, _p.s->cellCount.y);
_api.invalidatedRows.end = clamp(_api.invalidatedRows.end, _api.invalidatedRows.start, _p.s->cellCount.y);
}

const auto allInvalid = _api.invalidatedRows == range<u16>{ 0, _p.s->cellCount.y };

// Avoid scrolling if everything's invalid anyways. This isn't here for performance or correctness
// (the code also works without this), but rather because it helps me reason about the way this works.
// For instance it ensures we don't pass a scroll rect to Present1() when effectively nothing is scrolling.
if (allInvalid)
{
const auto limit = gsl::narrow_cast<i16>(_p.s->cellCount.y & 0x7fff);
_api.scrollOffset = gsl::narrow_cast<i16>(clamp<int>(_api.scrollOffset, -limit, limit));
_api.scrollOffset = 0;
}

// Scroll the buffer by the given offset and mark the newly uncovered rows as "invalid".
if (const auto offset = _api.scrollOffset)
else
{
const auto nothingInvalid = _api.invalidatedRows.start == _api.invalidatedRows.end;

if (offset < 0)
{
// scrollOffset/offset = -1
// +----------+ +----------+
// | | | xxxxxxxxx|
// | xxxxxxxxx| -> |xxxxxxx |
// |xxxxxxx | | |
// +----------+ +----------+
const u16 begRow = _p.s->cellCount.y + offset;
_api.invalidatedRows.start = nothingInvalid ? begRow : std::min(_api.invalidatedRows.start, begRow);
_api.invalidatedRows.end = _p.s->cellCount.y;

const auto dst = std::copy_n(_p.rows.begin() - offset, _p.rows.size() + offset, _p.rowsScratch.begin());
std::copy_n(_p.rows.begin(), -offset, dst);
}
else
{
// scrollOffset/offset = 1
// +----------+ +----------+
// | xxxxxxxxx| | |
// |xxxxxxx | -> | xxxxxxxxx|
// | | |xxxxxxx |
// +----------+ +----------+
const u16 endRow = offset;
_api.invalidatedRows.start = 0;
_api.invalidatedRows.end = nothingInvalid ? endRow : std::max(_api.invalidatedRows.end, endRow);

const auto dst = std::copy_n(_p.rows.end() - offset, offset, _p.rowsScratch.begin());
std::copy_n(_p.rows.begin(), _p.rows.size() - offset, dst);
}
const auto limit = gsl::narrow_cast<i16>(_p.s->cellCount.y & 0x7fff);
const auto offset = gsl::narrow_cast<i16>(clamp<int>(_api.scrollOffset, -limit, limit));

std::swap(_p.rows, _p.rowsScratch);
_api.scrollOffset = offset;

// Scrolling the background bitmap is a lot easier because we can rely on memmove which works
// with both forwards and backwards copying. It's a mystery why the STL doesn't have this.
// Scroll the buffer by the given offset and mark the newly uncovered rows as "invalid".
if (offset)
{
const auto srcOffset = std::max<ptrdiff_t>(0, -offset) * gsl::narrow_cast<ptrdiff_t>(_p.colorBitmapRowStride);
const auto dstOffset = std::max<ptrdiff_t>(0, offset) * gsl::narrow_cast<ptrdiff_t>(_p.colorBitmapRowStride);
const auto count = _p.colorBitmapDepthStride - std::max(srcOffset, dstOffset);
assert(dstOffset >= 0 && dstOffset + count <= _p.colorBitmapDepthStride);
assert(srcOffset >= 0 && srcOffset + count <= _p.colorBitmapDepthStride);
const auto nothingInvalid = _api.invalidatedRows.start == _api.invalidatedRows.end;

auto src = _p.colorBitmap.data() + srcOffset;
auto dst = _p.colorBitmap.data() + dstOffset;
const auto bytes = count * sizeof(u32);
if (offset < 0)
{
// scrollOffset/offset = -1
// +----------+ +----------+
// | | | xxxxxxxxx|
// | xxxxxxxxx| -> |xxxxxxx |
// |xxxxxxx | | |
// +----------+ +----------+
const u16 begRow = _p.s->cellCount.y + offset;
_api.invalidatedRows.start = nothingInvalid ? begRow : std::min(_api.invalidatedRows.start, begRow);
_api.invalidatedRows.end = _p.s->cellCount.y;

const auto dst = std::copy_n(_p.rows.begin() - offset, _p.rows.size() + offset, _p.rowsScratch.begin());
std::copy_n(_p.rows.begin(), -offset, dst);
}
else
{
// scrollOffset/offset = 1
// +----------+ +----------+
// | xxxxxxxxx| | |
// |xxxxxxx | -> | xxxxxxxxx|
// | | |xxxxxxx |
// +----------+ +----------+
const u16 endRow = offset;
_api.invalidatedRows.start = 0;
_api.invalidatedRows.end = nothingInvalid ? endRow : std::max(_api.invalidatedRows.end, endRow);

const auto dst = std::copy_n(_p.rows.end() - offset, offset, _p.rowsScratch.begin());
std::copy_n(_p.rows.begin(), _p.rows.size() - offset, dst);
}

for (size_t i = 0; i < 2; ++i)
std::swap(_p.rows, _p.rowsScratch);

// Scrolling the background bitmap is a lot easier because we can rely on memmove which works
// with both forwards and backwards copying. It's a mystery why the STL doesn't have this.
{
// Avoid bumping the colorBitmapGeneration unless necessary. This approx. further halves
// the (already small) GPU load. This could easily be replaced with some custom SIMD
// to avoid going over the memory twice, but... that's a story for another day.
if (memcmp(dst, src, bytes) != 0)
const auto srcOffset = std::max<ptrdiff_t>(0, -offset) * gsl::narrow_cast<ptrdiff_t>(_p.colorBitmapRowStride);
const auto dstOffset = std::max<ptrdiff_t>(0, offset) * gsl::narrow_cast<ptrdiff_t>(_p.colorBitmapRowStride);
const auto count = _p.colorBitmapDepthStride - std::max(srcOffset, dstOffset);
assert(dstOffset >= 0 && dstOffset + count <= _p.colorBitmapDepthStride);
assert(srcOffset >= 0 && srcOffset + count <= _p.colorBitmapDepthStride);

auto src = _p.colorBitmap.data() + srcOffset;
auto dst = _p.colorBitmap.data() + dstOffset;
const auto bytes = count * sizeof(u32);

for (size_t i = 0; i < 2; ++i)
{
memmove(dst, src, bytes);
_p.colorBitmapGenerations[i].bump();
// Avoid bumping the colorBitmapGeneration unless necessary. This approx. further halves
// the (already small) GPU load. This could easily be replaced with some custom SIMD
// to avoid going over the memory twice, but... that's a story for another day.
if (memcmp(dst, src, bytes) != 0)
{
memmove(dst, src, bytes);
_p.colorBitmapGenerations[i].bump();
}

src += _p.colorBitmapDepthStride;
dst += _p.colorBitmapDepthStride;
}

src += _p.colorBitmapDepthStride;
dst += _p.colorBitmapDepthStride;
}
}
}
@@ -177,7 +190,7 @@ try
_p.cursorRect = {};
_p.scrollOffset = _api.scrollOffset;

if (_api.invalidatedRows.start != _api.invalidatedRows.end)
if (_api.invalidatedRows.non_empty())
{
const auto deltaPx = _api.scrollOffset * _p.s->font->cellSize.y;
const til::CoordType targetSizeX = _p.s->targetSize.x;
@@ -213,7 +226,7 @@ try
// I feel a little bit like this is a hack, but I'm not sure how to better express this.
// This ensures that we end up calling Present1() without dirty rects if the swap chain is
// recreated/resized, because DXGI requires you to then call Present1() without dirty rects.
if (_api.invalidatedRows == range<u16>{ 0, _p.s->cellCount.y })
if (allInvalid)
{
_p.dirtyRectInPx.top = 0;
_p.dirtyRectInPx.bottom = targetSizeY;
68 changes: 20 additions & 48 deletions src/renderer/atlas/BackendD2D.cpp
Original file line number Diff line number Diff line change
@@ -484,38 +484,17 @@ void BackendD2D::_drawCursorPart1(const RenderingPayload& p)
}

const auto cursorColor = p.s->cursor->cursorColor;
if (cursorColor == 0xffffffff)
{
const auto cursorSize = p.cursorRect.size();
if (cursorSize != _cursorBitmapSize)
{
_resizeCursorBitmap(p, cursorSize);
}

const auto backgroundBitmapOffset = p.cursorRect.top * p.colorBitmapRowStride;
const auto cellSizeX = static_cast<f32>(p.s->font->cellSize.x);
const auto cellSizeY = static_cast<f32>(p.s->font->cellSize.y);
const auto offsetX = p.cursorRect.left * cellSizeX;
const auto offsetY = p.cursorRect.top * cellSizeY;

D2D1_RECT_F srcRect{
.bottom = cursorSize.height * cellSizeY,
};
D2D1_RECT_F dstRect{
.top = offsetY,
.bottom = offsetY + srcRect.bottom,
if (cursorColor != 0xffffffff)
{
const D2D1_RECT_F rect{
static_cast<f32>(p.cursorRect.left * p.s->font->cellSize.x),
static_cast<f32>(p.cursorRect.top * p.s->font->cellSize.y),
static_cast<f32>(p.cursorRect.right * p.s->font->cellSize.x),
static_cast<f32>(p.cursorRect.bottom * p.s->font->cellSize.y),
};

for (til::CoordType x = 0; x < cursorSize.width; ++x)
{
const auto bg = p.backgroundBitmap[backgroundBitmapOffset + x];
const auto brush = _brushWithColor(bg ^ 0x3f3f3f);
srcRect.left = x * cellSizeX;
srcRect.right = srcRect.left + cellSizeX;
dstRect.left = srcRect.left + offsetX;
dstRect.right = srcRect.right + offsetX;
_renderTarget->FillOpacityMask(_cursorBitmap.get(), brush, &dstRect, &srcRect);
}
const auto brush = _brushWithColor(cursorColor);
_drawCursor(p, _renderTarget.get(), rect, brush);
}
}

@@ -526,26 +505,19 @@ void BackendD2D::_drawCursorPart2(const RenderingPayload& p)
return;
}

const auto cursorColor = p.s->cursor->cursorColor;
const D2D1_POINT_2F target{
static_cast<f32>(p.cursorRect.left * p.s->font->cellSize.x),
static_cast<f32>(p.cursorRect.top * p.s->font->cellSize.y),
};

if (cursorColor == 0xffffffff)
{
_renderTarget->DrawImage(_cursorBitmap.get(), &target, nullptr, D2D1_INTERPOLATION_MODE_NEAREST_NEIGHBOR, D2D1_COMPOSITE_MODE_MASK_INVERT);
}
else
if (p.s->cursor->cursorColor == 0xffffffff)
{
const D2D1_RECT_F rect{
target.x,
target.y,
static_cast<f32>(p.cursorRect.right * p.s->font->cellSize.x),
static_cast<f32>(p.cursorRect.bottom * p.s->font->cellSize.y),
const auto cursorSize = p.cursorRect.size();
if (cursorSize != _cursorBitmapSize)
{
_resizeCursorBitmap(p, cursorSize);
}

const D2D1_POINT_2F target{
static_cast<f32>(p.cursorRect.left * p.s->font->cellSize.x),
static_cast<f32>(p.cursorRect.top * p.s->font->cellSize.y),
};
const auto brush = _brushWithColor(cursorColor);
_drawCursor(p, _renderTarget.get(), rect, brush);
_renderTarget->DrawImage(_cursorBitmap.get(), &target, nullptr, D2D1_INTERPOLATION_MODE_NEAREST_NEIGHBOR, D2D1_COMPOSITE_MODE_MASK_INVERT);
}
}

163 changes: 84 additions & 79 deletions src/renderer/atlas/BackendD3D.cpp
Original file line number Diff line number Diff line change
@@ -177,24 +177,6 @@ BackendD3D::BackendD3D(const RenderingPayload& p)
THROW_IF_FAILED(p.device->CreateBlendState(&desc, _blendState.addressof()));
}

{
static constexpr D3D11_BLEND_DESC desc{
.RenderTarget = { {
.BlendEnable = TRUE,
.SrcBlend = D3D11_BLEND_ONE,
.DestBlend = D3D11_BLEND_ONE,
.BlendOp = D3D11_BLEND_OP_SUBTRACT,
// In order for D3D to be okay with us using dual source blending in the shader, we need to use dual
// source blending in the blend state. Alternatively we could write an extra shader for these cursors.
.SrcBlendAlpha = D3D11_BLEND_SRC1_ALPHA,
.DestBlendAlpha = D3D11_BLEND_ZERO,
.BlendOpAlpha = D3D11_BLEND_OP_ADD,
.RenderTargetWriteMask = D3D11_COLOR_WRITE_ENABLE_ALL,
} },
};
THROW_IF_FAILED(p.device->CreateBlendState(&desc, _blendStateInvert.addressof()));
}

#ifndef NDEBUG
_sourceDirectory = std::filesystem::path{ __FILE__ }.parent_path();
_sourceCodeWatcher = wil::make_folder_change_reader_nothrow(_sourceDirectory.c_str(), false, wil::FolderChangeEvents::FileName | wil::FolderChangeEvents::LastWriteTime, [this](wil::FolderChangeEvent, PCWSTR path) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: copy this for hot-reloading user-defined pixel shaders

@@ -239,9 +221,9 @@ void BackendD3D::Render(RenderingPayload& p)
#endif

_drawBackground(p);
_drawCursorPart1(p);
_drawCursorBackground(p);
_drawText(p);
_drawCursorPart2(p);
//_drawCursorPart2(p);
lhecker marked this conversation as resolved.
Show resolved Hide resolved
_drawSelection(p);
#if ATLAS_DEBUG_SHOW_DIRTY
_debugShowDirty(p);
@@ -803,11 +785,6 @@ void BackendD3D::_resizeGlyphAtlas(const RenderingPayload& p, const u16 u, const
_rectPackerData = Buffer<stbrp_node>{ u };
}

void BackendD3D::_markStateChange(ID3D11BlendState* blendState)
{
_instancesStateChanges.emplace_back(blendState, _instancesCount);
}

BackendD3D::QuadInstance& BackendD3D::_getLastQuad() noexcept
{
assert(_instancesCount != 0);
@@ -847,6 +824,11 @@ void BackendD3D::_flushQuads(const RenderingPayload& p)
return;
}

if (p.s->cursor->cursorColor == 0xffffffff && !_cursorRects.empty())
{
_drawCursorInvert();
}

// TODO: Shrink instances buffer
if (_instancesCount > _instanceBufferCapacity)
{
@@ -879,24 +861,7 @@ void BackendD3D::_flushQuads(const RenderingPayload& p)
// Instead I found that packing instance data as tightly as possible made the biggest performance difference,
// and packing 16 bit integers with ID3D11InputLayout is quite a bit more convenient too.

// This will cause the loop below to emit one final DrawIndexedInstanced() for the remainder of instances.
_markStateChange(nullptr);

size_t previousOffset = 0;
for (const auto& state : _instancesStateChanges)
{
if (const auto count = state.offset - previousOffset)
{
p.deviceContext->DrawIndexedInstanced(6, count, 0, 0, previousOffset);
}
if (state.blendState)
{
p.deviceContext->OMSetBlendState(state.blendState, nullptr, 0xffffffff);
}
previousOffset = state.offset;
}

_instancesStateChanges.clear();
p.deviceContext->DrawIndexedInstanced(6, static_cast<UINT>(_instancesCount), 0, 0, 0);
_instancesCount = 0;
}

@@ -945,7 +910,6 @@ void BackendD3D::_drawBackground(const RenderingPayload& p)
.shadingType = ShadingType::Background,
.size = p.s->targetSize,
};
_flushQuads(p);
}

void BackendD3D::_uploadBackgroundBitmap(const RenderingPayload& p)
@@ -1256,18 +1220,18 @@ bool BackendD3D::_drawGlyph(const RenderingPayload& p, const BackendD3D::AtlasFo
// This calculates the black box of the glyph, or in other words,
// it's extents/size relative to its baseline origin (at 0,0).
//
// box.top --------++-----######--+
// bounds.top ------++-----######--+
// (-7) || ############
// ||#### ####
// |### #####
// baseline _____ |### #####|
// origin \ |############# |
// (= 0,0) \||########### |
// baseline ______ |### #####|
// origin \|############# |
// (= 0,0) \|########### |
// ++-------###---+
// ## ### |
// box.bottom -----+#########-----+
// bounds.bottom ---+#########-----+
// (+2) | |
// box.left box.right
// bounds.left bounds.right
// (-1) (+14)
//

@@ -1625,7 +1589,7 @@ void BackendD3D::_drawGridlineRow(const RenderingPayload& p, const ShapedRow* ro
}
}

void BackendD3D::_drawCursorPart1(const RenderingPayload& p)
void BackendD3D::_drawCursorBackground(const RenderingPayload& p)
{
_cursorRects.clear();

@@ -1654,7 +1618,7 @@ void BackendD3D::_drawCursorPart1(const RenderingPayload& p)
static_cast<u16>(p.s->font->cellSize.x * (x1 - x0)),
p.s->font->cellSize.y,
};
const auto color = cursorColor == 0xffffffff ? bg ^ 0x3f3f3f : cursorColor;
const auto color = cursorColor == 0xffffffff ? bg ^ 0xc0c0c0 : cursorColor;
auto& c0 = _cursorRects.emplace_back(position, size, color);

switch (static_cast<CursorType>(p.s->cursor->cursorType))
@@ -1716,51 +1680,92 @@ void BackendD3D::_drawCursorPart1(const RenderingPayload& p)
}
}

for (const auto& c : _cursorRects)
{
_appendQuad() = {
.shadingType = ShadingType::SolidFill,
.position = c.position,
.size = c.size,
.color = c.color,
};
}

if (cursorColor == 0xffffffff)
{
for (auto& c : _cursorRects)
{
_appendQuad() = {
.shadingType = ShadingType::SolidFill,
.position = c.position,
.size = c.size,
.color = c.color,
};
c.color = 0xffffffff;
}
}
}

void BackendD3D::_drawCursorPart2(const RenderingPayload& p)
void BackendD3D::_drawCursorInvert()
{
if (_cursorRects.empty())
// NOTE: _appendQuad() may reallocate the _instances vector. It's important to iterate
// by index, because pointers (or iterators) would get invalidated. It's also important
// to cache the original _instancesCount since it'll get changed with each append.
const auto instancesCount = _instancesCount;

for (const auto& c : _cursorRects)
{
return;
}
const int cursorL = c.position.x;
const int cursorT = c.position.y;
const int cursorR = cursorL + c.size.x;
const int cursorB = cursorT + c.size.y;

const auto color = p.s->cursor->cursorColor;
for (size_t i = 0; i < instancesCount; ++i)
{
const auto& it = _instances[i];
const auto shadingType = it.shadingType;

if (color == 0xffffffff)
{
_markStateChange(_blendStateInvert.get());
}
if (shadingType < ShadingType::TextGrayscale || shadingType > ShadingType::TextClearType)
lhecker marked this conversation as resolved.
Show resolved Hide resolved
{
continue;
}

for (const auto& c : _cursorRects)
{
_appendQuad() = {
.shadingType = ShadingType::SolidFill,
.position = c.position,
.size = c.size,
.color = c.color,
};
}
const int instanceL = it.position.x;
const int instanceT = it.position.y;
const int instanceR = instanceL + it.size.x;
const int instanceB = instanceT + it.size.y;

if (color == 0xffffffff)
{
_markStateChange(_blendState.get());
if (instanceL < cursorR && cursorL < instanceR && instanceT < cursorB && cursorT < instanceB)
{
// The _instances vector is _huge_ (easily up to 100k items) whereas only 1-2 items will actually overlap
// with the cursor. --> Make this loop more compact by putting as much as possible into a function call.
_drawCursorInvertSlowPath(c, it);
}
}
}
}

void BackendD3D::_drawCursorInvertSlowPath(const CursorRect& c, const QuadInstance& it)
{
const int cursorL = c.position.x;
const int cursorT = c.position.y;
const int cursorR = cursorL + c.size.x;
const int cursorB = cursorT + c.size.y;

const int instanceL = it.position.x;
const int instanceT = it.position.y;
const int instanceR = instanceL + it.size.x;
const int instanceB = instanceT + it.size.y;

const auto l = std::max<int>(cursorL, instanceL);
const auto t = std::max<int>(cursorT, instanceT);
const auto w = std::min<int>(cursorR, instanceR) - l;
const auto h = std::min<int>(cursorB, instanceB) - t;
const auto u = it.texcoord.x + l - instanceL;
const auto v = it.texcoord.y + t - instanceT;

_appendQuad() = {
it.shadingType,
{ static_cast<i16>(l), static_cast<i16>(t) },
{ static_cast<u16>(w), static_cast<u16>(h) },
{ static_cast<u16>(u), static_cast<u16>(v) },
it.color ^ 0x00c0c0c0,
};
}

void BackendD3D::_drawSelection(const RenderingPayload& p)
{
u16 y = 0;
31 changes: 10 additions & 21 deletions src/renderer/atlas/BackendD3D.h
Original file line number Diff line number Diff line change
@@ -172,6 +172,13 @@ namespace Microsoft::Console::Render::Atlas
};

private:
struct CursorRect
{
i16x2 position;
u16x2 size;
u32 color;
};

ATLAS_ATTR_COLD void _handleSettingsUpdate(const RenderingPayload& p);
void _updateFontDependents(const RenderingPayload& p);
void _d2dRenderTargetUpdateFontSettings(const RenderingPayload& p) const noexcept;
@@ -187,7 +194,6 @@ namespace Microsoft::Console::Render::Atlas
void _d2dEndDrawing();
ATLAS_ATTR_COLD void _resetGlyphAtlas(const RenderingPayload& p);
ATLAS_ATTR_COLD void _resizeGlyphAtlas(const RenderingPayload& p, u16 u, u16 v);
void _markStateChange(ID3D11BlendState* blendState);
QuadInstance& _getLastQuad() noexcept;
QuadInstance& _appendQuad();
ATLAS_ATTR_COLD void _bumpInstancesSize();
@@ -202,8 +208,9 @@ namespace Microsoft::Console::Render::Atlas
void _drawGlyphPrepareRetry(const RenderingPayload& p);
void _splitDoubleHeightGlyph(const RenderingPayload& p, const AtlasFontFaceEntryInner& fontFaceEntry, AtlasGlyphEntry& glyphEntry);
void _drawGridlineRow(const RenderingPayload& p, const ShapedRow* row, u16 y);
void _drawCursorPart1(const RenderingPayload& p);
void _drawCursorPart2(const RenderingPayload& p);
void _drawCursorBackground(const RenderingPayload& p);
ATLAS_ATTR_COLD void _drawCursorInvert();
ATLAS_ATTR_COLD void _drawCursorInvertSlowPath(const CursorRect& c, const QuadInstance& it);
void _drawSelection(const RenderingPayload& p);
void _executeCustomShader(RenderingPayload& p);

@@ -212,7 +219,6 @@ namespace Microsoft::Console::Render::Atlas
wil::com_ptr<ID3D11VertexShader> _vertexShader;
wil::com_ptr<ID3D11PixelShader> _pixelShader;
wil::com_ptr<ID3D11BlendState> _blendState;
wil::com_ptr<ID3D11BlendState> _blendStateInvert;
wil::com_ptr<ID3D11Buffer> _vsConstantBuffer;
wil::com_ptr<ID3D11Buffer> _psConstantBuffer;
wil::com_ptr<ID3D11Buffer> _vertexBuffer;
@@ -222,17 +228,6 @@ namespace Microsoft::Console::Render::Atlas
Buffer<QuadInstance, 32> _instances;
size_t _instancesCount = 0;

// This allows us to batch inverted cursors into the same
// _instanceBuffer upload as the rest of all other instances.
struct StateChange
{
ID3D11BlendState* blendState;
size_t offset;
};
// 3 allows for 1 state change to _blendStateInvert, followed by 1 change back to _blendState,
// and finally 1 entry to signal the past-the-end size, as used by _flushQuads.
til::small_vector<StateChange, 3> _instancesStateChanges;

wil::com_ptr<ID3D11RenderTargetView> _customRenderTargetView;
wil::com_ptr<ID3D11Texture2D> _customOffscreenTexture;
wil::com_ptr<ID3D11ShaderResourceView> _customOffscreenTextureView;
@@ -276,12 +271,6 @@ namespace Microsoft::Console::Render::Atlas

// An empty-box cursor spanning a wide glyph that has different
// background colors on each side results in 6 lines being drawn.
struct CursorRect
{
i16x2 position;
u16x2 size;
u32 color;
};
til::small_vector<CursorRect, 6> _cursorRects;

bool _requiresContinuousRedraw = false;
10 changes: 10 additions & 0 deletions src/renderer/atlas/common.h
Original file line number Diff line number Diff line change
@@ -108,6 +108,16 @@ namespace Microsoft::Console::Render::Atlas

ATLAS_POD_OPS(range)

constexpr bool empty() const noexcept
{
return start >= end;
}

constexpr bool non_empty() const noexcept
{
return start < end;
}

constexpr bool contains(T v) const noexcept
{
return v >= start && v < end;