Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtlasEngine: Fix bugs introduced in 66f4f9d and d74b66a #13496

Merged
3 commits merged into from
Jul 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 54 additions & 13 deletions src/renderer/atlas/AtlasEngine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -360,6 +360,40 @@ try
_api.dirtyRect = til::rect{ 0, _api.invalidatedRows.x, _api.cellCount.x, _api.invalidatedRows.y };
}

// This is an important block of code for our TileHashMap.
// We only process glyphs within the dirtyRect, but glyphs outside of the
// dirtyRect are still in use and shouldn't be discarded. This is critical
// if someone uses a tool like tmux to split the terminal horizontally.
// If they then print a lot of Unicode text on just one side, we have to
// ensure that the (for example) plain ASCII glyphs on the other half of the
// viewport are still retained. This bit of code "refreshes" those glyphs and
// brings them to the front of the LRU queue to prevent them from being reused.
lhecker marked this conversation as resolved.
Show resolved Hide resolved
{
const std::array<til::point, 2> ranges{ {
{ 0, _api.dirtyRect.top },
{ _api.dirtyRect.bottom, _api.cellCount.y },
} };
const auto stride = static_cast<size_t>(_r.cellCount.x);

for (const auto& p : ranges)
{
// We (ab)use the .x/.y members of the til::point as the
// respective [from,to) range of rows we need to makeNewest().
Comment on lines +380 to +381
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hahahaha, ok that makes me feel much better about this

const auto from = p.x;
const auto to = p.y;
Comment on lines +382 to +383
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm just really confused about this. Why are we iterating from the x-coord to the y-coord? The inner loop makes sense because we're looping over each y value. Shouldn't this be looping over the x values then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a brief comment explaining this peculiar code.


for (auto y = from; y < to; ++y)
{
auto it = _r.cellGlyphMapping.data() + stride * y;
const auto end = it + stride;
for (; it != end; ++it)
{
_r.glyphs.makeNewest(*it);
}
}
}
}

return S_OK;
}
catch (const wil::ResultException& exception)
Expand Down Expand Up @@ -883,10 +917,12 @@ void AtlasEngine::_recreateSizeDependentResources()
// This buffer is a bit larger than the others (multiple MB).
// Prevent a memory usage spike, by first deallocating and then allocating.
_r.cells = {};
_r.cellGlyphMapping = {};
// Our render loop heavily relies on memcpy() which is between 1.5x
// and 40x faster for allocations with an alignment of 32 or greater.
// (40x on AMD Zen1-3, which have a rep movsb performance issue. MSFT:33358259.)
_r.cells = Buffer<Cell, 32>{ totalCellCount };
_r.cellGlyphMapping = Buffer<TileHashMap::iterator>{ totalCellCount };
_r.cellCount = _api.cellCount;
_r.tileAllocator.setMaxArea(_api.sizeInPixel);

Expand Down Expand Up @@ -1062,6 +1098,13 @@ AtlasEngine::Cell* AtlasEngine::_getCell(u16 x, u16 y) noexcept
return _r.cells.data() + static_cast<size_t>(_r.cellCount.x) * y + x;
}

AtlasEngine::TileHashMap::iterator* AtlasEngine::_getCellGlyphMapping(u16 x, u16 y) noexcept
{
assert(x < _r.cellCount.x);
assert(y < _r.cellCount.y);
return _r.cellGlyphMapping.data() + static_cast<size_t>(_r.cellCount.x) * y + x;
}

void AtlasEngine::_setCellFlags(u16r coords, CellFlags mask, CellFlags bits) noexcept
{
assert(coords.left <= coords.right);
Expand Down Expand Up @@ -1399,9 +1442,9 @@ void AtlasEngine::_emplaceGlyph(IDWriteFontFace* fontFace, size_t bufferPos1, si
attributes.cellCount = cellCount;

AtlasKey key{ attributes, gsl::narrow<u16>(charCount), chars };
const AtlasValue* valueRef = _r.glyphs.find(key);
auto it = _r.glyphs.find(key);

if (!valueRef)
if (it == _r.glyphs.end())
{
// Do fonts exist *in practice* which contain both colored and uncolored glyphs? I'm pretty sure...
// However doing it properly means using either of:
Expand Down Expand Up @@ -1440,26 +1483,24 @@ void AtlasEngine::_emplaceGlyph(IDWriteFontFace* fontFace, size_t bufferPos1, si
coords[i] = _r.tileAllocator.allocate(_r.glyphs);
}

const auto it = _r.glyphs.insert(std::move(key), std::move(value));
valueRef = &it->second;
it = _r.glyphs.insert(std::move(key), std::move(value));
_r.glyphQueue.emplace_back(&it->first, &it->second);
}

// For some reason MSVC doesn't understand that valueRef is overwritten in the branch above, resulting in:
// C26430: Symbol 'valueRef' is not tested for nullness on all paths (f.23).
__assume(valueRef != nullptr);

const auto valueData = valueRef->data();
const auto valueData = it->second.data();
const auto coords = &valueData->coords[0];
const auto data = _getCell(x1, _api.lastPaintBufferLineCoord.y);
const auto cells = _getCell(x1, _api.lastPaintBufferLineCoord.y);
const auto cellGlyphMappings = _getCellGlyphMapping(x1, _api.lastPaintBufferLineCoord.y);

for (u32 i = 0; i < cellCount; ++i)
{
data[i].tileIndex = coords[i];
cells[i].tileIndex = coords[i];
// We should apply the column color and flags from each column (instead
// of copying them from the x1) so that ligatures can appear in multiple
// colors with different line styles.
data[i].flags = valueData->flags | _api.bufferLineMetadata[static_cast<size_t>(x1) + i].flags;
data[i].color = _api.bufferLineMetadata[static_cast<size_t>(x1) + i].colors;
cells[i].flags = valueData->flags | _api.bufferLineMetadata[static_cast<size_t>(x1) + i].flags;
cells[i].color = _api.bufferLineMetadata[static_cast<size_t>(x1) + i].colors;
}

std::fill_n(cellGlyphMappings, cellCount, it);
}
51 changes: 41 additions & 10 deletions src/renderer/atlas/AtlasEngine.h
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,12 @@ namespace Microsoft::Console::Render
};

private:
// I wrote `Buffer` instead of using `std::vector`, because I want to convey that these things
// explicitly _don't_ hold resizeable contents, but rather plain content of a fixed size.
// For instance I didn't want a resizeable vector with a `push_back` method for my fixed-size
// viewport arrays - that doesn't make sense after all. `Buffer` also doesn't initialize
// contents to zero, allowing rapid creation/destruction and you can easily specify a custom
// (over-)alignment which can improve rendering perf by up to ~20% over `std::vector`.
template<typename T, size_t Alignment = alignof(T)>
struct Buffer
{
Expand All @@ -175,19 +181,24 @@ namespace Microsoft::Console::Render
_data{ allocate(size) },
_size{ size }
{
std::uninitialized_default_construct_n(_data, size);
}

Buffer(const T* data, size_t size) :
_data{ allocate(size) },
_size{ size }
{
static_assert(std::is_trivially_copyable_v<T>);
memcpy(_data, data, size * sizeof(T));
// Changing the constructor arguments to accept std::span might
// be a good future extension, but not to improve security here.
// You can trivially construct std::span's from invalid ranges.
// Until then the raw-pointer style is more practical.
lhecker marked this conversation as resolved.
Show resolved Hide resolved
#pragma warning(suppress : 26459) // You called an STL function '...' with a raw pointer parameter at position '3' that may be unsafe [...].
std::uninitialized_copy_n(data, size, _data);
}

~Buffer()
{
deallocate(_data);
destroy();
}

Buffer(Buffer&& other) noexcept :
Expand All @@ -199,7 +210,7 @@ namespace Microsoft::Console::Render
#pragma warning(suppress : 26432) // If you define or delete any default operation in the type '...', define or delete them all (c.21).
Buffer& operator=(Buffer&& other) noexcept
{
deallocate(_data);
destroy();
_data = std::exchange(other._data, nullptr);
_size = std::exchange(other._size, 0);
return *this;
Expand Down Expand Up @@ -288,6 +299,12 @@ namespace Microsoft::Console::Render
}
#pragma warning(pop)

void destroy() noexcept
{
std::destroy_n(_data, _size);
deallocate(_data);
}
Comment on lines +302 to +306
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 Buffer changes should've been here all along. The 3 corresponding STL functions already do exactly the optimal thing: nothing for trivial types (i.e. leave the memory uninitialized) and properly construct/destruct the types if they're proper classes. And in case of std::uninitialized_copy_n in particular it already uses memmove for trivial types internally. Perfect!
This change allows us to construct a Buffer<iterator> which is an array of non-trivial types.

BTW/FYI: I didn't use std::vector and wrote Buffer instead, because I want to convey that these things explicitly don't hold resizeable contents (in general). For instance I didn't want a push_back method on my viewport arrays - that wouldn't make sense so that always felt a bit wrong to me. It also doesn't initialize contents to zero, allowing rapid creation/destruction and you can specify a custom alignment (improves rendering perf by about 20% over std::vector IIRC).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you pull these classes out to their own files, explain that ^^ in a comment!


T* _data = nullptr;
size_t _size = 0;
};
Expand Down Expand Up @@ -553,21 +570,28 @@ namespace Microsoft::Console::Render

struct TileHashMap
{
using iterator = std::list<std::pair<AtlasKey, AtlasValue>>::iterator;

TileHashMap() noexcept = default;

AtlasValue* find(const AtlasKey& key)
iterator end() noexcept
{
return _lru.end();
}

iterator find(const AtlasKey& key)
lhecker marked this conversation as resolved.
Show resolved Hide resolved
{
const auto it = _map.find(key);
if (it != _map.end())
{
// Move the key to the head of the LRU queue.
_lru.splice(_lru.begin(), _lru, *it);
return &(*it)->second;
makeNewest(*it);
return *it;
}
return nullptr;
return end();
}

std::list<std::pair<AtlasKey, AtlasValue>>::iterator insert(AtlasKey&& key, AtlasValue&& value)
iterator insert(AtlasKey&& key, AtlasValue&& value)
{
// Insert the key/value right at the head of the LRU queue, just like find().
//
Expand All @@ -579,6 +603,11 @@ namespace Microsoft::Console::Render
return it;
}

void makeNewest(const iterator& it)
{
_lru.splice(_lru.begin(), _lru, it);
}

void popOldestTiles(std::vector<u16x2>& out) noexcept
{
Expects(!_lru.empty());
Expand All @@ -602,7 +631,7 @@ namespace Microsoft::Console::Render
// If you need a LRU hash-map, write a custom one with an intrusive
// prev/next linked list (it's easier than you might think!).
std::list<std::pair<AtlasKey, AtlasValue>> _lru;
std::unordered_set<std::list<std::pair<AtlasKey, AtlasValue>>::iterator, AtlasKeyHasher, AtlasKeyEq> _map;
std::unordered_set<iterator, AtlasKeyHasher, AtlasKeyEq> _map;
};

// TileAllocator yields `tileSize`-sized tiles for our texture atlas.
Expand Down Expand Up @@ -844,6 +873,7 @@ namespace Microsoft::Console::Render
IDWriteTextFormat* _getTextFormat(bool bold, bool italic) const noexcept;
const Buffer<DWRITE_FONT_AXIS_VALUE>& _getTextFormatAxis(bool bold, bool italic) const noexcept;
Cell* _getCell(u16 x, u16 y) noexcept;
TileHashMap::iterator* _getCellGlyphMapping(u16 x, u16 y) noexcept;
void _setCellFlags(u16r coords, CellFlags mask, CellFlags bits) noexcept;
void _flushBufferLine();
void _emplaceGlyph(IDWriteFontFace* fontFace, size_t bufferPos1, size_t bufferPos2);
Expand Down Expand Up @@ -911,6 +941,7 @@ namespace Microsoft::Console::Render
wil::com_ptr<IDWriteTypography> typography;

Buffer<Cell, 32> cells; // invalidated by ApiInvalidations::Size
Buffer<TileHashMap::iterator> cellGlyphMapping; // invalidated by ApiInvalidations::Size
f32x2 cellSizeDIP; // invalidated by ApiInvalidations::Font, caches _api.cellSize but in DIP
u16x2 cellSize; // invalidated by ApiInvalidations::Font, caches _api.cellSize
u16x2 cellCount; // invalidated by ApiInvalidations::Font|Size, caches _api.cellCount
Expand Down
19 changes: 14 additions & 5 deletions src/renderer/atlas/AtlasEngine.r.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -293,11 +293,16 @@ void AtlasEngine::_drawCursor()
// At 150% scale lineWidth thus needs to be 1.33333... because at a zoom scale of 1.5 this results in a 2px wide line.
const auto lineWidth = std::max(1.0f, static_cast<float>((_r.dpi + USER_DEFAULT_SCREEN_DPI / 2) / USER_DEFAULT_SCREEN_DPI * USER_DEFAULT_SCREEN_DPI) / static_cast<float>(_r.dpi));
const auto cursorType = static_cast<CursorType>(_r.cursorOptions.cursorType);
D2D1_RECT_F rect;
rect.left = 0.0f;
rect.top = 0.0f;
rect.right = _r.cellSizeDIP.x;
rect.bottom = _r.cellSizeDIP.y;

// `clip` is the rectangle within our texture atlas that's reserved for our cursor texture, ...
D2D1_RECT_F clip;
clip.left = 0.0f;
clip.top = 0.0f;
clip.right = _r.cellSizeDIP.x;
clip.bottom = _r.cellSizeDIP.y;

// ... whereas `rect` is just the visible (= usually white) portion of our cursor.
auto rect = clip;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need clip? I mean, I see that we call PushAxisAlignedClip() down below, but it would be nice to add a comment here explaining the purpose of clip and what it fixes, exactly. More just so that anybody that comes along understands this code better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if the comment I just added is sufficient. 🙂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it. Thanks!


switch (cursorType)
{
Expand Down Expand Up @@ -328,6 +333,9 @@ void AtlasEngine::_drawCursor()
}

_r.d2dRenderTarget->BeginDraw();
// We need to clip the area we draw in to ensure we don't
// accidentally draw into any neighboring texture atlas tiles.
_r.d2dRenderTarget->PushAxisAlignedClip(&clip, D2D1_ANTIALIAS_MODE_ALIASED);
lhecker marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually understand why we need this clip! We are only filling or stroking within 0,0 .. cellSizeDIP already; were we measurably "accidentally" drawing into the neighboring tiles? When? How?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned that D2D might draw lines/rectangles outside of the coordinates I specify, be it due to off-by-fractional-1 bugs or due to AA. I think using a clip rect is more robust and doesn't hurt performance since we only seldomly draw a cursor.

_r.d2dRenderTarget->Clear();

if (cursorType == CursorType::EmptyBox)
Expand All @@ -346,5 +354,6 @@ void AtlasEngine::_drawCursor()
_r.d2dRenderTarget->FillRectangle(&rect, _r.brush.get());
}

_r.d2dRenderTarget->PopAxisAlignedClip();
THROW_IF_FAILED(_r.d2dRenderTarget->EndDraw());
}