Skip to content

Commit

Permalink
Refactor TextBuffer::GenHTML/RTF to read the buffer directly (#16377)
Browse files Browse the repository at this point in the history
`TextBuffer::GenHTML` and `TextBuffer::GenRTF` now read directly from
the TextBuffer.

- Since we're reading from the buffer, we can now read _all_ the
attributes saved in the buffer. Formatted copy now copies most (if not
all) font/color attributes in the requested format (RTF/HTML).
- Use `TextBuffer::CopyRequest` to pass all copy-related options into
text generation functions as one unit.
- Helper function `TextBuffer::CopyRequest::FromConfig()` generates a
copy request based on Selection mode and user configuration.
- Both formatted text generation functions now use `std::string` and
`fmt::format_to` to generate the required strings. Previously, we were
using `std::ostringstream` which is not recommended due to its potential
overhead.
- Reading attributes from `ROW`'s attribute RLE simplified the logic as
we don't have to track attribute change between the text.
- On the caller side, we do not have to rebuild the plain text string
from the vector of strings anymore. `TextBuffer::GetPlainText()` returns
the entire text as one `std::string`.
- Removed `TextBuffer::TextAndColors`.
- Removed `TextBuffer::GetText()`. `TextBuffer::GetPlainText()` took its
place.

This PR also fixes two bugs in the formatted copy:

- We were applying line breaks after each selected row, even though the
row could have been a Wrapped row. This caused the wrapped rows to break
when they shouldn't.
- We mishandled Unicode text (\uN) within the RTF copy. Every next
character that uses a surrogate pair or high codepoint was missing in
the copied text when pasted to MSWord. The command `\uc4` should have
been `\uc1`, which is used to tell how many fallback characters are used
for each Unicode codepoint (\u). We always use one `?` character as the
fallback.

Closes #16191

**References and Relevant Issues**

- #16270

**Validation Steps Performed**

- Casual copy-pasting from Terminal or OpenConsole to word editors works
as before.
- Verified HTML copy by copying the generated HTML string and running it
through an HTML viewer.
[Sample](https://codepen.io/tusharvickey/pen/wvNXbVN)
- Verified RTF copy by copy-pasting the generated RTF string into
MSWord.
- SingleLine mode works (<kbd>Shift</kbd>+ copy)
- BlockSelection mode works (<kbd>Alt</kbd> selection)
  • Loading branch information
tusharsnx committed Jan 29, 2024
1 parent 5d2fa47 commit a3ac337
Show file tree
Hide file tree
Showing 23 changed files with 763 additions and 635 deletions.
7 changes: 6 additions & 1 deletion .github/actions/spelling/expect/expect.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1113,8 +1113,8 @@ msix
msrc
MSVCRTD
MTSM
munged
munges
Munged
murmurhash
muxes
myapplet
Expand Down Expand Up @@ -1870,7 +1870,12 @@ uiautomationcore
uielem
UIELEMENTENABLEDONLY
UINTs
ul
ulcch
uld
uldb
uldash
ulwave
Unadvise
unattend
UNCPRIORITY
Expand Down
61 changes: 41 additions & 20 deletions src/buffer/out/Row.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,18 @@ til::CoordType ROW::AdjustToGlyphStart(til::CoordType column) const noexcept
return _adjustBackward(_clampedColumn(column));
}

// Returns the (exclusive) ending column of the glyph at the given column.
// In other words, if you have 3 wide glyphs
// AA BB CC
// 01 23 45 <-- column
// Examples:
// - `AdjustToGlyphEnd(4)` returns 6.
// - `AdjustToGlyphEnd(3)` returns 4.
til::CoordType ROW::AdjustToGlyphEnd(til::CoordType column) const noexcept
{
return _adjustForward(_clampedColumnInclusive(column));
}

// Routine Description:
// - clears char data in column in row
// Arguments:
Expand Down Expand Up @@ -939,6 +951,32 @@ uint16_t ROW::size() const noexcept
return _columnCount;
}

// Routine Description:
// - Retrieves the column that is one after the last non-space character in the row.
til::CoordType ROW::GetLastNonSpaceColumn() const noexcept
{
const auto text = GetText();
const auto beg = text.begin();
const auto end = text.end();
auto it = end;

for (; it != beg; --it)
{
// it[-1] is safe as `it` is always greater than `beg` (loop invariant).
if (til::at(it, -1) != L' ')
{
break;
}
}

// We're supposed to return the measurement in cells and not characters
// and therefore simply calculating `it - beg` would be wrong.
//
// An example: The row is 10 cells wide and `it` points to the second character.
// `it - beg` would return 1, but it's possible it's actually 1 wide glyph and 8 whitespace.
return gsl::narrow_cast<til::CoordType>(GetReadableColumnCount() - (end - it));
}

til::CoordType ROW::MeasureLeft() const noexcept
{
const auto text = GetText();
Expand All @@ -957,6 +995,8 @@ til::CoordType ROW::MeasureLeft() const noexcept
return gsl::narrow_cast<til::CoordType>(it - beg);
}

// Routine Description:
// - Retrieves the column that is one after the last valid character in the row.
til::CoordType ROW::MeasureRight() const noexcept
{
if (_wrapForced)
Expand All @@ -969,26 +1009,7 @@ til::CoordType ROW::MeasureRight() const noexcept
return width;
}

const auto text = GetText();
const auto beg = text.begin();
const auto end = text.end();
auto it = end;

for (; it != beg; --it)
{
// it[-1] is safe as `it` is always greater than `beg` (loop invariant).
if (til::at(it, -1) != L' ')
{
break;
}
}

// We're supposed to return the measurement in cells and not characters
// and therefore simply calculating `it - beg` would be wrong.
//
// An example: The row is 10 cells wide and `it` points to the second character.
// `it - beg` would return 1, but it's possible it's actually 1 wide glyph and 8 whitespace.
return gsl::narrow_cast<til::CoordType>(_columnCount - (end - it));
return GetLastNonSpaceColumn();
}

bool ROW::ContainsText() const noexcept
Expand Down
2 changes: 2 additions & 0 deletions src/buffer/out/Row.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ class ROW final
til::CoordType NavigateToPrevious(til::CoordType column) const noexcept;
til::CoordType NavigateToNext(til::CoordType column) const noexcept;
til::CoordType AdjustToGlyphStart(til::CoordType column) const noexcept;
til::CoordType AdjustToGlyphEnd(til::CoordType column) const noexcept;

void ClearCell(til::CoordType column);
OutputCellIterator WriteCells(OutputCellIterator it, til::CoordType columnBegin, std::optional<bool> wrap = std::nullopt, std::optional<til::CoordType> limitRight = std::nullopt);
Expand All @@ -151,6 +152,7 @@ class ROW final
TextAttribute GetAttrByColumn(til::CoordType column) const;
std::vector<uint16_t> GetHyperlinks() const;
uint16_t size() const noexcept;
til::CoordType GetLastNonSpaceColumn() const noexcept;
til::CoordType MeasureLeft() const noexcept;
til::CoordType MeasureRight() const noexcept;
bool ContainsText() const noexcept;
Expand Down
Loading

0 comments on commit a3ac337

Please sign in to comment.