Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for downloadable "soft fonts" (DRCS) #9164

Closed
j4james opened this issue Feb 14, 2021 · 16 comments · Fixed by #10011
Closed

Support for downloadable "soft fonts" (DRCS) #9164

j4james opened this issue Feb 14, 2021 · 16 comments · Fixed by #10011
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-VT Virtual Terminal sequence support Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.

Comments

@j4james
Copy link
Collaborator

j4james commented Feb 14, 2021

Description of the new feature/enhancement

Starting with the VT220 terminal, it was possible for apps to define their own "soft fonts", also known as dynamically replaceable character sets (DRCS). You would download the font to the terminal with a DECDLD escape sequence, and assign it a character set ID that could then be designated via the usual SCS escape sequences.

Some example use cases:

Custom fonts

image
More examples at https://vt100.net/dec/vt320/fonts

Simple monochromatic images

image

CMatrix with Japanese characters

Source: https://github.com/jhamby/cmatrix
image

Game sprites

image

Proposed technical implementation details (optional)

The screenshots above were taken from a POC I've been working on for conhost. In the current implemenation, when you've designated a DRCS character set, those characters get mapped to values in the Unicode PUA area. Then when the renderer encounters values in that range, it uses a BitBlt to render the glyph in place of the usual PolyTextOut calls.

The quality isn't fantastic - I'm just using StretchBlt to resize the provided glyphs to match the current font size. And the performance can seem a bit sluggish on apps like CMatrix when writing a lot of content to the screen. Nevertheless, I think it works reasonably well, and I'm sure someone smarter than me will have suggestions for how it could be improved.

I'm still a long way from producing a PR, but I wanted to raise the issue now to see how much interest there was in the idea before spending too much time on it. Initially it would be conhost-only - I'm not sure about the conpty feasibility without #1173.

@j4james j4james added the Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. label Feb 14, 2021
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Feb 14, 2021
@DHowett
Copy link
Member

DHowett commented Feb 15, 2021

This is extremely cool. I'd love to see it, and I'm sure @miniksa would agree. It does open up a minor can of worms though...

  • If we have support for custom bitmap fonts, folks may revive their (sometimes quote angry) requests for supporting old FON files or whatever weird old bitmap format they wanted
  • I'd love to keep renderer feature divergence between conhost and terminal to a minimum since eventually we're gonna use the DX renderer in conhost; does that make this significantly more difficult as to not be feasible?
  • ConPTY feasibility, as you mentioned -- though I think that it might work if it's treated as a graphic rendition for a line/segment of a line and we have a way to pass off DECDLD¹

None of these concerns is significant enough to stop me from wanting it. Michael may feel differently?

¹ we may experience a 1-frame or (-1)-frame "tear" if the font changes and the sequence is passed through directly (using the quasi-broken conpty "pass through unknown sequences" feature, and without the renderer's consent/knowledge), right? but that's pretty minor.

@j4james
Copy link
Collaborator Author

j4james commented Feb 15, 2021

  • I'd love to keep renderer feature divergence between conhost and terminal to a minimum since eventually we're gonna use the DX renderer in conhost; does that make this significantly more difficult as to not be feasible?

I'd very much like that too. And in theory it shouldn't be a problem to add support in the DX renderer, although it might be a bit tricky to test without the conpty side of things. I could probably hack something though.

If you're also thinking about the double-width/height escape sequences, that's more of a political problem. I did initially implement a basic DX renderer for that, but it became clear that it would never work correctly with the way we're handling RTL in Windows Terminal, and I didn't want to start another argument on that subject.

  • ConPTY feasibility, as you mentioned -- though I think that it might work if it's treated as a graphic rendition for a line/segment of a line and we have a way to pass off DECDLD¹

I wouldn't say it definitely can't be done, but it's more complicated than you might think, because fonts can be updated after they've been rendered. To correctly handle that behaviour over conpty would probably require tracking character set data associated with each character. Maybe not impossible, but it seems pointlessly complicated for something that should be trivial with a pass-through version of conpty.

@j4james
Copy link
Collaborator Author

j4james commented Feb 15, 2021

It just occurred to me that it might be possible to get this working over conpty without that much extra effort if we limited our support to a single font buffer (which was standard for most of the earlier terminals). That way we probably only have to remember the last charset ID, and it should be safe to use that ID for forwarding all soft characters, regardless of the charset they were originally generated with.

I may be missing something, though. I've been through this several times before where I thought I had a plan that would work, then later realized there was a complication I had overlooked. Either way, I'd still prefer to leave this for a follow-up PR. Just getting it working in conhost is going to be complicated enough as it is.

@DHowett DHowett added this to the Console Backlog milestone Feb 15, 2021
@DHowett DHowett added Product-Conhost For issues in the Console codebase Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-VT Virtual Terminal sequence support labels Feb 15, 2021
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Feb 15, 2021
@zadjii-msft zadjii-msft removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Feb 16, 2021
@miniksa
Copy link
Member

miniksa commented Feb 16, 2021

  • ConPTY feasibility, as you mentioned -- though I think that it might work if it's treated as a graphic rendition for a line/segment of a line and we have a way to pass off DECDLD¹

Oh no. What if the conhost can handle all of these fonts and it just emits any of the complexity out the PTY side as sixels and we make Terminal understand sixels? :P

This is extremely cool. I'd love to see it, and I'm sure @miniksa would agree. It does open up a minor can of worms though...

None of these concerns is significant enough to stop me from wanting it. Michael may feel differently?

I do agree though that this is exciting and I'd love to see it. I'm not concerned about any of the issues up front to wholesale stop us from trying to do it.

The quality isn't fantastic - I'm just using StretchBlt to resize the provided glyphs to match the current font size. And the performance can seem a bit sluggish on apps like CMatrix when writing a lot of content to the screen. Nevertheless, I think it works reasonably well, and I'm sure someone smarter than me will have suggestions for how it could be improved.

If you're having trouble with the performance in GDI... I can look at it under the performance analyzer once there's a prototype and see if there's anything to be done. But generally speaking, BitBlt and StretchBlt ought to be hardware accelerated and pretty quick per https://docs.microsoft.com/en-us/windows-hardware/drivers/display/specifying-gdi-hardware-accelerated-rendering-operations. It might be faster to pre-stretch them all to the right size on another in-memory canvas and blt them over from there instead of making it do the stretch calculation every time.

I'd very much like that too. And in theory it shouldn't be a problem to add support in the DX renderer, although it might be a bit tricky to test without the conpty side of things. I could probably hack something though.

If you're also thinking about the double-width/height escape sequences, that's more of a political problem. I did initially implement a basic DX renderer for that, but it became clear that it would never work correctly with the way we're handling RTL in Windows Terminal, and I didn't want to start another argument on that subject.

And as for the DX side...I'm sure we can figure something out. I almost have the vaguest idea of writing all the glyphs into an in-memory font that exposes the interfaces and applying that... or a sprite map sort of thing with blits... hmmmm.

I'm not that worried about the RTL either. If we have to refine RTL further, then we do.

The screenshots above were taken from a POC I've been working on for conhost. In the current implemenation, when you've designated a DRCS character set, those characters get mapped to values in the Unicode PUA area.

I'm a bit worried about using the PUA and having someone have another legitimate use of it that gets in the way. But maybe it's OK to say you can't do both at the same time? I think the only way of avoiding it otherwise would be to attach a specific font override to pieces of the text buffer. Or some sort of flag. A bit ick, but possible.

  • If we have support for custom bitmap fonts, folks may revive their (sometimes quote angry) requests for supporting old FON files or whatever weird old bitmap format they wanted

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN. Not excited about that potential. I want FON files to go die. But it might make it easier to do if there's some sort of bltting thing already established for this.

@j4james
Copy link
Collaborator Author

j4james commented Feb 16, 2021

It might be faster to pre-stretch them all to the right size on another in-memory canvas and blt them over from there instead of making it do the stretch calculation every time.

Yeah, I am doing that now, and I think it was an improvement, but it still feels sluggish. My current plan is to see if I can get it working with a run-time raster font, and see if that's any faster. That way I think I can render a whole line of characters with a single GDI call instead of having to do a separate BitBlit for each character - I'm hoping that might make a bit of a difference.

I'm a bit worried about using the PUA and having someone have another legitimate use of it that gets in the way.

This concerned me too. But if we only support a single font buffer, then it's only 96 characters, so I figured we should be able to find a safe gap somewhere that we can reserve. That said, I don't think adding a flag in the TextAttribute class would be such a bad idea either. It might actually make the renderer a little simpler too.

@miniksa
Copy link
Member

miniksa commented Feb 16, 2021

It might be faster to pre-stretch them all to the right size on another in-memory canvas and blt them over from there instead of making it do the stretch calculation every time.

Yeah, I am doing that now, and I think it was an improvement, but it still feels sluggish. My current plan is to see if I can get it working with a run-time raster font, and see if that's any faster. That way I think I can render a whole line of characters with a single GDI call instead of having to do a separate BitBlit for each character - I'm hoping that might make a bit of a difference.

Yeah that's sort of the idea I was going for with what I was describing with DirectX... maybe making a fake in-memory font for it and letting those optimizations happen in the text drawing code.

The problem with BitBlt is honestly probably that you're doing a syscall for every single operation. I don't think there's a way around that beyond the font trickery or... composing your own complete bitmap in a chunk of memory in usermode somehow and transferring that whole thing to GDI for rendering at the end. That's really why I was using PolyTextOut in the GDI renderer... calling TextOut in a loop was DEATH in just the syscalls when I did a perf trace on it.

This is where things like DirectWrite will have the blting advantage by composing up a ton of instructions in a queue in user mode and then dispatching them in a chunk. Also it'll run some of it on the GPU probably versus GDI doing virtually all of the work on the CPU. Though I still bet if you can fit it into a font-shaped thing, it'd be faster for both DX and GDI. Or there is technically a DX/DWrite/GDI compatibility shim thing to allow interoperability.... but that may be too crazy.

@miniksa
Copy link
Member

miniksa commented Feb 16, 2021

https://devblogs.microsoft.com/oldnewthing/20170331-00/?p=95875 and https://docs.microsoft.com/en-us/windows/win32/api/wingdi/nf-wingdi-createdibsection?

Maybe make a memory section with the bits, fill it up, share it with GDI, and push it all over at once? Woof.

@j4james
Copy link
Collaborator Author

j4james commented Feb 18, 2021

Maybe make a memory section with the bits, fill it up, share it with GDI, and push it all over at once? Woof.

Now that you mention that, I did actually try using SetDIBitsToDevice, thinking it might help if I could use it to blit out a whole sequence of characters in one call, but that also didn't seem to work very well with the CMatrix test. I've since realised that that's probably because of the way CMatrix is updating the screen - we get a lot of individual characters being drawn at different locations rather than a nice neat sequences that can be chunked together. But maybe there's still hope for a font-based approach using PolyText.

Also I need to slow down a bit and do thing properly. So far I've just been hacking things together to get a general idea of what works and what doesn't, but it's quite possible my hacky code is part of the problem. I'm not ruling anything out until I've had a chance to try and implement things properly.

@j4james
Copy link
Collaborator Author

j4james commented Apr 6, 2021

FYI, I've got a PR ready to submit for this, but it's branched off of PR #9307, so I'm waiting for that to merge first. This is still conhost only, and I've no immediate plans to work on the DX/conpty side of things, so there's no urgency to prioritize this from a Windows Terminal point of view. Just letting you know it's available if you want it.

@miniksa
Copy link
Member

miniksa commented Apr 14, 2021

FYI, I've got a PR ready to submit for this, but it's branched off of PR #9307, so I'm waiting for that to merge first. This is still conhost only, and I've no immediate plans to work on the DX/conpty side of things, so there's no urgency to prioritize this from a Windows Terminal point of view. Just letting you know it's available if you want it.

Yes I want it even if it's conhost only. At least there's a reference impl for me to work from to port it to DX eventually. Sorry I've been so out of it on reviewing these.

@dotnetCarpenter
Copy link
Contributor

dotnetCarpenter commented Apr 15, 2021

Just a quick question to set expectations for this feature.

@DHowett mentions that PCF fonts will never be supported but that DRCS fonts can be used in place of PCF. I googled conversions from PCF to DRCS but came up empty handed (it seems that DRCS font by itself does not yield many results).

But how about .fnt files? Will those be supported or is there a conversion tool available to generate DRCS fonts from various fonts? I guess what I'm aiming at, is how would I go about using this feature without being an artist?

Screen-shot of the rendered matrix.fnt
Screen-shot rendering of matrix.fnt with RECOIL for Windows

fnt.zip

@j4james
Copy link
Collaborator Author

j4james commented Apr 15, 2021

In theory, converting a raster .fnt file to DRCS shouldn't be too difficult, but there are a couple of limitations you need to be aware of.

  1. DRCS fonts are limited to 96 characters, while a .fnt file can have up to 256 I think. In theory you could have two DRCS fonts, one mapping characters 32 to 127, and another mapping 160 to 255, which gets you closer to a full .fnt file, but my DRCS implementation doesn't support multiple font buffers yet, so you can't display both sets at the same time. Maybe in a future update.

  2. Raster fonts are designed to work at a specific resolution, but in the terminal we need to render the glyphs at the same size as your current terminal font. That will often require some stretching or shrinking, so the quality won't be perfect. But if you choose your terminal font and font size appropriately, you can potentially work around those issues.

But if you just want Cmatrix with the cool characters, there's already a fork that uses DRCS to achieve that. See the link and screenshot above.

ghost pushed a commit that referenced this issue Apr 30, 2021
This PR introduces a mechanism via which DCS data strings can be passed
through directly to the dispatch method that will be handling them, so
the data can be processed as it is received, rather than being buffered
in the state machine. This also simplifies the way string termination is
handled, so it now more closely matches the behaviour of the original
DEC terminals.

* Initial support for DCS sequences was introduced in PR #6328.
* Handling of DCS (and other) C1 controls was added in PR #7340.
* This is a prerequisite for Sixel (#448) and Soft Font (#9164) support.

The way this now works, a `DCS` sequence is dispatched as soon as the
final character of the `VTID` is received. Based on that ID, the
`OutputStateMachineEngine` should forward the call to the corresponding
dispatch method, and its the responsibility of that method to return an
appropriate handler function for the sequence.

From then on, the `StateMachine` will pass on all of the remaining bytes
in the data string to the handler function. When a data string is
terminated (with `CAN`, `SUB`, or `ESC`), the `StateMachine` will pass
on one final `ESC` character to let the handler know that the sequence
is finished. The handler can also end a sequence prematurely by
returning false, and then all remaining data bytes will be ignored.

Note that once a `DCS` sequence has been dispatched, it's not possible
to abort the data string. Both `CAN` and `SUB` are considered valid
forms of termination, and an `ESC` doesn't necessarily have to be
followed by a `\` for the string terminator. This is because the data
string is typically processed as it's received. For example, when
outputting a Sixel image, you wouldn't erase the parts that had already
been displayed if the data string is terminated early.

With this new way of handling the string termination, I was also able to
simplify some of the `StateMachine` processing, and get rid of a few
states that are no longer necessary. These changes don't apply to the
`OSC` sequences, though, since we're more likely to want to match the
XTerm behavior for those cases (which requires a valid `ST` control for
the sequence to be accepted).

## Validation Steps Performed

For the unit tests, I've had to make a few changes to some of the
`OutputEngineTests` to account for the updated `StateMachine`
processing. I've also added a new `StateMachineTest` to confirm that the
data strings are correctly passed through to the string handler under
all forms of termination.

To test whether the framework is actually usable, I've been working on
DRCS Soft Font support branched off of this PR, and haven't encountered
any problems. To test the throughput speed, I also hacked together a
basic Sixel parser, and that seemed to perform reasonably well.

Closes #7316
@ghost ghost added the In-PR This issue has a related PR label May 1, 2021
@ghost ghost closed this as completed in #10011 Aug 6, 2021
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Aug 6, 2021
ghost pushed a commit that referenced this issue Aug 6, 2021
This PR adds conhost support for downloadable soft fonts - also known as
dynamically redefinable character sets (DRCS) - using the `DECDLD`
escape sequence.

These fonts are typically designed to work on a specific terminal model,
and each model tends to have a different character cell size. So in
order to support as many models as possible, the code attempts to detect
the original target size of the font, and then scale the glyphs to fit
our current cell size.

Once a font has been downloaded to the terminal, it can be designated in
the same way you would a standard character set, using an `SCS` escape
sequence. The identification string for the set is defined by the
`DECDLD` sequence. Internally we map the characters in this set to code
points `U+EF20` to `U+EF7F` in the Unicode private use are (PUA).

Then in the renderer, any characters in that range are split off into
separate runs, which get painted with a special font. The font itself is
dynamically generated as an in-memory resource, constructed from the
downloaded character bitmaps which have been scaled to the appropriate
size.

If no soft fonts are in use, then no mapping of the PUA code points will
take place, so this shouldn't interfere with anyone using those code
points for something else, as along as they aren't also trying to use
soft fonts. I also tried to pick a PUA range that hadn't already been
snatched up by Nerd Fonts, but if we do receive reports of a conflict,
it's easy enough to change.

## Validation Steps Performed

I added an adapter test that runs through a bunch of parameter
variations for the `DECDLD` sequence, to make sure we're correctly
detecting the font sizes for most of the known DEC terminal models.

I've also tested manually on a wide range of existing fonts, of varying
dimensions, and from multiple sources, and made sure they all worked
reasonably well.

Closes #9164
@dotnetCarpenter
Copy link
Contributor

@j4james I assume you mean https://github.com/jhamby/cmatrix and you're absolutely right!

I removed my apt version of cmatrix and installed jhamby/cmatrix from source.

However, using terminal Version: 1.9.1942.0 on Ubuntu 21.04 and the output with cmatrix -3 looks nothing like expected. Using cmatrix -lba and I'm back to #9830 which is closed as a duplicate of this issue...

How did you run your tests for #10011?

PS. Sorry for the very late reply

@zadjii-msft
Copy link
Member

@dotnetCarpenter FWIW #10011 hasn't shipped in any terminal release yet - 1.11 will be the first that has it

@DHowett
Copy link
Member

DHowett commented Aug 26, 2021

However, Terminal doesn't support DRCS yet. Only the Windows Console host shipped inside Terminal does!

@dotnetCarpenter
Copy link
Contributor

@DHowett cmatrix for Windows then? ;-D

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-VT Virtual Terminal sequence support Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants