Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize C1 control chars in SetConsoleTitle API #10847

Merged
5 commits merged into from
Aug 2, 2021

Conversation

j4james
Copy link
Collaborator

@j4james j4james commented Aug 1, 2021

When the SetContoleTitle API is called with a title containing control
characters, we need to filter out those characters before we can forward
the title change over conpty as an escape sequence. If we don't do that,
the receiving terminal will end up executing the control characters
instead of updating the title. We were already filtering out the C0
control characters, but with this PR we're now filtering out C1 controls
characters as well.

I've simply updated the sanitizing routine in DoSrvSetConsoleTitleW to
filter our characters in the range 0x80 to 0x9F. This is in addition
to the C0 range (0x00 to 0x1F) that was already excluded.

Validation Steps Performed

I've added a conpty unit test that calls DoSrvSetConsoleTitleW with
titles containing a variety of C0 and C1 controls characters, and which
verifies that those characters are stripped from the title forwarded to
conpty.

I've also confirmed that the test case in issue #10312 is now working
correctly in Windows Terminal.

Closes #10312

@ghost ghost added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-VT Virtual Terminal sequence support Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Product-Conpty For console issues specifically related to conpty labels Aug 1, 2021
@github-actions
Copy link

github-actions bot commented Aug 1, 2021

@check-spelling-bot Report

Unrecognized words, please review:

  • propogated
Previously acknowledged words that are now absent SPACEBAR Unregister
To accept these unrecognized words as correct (and remove the previously acknowledged and now absent words), run the following commands

... in a clone of the git@github.com:j4james/terminal.git repository
on the fix-title-control-chars branch:

update_files() {
perl -e '
my @expect_files=qw('".github/actions/spelling/expect/alphabet.txt
.github/actions/spelling/expect/expect.txt
.github/actions/spelling/expect/web.txt"');
@ARGV=@expect_files;
my @stale=qw('"$patch_remove"');
my $re=join "|", @stale;
my $suffix=".".time();
my $previous="";
sub maybe_unlink { unlink($_[0]) if $_[0]; }
while (<>) {
if ($ARGV ne $old_argv) { maybe_unlink($previous); $previous="$ARGV$suffix"; rename($ARGV, $previous); open(ARGV_OUT, ">$ARGV"); select(ARGV_OUT); $old_argv = $ARGV; }
next if /^(?:$re)(?:(?:\r|\n)*$| .*)/; print;
}; maybe_unlink($previous);'
perl -e '
my $new_expect_file=".github/actions/spelling/expect/4b45bb8df1f90e6fc463172cb861bf968e162e77.txt";
use File::Path qw(make_path);
use File::Basename qw(dirname);
make_path (dirname($new_expect_file));
open FILE, q{<}, $new_expect_file; chomp(my @words = <FILE>); close FILE;
my @add=qw('"$patch_add"');
my %items; @items{@words} = @words x (1); @items{@add} = @add x (1);
@words = sort {lc($a)."-".$a cmp lc($b)."-".$b} keys %items;
open FILE, q{>}, $new_expect_file; for my $word (@words) { print FILE "$word\n" if $word =~ /\w/; };
close FILE;
system("git", "add", $new_expect_file);
'
}

comment_json=$(mktemp)
curl -L -s -S \
  --header "Content-Type: application/json" \
  "https://api.github.com/repos/microsoft/terminal/issues/comments/890599901" > "$comment_json"
comment_body=$(mktemp)
jq -r .body < "$comment_json" > $comment_body
rm $comment_json

patch_remove=$(perl -ne 'next unless s{^</summary>(.*)</details>$}{$1}; print' < "$comment_body")
  

patch_add=$(perl -e '$/=undef;
$_=<>;
s{<details>.*}{}s;
s{^#.*}{};
s{\n##.*}{};
s{(?:^|\n)\s*\*}{}g;
s{\s+}{ }g;
print' < "$comment_body")
  
update_files
rm $comment_body
git add -u
✏️ Contributor please read this

By default the command suggestion will generate a file named based on your commit. That's generally ok as long as you add the file to your commit. Someone can reorganize it later.

⚠️ The command is written for posix shells. You can copy the contents of each perl command excluding the outer ' marks and dropping any '"/"' quotation mark pairs into a file and then run perl file.pl from the root of the repository to run the code. Alternatively, you can manually insert the items...

If the listed items are:

  • ... misspelled, then please correct them instead of using the command.
  • ... names, please add them to .github/actions/spelling/allow/names.txt.
  • ... APIs, you can add them to a file in .github/actions/spelling/allow/.
  • ... just things you're using, please add them to an appropriate file in .github/actions/spelling/expect/.
  • ... tokens you only need in one place and shouldn't generally be used, you can add an item in an appropriate file in .github/actions/spelling/patterns/.

See the README.md in each directory for more information.

🔬 You can test your commits without appending to a PR by creating a new branch with that extra change and pushing it to your fork. The check-spelling action will run in response to your push -- it doesn't require an open pull request. By using such a branch, you can limit the number of typos your peers see you make. 😉

🗜️ If you see a bunch of garbage

If it relates to a ...

well-formed pattern

See if there's a pattern that would match it.

If not, try writing one and adding it to a patterns/{file}.txt.

Patterns are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your lines.

Note that patterns can't match multiline strings.

binary-ish string

Please add a file path to the excludes.txt file instead of just accepting the garbage.

File paths are Perl 5 Regular Expressions - you can test yours before committing to verify it will match your files.

^ refers to the file's path from the root of the repository, so ^README\.md$ would exclude README.md (on whichever branch you're using).

Comment on lines 1936 to 1938
for (size_t i = 0; i < title.size(); i++)
{
if (title.at(i) >= UNICODE_SPACE)
const auto ch = title.at(i);
Copy link
Member

@lhecker lhecker Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you convert this to a for (const auto ch : title) loop?

On top of that I'd personally do it this way - it reduces binary size by half (but is unrelated to your PR):

std::wstring sanitized{title};
sanitized.erase(std::remove_if(sanitized.begin(), sanitized.end(), [](auto ch) -> bool {
    return ch < 0x20 || (ch > 0x7f && ch < 0xa0);
}), sanitized.end());
return sanitized;

push_back and emplace_back are very expensive once they're inlined.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with James keeping the code the way that it was (it's not his mess to clean up!), but if he wants to fix it all the more power to him. 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I like that approach. Not that thrilled with the code health bot's formatting, but at least it's more efficient now. Thanks @lhecker.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this range stuff gets a bit neater with C++20, but I'm assuming we're not using that yet?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this range stuff gets a bit neater with C++20, but I'm assuming we're not using that yet?

alas no, we can't use c++20 in the OS build system quite yet ☹️

Comment on lines 1936 to 1938
for (size_t i = 0; i < title.size(); i++)
{
if (title.at(i) >= UNICODE_SPACE)
const auto ch = title.at(i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with James keeping the code the way that it was (it's not his mess to clean up!), but if he wants to fix it all the more power to him. 😄

@zadjii-msft zadjii-msft added the AutoMerge Marked for automatic merge by the bot when requirements are met label Aug 2, 2021
@ghost
Copy link

ghost commented Aug 2, 2021

Hello @zadjii-msft!

Because this pull request has the AutoMerge label, I will be glad to assist with helping to merge this pull request once all check-in policies pass.

p.s. you can customize the way I help with merging this pull request, such as holding this pull request until a specific person approves. Simply @mention me (@msftbot) and give me an instruction to get started! Learn more here.

@ghost ghost merged commit 9ba2080 into microsoft:main Aug 2, 2021
DHowett pushed a commit that referenced this pull request Aug 25, 2021
When the `SetContoleTitle` API is called with a title containing control
characters, we need to filter out those characters before we can forward
the title change over conpty as an escape sequence. If we don't do that,
the receiving terminal will end up executing the control characters
instead of updating the title. We were already filtering out the C0
control characters, but with this PR we're now filtering out C1 controls
characters as well.

I've simply updated the sanitizing routine in `DoSrvSetConsoleTitleW` to
filter our characters in the range `0x80` to `0x9F`. This is in addition
to the C0 range (`0x00` to `0x1F`) that was already excluded. 

## Validation Steps Performed

I've added a conpty unit test that calls `DoSrvSetConsoleTitleW` with
titles containing a variety of C0 and C1 controls characters, and which
verifies that those characters are stripped from the title forwarded to
conpty.

I've also confirmed that the test case in issue #10312 is now working
correctly in Windows Terminal.

Closes #10312
@ghost
Copy link

ghost commented Aug 31, 2021

🎉Windows Terminal Preview v1.10.2383.0 has been released which incorporates this pull request.:tada:

Handy links:

@ghost
Copy link

ghost commented Aug 31, 2021

🎉Windows Terminal Preview v1.11.2421.0 has been released which incorporates this pull request.:tada:

Handy links:

@j4james j4james deleted the fix-title-control-chars branch September 26, 2021 15:58
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-VT Virtual Terminal sequence support AutoMerge Marked for automatic merge by the bot when requirements are met Issue-Bug It either shouldn't be doing this or needs an investigation. Priority-2 A description (P2) Product-Conpty For console issues specifically related to conpty
Projects
None yet
Development

Successfully merging this pull request may close these issues.

C1 control characters break SetConsoleTitleW API
4 participants