Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

COOKED_READ (cmd.exe) doesn't properly support emoji input #1503

Closed
Tracked by #190
peter-bertok opened this issue Jun 24, 2019 · 4 comments · Fixed by #15783
Closed
Tracked by #190

COOKED_READ (cmd.exe) doesn't properly support emoji input #1503

peter-bertok opened this issue Jun 24, 2019 · 4 comments · Fixed by #15783
Labels
Area-CookedRead The cmd.exe COOKED_READ handling Area-Input Related to input processing (key presses, mouse, etc.) In-PR This issue has a related PR Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Product-Conhost For issues in the Console codebase

Comments

@peter-bertok
Copy link

peter-bertok commented Jun 24, 2019

Environment

Windows build number: 1903
Windows Terminal version (if applicable): 0.2.1715.0

Steps to reproduce

Paste text containing complex Unicode characters such as emoji into a PowerShell tab as a string literal.
Emoji will be displayed as "??" placeholders, but then display correctly when the literal is "output" by pressing enter.

Expected behavior

Unicode characters such as Emoji should be consistently displayed, including in string literals, input text, command-line arguments, etc...

Actual behavior

Inconsistent display:

Screenshot

@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jun 24, 2019
@DHowett-MSFT
Copy link
Contributor

This one is fascinating. I spy two bugs here. One is with the emoji input (this could just be PSReadline's fault), and the other is that some emoji are still too tiny!

@DHowett-MSFT DHowett-MSFT added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Terminal The new Windows Terminal. labels Jun 24, 2019
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jun 24, 2019
@miniksa miniksa added Area-Input Related to input processing (key presses, mouse, etc.) Product-Conhost For issues in the Console codebase and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jun 27, 2019
@miniksa miniksa changed the title Bug Report Emoji input displays replacement character � Jun 27, 2019
@Shorotshishir
Copy link

dumping any of the two files in the terminal doesn't render any data properly. these two files ocntails all the emojis that windows provide.

CMD:
use more or type to dump content into the terminal.
shows garbage text, but doesn't crash

PowerShell
use get-content to dump the data into the terminal
nothing shows , hangs the terminal app and crashes

WSL
use cat to dump the content into the terminal
nothing shows, hangs the terminal app and crashes

N.B. --> Crash doesn't terminate terminal app

@davidhewitt
Copy link

Pasting emoji input is also an issue with cmd. In the screenshot below, I paste a string of smileys in, but they come out as invalid glyphs.

Hitting enter and up displays the correct input string though, so it is making it into the console buffer correctly.

image

@zadjii-msft zadjii-msft added the Priority-2 A description (P2) label Jan 22, 2020
@DHowett-MSFT DHowett-MSFT changed the title Emoji input displays replacement character � COOKED READ: Emoji input displays replacement character � Feb 10, 2020
@zadjii-msft zadjii-msft self-assigned this Mar 26, 2020
@zadjii-msft
Copy link
Member

zadjii-msft commented Mar 27, 2020

For me later:

Emoji.txt

😁😁😂🤣😃😄😅😆😉😊😋😎😍😘🥰😗😙😚☺🙂🤗🤩🤔🤨😐😑😶🙄😏😣😥😮🤐😯😪😫🥱😴😌😛😜😝🤤😒😓😔😕🙃🤑😲☹🙁😖😞😟😤😢😭😦😧😨😩🤯😬😰😱🥵🥶😳🤪😵🥴😠😡🤬😷🤒🤕🤢🤮🤧😇🥳🥺🤠🤡🤥🤫🤭🧐🤓😈👿👹👺💀☠👻👽👾🤖💩😺😸😹😻😼😽🙀😿😾🐱‍👤🐱‍🏍🐱‍💻🐱‍🐉🐱‍👓🐱‍🚀🙈🙉🙊🐵🐶🐺🐱🦁🐯🦒🦊🦝🐮🐷🐗🐭🐹🐰🐻🐨🐼🐸🦓🐴🦄🐔🐲🐽🐾🐒🦍🦧🦮🐕‍🦺🐩🐕🐈🐅🐆🐎🦌🦏🦛🐂🐃🐄🐖🐏🐑🐐🐪🐫🦙🦘🦥🦨🦡🐘🐁🐀🦔🐇🐿🦎🐊🐢🐍🐉🦕🦖🦦🦈🐬🐳🐋🐟🐠🐡🦐🦑🐙🦞🦀🐚🦆🐓🦃🦅🕊🦢🦜🦩🦚🦉🐦🐧🐥🐤🐣🦇🦋🐌🐛🦟🦗🐜🐝🐞🦂🕷🕸🦠🧞‍♀️🧞‍♂️🗣👤👥👁👀🦴🦷👅👄🧠🦾🦿👣🤺⛷🤼‍♂️🤼‍♀️👯‍♂️👯‍♀️💑👩‍❤️‍👩👨‍❤️‍👨💏👩‍❤️‍💋‍👩👨‍❤️‍💋‍👨👪👨‍👩‍👦👨‍👩‍👧👨‍👩‍👧‍👦👨‍👩‍👦‍👦👨‍👩‍👧‍👧👨‍👨‍👦👨‍👨‍👧👨‍👨‍👧‍👦👨‍👨‍👦‍👦👨‍👨‍👧‍👧👩‍👩‍👦👩‍👩‍👧👩‍👩‍👧‍👦👩‍👩‍👦‍👦👩‍👩‍👧‍👧👩‍👦👩‍👧👩‍👧‍👦👩‍👦‍👦👩‍👧‍👧👨‍👦👨‍👧👨‍👧‍👦👨‍👦‍👦👨‍👧‍👧👭👩🏻‍🤝‍👩🏻👩🏼‍🤝‍👩🏻👩🏼‍🤝‍👩🏼👩🏽‍🤝‍👩🏻👩🏽‍🤝‍👩🏼👩🏽‍🤝‍👩🏽👩🏾‍🤝‍👩🏻👩🏾‍🤝‍👩🏼👩🏾‍🤝‍👩🏽👩🏾‍🤝‍👩🏾👩🏿‍🤝‍👩🏻👩🏿‍🤝‍👩🏼👩🏿‍🤝‍👩🏽👩🏿‍🤝‍👩🏾👩🏿‍🤝‍👩🏿👫👩🏻‍🤝‍🧑🏻👩🏻‍🤝‍🧑🏼👩🏻‍🤝‍🧑🏽👩🏻‍🤝‍🧑🏾👩🏻‍🤝‍🧑🏿👩🏼‍🤝‍🧑🏻👩🏼‍🤝‍🧑🏼👩🏼‍🤝‍🧑🏽👩🏼‍🤝‍🧑🏾👩🏼‍🤝‍🧑🏿👩🏽‍🤝‍🧑🏻👩🏽‍🤝‍🧑🏼👩🏽‍🤝‍🧑🏽👩🏽‍🤝‍🧑🏾👩🏽‍🤝‍🧑🏿👩🏾‍🤝‍🧑🏻👩🏾‍🤝‍🧑🏼👩🏾‍🤝‍🧑🏽👩🏾‍🤝‍🧑🏾👩🏾‍🤝‍🧑🏿👩🏿‍🤝‍🧑🏻👩🏿‍🤝‍🧑🏼👩🏿‍🤝‍🧑🏽👩🏿‍🤝‍🧑🏾👩🏿‍🤝‍🧑🏿👬👨🏻‍🤝‍👨🏻👨🏼‍🤝‍👨🏻👨🏼‍🤝‍👨🏼👨🏽‍🤝‍👨🏻👨🏽‍🤝‍👨🏼👨🏽‍🤝‍👨🏽👨🏾‍🤝‍👨🏻👨🏾‍🤝‍👨🏼👨🏾‍🤝‍👨🏽👨🏾‍🤝‍👨🏾👨🏿‍🤝‍👨🏻👨🏿‍🤝‍👨🏼👨🏿‍🤝‍👨🏽👨🏿‍🤝‍👨🏾👨🏿‍🤝‍👨🏿👨🏿‍🤝‍👨
👩👨🧑👧👦🧒👶👵👴🧓👩‍🦰👨‍🦰👩‍🦱👨‍🦱👩‍🦲👨‍🦲👩‍🦳👨‍🦳👱‍♀️👱‍♂️👸🤴👳‍♀️👳‍♂️👲🧔👼🤶🎅👮‍♀️👮‍♂️🕵️‍♀️🕵️‍♂️💂‍♀️💂‍♂️👷‍♀️👷‍♂️👩‍⚕️👨‍⚕️👩‍🎓👨‍🎓👩‍🏫👨‍🏫👩‍⚖️👨‍⚖️👩‍🌾👨‍🌾👩‍🍳👨‍🍳👩‍🔧👨‍🔧👩‍🏭👨‍🏭👩‍💼👨‍💼👩‍🔬👨‍🔬👩‍💻👨‍💻👩‍🎤👨‍🎤👩‍🎨👨‍🎨👩‍✈️👨‍✈️👩‍🚀👨‍🚀👩‍🚒👨‍🚒🧕👰🤵🤱🤰🦸‍♀️🦸‍♂️🦹‍♀️🦹‍♂️🧙‍♀️🧙‍♂️🧚‍♀️🧚‍♂️🧛‍♀️🧛‍♂️🧜‍♀️🧜‍♂️🧝‍♀️🧝‍♂️🧟‍♀️🧟‍♂️🙍‍♀️🙍‍♂️🙎‍♀️🙎‍♂️🙅‍♀️🙅‍♂️🙆‍♀️🙆‍♂️🧏‍♀️🧏‍♂️💁‍♀️💁‍♂️🙋‍♀️🙋‍♂️🙇‍♀️🙇‍♂️🤦‍♀️🤦‍♂️🤷‍♀️🤷‍♂️💆‍♀️💆‍♂️💇‍♀️💇‍♂️🧖‍♀️🧖‍♂️🤹‍♀️🤹‍♂️👩‍🦽👨‍🦽👩‍🦼👨‍🦼👩‍🦯👨‍🦯🧎‍♀️🧎‍♂️🧍‍♀️🧍‍♂️🚶‍♀️🚶‍♂️🏃‍♀️🏃‍♂️💃🕺🧗‍♀️🧗‍♂️🧘‍♀️🧘‍♂️🛀🛌🕴🏇🏂🏌️‍♀️🏌️‍♂️🏄‍♀️🏄‍♂️🚣‍♀️🚣‍♂️🏊‍♀️🏊‍♂️🤽‍♀️🤽‍♂️🤾‍♀️🤾‍♂️⛹️‍♀️⛹️‍♂️🏋️‍♀️🏋️‍♂️🚴‍♀️🚴‍♂️🚵‍♀️🚵‍♂️🤸‍♀️🤸‍♂️🤳💪🦵🦶👂🦻👃🤏👈👉☝👆👇✌🤞🖖🤘🤙🖐✋👌👍👎✊👊🤛🤜🤚👋🤟✍👏👐🙌🤲🙏🤝💅
🎈🎆🎇🧨✨🎉🎊🎃🎄🎋🎍🎎🎏🎐🎑🧧🎀🎁🎗🎞🎟🎫🎠🎡🎢🎪🎭🖼🎨🧵🧶🛒👓🕶🦺🥽🥼🧥👔👕👖🩳🧣🧤🧦👗🥻👘👚🩲🩱👙👛👜👝🛍🎒👞👟🥾🥿👠👡👢🩰👑🧢⛑👒🎩🎓💋💄💍💎⚽⚾🥎🏀🏐🏈🏉🎱🎳🥌⛳⛸🎣🤿🎽🛶🎿🛷🥅🏒🥍🏏🏑🏓🏸🎾🥏🪁🎯🥊🥋🥇🥈🥉🏅🎖🏆🎮🕹🎰🎲🔮🧿🧩🧸🪀🎴🃏🀄♟♠♣♥♦🔈🔉🔊📢📣🔔🎼🎵🎶🎙🎤🎚🎛🎧📯🥁🎷🎺🎸🪕🎻🎹📻🔒🔓🔏🔐🔑🗝🪓🔨⛏⚒🛠🔧🔩🧱⚙🗜🛢⚗🧪🧫🧬🩺💉🩸🩹💊🔬🔭⚖📿🔗⛓🧰🧲🦯🛡🏹🗡⚔🔪💣🔫☎📞📟📠📱📲📳📴🚬⚰⚱🗿🔋🔌💻🖥🖨⌨🖱🖲💽💾💿📀🧮🎥🎬📽📡📺📷📸📹📼🔍🔎🕯🪔💡🔦🏮📔📕📖📗📘📙📚📓📒📃📜📄📑📰🗞🔖🏷💰💴💵💶💷💸💳🧾🏧✉📧📨📩📤📥📦📫📪📬📭📮🗳✏✒🖋🖊🖌🖍📝🗒💼📁📂🗂📅📆🗓📇📈📉📊📋📌📍📎🖇📏📐✂🗃🗄🗑⌛⏳⌚⏰⏱⏲🕰
🍕🍔🍟🌭🍿🧂🥓🥚🍳🧇🥞🧈🍞🥐🥨🥯🥖🧀🥗🥙🥪🌮🌯🥫🍖🍗🥩🍠🥟🥠🥡🍱🍘🍙🍚🍛🍜🦪🍣🍤🍥🥮🍢🧆🥘🍲🍝🥣🥧🍦🍧🍨🍩🍪🎂🍰🧁🍫🍬🍭🍡🍮🍯🍼🥛🧃☕🍵🧉🍶🍾🍷🍸🍹🍺🍻🥂🥃🧊🥤🥢🍽🍴🥄🏺🥝🥥🍇🍈🍉🍊🍋🍌🍍🥭🍎🍏🍐🍑🍒🍓🍅🍆🌽🌶🍄🥑🥒🥬🥦🥔🧄🧅🥕🌰🥜💐🌸🏵🌹🌺🌻🌼🌷🥀☘🌱🌲🌳🌴🌵🌾🌿🍀🍁🍂🍃
🚗🚓🚕🛺🚙🚌🚐🚎🚑🚒🚚🚛🚜🚘🚔🚖🚍🦽🦼🛹🚲🛴🛵🏍🏎🚄🚅🚈🚝🚞🚃🚋🚆🚉🚊🚇🚟🚠🚡🚂🛩🪂✈🛫🛬💺🚁🚀🛸🛰⛵🚤🛥⛴🛳🚢⚓🚏⛽🚨🚥🚦🚧🏁🏳‍🌈🏳🏴🏴‍☠️🚩🌌🪐🌍🌎🌏🗺🧭🏔⛰🌋🗻🛤🏕🏞🛣🏖🏜🏝🏟🏛🏗🏘🏙🏚🏠🏡⛪🕋🕌🛕🕍⛩🏢🏣🏤🏥🏦🏨🏩🏪🏫🏬🏭🏯🏰💒🗼🌉🗽🗾🎌⛲⛺🌁🌃🌄🌅🌆🌇♨💈🛎🧳🪑🚪🛏🛋🚽🧻🚿🛁🧼🧽🧴🪒🧷🧹🧺🧯☁⛅⛈🌤🌥🌦🌧🌨🌩🌪🌫🌝🌑🌒🌓🌔🌕🌖🌗🌘🌙🌚🌛🌜☀🌞⭐🌟🌠☄🌡🌬🌀🌈🌂☂☔⛱⚡❄☃⛄🔥💧🌊
❤🧡💛💚💙💜🤎🖤🤍💔❣💕💞💓💗💖💘💝💟💌💢💥💤💦💨💫🕳☮✝☪🕉☸✡🔯🕎☯☦🛐⛎♈♉♊♋♌♍♎♏♐♑♒♓🆔⚕♾⚛🈳🈹🈶🈚🈸🈺🈷✴🆚🉑💮🉐㊙㊗🈴🈵🈲🚼🅰🅱🆎🆑🅾🆘⛔🛑📛❌⭕🚫🔇🔕🚭🚷🚯🚳🚱🔞📵❗❕❓❔‼⁉💯🔅🔆🔱⚜〽☢☣⚠🚸🔰♻🈯💹❇✳❎✅💠🌐Ⓜ🈂➿🛂🛃🛄🛅♿🚾🅿🚰🚹🚺🚻🚮📶🈁🆖🆗🆙🆒🆕🆓#️⃣*️⃣0️⃣1️⃣2️⃣3️⃣4️⃣5️⃣6️⃣7️⃣8️⃣9️⃣🔟🔢▶⏸⏯⏹⏺⏭⏮⏩⏪🔀🔁🔂◀🔼⏫🔽⏬⏏🎦➡⬅⬆⬇↗↘↙↖↕↔🔄↪↩⤴⤵ℹ🔤🔡🔠🔣🔃🔛🔝🔜☑🔚🔙〰➰✔💲💱➕➖✖➗©®™🔘🔴🟠🟡🟢🔵🟣🟤⚫⚪🟥🟧🟨🟩🟦🟪🟫⬛⬜◼◻◾◽▪▫🔶🔸🔷🔹🔺🔻🔲🔳💭🗯💬🗨👁‍🗨🕐🕑🕒🕓🕔🕕🕖🕗🕘🕙🕚🕛🕜🕝🕞🕟🕠🕡🕢🕣🕤🕥🕦🕧

;-)¯_(ツ)/¯( ••)>⌐■-■(⌐■_■):-P:-((••)( ´・・)ノ(..)༼ つ ◕_◕ ༽つ(ˉ﹃ˉ)(╯°□°)╯︵ ┻━┻ಠ_ಠಥ_ಥ:-Dᓚᘏᗢ(┬┬﹏┬┬)^_^:-)(^///^)╰(*°▽°*)╯☆*: .。. o(≧▽≦)o .。.:*☆(*/ω\*)(●'◡'●)(❁´◡❁)(☞゚ヮ゚)☞☜(゚ヮ゚☜)(¬‿¬)(¬_¬ )(T_T)(⊙_⊙;)

😁 is "\uD83D\uDE01", or 0n55357, 0n56833

OutputCellView OutputCellIterator::s_GenerateView(const std::wstring_view view,
                                                  const TextAttribute attr,
                                                  const TextAttributeBehavior behavior)
{
    const auto glyph = Utf16Parser::ParseNext(view);
    DbcsAttribute dbcsAttr;
    if (IsGlyphFullWidth(glyph))
    {
        dbcsAttr.SetLeading();
    }

    return OutputCellView(glyph, dbcsAttr, attr, behavior);
}

As the two wchar_ts get written to the buffer by WriteCharsLegacy, we create an OutputCellIterator to write each half of the emoji. Unfortunately, we write each half one char at a time. Utf16Parser::ParseNext doesn't like that. It knows the first wchar_t is a leading byte, but can also tell there's no trailing byte, so it just returns a Replacement char.

The character does end up getting inserted into the cooked read data correctly, which is why hitting enter to submit the commandline in cmd works just fine. The data in the cooked read data is correct, but the text buffer has the wrong data.

Presumably, the cooked read is just writing the text buffer wrong. COOKED_READ_DATA::ProcessInput can only handle one wchar_t at a time.

When you use the emoji picker to input the character, it first comes through ConversionAreaInfo::WriteText straight to _screenBuffer->Write to draw the composition buffer. Then, once the dialog is dismissed, the keys get sent to the input buffer in ConsoleImeInfo::_InsertConvertedString, where again the cooked read gets them one char at a time to display broken in the buffer.


EDIT: March 30th 2020

I've investigated into this a bit, and this is one of those terrible rabbit-hole issues. Even if we do add support for simply typing/pasting emoji to COOKED_READ, that opens up a whole other can of bugs. Then, COOKED_READ should probably also be enlightened to support backspacing an emoji. Also, what happens for applications that are expecting UCS-2 input, not utf-16? It's an unfortunately complex issue that we'll have to resolve on the console side of things.

This is now the "COOKED_READ (cmd.exe) doesn't properly support emoji input" issue, and I'm moving this to 21H1 as a "Feature", so we can try and prioritize for the next Windows release.

Code Snippet for future developers

I found that I could get cooked read to draw the emoji right every time by re-printing the buffer each time, but that wouldn't work for backspacing through emoji. Take a look at this segment for code for COOKED_READ_DATA::ProcessInput

if (AtEol())
    {
        // If at end of line, processing is relatively simple. Just store the character and write it to the screen.
        if (wch == UNICODE_BACKSPACE2)
        {
            wch = UNICODE_BACKSPACE;
        }

        if (wch != UNICODE_BACKSPACE || _bufPtr != _backupLimit)
        {
            fStartFromDelim = IsWordDelim(_bufPtr[-1]);

            bool loop = true;
            while (loop)
            {
                loop = false;
                if (wch == UNICODE_BACKSPACE && _processedInput)
                {
                    _bufPtr -= 1;
                    // clang-format off
#pragma prefast(suppress: __WARNING_POTENTIAL_BUFFER_OVERFLOW_HIGH_PRIORITY, "This access is fine")
                    // clang-format on
                    *_bufPtr = (WCHAR)' ';
                    _currentPosition -= 1;

                    _screenInfo.GetTextBuffer().GetCursor().SetPosition(_originalCursorPosition);
                    status = WriteCharsLegacy(_screenInfo,
                                              _backupLimit,
                                              _backupLimit,
                                              _backupLimit,
                                              &_bytesRead,
                                              &NumSpaces,
                                              _originalCursorPosition.X,
                                              WC_DESTRUCTIVE_BACKSPACE | WC_KEEP_CURSOR_VISIBLE | WC_ECHO,
                                              &ScrollY);
                    _bytesRead -= sizeof(WCHAR);

                    // Repeat until it hits the word boundary
                    if (wchOrig == EXTKEY_ERASE_PREV_WORD &&
                        _bufPtr != _backupLimit &&
                        fStartFromDelim ^ !IsWordDelim(_bufPtr[-1]))
                    {
                        loop = true;
                    }
                }
                else
                {
                    *_bufPtr = wch;
                    _bytesRead += sizeof(WCHAR);
                    _bufPtr += 1;
                    _currentPosition += 1;
                }
                if (_echoInput)
                {
                    NumToWrite = sizeof(WCHAR);
                    _screenInfo.GetTextBuffer().GetCursor().SetPosition(_originalCursorPosition);

                    status = WriteCharsLegacy(_screenInfo,
                                              _backupLimit,
                                              _backupLimit,
                                              _backupLimit,
                                              &_bytesRead,
                                              &NumSpaces,
                                              _originalCursorPosition.X,
                                              WC_DESTRUCTIVE_BACKSPACE | WC_KEEP_CURSOR_VISIBLE | WC_ECHO,
                                              &ScrollY);
                    if (NT_SUCCESS(status))
                    {
                        _originalCursorPosition.Y += ScrollY;
                    }
                    else
                    {
                        RIPMSG1(RIP_WARNING, "WriteCharsLegacy failed %x", status);
                    }
                }

                // if (wch == UNICODE_BACKSPACE && _processedInput)
                // {
                //     _bytesRead -= sizeof(WCHAR);
                // }
                _visibleCharCount += NumSpaces;
            }
        }
    }
    else

@zadjii-msft zadjii-msft removed their assignment Mar 30, 2020
@zadjii-msft zadjii-msft added Priority-1 A description (P1) and removed Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Priority-2 A description (P2) Product-Terminal The new Windows Terminal. labels Mar 30, 2020
@zadjii-msft zadjii-msft removed this from the Terminal v1.0 milestone Mar 30, 2020
ghost pushed a commit that referenced this issue Mar 30, 2020
This PR updates our internal tool `conechokey` to use `ReadConsoleInputW` by default. It also adds a flag `-a` to force it to use `ReadConsoleInputA`.

I discovered this while digging around for #1503, but figured I'd get this checked in now while I'm still investigating.

Since this is just a helper tool, I spent as little effort writing this change - yea the whole tool could benefit from cleaner code but _ain't nobody got time for that_.
@zadjii-msft zadjii-msft modified the milestones: Windows vNext, 22H2 Jan 4, 2022
@zadjii-msft zadjii-msft added the Area-CookedRead The cmd.exe COOKED_READ handling label Feb 23, 2022
@zadjii-msft zadjii-msft modified the milestones: 22H2, Backlog Dec 5, 2022
@microsoft-github-policy-service microsoft-github-policy-service bot added the In-PR This issue has a related PR label Aug 2, 2023
@zadjii-msft zadjii-msft modified the milestones: Backlog, Terminal v1.19 Aug 23, 2023
DHowett pushed a commit that referenced this issue Aug 25, 2023
This massive refactoring has two goals:
* Enable us to go beyond UCS-2 support for input editing
* Bring clarity into `COOKED_READ_DATA`'s inner workings

Unfortunately, over time, knowledge about its exact operation was lost.
While the new code is still complex it reduces the amount of code by 4x
which will make preserving knowledge hopefully significantly easier.

The new implementation is simpler and slower than the old one in a way,
because every time the input line is modified it's rewritten to the text
buffer from scratch. This however massively simplifies the underlying
algorithm and the amount of state that needs to be tracked and results
in a significant reduction in code size. It also makes it more robust,
because there's less code now that can be incorrect.

This "optimization laziness" can be afforded due the recent >10x
improvements to `TextBuffer`'s text ingestion performance.
For short inputs (<1000 characters) I still expect this implementation
to outperform the conhost from the past.
It has received one optimization already however: While reading text
from the `InputBuffer` we'll now defer writing into the `TextBuffer`
until we've stopped reading. This improves the overhead of pasting text
from O(n^2) to O(n), which is immediately noticeable for inputs >100kB.

Resizing the text buffer still ends up corrupting the input line
however, which unfortunately cannot be fixed in `COOKED_READ_DATA`.
The issue occurs due to bugs in `TextBuffer::Reflow` itself, as it
misplaces the cursor if the prompt is on the last line of the buffer.

Closes #1377
Closes #1503
Closes #4628
Closes #4975
Closes #5033
Closes #8008

This commit is required to fix #797

## Validation Steps Performed
* ASCII input ✅
* Chinese input (中文維基百科) ❔
  * Resizing the window properly wraps/unwraps wide glyphs ❌
    Broken due to `TextBuffer::Reflow` bugs
* Surrogate pair input (🙂) ❔
  * Resizing the window properly wraps/unwraps surrogate pairs ❌
    Broken due to `TextBuffer::Reflow` bugs
* In cmd.exe
  * Create 2 file: "a😊b.txt" and "a😟b.txt"
  * Press tab: Autocompletes "a😊b.txt" ✅
  * Navigate the cursor right past the "a"
  * Press tab twice: Autocompletes "a😟b.txt" ✅
* Backspace deletes preceding glyphs ✅
* Ctrl+Backspace deletes preceding words ✅
* Escape clears input ✅
* Home navigates to start ✅
* Ctrl+Home deletes text between cursor and start ✅
* End navigates to end ✅
* Ctrl+End deletes text between cursor and end ✅
* Left navigates over previous code points ✅
* Ctrl+Left navigates to previous word-starts ✅
* Right and F1 navigate over next code points ✅
  * Pressing right at the end of input copies characters
    from the previous command ✅
* Ctrl+Right navigates to next word-ends ✅
* Insert toggles overwrite mode ✅
* Delete deletes next code point ✅
* Up and F5 cycle through history ✅
  * Doesn't crash with no history ✅
  * Stops at first entry ✅
* Down cycles through history ✅
  * Doesn't crash with no history ✅
  * Stops at last entry ✅
* PageUp retrieves the oldest command ✅
* PageDown retrieves the newest command ✅
* F2 starts "copy to char" prompt ✅
  * Escape dismisses prompt ✅
  * Typing a character copies text from the previous command up
    until that character into the current buffer (acts identical
    to F3, but with automatic character search) ✅
* F3 copies the previous command into the current buffer,
  starting at the current cursor position,
  for as many characters as possible ✅
  * Doesn't erase trailing text if the current buffer
    is longer than the previous command ✅
  * Puts the cursor at the end of the copied text ✅
* F4 starts "copy from char" prompt ✅
  * Escape dismisses prompt ✅
  * Erases text between the current cursor position and the
    first instance of a given char (but not including it) ✅
* F6 inserts Ctrl+Z ✅
* F7 without modifiers starts "command list" prompt ✅
  * Escape dismisses prompt ✅
  * Minimum size of 40x10 characters ✅
  * Width expands to fit the widest history command ✅
  * Height expands up to 20 rows with longer histories ✅
  * F9 starts "command number" prompt ✅
  * Left/Right paste replace the buffer with the given command ✅
    * And put cursor at the end of the buffer ✅
  * Up/Down navigate selection through history ✅
    * Stops at start/end with <10 entries ✅
    * Stops at start/end with >20 entries ✅
    * Wide text rendering during pagination with >20 entries ✅
  * Shift+Up/Down moves history items around ✅
  * Home navigates to first entry ✅
  * End navigates to last entry ✅
  * PageUp navigates by 20 items at a time or to first ✅
  * PageDown navigates by 20 items at a time or to last ✅
* Alt+F7 clears command history ✅
* F8 cycles through commands that start with the same text as
  the current buffer up until the current cursor position ✅
  * Doesn't crash with no history ✅
* F9 starts "command number" prompt ✅
  * Escape dismisses prompt ✅
  * Ignores non-ASCII-decimal characters ✅
  * Allows entering between 1 and 5 digits ✅
  * Pressing Enter fetches the given command from the history ✅
* Alt+F10 clears doskey aliases ✅
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Tag-Fix Doesn't match tag requirements label Aug 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-CookedRead The cmd.exe COOKED_READ handling Area-Input Related to input processing (key presses, mouse, etc.) In-PR This issue has a related PR Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Priority-1 A description (P1) Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants