Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced the usage of ReadConsoleInputW and fgetwc #2477

Closed
wants to merge 1 commit into from

Conversation

tokenizer-decode
Copy link

@tokenizer-decode tokenizer-decode commented Aug 1, 2023

The previous implementation of getchar32() relied on low-level console functions, which caused issues when running the code in subprocesses with redirected stdin. The ReadConsoleInputW function is designed to read from the console input buffer and may not function correctly with input redirection. Additionally, using fgetwc in a Windows environment could lead to potential issues in certain scenarios.

I encountered unexpected results when redirecting stdin, even without passing any argument. To replicate the issue, try the following command in your C# code:

            ProcessStartInfo startInfo = new()
            {
                FileName = ".\\main.exe",
                Arguments = "-m .\\7b-q4.bin -n 256 --repeat_penalty 1.0 --interactive-first -r \"User:\" -f .\\chat-with-bob.txt",
                RedirectStandardOutput = true,
                RedirectStandardInput = true,
                UseShellExecute = false,
                CreateNoWindow = true
            };

To address this problem and enable people to stream the program directly to a UI without worrying about the C++ part, I made the following changes:

  • Replaced ReadConsoleInputW with std::getwchar(), a standard C++ input function that reads wide characters from std::wcin. This change allows getchar32() to handle both console and redirected stdin scenarios consistently.

…input functions to make getchar32() work consistently in all environments, including cases when stdin is redirected.

The previous implementation of getchar32() relied on low-level console functions, which caused issues when running the code in subprocesses with redirected stdin. The ReadConsoleInputW function is designed to read from the console input buffer and may not function correctly with input redirection. Additionally, using fgetwc in a Windows environment could lead to potential issues in certain scenarios.

I encountered unexpected results when redirecting stdin, even without passing any argument. To replicate the issue, try the following command in your C# code:

```
            ProcessStartInfo startInfo = new()
            {
                FileName = ".\\main.exe",
                Arguments = "-m .\\7b-q4.bin -n 256 --repeat_penalty 1.0 --interactive-first -r \"User:\" -f .\\chat-with-bob.txt",
                RedirectStandardOutput = true,
                RedirectStandardInput = true,
                UseShellExecute = false,
                CreateNoWindow = true
            };
```

To address this problem and enable people to stream the program directly to a UI without worrying about the C++ part, I made the following changes:
- Replaced ReadConsoleInputW with std::getwchar(), a standard C++ input function that reads wide characters from std::wcin. This change allows getchar32() to handle both console and redirected stdin scenarios consistently.

With these modifications, getchar32() now functions as intended in various environments and ensures that console interactions work correctly, even when stdin is redirected.
@tokenizer-decode tokenizer-decode changed the title Replaced the usage of ReadConsoleInputW and fgetwc with standard C++ … Replaced the usage of ReadConsoleInputW and fgetwc Aug 1, 2023
@DannyDaemonic
Copy link
Collaborator

Believe it or not, this is actually how the function was written earlier. We had issues both with mingw (#1462) and gcc for Windows when using getwchar. For some reason the function is entirely untested and full of bugs in those environments - I guess everyone just uses the Win32 API. It can either fail to read certain characters or worse, get stuck in an infinite loop.

I have a pending pull request (#1558) that introduces a --simple-io that fixes the subprocess redirect issue that you and others were getting. It's quite old now but there weren't enough people reviewing ui code at the time. It just needs to be brought up to date to merge, but it works in mingw, windows gcc, linux, and any other environment I could find to test on.

If you want to test the getwchar bug to see if it's there in your environment, you can do so with this code:

#include <windows.h>
#include <winnls.h>
#include <fcntl.h>
#include <wchar.h>
#include <stdio.h>
#include <io.h>

int main() {
    // Initialize for reading wchars and writing out UTF-8
    DWORD dwMode = 0;
    HANDLE hConsole = GetStdHandle(STD_OUTPUT_HANDLE);
    if (hConsole == INVALID_HANDLE_VALUE || !GetConsoleMode(hConsole, &dwMode)) {
        hConsole = GetStdHandle(STD_ERROR_HANDLE);
        if (hConsole != INVALID_HANDLE_VALUE && (!GetConsoleMode(hConsole, &dwMode))) {
            hConsole = NULL;
        }
    }
    if (hConsole) {
        SetConsoleMode(hConsole, dwMode | ENABLE_VIRTUAL_TERMINAL_PROCESSING);
        SetConsoleOutputCP(CP_UTF8);
    }
    HANDLE hConIn = GetStdHandle(STD_INPUT_HANDLE);
    if (hConIn != INVALID_HANDLE_VALUE && GetConsoleMode(hConIn, &dwMode)) {
        _setmode(_fileno(stdin), _O_WTEXT);
        dwMode &= ~(ENABLE_LINE_INPUT | ENABLE_ECHO_INPUT);
        SetConsoleMode(hConIn, dwMode);
    }

    // Echo input
    while (1) {
        // Read
        wchar_t wcs[2] = { getwchar(), L'\0' };
        if (wcs[0] == WEOF) break;
        if (wcs[0] >= 0xD800 && wcs[0] <= 0xDBFF) { // If we have a high surrogate...
            wcs[1] = getwchar(); // Read the low surrogate
            if (wcs[1] == WEOF) break;
        }
        // Write
        char utf8[5] = {0};
        int result = WideCharToMultiByte(CP_UTF8, 0, wcs, (wcs[1] == L'\0') ? 1 : 2, utf8, 4, NULL, NULL);
        if (result > 0) {
            printf("%s", utf8);
        }
    }
    return 0;
}

@DannyDaemonic
Copy link
Collaborator

I've rebased #1558 if you're interested in trying out that patch.

@tokenizer-decode
Copy link
Author

First of all, getwchar is quite problematic, and I would prefer to avoid using it as much as possible. Secondly, I've experimented with your patch. Have you run tests on it using Windows? Because redirecting stdin still seems to fail; it cannot capture stdin correctly and doesn't respect the reverse prompt. On my system, which runs Windows 11, the changes I've made work fine both in the console and upon redirection.

@DannyDaemonic

@DannyDaemonic
Copy link
Collaborator

Are you using the --simple-io flag?

@tokenizer-decode
Copy link
Author

Oh sorry didn't see that flag. Works fine with --simple-io. @DannyDaemonic

@DannyDaemonic
Copy link
Collaborator

DannyDaemonic commented Aug 3, 2023

Glad to hear it works. It's also confirmed to work here for #1491, and here for #1547. Both of which are also subprocess related.

Again, the code you suggest, was originally changed due to a bug in compilers. Your patch would work in Windows when compiled with the official MS compiler and with Intel's compiler. But it doesn't work in Windows when compiled with:

  1. Standalone MinGW-w64
  2. w64devkit
  3. Cygwin
  4. MSYS2
  5. Qt (native Windows compilation)
  6. Code::Blocks

@tokenizer-decode
Copy link
Author

Thank you for quick responses. I'm thinking whether it might be beneficial to include a sample UI implementation within the project. Currently, for anyone who wishes to create a UI, the available options seem to be:

  1. The use of a subprocess, which, despite its functional utility, introduces some overhead and might not be the most elegant solution (as we've seen the potential issues so far).
  2. The comprehensive implementation of the entire inference logic, which entails an understanding of the llama library and involves substantial effort.

If we could incorporate a straightforward UI example into our project, I believe it would provide a valuable template for developers to customize and adapt according to their specific requirements. This is just a suggestion, and I'd like to hear your thoughts on it.

@DannyDaemonic

@DannyDaemonic
Copy link
Collaborator

I think the other option, which has gotten way more popular lately, is to start a server instance listening locally only, and connecting to that with your UI app. Either way, I'm totally for having such an example, but it just comes down to time.

@DannyDaemonic
Copy link
Collaborator

Fixed with #1558

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants