9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

seritools · 2024-01-01T22:00:36Z

For #13 I have implemented a kind of crappy workaround. This ticket tracks actually implementing a proper solution.

The workaround:

rust/library/std/src/sys/windows/stdio.rs

Lines 205 to 209 in cdf0f73

    
           if !compat::is_windows_nt() { 
        
               // FIXME: This function should manually convert to the target codepage on 9x/ME, and 
        
               // handle incomplete writes by calculating how many utf8-effective bytes were written. 
        
               return Ok(utf8.len()); 
        
           }

Problem description

Whenever the "character count" mismatches between the UTF-16 side and the codepage size, the "amount of characters written" also mismatches between both sides.

Solution idea

Roguh idea for the conversion code

Becasue of the multiple, lossy conversions needed on 9x/Me (subset of UTF8 → WTF16 → codepage), the only sensible way would be to do the code page conversion in the stdlib code, then loop until the entire converted buffer has been written, thus confirming that the entire input buffer of up to MAX_BUFFER_SIZE bytes have been written. Only that way we can ensure that all input characters are accounted for in some capacity.

The text was updated successfully, but these errors were encountered:

seritools added enhancement New feature or request help wanted Extra attention is needed labels Jan 1, 2024

seritools changed the title ~~Implement proper MBCS handling for console writes~~ 9x/ME: Implement proper MBCS handling for console writes Jan 1, 2024

seritools changed the title ~~9x/ME: Implement proper MBCS handling for console writes~~ 9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage Jan 1, 2024

seritools mentioned this issue Jan 1, 2024

Win95 println!("Hello, 世界!") panics when using Chinese locale #13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

seritools commented Jan 1, 2024 •

edited

Loading

9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

9x/ME: Implement proper handling for console writes where character lengths differ between utf16 and target codepage #14

Comments

seritools commented Jan 1, 2024 • edited Loading

Problem description

Solution idea

seritools commented Jan 1, 2024 •

edited

Loading