Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyboard/mouse context mgrs, printable width of sequences, sequence parsers #27

Closed
wants to merge 78 commits into from

Conversation

jquast
Copy link
Collaborator

@jquast jquast commented Apr 11, 2013

An overview of the changes made:

  1. keyboard input & stream definitions
  2. an additional stream, self.i_stream, which is always sys.stdin. You may prefer to allow a input_stream keyword to Terminal(). Some refactoring of variable 'descriptor' to 'o_fd' vs. 'i_fd' to differentiate which file descriptor it refers to.
  3. primary routine kbhit() is styled after the win32 name of the same purpose. It is a select() wrapper on stdin, "any input on keyboard?".
  4. primary context manager cbreak() is a simple wrapper to tty.cbreak() that stores and restores terminal settings on exit with a finally clause.
  5. primary context manager mouse_tracking() is a simple wrapper for sending sequences to enable mouse input as terminal sequences. mouse sequences are not decoded by inkey(), an example decoder is provided with test_mouse.py
  6. backend initializer _init_keystrokes() abuses curses to build a comprehensive keystroke database of multibyte sequences, code definitions and values.
  7. primary routine Terminal.inkey() is styled after qbasic? The classic issue of "did a user press the escape key alone? Or a multibyte sequence representing an application key?" is solved by timing with kbhit() and multibyte sequence decoding application keys as well as Unicode encoding of the byte-based input that os.read() returns.
  8. primary routine Terminal.wrap() and backend routine ansiwrap() is styled after textwrap.textwrap, but is sequence-aware. This is achieved using backend class AnsiWrapper by overriding _wrap_chunks and making use of AnsiString.
  9. backend generator _resolve_mbs(ucs) yields a Keystroke instance for each "keyboard hit" detected on input. This is the primary multibyte sequence parsing routine.
  10. backend function _sqlen(ucs) returns the length of the terminal output sequence pointed to by ucs as an Integer
  11. backend function _is_movement(ucs) returns True of the output sequence pointed to by ucs is "unhealthy for padding", that is, has positional effects on the terminal that would cause the printable width of the string to be indeterminable. Examples are term.clear or term.move.
  12. backend class AnsiString(unicode) ovverides len, ljust, rjust, and center to provide sequence-aware printable width of strings. These methods are duplicated to Terminal.center, ljust, and rjust. You may chose to provide access to this routine with a method such as Terminal.printable_length(). If you chose to do so, I recommend also supporting east-asian double-width with wcswidth() routines, let me know.

- add unicode-derived 'Keystroke' class, which is yielded by the
generator method _resolve_multibyte, adding additional properties,
.is_sequence (bool), .code (int), and .name (str).
- detect encoding and initialize incrementaldecoder in __init__, used
by _resolve_multibyte.
- inherit curses key capability names, values, and create multibyte
sequence pattern lookup in __init__
- add _resolve_keycode method, returns key capability name for an
integer.
- add _resolve_multibyte(text) method, given a string of multibyte
input, 'text', returns instances of Keystroke class, which may be
checked for .is_sequence, .code, and .name.
@jquast jquast mentioned this pull request Apr 11, 2013
jquast added 16 commits April 11, 2013 02:37
Add wrap method to Terminal, with accompanying AnsiWrapper class, a
derived version of textwrap.TextWrapper, with a modified version of
_wrap_chunks that is sequence-safe.

A module-global function, 'ansiwrap' is exposed. Sequences are detected
using a range of byte matching and regular expressions. This sequence
padding code been tested thoroughly on ANSI Art from the "Dark Domains"
DVD by "ACiD Productions". Example usage:

>> import blessings
>> t = blessings.Terminal()
>>> for num, byte in enumerate('Pony Express, Choo! CHooooooOOoooOOo!'):
...    seq += t.color(num % 7) + t.bold + byte + t.normal
>>> import textwrap
>>> print u'\n'.join(textwrap.wrap(seq, 15))

(as you can see, textwrap fails horribly !!)

>>> print u'\n'.join(t.wrap(seq, 15))
Pony Express,
Choo!
CHooooooOOoooOOo!

winning !!
Continuing the example from the previous commit, it is now possible:

>>> print t.center(seq)
                     Pony Express, Choo! CHooooooOOoooOOo!
Instantate AnsiString from Terminal, so AnsiString contains these
padding methods, requires less instantiation.
also readability from pexpect
* add _conanical and _echo terminal state helpers and related functions.
* remove another magic bit, '27', by explicitly setting to curses.ascii.ESC
* fix for incremental decoder
* add warning for failed decodes in _resolve_multibyte
* placeholder for inkey() with docstrings of expected behavior
use tty.LFLAGS instead of magic attrs[3] bit. Also rename 'attrs' references to 'mode' to match python standard module naming conv.
in the usual style of mine, fully written but not yet run, this intermittent commit has complete docstrings for the interface in mind.

I haven't a win32 platform, but diligent effort was made to support it.
This commit also includes a possible fix for windows systems to determine the terminal height and width without calls to fcntl.

- remove unused defaultdict import
- refactor input stream key capability from __init__ to
  _init_keystrokes()
- remove redundant input key capability definitions
- store _state_echo and _state_canonical; necessary for win32 systems
- refactor term.stream to term.o_stream to accompany term.i_stream
- refactor documented @Property, do_style
- refactor _resolve_multibyte to _resolve_msb, consistent reference
  'MSB' can be either 'escape sequence' for application keys, or utf-8 bytes
  as part of a hamsterface emoticon.
getch() so far works in non-conanical mode, as unimpressive as it is.

various issues with multibyte sequence decoding, currently looks like
the loop timing and buffering at issue.
This is what the curses wrapper is missing; it is very obnoxious when developing curses when your program crashes, you have to type 'reset' after each crash

That's fine, we simply check the state and always restore it on __del__. If python throws an exception and kicks you out while the terminal is in cbreak mode, a warning is issued to let you know it was restored after doing so.

This will appear after your traceback. So it won't be any suprised as to what happened.
also sprinly various print debugs for MSB, there is a timing issue related to the 2nd byte of a '\x1b[A' sequence failing the kbhit() check. I think i've seen this before, has to do with flushing something correctly at a particular time? will look tomorrow.
self._does_styling = ((self.is_a_tty or force_styling) and
# os.isatty returns True if output stream is an open file descriptor
# connected to the slave end of a terminal.
self.is_a_tty = (o_stream is not None and os.isatty(o_fd))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary parens

@erikrose
Copy link
Owner

Okay, got a chance to read over the rest of it. This is a BIG code dump, and I'd like to restructure some parts of it; the API feels geared toward experts in tty terminology and workings, but the draw of blessings is that it's dead-easy to pick up for anyone. However, the mechanics look sound—it's clear you have serious Terminal fu. :-)

Rather than nit pick our way to release by commenting line by line on the pull (and enduring the communication lag for each exchange), I'd like to do a series of commits atop master myself, drawing from what you have here. That way, we can start releasing pieces of functionality without waiting for everything (mouse, Windows, etc.) to be perfect. How does that sound to you?

Looking forward to 1.6, which will be prominently ballyhooed as the "Jeff Quast release"!

@jquast
Copy link
Collaborator Author

jquast commented Apr 15, 2013

Sounds good to me. Enjoy !

@sweenzor
Copy link

I am pumped for 1.6!

@jquast
Copy link
Collaborator Author

jquast commented May 25, 2013

To clarify on iso8859-1,

  •        if seq is not None:
    
  •            self._keymap[seq.decode('iso8859-1')] = i_val
    

What's the significance of this particular encoding?

If the 8th bit is set high on a non-utf8 session, it is trying to indicate that "meta sends 8-bit output", that is, sets MSB to 1 to indicate Meta was depressed. If this value is decoded as utf-8, it raises UnicodeDecodeError, as it is not a valid UTF-8 start byte. If it is decoded as 'ascii', is also raises exception, as ascii is 7-bit.

I made a gist that demonstrates what could happen if you attempted to decode this input through a utf-8 decoder:

https://gist.github.com/jquast/5649654

It also displays the values of say, 'M-x' etc. Although there is not any support for binding Meta or Ctrl characters in the proposed patch, the input of any such characters (keyboard inputs 254, but not legal in utf-8), should not raise exception during multibyte sequence decoding. It would simply be a key event that is not currently mapped.

If Ctrl and Meta were wanted, I would recommend something like adding additional functions or attributes to the Keypress class that matches a constant returned by Terminal.Meta('x') or Terminal.Ctrl('z').

as meta could be sent as \033[x ("meta sends escape") or b'248', as chr(ord('x') ^ 128) is b'248'.

The user would enable/disable this option in their client, and through some configuration interface, such as XResources in X11, or through software UI, chose "meta sends escape" .. or not. when enabled, any meta would be escaped as \033 x. This is the preferred new input method for meta, as it supports both locale + meta.

Some poorly programmed terminal software (i think 'tetradraw' does this), will simply check if(chr == 'ø').

It may be very hard to reproduce on OSX, Easier on xterm/linux. I seem to recall using both modes successfully in rxvt many years passed. In "irssi" you would switch windows by using alt+1 -- but for us UTF-8 folk, this now inputs ¡ instead. This can be worked around by simply pressing Escape (\e), then 1, which simulates "meta sends escape", though very slowly ! This has the same escape delay issue as the timer of inkey() presents when encoutering a single \e -- wait for more, or signal a single escape key?? Irssi seems to wait indefinitely for subsequent bytes; it does not provide any single "escape key" binding. For applications that support both Escape and Meta, some timing is required, as I think is already impl...

Its up to the implementer what to do about bytes > 128. I could very well instruct the user to press alt+1, and seek '\e1' OR '¡'. But I'm making assumptions about UTF-8 and the user's keyboard. This is why emacs is so Control-Key heavy, and few terminal applications make use of meta -- the meta key is clearly a difficult thing to support !

I'm using python 3 now and noticing my patch doesn't work. Sorry about that.. I just thought I'd let you make any comments before I continued making more changes. Just let me know if you have trouble, no pressure.

@jquast
Copy link
Collaborator Author

jquast commented May 25, 2013

Ahh, from iTerm Profiles->Keys ->

left/right Alt key acts as:
Meta (presumably 8-bit input): You have chosen to have an option key act as Meta. This option is useful for backward compatibility with older systems. The "+Esc" option is recommended for most users.

so there you go. The idea is somebody using blessings with these older systems won't throw decoding exceptions when trying to match it with a MBS. Or if somebody really wanted to, they could program blessings to map them.

@jquast
Copy link
Collaborator Author

jquast commented Aug 30, 2013

I can try to bring this all in again, one stage/feature/pull request at a time sometime in the near future. I'm currently unemployed and homeless (I prefer 'nomadic', heh) and could use the distraction

@erikrose
Copy link
Owner

That would certainly make me less afraid of it and more likely to give it a timely review. :-)

Sorry to hear about your troubles! Do you know any web stuff? Ever thought about applying at Mozilla? You're clearly a very smart guy.

@jquast
Copy link
Collaborator Author

jquast commented Aug 30, 2013

Unfortunately not any good at web stuff, really. avoided javascript/css/html 2,3,4, and now 5 for a long while, it's sadly a rapidly moving target that I've not kept up with much longer than since html 2.0 came out. the languages themselves I can do fine, its the rapidly evolving "everything else" thats around it. Appreciate the smart guy comment, though :-)

(Crazy how employers expect you to do all of front and back-end, having such a diverse range of expertise to get your foot in the door -- seems when I'm actually there, working the back-end, I end up knowing more than the front-end teams, maybe I should just start lying on my resume like the rest of them, lol - http://jeffquast.com/resume.pdf btw)

@sweenzor
Copy link

Two things!

  1. I'm glad to see this thread again!
  2. @jquast, you open to relocation?

@jquast
Copy link
Collaborator Author

jquast commented Sep 1, 2013

  1. Indeed, I'm seeking work along the W. Coast, though I'm currently in Northern Michigan, will be driving my '79 toyota there within the next two weeks or bust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants