Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Upper case non-ASCII keysyms in .mim files do not work in ibus-m17n #90

Closed
mike-fabian opened this issue Nov 5, 2024 · 4 comments
Assignees
Labels

Comments

@mike-fabian
Copy link

Key symbols starting with a capital letter like (Udiaeresis) work in ibus-typing-booster but not in ibus-m17n.

Example test file:

;; t-test-mike.mim -- test input method

(input-method t test-mike)

(description
"Mike's test input method")

(title "Test Mike")

(map
 (trans
 ;; Lines marked with PASS work with ibus-m17n and ibus-typing-booster.
 ;; Lines marked with FAIL do not work and probably need an enhancement
 ;; in the m17n library.
  ((0x0061) "test1") ; PASS a U+0061 
  ((0x0100263A) "test3") ; FAIL ☺ U+263A WHITE SMILING FACE
  ((udiaeresis) "test40") ; PASS
  ((udiaeresis udiaeresis) "test41") ; PASS
  ;; without the line with the single adiaeresis, the following line
  ;; with the double adiaeresis still works, but when a single ä
  ;; is typed, it is not shown in the preedit, only when the second ä is typed
  ;; the final result appears. If the second ä does not come but some other
  ;; letter like x instead, the ä vanishes completely.
  ;; This is different from the behaviour of the ((x x) "test60") line.
  ;; Typing a single x puts the x into preedit, typing the second x produces
  ;; the 'test60'. If a different letter instead of the second x is typed,
  ;; the x in preedit gets commited.
  ;; I suspect a bug in the m17n library here, as this problem is reproducible
  ;; both with ibus-typing-boosters m17n_translit.py as well as with ibus-m17n.
  ;; Workaround: add ((adiaeresis) "ä"), then the single ä appears in predit
  ;; and everything seems to work fine.
  ;((adiaeresis) "ä") ; PASS
  ((adiaeresis adiaeresis) "test50") ; PASS
  ((x x) "test61") ; PASS, does not need an extra ((x) "test60") to work well
  ;; ((0x00DC) "test7") ; FAIL Ü U+00DC LATIN CAPITAL LETTER U WITH DIAERESIS
  ;; key symbols for capital letters seem to fail in ibus-m17n:
  ((Udiaeresis) "test8") ; PASS in ibus-typing-booster, FAIL in ibus-m17n. ibus-m17n bug?
  ))

(state
  (init
    (trans)))
@mike-fabian mike-fabian added the bug label Nov 5, 2024
@mike-fabian mike-fabian self-assigned this Nov 5, 2024
@mike-fabian mike-fabian moved this to In Progress in Mike’s project Nov 5, 2024
@mike-fabian
Copy link
Author

mike-fabian commented Nov 5, 2024

Debugging ibus-m17n shows this:

** (ibus-engine-m17n:2651747): DEBUG: 09:57:01.237: FIXME keysym->str=u
** (ibus-engine-m17n:2651747): DEBUG: 09:57:04.396: FIXME keysym->str=udiaeresis
** (ibus-engine-m17n:2651747): DEBUG: 09:57:05.838: FIXME keysym->str=udiaeresis
** (ibus-engine-m17n:2651747): DEBUG: 09:57:08.917: FIXME keysym->str=S-Udiaeresis
** (ibus-engine-m17n:2651747): DEBUG: 10:24:12.292: FIXME keysym->str=U
** (ibus-engine-m17n:2651747): DEBUG: 10:30:17.963: FIXME keysym->str=S-C-Return
** (ibus-engine-m17n:2651747): DEBUG: 10:31:54.782: FIXME keysym->str=Return
** (ibus-engine-m17n:2651747): DEBUG: 10:32:03.734: FIXME keysym->str=Up
** (ibus-engine-m17n:2651747): DEBUG: 10:32:11.895: FIXME keysym->str=S-Right

Adding the S- prefix causes the problem:

https://github.com/ibus/ibus-m17n/blob/main/src/engine.c#L707

    if (mask & IBUS_SHIFT_MASK) {
        g_string_prepend (keysym, "S-");
    }

@mike-fabian
Copy link
Author

mike-fabian commented Nov 5, 2024

Not adding the S- prefix would break input methods which use (S-\ ) or (S-C-Return) or (S-Right) or (S-Left):

mfabian@hathi:/usr/share/m17n
$ grep '(S-' *.mim 
as-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
as-itrans.mim:  ((S-\ ) "‌")
bn-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
bn-itrans.mim:  ((S-\ ) "‌")				; not in ITRANS Bengali table
fa-isiri.mim:  ((S-\ ) "‌")				; zero width non joiner
global.mim:  (S-Right) (C-O))
global.mim:  (S-Left) (C-I))
gu-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
gu-itrans.mim:  ((S-\ ) "‌")				; not in ITRANS Gujarati table
hi-brahmi-itrans.mim:((S-C-Return)))
hi-itrans.mim:  ((S-C-Return)))
hi-optitransv2.mim:  ((S-\ ) "‌")
hi-optitransv2.mim:  ((S-C-Return)))
hi-vedmata.mim:		(".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
ja-anthy.mim:  ((S-Left) (call libmimx-anthy resize t))
ja-anthy.mim:  ((S-Right) (call libmimx-anthy resize nil)))
kn-optitransv2.mim:  ((S-\ ) "‌")
kn-optitransv2.mim:  ((S-C-Return)))
ko-romaja.mim:  ((S-\ ))))
ks-sharada-itrans.mim:((S-C-Return)))
lsymbol.mim:  ((S-\ ))
minglish.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
minglish.mim:  ((S-\ ) "‌")			     
ml-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
ml-itrans.mim:  ((S-\ ) "‌")
ml-mozhi.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
ml-mozhi.mim:  ((S-\ ) "‌")
ml-swanalekha.mim:  ((S-\ ))))
mr-gamabhana.mim:  (".") ("~") ("#") ("$") ("*") ("]") (":") ((S-\ )) ((C-@))
mr-gamabhana.mim:  ((S-\ ) "‌")
mr-gamabhana.mim:  ((S-C-Return)))
mr-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
mr-itrans.mim:  ((S-\ ) "‌")			      ; not in ITRANS Devanagari table
mr-modi-itrans.mim:((S-C-Return)))
or-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
or-itrans.mim:  ((S-\ ) "‌")
pa-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
pa-itrans.mim:  ((S-\ ) "‌")				; not in ITRANS Gurmukhi table
sa-grantha-itrans.mim:((S-C-Return)))
sa-harvard-kyoto.mim:  ((S-\ )) ((C-@))
sa-harvard-kyoto.mim:  ((S-\ ) "‌")
sa-vedic-itrans.mim:  ((S-C-Return)))
si-phonetic-dynamic.mim:  ((S-\ ) " ")		; 0x00a0 - no-break space
si-sumihiri.mim:  ((S-\ ) " ")		; 0x00a0 - no-break space
si-trans.mim:  ((S-\ ) " ")		; 0x00a0 - no-break space
si-wijesekara.mim:  ((S-\ )) ((BackSpace)) ((Delete)))
si-wijesekara.mim:   ((S-\ ) " ")				; NBSP
si-wijesekara.mim:   ((S-\ ) " ")				; NBSP (00A0)
ta-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
ta-itrans.mim:  ((S-\ ) "‌")				; not in ITRANS Tamil table
te-itrans.mim:  (".") ("~") ("#") ("$") ("^") ("*") ((S-\ )) ((C-@))
te-itrans.mim:  ((S-\ ) "‌")				; not in ITRANS Telugu table
te-pothana.mim:  ((S-\ )) ((C-@))
te-pothana.mim:;;  ((S-M-Y)) need to check what 
te-pothana.mim:  ((S-\ ) "‌")				; not in ITRANS Telugu table
te-rts.mim:  ((S-\ )) ((C-@))
te-rts.mim:  ((S-\ ) "‌")
zh-util.mim:  ((S-\ ))
zh-zhuyin.mim:  ((S-\ ) (select @\[))
zh-zhuyin.mim: (toggle-fullshape ((S-\ ))
mfabian@hathi:/usr/share/m17n
$ 

@mike-fabian
Copy link
Author

Another file which contains upper case non-ASCII keysyms is

$ rpm -qf  /usr/share/m17n/hu-rovas-post.mim
m17n-db-1.8.8-1.fc41.noarch

@github-project-automation github-project-automation bot moved this from In Progress to Done in Mike’s project Nov 7, 2024
mike-fabian added a commit to mike-fabian/ibus-typing-booster that referenced this issue Nov 8, 2024
…ed and is is a space or not printable

Before I added 'S-' only for the special cases when msymbol was
'C-Return' or ' '.  Because 'S-C-Return' and 'S- ' were used in some
input methods in m17n-db.  But actually 'S-Left' and 'S-Right' are
also used already. And others like 'S-Up' or 'S-Down' could surely be
used. So the code handled only two special cases and was not generic enough.

But for characters which are printable and **not** whitespace, the
'S-' should **not** be added!  For example Ü (keysym is 'Udiaeresis')
is typed by pressing Shift+ü. And the Shift has then been “absorbed”
in making the characters uppercase, prepending 'S-' is then wrong.

See also ibus/ibus-m17n#90 for the same problem in ibus-m17n.
@mike-fabian
Copy link
Author

mike-fabian added a commit to mike-fabian/ibus-typing-booster that referenced this issue Nov 9, 2024
…ed and is is a space or not printable

Before I added 'S-' only for the special cases when msymbol was
'C-Return' or ' '.  Because 'S-C-Return' and 'S- ' were used in some
input methods in m17n-db.  But actually 'S-Left' and 'S-Right' are
also used already. And others like 'S-Up' or 'S-Down' could surely be
used. So the code handled only two special cases and was not generic enough.

But for characters which are printable and **not** whitespace, the
'S-' should **not** be added!  For example Ü (keysym is 'Udiaeresis')
is typed by pressing Shift+ü. And the Shift has then been “absorbed”
in making the characters uppercase, prepending 'S-' is then wrong.

See also ibus/ibus-m17n#90 for the same problem in ibus-m17n.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant