http://invisible-island.net/xterm/modified-keys/
Copyright © 2019 by Thomas E. Dickey

XTerm – “Other” Modified Keys

Background

Like the modified function- and cursor-keys introduced in patch #94 (1999) and refined in patch #167 (2002), xterm can send other keys as special escape sequences. Again, that was due to another developer's suggestion.

In the beginning

It was simple enough, at first:

Subject: support for key bindings for C-. C-, etc etc in xterm?
From: Dan Nicolaescu <dann@ics.uci.edu>
Date: Tue, 04 Apr 2006 15:53:39 -0700

Hi!

Do you plan to add support for the few C-KEY combinations that are not
currently supported (i.e. the ones that don't correspond to any ASCII
sequence).

The reason I am asking is that these keys could be used in emacs, some
modes use them in X11, but they cannot be used in an xterm.
If you add such support to xterm, I'll add the corresponding support
to the emacs CVS right away, the only think needed is the
corresponding escape sequences.

I understand that key bindings are a twisted and hard to understand
business, and I am not sure if this idea has been considered and
rejected before, I apologize if that is indeed the case.

Thanks
        --dan

Emacs users like key bindings, so the request was not really surprising. I had not thought of extending xterm in this manner:

Subject: Re: support for key bindings for C-. C-, etc etc in xterm?
From: Dan Nicolaescu <dann@ics.uci.edu>
Date: Tue, 04 Apr 2006 16:56:04 -0700

Thomas Dickey <dickey@his.com> writes:

  > On Tue, 4 Apr 2006, Dan Nicolaescu wrote:
  >
  > >
  > > Hi!
  > >
  > > Do you plan to add support for the few C-KEY combinations that are not
  > > currently supported (i.e. the ones that don't correspond to any ASCII
  > > sequence).
  >
  > I hadn't thought of that - which ones, for example?
  >
  > xterm supports control-modifiers for function- and keypad-keys.

I know, and I added support for all of them in emacs :-)

  >  Perhaps you're talking about things like control/{, which doesn't
  > correspond to an ASCII control character?

Right.

Right now I am thinking: Control-. Control-, Control-TAB,
At least these have uses in emacs. I would have to think more what
else would be useful, in case you are willing to add support for them.

Thanks
                --dan

I was willing to do this:

Subject: Re: support for key bindings for C-. C-, etc etc in xterm?
From: Dan Nicolaescu <dann@ics.uci.edu>
Date: Wed, 05 Apr 2006 10:10:02 -0700

Thomas Dickey <dickey@his.com> writes:

  > On Tue, 4 Apr 2006, Dan Nicolaescu wrote:
  >
  > > Thomas Dickey <dickey@his.com> writes:
  > >
  > >  > On Tue, 4 Apr 2006, Dan Nicolaescu wrote:
  > >  >
  > >  > >
  > >  > > Hi!
  > >  > >
  > >  > > Do you plan to add support for the few C-KEY combinations that are not
  > >  > > currently supported (i.e. the ones that don't correspond to any ASCII
  > >  > > sequence).
  > >  >
  > >  > I hadn't thought of that - which ones, for example?
  > >  >
  > >  > xterm supports control-modifiers for function- and keypad-keys.
  > >
  > > I know, and I added support for all of them in emacs :-)
  > >
  > >  >  Perhaps you're talking about things like control/{, which doesn't
  > >  > correspond to an ASCII control character?
  > >
  > > Right.
  > >
  > > Right now I am thinking: Control-. Control-, Control-TAB,
  > > At least these have uses in emacs. I would have to think more what
  > > else would be useful, in case you are willing to add support for them.
  >
  > well, offhand (I'll have to think about this, of course), it would be
  > straightforward to add an option that would pass the extra characters
  > as if they were function-keys in the form
  >
  >     \E[27;modifier;code~
  >
  > just like the existing function keys - using the decimal value of the
  > character as the code.

That would work perfectly well from the point of view of emacs.
If you decide this is feasible, please let me know if you need any help
with this.

Thanks
                --dan

I chose this format because it was consistent with the earlier work, and (noticing that 27 was unused) chose that as the primary code. After all, decimal 27 is the “same” as the ASCII escape character.

Subject: Re: support for key bindings for C-. C-, etc etc in xterm?
From: Dan Nicolaescu <dann@ics.uci.edu>
Date: Fri, 12 May 2006 09:20:51 -0700

Thomas Dickey <dickey@his.com> writes:

  > On Sat, 6 May 2006, Dan Nicolaescu wrote:
  >
  > > Have you reached a decision about adding these key bindings to xterm?
  >
  > It's on my list - just lots of realtime nuisances recently.  I thought
  > I might have time in #213, but that became a short bug-fix, rather
  > than new development.  In my to-do list, I have this marked for #214
  > (and will probably do it unless someone points out a nasty
  > fix-it-right-now-bug).

Thanks!
Based on that I have added these key bindings to emacs with a note
that they will be available in xterm-214 or later:
\e[27;5;9~  control-TAB
\e[27;5;44~ control-,
\e[27;5;46~ control-.
\e[27;5;47~ control-/
\e[27;5;92~ control-\

(I'll add more such key bindings as soon as I find them potentially useful).

      --dan

As promised, I added the feature in patch #214. Nicolaescu added support in Emacs in xterm.el in stages beginning in May 2006.

Seeing that there would be different user preferences for this area, I added resource-settings (and when appropriate) new control sequences for enabling the corresponding function-key feature. In summary:

modifyCursorKeys, in patch #167 (2002/8/24).
modifyOtherKeys, in patch #214 (2006/6/18).
modifyFunctionKeys, in patch #216 (2006/8/3).

An alternative

In March 2008, I received mail from Paul Evans, regarding the modifyOtherKeys feature. He asked about the reason for choosing the 27 encoding. Also, he asked for more information:

> I'm also interested to know the selection criteria on how XTerm decides
> to use this representation or not - when modifyOtherKeys=2, it seems to
> pick it for just about everything - Shift-Q, for example, leaving plain
> "q" to represent unshifted, yet "1" and "!" each encode natively - this
> mode seems a little overkill. Alternatively, in modifyOtherKeys=1, it
> doesn't seem to go far enough - Ctrl-R and Ctrl-Shift-R both using the
> same bytesequence to represent them, seeming to defeat the point of using
> such a mode.

It was a tradeoff (with some feedback from two different people at extremes).
Some of the modified-key combinations are generally predefined (and hard to
work around).  For the predefined combinations, xterm doesn't actually
see the key+modifier - it only gets whatever the keyboard configuration
has first mapped it to.  That seems to be the issue with "1" vs "!".

Also, Because xterm's using the *LookupString() functions to resolve events,
it doesn't see quite as much as xev does.

In response to my explanation of the 27, Evans responded

I've been thinking over [the above], and come to the conclusion it isn't
very neat.

What would you think instead to using one of the other Private Use CSIs
here, say, "u"

Ctrl-A, for example, could become

  CSI 65 ; 5 u

The "u" is 0x75, so Private Use. It's almost self-documenting - the "u"
hints at being "Unicode". And it fits in with the use of ~ much nicer...
CSI 65 ; 5 ~ is the 65th function key with the 5 modifier bits. This
would also strengthen one of my standard arguments I make at everyone on
why they should implement a Proper CSI Parser.

Evans was unfamiliar with the nuances of the X11 library (and how its keycodes and keysyms are related), and did not pick up on the mention of *LookupString() or xev.

He also ignored the fact that it was used in Emacs, stating

I think it's fairly safe to say that nobody really uses the CSI 27
feature of XTerm thus far - it's taken enough beating with sticks for
people to even get as far as noticing the modifiers. If we catch this
arbitrary unicode sequence now, and do it right, we should be able to
fix it easily.

However, to give him something to experiment with, I added the formatOtherKeys resource in patch #235 (2008/04/20).

Two years passed, and Evans came back:

Date: Mon, 16 Aug 2010 19:06:06 +0100
From: Paul LeoNerd Evans <leonerd@leonerd.org.uk>
To: dickey@invisible-island.net
Subject: Xterm and modified keys

On the subject of xterm and modified keypresses, you may recall we were
discussing this quite a while ago.

Well, since then I've now got a nice client-side library for programs to
use to parse VT-style keypresses, plus XTerm's extensions for modified
cursor/function keys, and the CSI u style we came up with. See

  http://www.leonerd.org.uk/code/libtermkey/

I've also had a stab at actually documenting the terminal interface
here:

  http://www.leonerd.org.uk/hacks/fixterms/

How is xterm support looking in this direction? I was hoping at some
point it might be possible to enable a mode where it would properly send
keypresses in this neat scheme all the time; e.g. that Ctrl-D would
continue to work as it does now, but then Ctrl-Shift-D sends its proper
sequence; namely,  CSI 68;5 u

Actually, the “fixterms” documentation is far shorter than this page, and does not go into much detail. It mentions a few specific control characters (tab, return, escape), but states that other control characters use the CSI-u encoding. The page also conflates the modified cursor- and function-keys with the “other” keys (in the section “Really special keypresses”).

If Evans had constructed a table to illustrate the difference between xterm and his objective (as well as participating in development), he might have gotten somewhere.

Later he continued:

Date: Wed, 23 Feb 2011 16:29:10 +0000
From: Paul LeoNerd Evans <leonerd@leonerd.org.uk>
To: Thomas Dickey <dickey@his.com>
Subject: XTerm, modifyOtherKeys, modified unicode, and libtermkey

How is the current state-of-play regarding xterm and modified unicode
keypresses? I was hoping by now there'd be a mode I could set my xterm
into, such that the following happened:

  Ctrl-D                 \x04
  Ctrl-Shift-D           CSI 68;5 u

  Tab                    \x09
  Ctrl-I                 CSI 105;5 u
  Ctrl-Shift-I           CSI 73;5 u

This would help strengthen my proposal to get libtermkey more supported
by applications, such as my ever-present case-in-point being vim;
currently under debate in this thread:

  https://groups.google.com/group/vim_dev/browse_thread/thread/d9ba7d51d7d9eb73?pli=1

Specifically see Benjamin R Haskell's message of Feb 23rd, and my
response to it.

Still more time passed:

Date: Wed, 19 Nov 2014 18:46:59 +0000
From: "Paul \"LeoNerd\" Evans" <leonerd@leonerd.org.uk>
To: dickey@his.com
Subject: Re: A mode switch number for enabling CSI u keypress encodings

On Sun, 12 Oct 2014 19:29:27 -0400
Thomas Dickey <dickey@his.com> wrote:

> I haven't revisited this (in xterm, fonts and the ReGIS stuff have
> been most of the recent work).
> ...in a quick check, it seems quite a while since we discussed this.
> I have some to-do item from 2008 updated in 2009.

Yeah ;) Well.. that all still needs fixing. Last time I checked the
behaviour was still far from ideal.

Do you need a more comprehensive check list / comparison chart, or
somesuch? Or are you aware what needs to happen and just have to get
around to actually doing it..?

Between those last two messages, Evans had done some development. While he had begun libvterm late in 2007, it was not until late 2011 that he began making changes to support the CSI-u format:

initial commit (November 20, 2007).
note the "TODO" in input.c (October 3, 2011).
relevant development with commit comments:
- Much improved input_char function - UTF-8 encoding, CSI u for modified unicode where appropriate (October 4, 2011).
- CSI u encoding also for LITERAL keys (backspace/tab/enter/escape) (October 4, 2011).
- Bugfix for UTF-8 input (October 4, 2011).
- Ensure that Shift-Space is still representable, as the one Unicode character we allow with Shift (October 4, 2011).
- Ignore Shift if it's the only modifier to Space or Enter, as it's too easy to mistype (October 5, 2011).
- Also don't represent Shift-Backspace because of mistypeability (October 6, 2011).
- Shift-Tab has its own different representation than normal Tab (October 6, 2011).

Thus, Evans provided the third implementation of CSI-u late in 2011 (after xterm in mid-2008, mintty in mid-2009). According to Evans' comment in freebsd-arch (March 2014), he completed pangoterm around that time (see source-tree for October 4, 2011).

There is no documentation for pangoterm aside from its source-code. Likewise, documentation for libvterm is scanty, consisting of a list of output control sequences which it supports, noting those which were from VT100, VT220 or VT320 (some inaccuracies noted). The list does not mention which were from xterm or ECMA-48 (about 10% is from xterm versus the other categories). But it does not mention input sequences. For that, one must read the source-code or test it, e.g., with tack. As one might guess from the commit-comments, there are some quirks.

For example (testing 0~bzr607-1) typing control-alt-a and control-1 both produced escape-1. Other sequences using alt produced a CSI-u escape sequence (usually).

How It Works

Very briefly:

xterm receives key events from the X server. These contain key codes and modifiers.
xterm calls Xutf8LookupString or XmbLookupString to obtain equivalent characters and key symbol.
Both XKB (keyboard) and the X11 library contribute to the key symbol returned by the *LookupString functions.
- The former uses the shift modifier to do case-conversion, etc.
- The latter has special cases for certain control keys, e.g., converting control-3 to the escape character.
xterm has its own special cases such as the VT100 keypad escape sequences.
When handling modifyOtherKeys, xterm uses the key symbol from the *LookupString functions, and the modifier information from the original XKeyPressed event, sending an escape sequence in certain cases rather than as a character.
For formatOtherKeys, xterm uses the same information as modifyOtherKeys, but changes the format of the escape sequence.

All of the relevant processing is done in xterm's input.c, and the resources which control this are documented in the manual page. But the interaction between these resources and which special cases apply to a given key can be hard for users to follow, without testing.

Developers can have a hard time with this as well. One could use xev, to see what events the X server may send xterm. Most of xev's output is not useful. But xterm has a debugging trace which shows the useful information. For this example:

control-F1
Help!, followed by Enter
exit, followed by Enter
control-d

the debugging trace contains this information:

Input(0,0) keysym 0xFFE3, 0:'' 7bit
Input(0,0) keysym 0xFFBE, 0:'' Control 7bit FKey
Input(0,7) keysym 0xFFE2, 0:'' 7bit
Input(0,7) keysym 0x0048, 1:'H' Shift 7bit
Input(0,8) keysym 0x0065, 1:'e' 7bit
Input(0,9) keysym 0x006C, 1:'l' 7bit
Input(0,10) keysym 0x0070, 1:'p' 7bit
Input(0,11) keysym 0xFFE2, 0:'' 7bit
Input(0,11) keysym 0x0021, 1:'!' Shift 7bit
Input(0,12) keysym 0xFF0D, 1:'\r' 7bit
Input(2,0) keysym 0xFFE3, 0:'' 7bit
Input(2,0) keysym 0x0065, 1:'e' 7bit
Input(2,1) keysym 0x0078, 1:'x' 7bit
Input(2,2) keysym 0x0069, 1:'i' 7bit
Input(2,3) keysym 0x0074, 1:'t' 7bit
Input(2,4) keysym 0xFF0D, 1:'\r' 7bit
Input(4,0) keysym 0xFFE3, 0:'' 7bit
Input(4,0) keysym 0x0064, 1:'\004' Control 7bit

Some of the lines have nothing in the quotes, e.g., the lines with keysyms 0xFFE3, and 0xFFE2. Those events are for modifiers. The X library takes those into account when it delivers the function key F1 (0xFFBE) and the 'H' (shifted 'h').

The case-conversion (“h” to “H”) is handled by the keyboard configuration. This aspect of the X library is documented in The X Keyboard Extension: Library Specification. The fact that uppercase and lowercase letters happen to be on the same key is incidental to the overall scheme of groups and modifiers.

Examples

In October 2019, Bram Moolenaar asked for more information about how xterm prefixes keys. That led into a discussion of modifyOtherKeys, noting that the manual page description is rather short. I suggested that the best way to document this would be to construct a table, to make it simpler to see the pattern (and special cases). The resulting script (modify-keys.pl) does this:

reads keysymdef.h to obtain the key symbols which xterm may use
pipes the output of setxkbmap through xkbcomp, to obtain tables relating the XKB key symbol and the definitions from keysymdef.h
```
# setxkbmap -model pc105 -layout us -print | xkbcomp - -C -o -
```
imitates the logic of xterm for interpreting modifyOtherKeys with the different modes (i.e., 0=off, 1=user, 2=program).
optionally writes a report in either plain text or html.

I verified the script (for my keyboard, at least) using a curses application xterm_keys which I wrote in 2006 to help with my other-key and function-key development. That sets xterm into the various keyboard modes, and prints the result from typed keys in readable form.

This example shows xterm_keys, using xdotool to send all 16 combinations of shift, alt, control, and meta for a given keycode to an xterm whose Allow SendEvents has been enabled.

That is an animated Gif, showing the various menus. The data window briefly shows the difference between sending 0x000d and 0xff0d (XK_return). The latter can have a shift modifier associated with the event.

Not all keyboard configurations contain all modifiers. My MacPorts configuration does:

xmodmap:  up to 2 keys per modifier, (keycodes in parentheses):

shift       Shift_L (0x40),  Shift_R (0x44)
lock        Caps_Lock (0x41)
control     Control_L (0x43),  Control_R (0x46)
mod1        Alt_L (0x42),  Alt_R (0x45)
mod2        Meta_L (0x3f),  Meta_R (0x47)
mod3      
mod4      
mod5

but others do not, e.g., Debian/testing:

xmodmap:  up to 4 keys per modifier, (keycodes in parentheses):

shift       Shift_L (0x32),  Shift_R (0x3e)
lock        Caps_Lock (0x42)
control     Control_L (0x25),  Control_R (0x69)
mod1        Alt_L (0x40),  Alt_R (0x6c),  Meta_L (0xcd)
mod2        Num_Lock (0x4d)
mod3      
mod4        Super_L (0x85),  Super_R (0x86),  Super_L (0xce),  Hyper_L (0xcf)
mod5        ISO_Level3_Shift (0x5c),  Mode_switch (0xcb)

The modify-keys.pl script does not take that into account; it is always possible to send keypress events directly to xterm without limiting those to the keys listed in the XKB configuration. On the other hand, although xdotool can send keypress events to pangoterm (a GDK application), GDK does not process those correctly. Typing directly into pangoterm, I find also that it does not recognize either the alt or meta key. Omitting those from the test does not help with the xdotool script; testing has to be done manually. Here is a screenshot showing the part of the test which it can do:

Here are some examples of the output from modify-keys.pl. Because my machine (and Bram's) used keyboard layout “us” and model is “pc105” that has been best-tested, e.g., using the xterm_keys program.

Other programs

Because Evans heavily promotes his alternative, this elicits some response. To date (late 2019) there are a few implementations, and some related discussion of applications which could be modified to work with the feature:

mintty provides an implementation (without documentation beyond mentioning “modifyOtherKeys”). The wiki changelog for 0.4.0 (probably by Andy Koppe) gives some insight:
```
### 0.4.0 (7 Jun 2009) ###
...
  * Xterm's "modifyOtherKeys" mode for encoding key combinations without
    standard keycodes is now supported, whereby the 'CSI u' format enabled by
    setting the "formatOtherKeys" resource to 1 in xterm is used.
```
To see what it does, read the source-code (i.e., see how the other_code function is used), or test it.

Four years later, Ren Victor requested that Emacs support it:
- Fwd: xterm/mintty control sequences support when formatOtherKeys = 1
- bug#13839: xterm/mintty control sequences support when formatOtherKeys = 1
In Fix ambiguous terminal key strokes? [$35] (neovim #176, 2014), Charles Strahan mentioned Tom Feist's attempt to incorporate the CSI-u format into iTerm2 (discussed in [iterm2-discuss] Re: Alternative Keystroke generation in 2011). The associated script, by the way, confused alt and meta. After several years, Nachman mentioned that he had added the feature (January 2019).

In the neovim issue, there is a comment by Evans:

leonerd commented on Mar 8, 2014

You mention wanting a standard. There is one. I wrote it
http://www.leonerd.org.uk/hacks/fixterms/
This is what libtermkey parses, and also what
libvterm/pangoterm will output.  xterm will /mostly/ output
that, but gets it a bit wrong in places depending on the value
of the formatOtherKeys setting.  It's either too eager or not
eager enough to use CSI u encoding.  But that's a relatively
small bug that should be easy enough for xterm to fix.

Others reading Evans' comments come away with the understanding that Evans is saying that xterm imitates pangoterm, and does a poor job at that. For instance, in Add Escapes for Ctrl-Tab Shift-Ctrl-Tab (vte #94), Martin Hostettler wrote

\e[27;<mod>;<char>~ is "modify other" aka xterm's interpretation of leonerd's fixterm.

and later (after some discussion) altered his comment to say

\e[27;<mod>;<char>~ is "modify other" aka xterm's interpretation of leonerd's fixterm.

edit: Actually it seems \e[27;<mod>;<char>~ predates fixterm (which proposes CSI u not 27) and is an xterm extension.

With more insight, Hostettler might have pointed out that xterm's CSI-u implementation predates fixterm by a few years.