# Wayland and input methods

Wayland is gradually getting the ability to support input methods natively. Actually, it's plural "abilities", because there are several pieces related to the functionality. Working in this area, I had to explain this to newcomers so often that I decided to write this blog post instead, to explain to everyone what's going on here, once and for all.

## Text input

Quick recap. The purpose of an input method is twofold: to give applications text from the user, and to recognize when and what kind of text is expected.

Input methods help the user enter text in an efficient way

The most basic thing to do that under Wayland is the **text-input** protocol. It takes text from the compositor, and gives it to applications. It lets applications tell the compositor when and what kind of text they need. The protocol doesn't worry about the user, instead leaving that to the compositor.

Text input connects applications to the compositor

## Input method

The compositor can take two paths in order to let user input the text. Either it takes the burden of communication on its own, by handling input itself, or it can delegate that task to some other program.

Input method inserts itself between compositor and privileged program

In Wayland, the **input-method** protocol was designed to help. It is very similar to *text-input*, because it lets *a program* send text *to the compositor*, and allows *the compositor* to tell *the program* what kind of text is needed. Notice the inversion! This time, the program is special. It communicates with the user, and then gives the text to the compositor. The compositor is then typically going to send the text onward to the currently focused application using *text-input*, creating a chain: special program → compositor → focused application.

Input method is one, text inputs are many.

This protocol has place for some additional responsibilities, too. Because there is typically only one application using this protocol, it can do things which would not work with multiple applications. One of them is grabbing the keyboard, known to CJKV language users. *Input-method* allows the special program to ask the compositor to send it all keyboard presses ("exclusive grab"). Taking over the keyboard makes it possible to send the text "你好" when "nihao" is typed. The other extension is meant for creating a special pop up window, which the compositor places next to the text field, and which can be used to show typing completions.

## Virtual keyboard

Input methods support would have been complete here, if all we cared about was text. However, the world is not so simple, and we have to deal with additional categories of input before being useful:

text in legacy applications which don't support text-input, and
triggering actions which would normally need a keyboard.

Both of them can be addressed by using a keyboard. But what if we're using a tablet computer, a TV, game console, or a phone, and there isn't one to speak of? We can address this issue by emulating button presses. Again, there are two basic ways to address this. The compositor can come up with something on its own, or it can delegate the task to another program.

The protocol **virtual-keyboard** is designed for programs which want to tell the compositor to issue "fake" keyboard button press events.

## Together

A fully-fledged input method program will be a Wayland client using the *input-method* protocol for submitting text, but also supporting *virtual-keyboard* for submitting actions, and as a fallback for legacy applications.

Virtual keyboard in parallel to input method

A compositor would ferry text around between the input method program and whichever application is focused. It would also carry synthetic keyboard events from the input method program to the focused application.

An application consuming text would support *text-input*, and it would send enable and disable events whenever a text input field comes into focus or becomes unfocused. It would also accept keyboard events.

Legacy applications won't send enable and disable, even when a text field is focused, and the user ready to type. When that happens, the compositor and the input method won't realize that text should be submitted now. If the input method uses an on-screen keyboard, it could remain hidden! Because of that, it's best to always make sure the user can bring up the input method and input text, which would then be delivered as keyboard events (which are always supported by applications).

## Current state

As of 2020-08-15, the latest versions of relevant protocols are:

### text-input-unstable-v3

Accepted in wayland-protocols. Designed by me, based on *gtk-text-input* by Carlos Garnacho, and on *text-input-unstable-v1*.

### input-method-unstable-v2

Used in wlroots. Designed by me, based on *text-input-unstable-v3* and *input-method-unstable-v1*.

### virtual-keyboard-unstable-v1

Used in wlroots. Designed by me, based on the *wl_keyboard* interface.

## Future

There are still some topics open. The most important one is about fixing the deficiencies in *text-input*, and updating *input-method* to match. Another one is regarding whether *virtual-keyboard* is even worth the effort, considering how it stirs up some conflict with Wayland's design.

Less important is implementing the additional features of *input-method*.

There's also the exploratory idea of designing a protocol dedicated to submitting actions like "undo", "submit", "next field", but not text, in order to eliminate the need to emulate keys in modern keyboard.

Comments