Fedora desktop Planet

Please welcome Star Labs to the LVFS

Posted by Richard Hughes on January 22, 2019 02:11 PM

A few weeks ago Sean from Star Labs asked me to start the process of joining the LVFS. Star Labs is a smaller Linux-friendly OEM based just outside London not far from where I grew up. We identified three types of firmware we could update, the system firmware, the EC firmware and the SSD firmware. The first should soon be possible with the recent pledge of capsule support from AMI, and we’ve got a binary for testing now. The EC firmware will need further work, although now I can get the IT8987 into (and out of) programming mode. The SSD firmware needed a fix to fwupd (to work around a controller quirk), but with the soon-to-be released fwupd it can already be updated:

Sean also shipped me some loaner hardware that could be recovered using manufacturing tools if I broke it, which makes testing the ITE EC flashing possible. The IT89 chip is subtly different to the IT87 chip in other hardware like the Clevo reference designs, but this actually makes it easier to support as there are fewer legacy modes. This will be a blog post all of it’s own.

In playing with hardware intermittently for a few weeks, I’ve got a pretty good feel for the “Lap Top” and “Star Lite” models. There is a lot to like, the aluminium cases feel both solid and tactile (like an XPS 13) and it feels really “premium” unlike the Clevo reference hardware. Star Labs doesn’t use the Clevo platform any more, which allows it to take some other bolder system design choices. Some things I love: the LED IPS screen, USB-C charging, the trackpad and keyboard. The custom keyboard design is actually a delight to use; I actually prefer it to my Lenovo P50 and XPS 13 for key-placement and key-travel. The touchpad seems responsive, and the virtual buttons work well unlike some of the other touchpads from other Linux-friendly OEMs. The battery life seems superb, although I’ve not really done enough discharge→charge→discharge cycles to be able to measure it properly. The front-facing camera is at the top of the bezel where it belongs, which is something the XPS 13 has only just fixed in the latest models. Nobody needs to see up my nose.

There are a few things that could be improved with the hardware in my humble opinion: The shiny bezel around the touchpad is somewhat distracting on an otherwise beautifully matte chassis. There is also only a microSD slot, when all my camera cards are full sized. The RAM is soldered in, and so can’t be upgraded in the future, and the case screws are not “captive” like the new Lenovos. It also doesn’t seem to have a ThunderBolt interface which might matter if you want to use this thing docked with a bazillion things plugged in. Some of these are probably cost choices, the Lap Top is significantly cheaper than the XPS 13 developer edition I keep comparing it against in my head.

I was also curious to try the vendor-supplied customized Ubuntu install which was supplied with it. It just worked, in every way, and for those installing other operating systems like Fedora or Arch all the different distros have been pre-tested with extra notes – a really nice touch. This is where Star Labs really shine, these guys really care about Linux and it shows. I’ve been impressed with the Lab Top and I’ll be sad to return it when all the hardware is supported by fwupd and firmware releases are available on the LVFS.

So, if you’re using Star Drive hardware already then upgrade fwupd to the latest development version, enable the LVFS testing remote using fwupdmgr enable-remote lvfs-testing and tell us how the process goes. For technical reasons you need to power down the machine and power it back up rather than just doing a “warm” reboot. In a few weeks we’ll do a proper fwupd release and push the firmware to stable.

Phoenix joins the LVFS

Posted by Richard Hughes on January 09, 2019 02:15 PM

Just like AMI, Phoenix is a huge firmware vendor, providing the firmware for millions of machines. If you’re using a ThinkPad right now, you’re most probably using Phoenix code in your mainboard firmware. Phoenix have been working with Lenovo and their ODMs on LVFS support for a while, fixing all the niggles that was stopping the capsule from working with the loader used by Linux. Phoenix can help customers build deliverables for the LVFS that use UX capsule support to make flashing beautiful, although it’s up to the OEM if that’s used or not.

It might seem slightly odd for me to be working with the firmware suppliers, rather than just OEMs, but I’m actually just doing both in parallel. From my point of view, both of the biggest firmware suppliers now understand the LVFS, and provide standards-compliant capsules by default. This should hopefully mean smaller Linux-specific OEMs like Tuxedo and Star Labs might be able to get signed UEFI capsules, rather than just getting a ROM file and an unsigned loader.

We’re still waiting for the last remaining huge OEM, but fingers crossed that should be any day now.

Fedora Firefox heads to updates with PGO/LTO.

Posted by Martin Stransky on January 08, 2019 10:29 AM

I’ve had lots of fun with GCC performance tuning at Fedora but without much results. When Mozilla switched its official builds to clang I considered that too due to difficulties with GCC PGO/LTO setup and inferior Fedora Firefox builds speed compared to Mozilla official builds.

That movement woke up GCC fans to parry that threat. Lots of arguments were brought  to that ticket about clang insecurity and missing features. More importantly upstream developer Honza Hubicka found and fixed profile data generation bug (beside the others) and Jakub Jelinek worked out a GCC bug which caused Firefox crash at startup.

That effort helped me to convince GCC to behave and thanks to those two guys Fedora can offer GCC Firefox builds with PGO (Profile-Guided Optimization) and LTO (Link-Time Optimization).

The new builds are waiting for you at Koji (Fedora 28, Fedora 29). Don’t hesitate to take them for a test drive, I use Speedometer as general browser responsibility benchmark. You can also compare them with official Mozilla builds which are built with clang PGO/LTO.

speedometer

Flatpak commandline design

Posted by Matthias Clasen on December 19, 2018 03:03 PM

Flatpak is made to run desktop apps, and there are apps like KDE Discover or GNOME Software that let you manage the Flatpak installations on your system.

Of course, we still need a way to handle Flatpak from the commandline. The flatpak commandline tool in 1.0 is powerful without being overwhelming (like git) and way friendlier than some other tools (for example, ostree).

But we can always do better. For Flatpak 1.2, we’ve gone back to the drawing board and did some designs for the commandline user experience (yes, that needs design too).

Powerful: Columns

Many Flatpak commands produce information in tabular form. You can list remotes, apps, documents, etc. And all these have a bunch information that can be displayed. It can be overwhelming when you only need a particular piece of information (and overflowing the available space in a terminal).

To help with that, we’ve introduced a –columns option which you select which information you want to see.

You can explore the available columns using –columns=help. In Flatpak 1.2, all the list-producing commands will support the –columns option: list, search, remotes, remote-ls, ps, history.

One nice side-effect of having –columns is that we can add more information to the output of these commands without overflowing the display, by adding optional columns that are not included in the default output.

Concise: Errors

It happens all the time, I want to type search, but my c key is sticky, and it comes out as searh. Flatpak used to react to unknown command by dumping out its –help option, which is long and a bit overwhelming.

Now we try to be more concise and more helpful at the same time, by making a guess at what was meant, and just pointing out the –help option.

Friendly: Search

In the same vein, the reverse-DNS style application ID that Flatpak relies on has been criticized as unwieldy and hard to handle. Nobody wants to type

flatpak install flathub org.gnome.meld

and commandline completion does not help too much if you have no idea what the application ID might be.

Thankfully, that is no longer necessary. With Flatpak 1.2, you can type

flatpak install meld

and Flatpak will ask you a few questions to confirm which remote to use and what exact application you meant. Much friendlier. This search also works for the uninstall command, and may be added to more commands in the future.

Informative: Descriptions

Flatpak repos contain appstream data describing the apps in detail. That is what e.g. GNOME Software uses on its detail page for an application.

So far, the Flatpak commandline has not used appstream data at all. But that is changing. In 1.2, a number of commands will show useful information from appstream data, such as the description shown here by the list and info commands:

If you pay close attention you may notice that column names can be abbreviated with –columns.

Fun: Prompts

Here’s the commandline version of theming. We now set a custom prompt. to let you know what context you are in when using a shell in a flatpak sandbox.

You can customize the prompt using

flatpak override --env=PS1="..."

The application ID for the sandbox is available in the FLATPAK_ID environment variable.

The beast: Progress

Updates and installs can take a long time – there are possibly big downloads and there may be dependencies that need to be updated as well, etc. For a good user experience it is  important to provide some feedback on the progress and the expected remaining time.

The Ostree library which Flatpak uses for downloads provides progress information, but it is very detailed and hard to digest.  To make this even more challenging, there is not that much space in a terminal window to display all the relevant information.

For 1.2, we’ve come up with  a combination of a table that gets updated to display overall status and a progress line for the current operation:

<video class="wp-video-shortcode" controls="controls" height="248" id="video-2450-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/12/install.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/12/install.webm</video>

A similar, but simpler layout is used for uninstalls.

<video class="wp-video-shortcode" controls="controls" height="248" id="video-2450-2" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/12/uninstall.webm?_=2" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/12/uninstall.webm</video>

Coming Soon

All of these improvements will appear in Flatpak 1.2, hopefully very soon after the New Year. Something to look forward to.  📦📦📦

Understanding HID report descriptors

Posted by Peter Hutterer on December 15, 2018 04:47 AM

This time we're digging into HID - Human Interface Devices and more specifically the protocol your mouse, touchpad, joystick, keyboard, etc. use to talk to your computer.

Remember the good old days where you had to install a custom driver for every input device? Remember when PS/2 (the protocol) had to be extended to accommodate for mouse wheels, and then again for five button mice. And you had to select the right protocol to make it work. Yeah, me neither, I tend to suppress those memories because the world is awful enough as it is.

As users we generally like devices to work out of the box. Hardware manufacturers generally like to add bits and bobs because otherwise who would buy that new device when last year's device looks identical. This difference in needs can only be solved by one superhero: Committee-man, with the superpower to survive endless meetings and get RFCs approved.

Many many moons ago, when USB itself was in its infancy, Committee man and his sidekick Caffeine boy got the USB consortium agree on a standard for input devices that is so self-descriptive that operating systems (Win95!) can write one driver that can handle this year's device, and next year's, and so on. No need to install extra drivers, your device will just work out of the box. And so HID was born. This may only be an approximate summary of history.

Originally HID was designed to work over USB. But just like Shrek the technology world is obsessed with layers so these days HID works over different transport layers. HID over USB is what your mouse uses, HID over i2c may be what your touchpad uses. HID works over Bluetooth and it's celebrity-diet version BLE. Somewhere, someone out there is very slowly moving a mouse pointer by sending HID over carrier pigeons just to prove a point. Because there's always that one guy.

HID is incredibly simple in that the static description of the device can just be bytes burnt into the ROM like the Australian sun into unprepared English backpackers. And the event frames are often an identical series of bytes where every bit is filled in by the firmware according to the axis/buttons/etc.

HID is incredibly complicated because parsing it is a stack-based mental overload. Each individual protocol item is simple but getting it right and all into your head is tricky. Luckily, I'm here for you to make this simpler to understand or, failing that, at least more entertaining.

As said above, the purpose of HID is to make devices describe themselves in a generic manner so that you can have a single driver handle any input device. The idea is that the host parses that standard protocol and knows exactly how the device will behave. This has worked out great, we only have around 200 files dealing with vendor- and hardware-specific HID quirks as of v4.20.

HID messages are Reports. And to know what a Report means and how to interpret it, you need a Report Descriptor. That Report Descriptor is static and contains a series of bytes detailing "what" and "where", i.e. what a sequence of bits represents and where to find those bits in the Report. So let's try and parse one of Report Descriptors, let's say for a fictional mouse with a few buttons. How exciting, we're at the forefront of innovation here.

The Report Descriptor consists of a bunch of Items. A parser reads the next Item, processes the information within and moves on. Items are small (1 byte header, 0-4 bytes payload) and generally only apply exactly one tiny little bit of information. You need to accumulate several items to build up enough information to actually know what's happening.

The "what" question of the Report Descriptor is answered with the so-called Usage. This could be something simple like X or Y (0x30 and 0x31) or something more esoteric like System Menu Exit (0x88). A Usage is 16 bits but all Usages are grouped into so-called Usage Pages. A Usage Page too is a 16 bit value and together they form the 32-bit value that tells us what the device can do. Examples:


0001 0031 # Generic Desktop, Y
0001 0088 # Generic Desktop, System Menu Exit
0003 0005 # VR Controls, Head Tracker
0003 0006 # VR Controls, Head Mounted Display
0004 0031 # Keyboard, Keyboard \ and |
Note how the Usage in the last item is the same as the first one, without the Usage Page you will mix things up. It helps if you always think of as the Usage as a 32-bit number. For your kids' bed-time story time, here are the HID Usage Tables from 2004 and the approved HID Usage Table Review Requests of the last decade. Because nothing puts them to sleep quicker than droning on about hex numbers associated with remote control buttons.

To successfully interpret a Report from the device, you need to know which bits have which Usage associated with them. So let's go back to our innovative mouse. We would want a report descriptor with 6 items like this:


Usage Page (Generic Desktop)
Usage (X)
Report Size (16)
Usage Page (Generic Desktop)
Usage (Y)
Report Size (16)
This basically tells the host: X and Y both have 16 bits. So if we get a 4-byte Report from the device, we know two bytes are for X, two for Y.

HID was invented when a time when bits were more expensive than printer ink, so we can't afford to waste any bits (still the case because who would want to spend an extra penny on more ROM). HID makes use of so-called Global items, once those are set their value applies to all following items until changed. Usage Page and Report Size are such Global items, so the above report descriptor is really implemented like this:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
The Report Count just tells us that 2 fields of the current Report Size are coming up. We have two usages, two fields, and 16 bits each so we know what to do. The Input item is sort-of the marker for the end of the stack, it basically tells us "process what you've seen so far", together with a few flags. Rel in this case means that the Usages are relative. Oh, and Input means that this is data from device to host. Output would be data from host to device, e.g. to set LEDs on a keyboard. There's also Feature which indicates configurable items.

Buttons on a device are generally just numbered so it'd be monumental 16-bits-at-a-time waste to have HID send Usage (Button1), Usage (Button2), etc. for every button on the device. HID instead provides a Usage Minimum and Usage Maximumto sequentially order them. This looks like this:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)
So we have 5 buttons here and each button has one bit. Note how the buttons are Abs because a button state is not a relative value, it's either down or up. HID is quite intolerant to Schrödinger's thought experiments.

Let's put the two things together and we have an almost-correct Report descriptor:


Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)

Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
New here is Cnst. This signals that the bits have a constant value, thus don't need a Usage and basically don't matter (haha. yeah, right. in theory). Linux does indeed ignore those. Cnst is used for padding to align on byte boundaries - 5 bits for buttons plus 3 bits padding make 8 bits. Which makes one byte as everyone agrees except for granddad over there in the corner. I don't know how he got in.

Were we to get a 5-byte Report from the device, we'd parse it approximately like this:


button_state = byte[0] & 0x1f
x = bytes[1] | (byte[2] << 8)
y = bytes[3] | (byte[4] << 8)
Hooray, we're almost ready. Except not. We may need more info to correctly interpret the data within those reports.

The Logical Minimum and Logical Maximum specify the value range of the actual data. We need this to tell us whether the data is signed and what the allowable range is. Together with the Physical Minimumand the Physical Maximum they specify what the values really mean. In the simple case:


Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
This just means our x/y data is signed. Easy. But consider this combination:

...
Logical Minimum (0)
Logical Maximum (1)
Physical Minimum (1)
Physical Maximum (12)
This means that if the bit is 0, the effective value is 1. If the bit is 1, the effective value is 12.

Note that the above is one report only. Devices may have multiple Reports, indicated by the Report ID. So our Report Descriptor may look like this:


Report ID (01)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)

Report ID (02)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Input (Data,Var,Rel)
If we were to get a Report now, we need to check byte 0 for the Report ID so we know what this is. i.e. our single-use hard-coded parser would look like this:

if byte[0] == 0x01:
button_state = byte[1] & 0x1f
else if byte[0] == 0x02:
x = bytes[2] | (byte[3] << 8)
y = bytes[4] | (byte[5] << 8)
A device may use multiple Reports if the hardware doesn't gather all data within the same hardware bits. Now, you may ask: if I get fifteen reports, how should I know what belongs together? Good question, and lucky for you the HID designers are miles ahead of you. Report IDs are grouped into Collections.

Collections can have multiple types. An Application Collectiondescribes a set of inputs that make sense as a whole. Usually, every Report Descriptor must define at least one Application Collection but you may have two or more. For example, a a keyboard with integrated trackpoint should and/or would use two. This is how the kernel knows it needs to create two separate event nodes for the device. Application Collections have a few reserved Usages that indicate to the host what type of device this is. These are e.g. Mouse, Joystick, Consumer Control. If you ever wondered why you have a device named like "Logitech G500s Laser Gaming Mouse Consumer Control" this is the kernel simply appending the Application Collection's Usage to the device name.

A Physical Collection indicates that the data is collected at one physical point though what a point is is a bit blurry. Theoretical physicists will disagree but a point can be "a mouse". So it's quite common for all reports on a mouse to be wrapped in one Physical Collections. If you have a device with two sets of sensors, you'd have two collections to illustrate which ones go together. Physical Collections also have reserved Usages like Pointer or Head Tracker.

Finally, a Logical Collection just indicates that some bits of data belong together, whatever that means. The HID spec uses the example of buffer length field and buffer data but it's also common for all inputs from a mouse to be grouped together. A quick check of my mice here shows that Logitech doesn't wrap the data into a Logical Collection but Microsoft's firmware does. Because where would we be if we all did the same thing...

Anyway. Now that we know about collections, let's look at a whole report descriptor as seen in the wild:


Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Application)
Usage Page (Generic Desktop)
Usage (Mouse)
Collection (Logical)
Report ID (26)
Usage (Pointer)
Collection (Physical)
Usage Page (Button)
Usage Minimum (1)
Usage Maximum (5)
Report Count (5)
Report Size (1)
Logical Minimum (0)
Logical Maximum (1)
Input (Data,Var,Abs)
Report Size (3)
Report Count (1)
Input (Cnst,Arr,Abs)
Usage Page (Generic Desktop)
Usage (X)
Usage (Y)
Report Count (2)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
Usage (Wheel)
Physical Minimum (0)
Physical Maximum (0)
Report Count (1)
Report Size (16)
Logical Minimum (-32767)
Logical Maximum (32767)
Input (Data,Var,Rel)
End Collection
End Collection
End Collection
We have one Application Collection (Generic Desktop, Mouse) that contains one Logical Collection (Generic Desktop, Mouse). That contains one Physical Collection (Generic Desktop, Pointer). Our actual Report (and we have only one but it has the decimal ID 26) has 5 buttons, two 16-bit axes (x and y) and finally another 16 bit axis for the Wheel. This device will thus send 8-byte reports and our parser will do:

if byte[0] != 0x1a: # it's decimal in the above descriptor
error, should be 26
button_state = byte[1] & 0x1f
x = byte[2] | (byte[3] << 8)
y = byte[4] | (byte[5] << 8)
wheel = byte[6] | (byte[7] << 8)
That's it. Now, obviously, you can't write a parser for every HID descriptor out there so your actual parsing code needs to be generic. The Linux kernel does exactly that and so does everything else that needs to parse HID. There's a huge variety in devices out there, all with HID descriptors that may or may not be correct. As with so much in life, correct HID implementations are often defined by "whatever Windows accepts" so if you like playing catch, Linux development is for you.

Oh, in case you just got a bit too optimistic about the state of the world: HID allows for vendor-defined usages. Which does exactly what you'd think it does, it hides vendor-specific protocol inside what should be a generic protocol. There are devices with hidden report IDs that you can only unlock by sending the right magic sequence to the report and/or by defeating the boss on Level 4. Usually those devices present themselves as basic/normal devices over HID but if you know the magic sequence you get to use *gasp* all buttons. Or access the device-specific configuration features. Logitech's HID++ is just one example here but at least that's one where we have most of the specs available.

The above describes how to parse the HID report descriptor and interpret the reports. But what happens once you have a HID report correctly parsed? In the case of the Linux kernel, once the report descriptor is parsed evdev nodes are created (one per Application Collection, more or less). As the Reports come in, they are mapped to evdev codes and the data appears on the evdev node. That's where userspace like libinput can pick it up. That bit is actually quite simple (mostly anyway).

The above output was generated with the tools from the hid-tools repository. Go forth and hid-record.

Firmware Attestation

Posted by Richard Hughes on December 14, 2018 08:23 PM

When fwupd writes firmware to devices, it often writes it, then does a verify pass. This is to read back the firmware to check that it was written correctly. For some devices we can do one better, and read the firmware hash and compare it against a previously cached value, or match it against the version published by the LVFS. This means we can detect some unintentional corruption or malicious firmware running on devices, on the assumption that the bad firmware isn’t just faking the requested checksum. Still, better than nothing.

Any processor better than the most basic PIC or Arduino (e.g. even a tiny $5 ARM core) is capable of doing public/private key firmware signing. This would use standard crypto using X.509 keys or GPG to ensure the device only runs signed firmware. This protects against both accidental bitflips and also naughty behaviour, and is unofficial industry recommended practice for firmware updates. Older generations of the Logitech Unifying hardware were unsigned, and this made the MouseJack hack almost trivial to deploy on an unmodified dongle. Newer Unifying hardware requires a firmware image signed by Logitech, which makes deploying unofficial or modified firmware almost impossible.

There is a snag with UEFI capsule updates, which is how you probably applied your last “BIOS” firmware update. Although the firmware capsule is signed by the OEM or ODM, we can’t reliably read the SPI EEPROM from userspace. It’s fair to say flashrom does work on some older hardware but it also likes disabling keyboard controllers and making the machine reboot when probing hardware. We can get a hash of the firmware, or rather, a hash derived from the firmware image with other firmware-related things added for good measure. This is helpfully stored in the TPM chip, which most modern laptops have installed.

Although the SecureBoot process cares about the higher PCR values to check all manners of userspace, we only care about the index zero of this register, so called PCR0. If you change your firmware, for any reason, the PCR0 will change. There is one PCR0 checksum (or a number slightly higher than one, for reasons) on all hardware of a given SKU. If you somehow turn the requirement for the hardware signing key off on your machine (e.g. a newly found security issue), or your firmware is flashed using another method than UpdateCapsule (e.g. DediProg) then you can basically flash anything. This would be unlikely, but really bad.

If we include the PCR0 in the vendor-supplied firmware.metainfo.xml file, or set it in the admin console of the LVFS then we can verify that the firmware we’re running right now is the firmware the ODM or OEM uploaded. This means you can have firmware 100% verified, where you’re sure that the firmware version that was uploaded by the vendor is running on your machine right now. This is good.

As an incentive for vendors to support signing they’ll soon be an easy to understand shield system on the LVFS. A wooden shield means the firmware was uploaded to the LVFS by the OEM or authorized ODM on behalf of the OEM. A plain metal shield means the above, plus the firmware is signed using strong encryption. A crested shield means the vendor is trusted, the firmware is signed, and we can do secure attestation and be sure the firmware hasn’t been tampered with.

Obviously some protocols can’t get either the last, or the last two shield types (e.g. ColorHug, even symmetric crypto isn’t good) but that’s okay. It’s still more secure than flashing a random binary from an FTP site, which is what most people were doing before. Not upstream yet, and not quite finished, so comments welcome.

The tools of libfprint

Posted by Bastien Nocera on December 14, 2018 04:04 PM
libfprint, the fingerprint reader driver library, is nearing a 1.0 release.

Since the last time I reported on the status of the library, we've made some headway modernising the library, using a variety of different tools. Let's go through them and how they were used.

Callcatcher

When libfprint was in its infancy, Daniel Drake found the NBIS fingerprint processing library matched what was required to provide fingerprint matching algorithms, and imported it in libfprint. Since then, the code in this copy-paste library in libfprint stayed the same. When updating it to the latest available version (from 2015 rather than 2007), as well as splitting off a patch to make it easier to update the library again in the future, I used Callcatcher to cull the unused functions.

Callcatcher is not a "production-level" tool (too many false positives, lack of support for many common architectures, etc.), but coupled with manual checking, it allowed us to greatly reduce the number of functions in our copy, so they weren't reported when using other source code quality checking tools.

LLVM's scan-build

This is a particularly easy one to use as its use is integrated into meson, and available through ninja scan-build. The output of the tool, whether on stderr, or on the HTML pages, is pretty similar to Coverity's, but the tool is free, and easily integrated into a CI (once you've fixed all the bugs, obviously). We found plenty of possible memory leaks and unintialised variables using this, with more flexibility than using Coverity's web interface, and avoiding going through hoops when using its "source code check as a service" model.

cflow and callgraph

LLVM has another tool, called callgraph. It's not yet integrated into meson, which was a bit of a problem to get some output out of it. But combined with cflow, we used it to find where certain functions were called, trying to find the origin of some variables (whether they were internal or device-provided for example), which helped with implementing additional guards and assertions in some parts of the library, in particular inside the NBIS sub-directory.

0.99.0 is out

We're not yet completely done with the first pass at modernising libfprint and its ecosystem, but we released an early Yule present with version 0.99.0. It will be integrated into Fedora after the holidays if the early testing goes according to plan.

We also expect a great deal from our internal driver API reference. If you have a fingerprint reader that's unsupported, contact your laptop manufacturer about them providing a Linux driver for it and point them at this documentation.

A number of laptop vendors are already asking their OEM manufacturers to provide drivers to be merged upstream, but a little nudge probably won't hurt.

Happy holidays to you all, and see you for some more interesting features in the new year.

High resolution wheel scrolling on Linux v4.21

Posted by Peter Hutterer on December 12, 2018 04:27 AM

Disclaimer: this is pending for v4.21 and thus not yet in any kernel release.

Most wheel mice have a physical feature to stop the wheel from spinning freely. That feature is called detents, notches, wheel clicks, stops, or something like that. On your average mouse that is 24 wheel clicks per full rotation, resulting in the wheel rotating by 15 degrees before its motion is arrested. On some other mice that angle is 18 degrees, so you get 20 clicks per full rotation.

Of course, the world wouldn't be complete without fancy hardware features. Over the last 10 or so years devices have added free-wheeling scroll wheels or scroll wheels without distinct stops. In many cases wheel behaviour can be configured on the device, e.g. with Logitech's HID++ protocol. A few weeks back, Harry Cutts from the chromium team sent patches to enable Logitech high-resolution wheel scrolling in the kernel. Succinctly, these patches added another axis next to the existing REL_WHEEL named REL_WHEEL_HI_RES. Where available, the latter axis would provide finer-grained scroll information than the click-by-click REL_WHEEL. At the same time I accidentally stumbled across the documentation for the HID Resolution Multiplier Feature. A few patch revisions later and we now have everything queued up for v4.21. Below is a summary of the new behaviour.

The kernel will continue to provide REL_WHEEL as axis for "wheel clicks", just as before. This axis provides the logical wheel clicks, (almost) nothing changes here. In addition, a REL_WHEEL_HI_RES axis is available which allows for finer-grained resolution. On this axis, the magic value 120 represents one logical traditional wheel click but a device may send a fraction of 120 for a smaller motion. Userspace can either accumulate the values until it hits a full 120 for one wheel click or it can scroll by a few pixels on each event for a smoother experience. The same principle is applied to REL_HWHEEL and REL_HWHEEL_HI_RES for horizontal scroll wheels (which these days is just tilting the wheel). The REL_WHEEL axis is now emulated by the kernel and simply sent out whenever we have accumulated 120.

Important to note: REL_WHEEL and REL_HWHEEL are now legacy axes and should be ignored by code handling the respective high-resolution version.

The magic value of 120 is taken directly from Windows. That value was chosen because it has a good number of integer factors, so dividing 120 by whatever multiplier the mouse uses gives you a integer fraction of 120. And because HW manufacturers want it to work on Windows, we can rely on them doing it right, provided we use the same approach.

There are two implementations that matter. Harry's patches enable the high-resolution scrolling on Logitech mice which seem to mostly have a multiplier of 8 (i.e. REL_WHEEL_HI_RES will send eight events with a value of 15 before REL_WHEEL sends 1 click). There are some interesting side-effects with e.g. the MX Anywhere 2S. In high-resolution mode with a multiplier of 8, a single wheel movement does not always give us 8 events, the firmware does its own magic here. So we have some emulation code in place with the goal of making the REL_WHEEL event happen on the mid-point of a wheel click motion. The exact point can shift a bit when the device sends 7 events instead of 8 so we have a few extra bits in place to reset after timeouts and direction changes to make sure the wheel behaviour is as consistent as possible.

The second implementation is for the generic HID protocol. This was all added for Windows Vista, so we're only about a decade behind here. Microsoft got the Resolution Multiplier feature into the official HID documentation (possibly in the hope that other HW manufacturers implement it which afaict didn't happen). This feature effectively provides a fixed value multiplier that the device applies in hardware when enabled. It's basically the same as the Logitech one except it's set through a HID feature instead of a vendor-specific protocol. On the devices tested so far (all Microsoft mice because no-one else seems to implement this) the multipliers vary a bit, ranging from 4 to 12. And the exact behaviour varies too. One mouse behaves correctly (Microsoft Comfort Optical Mouse 3000) and sends more events than before. Other mice just send the multiplied value instead of the normal value, so nothing really changes. And at least one mouse (Microsoft Sculpt Ergonomic) sends the tilt-wheel values more frequently and with a higher value. So instead of one event with value 1 every X ms, we now get an event with value 3 every X/4 ms. The mice tested do not drop events like the Logitech mice do, so we don't need fancy emulation code here. Either way, we map this into the 120 range correctly now, so userspace gets to benefit.

As mentioned above, the Resolution Multiplier HID feature was introduced for Windows Vista which is... not the most recent release. I have a strong suspicion that Microsoft dumped this feature as well, the most recent set of mice I have access to don't provide the feature anymore (they have vendor-private protocols that we don't know about instead). So the takeaway for all this is: if you have a Logitech mouse, you'll get higher-resolution scrolling on v4.21. If you have a Microsoft mouse a few years old, you may get high-resolution wheel scrolling if the device supports it. Any other vendor or a new Microsoft mouse, you don't get it.

Coincidentally, if you know anyone at Microsoft who can provide me with the specs for their custom protocol, I'd appreciate it. We'd love to have support for it both in libratbag and in the kernel. Or any other vendor, come to think of it.

AMI joins the LVFS

Posted by Richard Hughes on December 07, 2018 11:49 AM

American Megatrends Inc. may not be a company you’ve heard of, unless perhaps you like reading early-boot BIOS messages. AMI is the world’s largest BIOS firmware vendor, supplying firmware and tools to customers such as Asus, Clevo, Intel, AMD and many others. If you’ve heard of a vendor using Aptio for firmware updates, that means it’s from them. AMI has been testing the LVFS, UpdateCapsule and fwupd for a few months and is now fully compatible. They are updating their whitepapers for customers explaining the process of generating a capsule, using the ESRT, and generating deliverables for the LVFS.

This means “LVFS Support” becomes a first class citizen alongside Windows Update for the motherboard manufacturers. This should trickle down to the resellers, so vendors using Clevo motherboards like Tuxedo get LVFS support almost for free. This will take a bit of time to trickle down to the smaller OEMs.

Also, expect another large vendor announcement soon. It’s the one quite a few people have been waiting for.

Flatpaks in Fedora – now live

Posted by Owen Taylor on December 04, 2018 07:42 PM

fedora-plus-flatpak

I’m pleased to announce that we now have full initial support for Flatpak creation in Fedora infrastructure: Flatpaks can be built as containers, pushed to testing and stable via Bodhi, and installed by users from registry.fedoraproject.org through the command line, GNOME Software, or KDE Discover.

The goal of this work has been to enable creating Flatpaks from Fedora packages on Fedora infrastructure – this will expand the set of Flatpaks that are available to all Flatpak users, provide a runtime that gets updates as bugs and security fixes appear in Fedora, and provide Fedora users, especially on Fedora Silverblue, with an out-of-the-box set of Flatpak applications enabled by default.

At a technical level, a very brief summary of the approach we’ve taken is to take Fedora RPMs, rebuild them with prefix=/app using the Fedora modularity framework, then using the same container build service we use for server-side containers, create Flatpaks as OCI images . Flatpak has been extended to know how to browse, download, and install Flatpaks packaged as OCI images from a container registry. See my my talk at DevConf.CZ last year  for a slightly longer introduction to what we are doing.

Right now, there are only a few applications in the registry, but we will work to build up the set of applications over the next few months, and hopefully by the time that Fedora 30 comes out in the spring, will have something that will be genuinely useful for Fedora users and that can be enabled by default in Fedora Workstation.

Special thanks to Clement Verna, Randy Barlow, Kevin Fenzi, and the rest of the Fedora infrastructure team for a lot of help in making the Fedora deployment happen, as well as to the OSBS team and Alex Larsson!

Using it

From the command line, either add the stable remote:

flatpak remote-add fedora oci+https://registry.fedoraproject.org

Or add the testing remote:

flatpak remote-add fedora-testing oci+https://registry.fedoraproject.org#testing

There is no point in adding both, since all Flatpaks in the stable remote are also in the testing remote. (The plan is to eventually that most users will simply use the stable remote, and there will be links in Bodhi to make it easy to install a single application from testing.)

Note that some fixes were needed to make OCI remotes work properly system-wide. These were backported to the Fedora Flatpak-1.0.6 packages, but are only in master upstream, so if you aren’t using Fedora, you should add the remote per-user instead.

Creating Flatpaks

If you are a Fedora packager and want to create a Flatpak of a graphical application you maintain, or want to help out creating Flatpaks in general, see the packager documentation. There is also a list of applications that are easy to package as Flatpaks.

Future work

One thing that should make things easier for packagers is the flatpak common module – by depending on this module, different Flatpaks can share the same binary builds of common libraries. This is particularly important for libraries that take a long time to build, or when an application needs to bundle a large set of libraries (think KDE or TeX). The current flatpak-common is a prototype, and there needs to be some thought given to the policies and tools for updating it.

Automatic rebuilds of Flatpaks are essential: when a fix is added to a library that an application bundles, the module and Flatpak should automatically be rebuilt without the maintainer of the Flatpak having to know that there was something to do – and then the maintainer should be notified, either with a link to a build failure, or, more hopefully, with a link to a Bodhi update that was automatically filed. Adding Flatpak support to Freshmaker and deploying it for Fedora is probably the right course here – though not a small task.

Another thing that I hope to address in the near term is signatures. Right now the authenticity is checked by downloading a master index from https://registry.fedoraproject.org that contains hashes for the latest versions of applications. But having signatures on the images would add further protection against tampering, and depending on how they were implemented, could allow things like third party signatures added by an organization’s IT department. There’s quite a bit of complexity here, because there are multiple competing signature frameworks to coordinate: not just Flatpaks native signatures, but multiple different ways of signing container images.

A more long-term goal is to create a way to download updates to Flatpak container images as deltas, so that every update is not a full download. Reusing OCI images for Fedora Flatpaks has strong advantages for creating a common ecosystem between server applications and desktop applications, but on the server side, reducing bandwidth usage for server updates was usually not an important consideration, so the distribution strategy is simply to download everything from scratch. Hopefully work here could be shared between Flatpaks and the server usage of OCI images.

A PolicyKit refresher

Posted by Matthias Clasen on December 04, 2018 05:08 AM

PolicyKit has been around for a long time, and it is a mature system that basically does its job. I recently spent some time to improve the PolicyKit integration in Flatpak and thought it might be useful to write up some of the details.

Details, details

It is irritating when a dialog  pops up unexpectedly and asks questions without providing sufficient details to really know where it is coming from and what the context is. Like

Authentication is required to install software

It would be much better to say what software is being installed, and where it is coming from:

PolicyKit lets you do this by using variables in your message and providing the replacement for them via a PolkitDetails object:

Authentication is required to install $(ref) from $(origin)
polkit_details_insert (details, "origin", origin);
polkit_details_insert (details, "ref", ref);

No display – no problem

Its nice to get a  PolicyKit dialog when you are using a desktop app that needs to carry out a privileged operation. But PolicyKit is also used by commandline tools, such as flatpak. If you are running a command in a terminal, a dialog might still be ok. But what if you are using the flatpak on the console? A dialog is not available here, so privileged operations will fail.

PolicyKit provides the necessary plumbing to solve this situation, by letting apps register their own ‘agent’, which can handle authorization requests if no other agent is available. (In a graphical session, GNOME Shell provides an agent.)

Glancing over some details,  the code to do this looks roughly like this:

listener = polkit_agent_text_listener_new (NULL,
                                         &error);
polkit_agent_listener_register (listener,
            flags, subject, NULL, NULL, &error);

And it yields a result like this:If PolicyKits built-in text agent does not fit your needs, you can implement a PolkitAgentListener yourself. That is what I ended up doing for Flatpak.

Psst, don’t interrupt

As I said earlier, unexpected dialogs are annoying. Ideally, PolicyKit dialogs should only ever appear in response to a direct user action. For example, a dialog is ok if I am clicking the “Install” button in GNOME Software, but not if GNOME Software decides on it own that it is time to install some updates.

PolicyKit has a means to achieve this, by not passing the

POLKIT_CHECK_AUTHORIZATION_FLAGS_ALLOW_USER_INTERACTION

flag when checking for authorization. But this check happens in the system service (or mechanism, in PolicyKit lingo), not in your client. In order to take advantage of this flag, the client needs to pass information about  user interaction along whenever it calls a privileged method of the mechanism.

In the Flatpak case, I added a ‘no-interaction’ flag to all the flatpak-system-helper methods, and library API that GNOME Software can use to set this flag for background operations.

So there should be less unexpected PolicyKit dialogs in the future.

Update: A useful PolicyKit feature that I forgot to mention is implications. If the the permissions (or actions, in PolicyKit lingo) are ordered in some way (e.g. if you are allowed to install applications, you should also be allowed to update them), you can express these relations with “imply” annotations. This can further reduce the need for duplicate dialogs.`

ggkbdd is a generic gaming keyboard daemon

Posted by Peter Hutterer on December 03, 2018 05:43 AM

Last week while reviewing a patch I read that some gaming keyboards have two modes - keyboard mode and gaming mode. When in gaming mode, the keys send out pre-recorded macros when pressed. Presumably (I am not a gamer) this is to record keyboard shortcuts to have quicker access to various functionalities. The macros are stored in the hardware and are thus relatively independent of the host system. Pprovided you have access to the custom protocol, which you probably don't when you're on Linux. But I digress.

I reckoned this could be done in software and work with any 5 dollar USB keyboard. A few hours later, I have this working now: ggkbdd. It sits directly above the kernel and waits for key events. Once the 'mode key' is hit, the keyboard will send pre-configured key sequences for the respective keys. Hitting the mode key again (or ESC) switches back to normal mode.

There's a lot of functionality that is missing such as integration with the desktop (probably via DBus), better security (dropping privs, masking the fd to avoid accidental key logging), better system integration (request fds from logind, possibly through the compositor). And error handling, etc. I think the total time on this spent is somewhere between 3 and 4h, and that includes the time to write this blog post and debug the systemd unit autostartup. There are likely other projects that solve it the same way, or at least in a similar manner. I didn't check.

This was done as proof-of-concept and

  • I don't know if it's useful and if so, what the use-cases are
  • I don't know if I will have any time to fix things on this
  • I don't know if other (better developed) projects already occupy that space
In the grand glorious future and provided this is indeed something generally useful, this would need compositor integration. Not sure we'll ever get to that point. Meanwhile, consider this a code drop for a proof-of-concept and expect that you'll have to fix any bugs yourself.

An update on Flatpak updates

Posted by Matthias Clasen on November 26, 2018 11:10 PM

A while ago, I’ve described how to handle restarting Flatpak apps when they are updated.

While I showed working code back then, the example was a bit  contrived, since it had to work around GApplication’s built-in uniqueness features.

With the just-landed support for replacement in GApplication, this is now much more straightforward, and I’ve updated my example to show how it works.

Restart yourself

There are a few steps to this.

First, we need to opt into allowing replacement by passing G_APPLICATION_ALLOW_REPLACEMENT when creating our GApplication:

app = g_object_new (portal_test_app_get_type (),
        "application-id", "org.gnome.PortalTest",
        "flags", G_APPLICATION_ALLOW_REPLACEMENT,
        NULL);

This flag makes GApplication handle a –gapplication-replace commandline option, which we will see in action later.

Next, we monitor the /app/.updated file. Flatpak creates this file inside a running sandbox when a new version of the app is available, so this is our signal to restart ourselves:

file = g_file_new_for_path ("/app/.updated");
update_monitor =
       g_file_monitor_file (file, 0, NULL, NULL);
g_signal_connect (update_monitor, "changed",
       G_CALLBACK (update_monitor_changed), win);

The update_monitor_changed function presents a dialog that offers the user to restart the app:

 if (response == GTK_RESPONSE_OK)
      portal_test_app_restart (app);

The portal_test_app_restart function is where we take advantage of the new GApplication functionality:

const char *argv[3] = {
    "/app/bin/portal-test",
    "--gapplication-replace",
    NULL
};

xdp_flatpak_call_spawn (flatpak, cwd, argv, ...);

We call the Spawn method of the Flatpak portal, passing –gapplication-replace as an commandline option, and everything else is handled for us. Easy!

Update 1: If D-Bus is not your thing, there is a simple commandline tool called flatpak-spawn that can be used for the same purpose. It is available inside the sandbox.

Update 2: I forgot to mention that GApplication has also gained a name-lost signal. It can be used to save state in the old instance that can then be loaded in the new instance.

Adding an optional install duration to LVFS firmware

Posted by Richard Hughes on November 13, 2018 09:54 AM

We’ve just added an optional feature to fwupd and the LVFS that some people might find useful: The firmware update process can now tell the user how long in seconds the update is going to take.

This means that users can know that a dock update might take 5 minutes, and so they start the update process before they go to lunch. A UEFI update will require multiple reboots and will take 45 minutes to apply, and so the user will only apply the update at the end of the day rather than losing access to the their computer for nearly an hour.

If you want to use this feature there are currently three ways to assign the duration to the update:

  • Changing the value on the LVFS admin console — the component update panel now has an extra input field to enter the
    duration in
    seconds
  • Adding a new attribute to the element, for instance:
    <release version="3.0.2" date="2018-11-09" install_duration="120">
    
  • Adding a ‘quirk’ to fwupd, for instance:
    [DeviceInstanceId=USB\VID_1234&PID_5678]
    InstallDuration = 40
    
  • For updates requiring a reboot the install duration should include the time to POST the system both before and after the update has run, but it can be approximate. Only users running very new versions of fwupd and gnome-software will be shown the install duration, and older versions will be unchanged as the new property will just be ignored. It’s therefore safe to include in all versions of firmware without adding a the dependency on a specific fwupd version.

More fun with libxmlb

Posted by Richard Hughes on November 12, 2018 03:49 PM

A few days ago I cut the 0.1.4 release of libxmlb, which is significant because it includes the last three features I needed in gnome-software to achieve the same search results as appstream-glib.

The first is something most users of database libraries will be familiar with: Bound variables. The idea is you prepare a query which is parsed into opcodes, and then at a later time you assign one of the ? opcode values to an actual integer or string. This is much faster as you do not have to re-parse the predicate, and also means you avoid failing in incomprehensible ways if the user searches for nonsense like ]@attr. Borrowing from SQL, the syntax should be familiar:

g_autoptr(XbQuery) query = xb_query_new (silo, "components/component/id[text()=?]/..", &error);
xb_query_bind_str (query, 0, "gimp.desktop", &error);

The second feature makes the caller jump through some hoops, but hoops that make things faster: Indexed queries. As it might be apparent to some, libxmlb stores all the text in a big deduplicated string table after the tree structure is defined. That means if you do <component component="component">component</component> then we only store just one string! When we actually set up an object to check a specific node for a predicate (for instance, text()='fubar' we actually do strcmp("fubar", "component") internally, which in most cases is very fast…

Unless you do it 10 million times…

Using indexed strings tells the XbMachine processing the predicate to first check if fubar exists in the string table, and if it doesn’t, the predicate can’t possibly match and is skipped. If it does exist, we know the integer position in the string table, and so when we compare the strings we can just check two uint32_t’s which is quite a lot faster, especially on ARM for some reason. In the case of fwupd, it is searching for a specific GUID when returning hardware results. Using an indexed query takes the per-device query time from 3.17ms to about 0.33ms – which if you have a large number of connected updatable devices makes a big difference to the user experience. As using the indexed queries can have a negative impact and requires extra code it is probably only useful in a handful of cases. In case you do need this feature, this is the code you would use:

xb_silo_query_build_index (silo, "component/id", NULL, &error); // the cdata
xb_silo_query_build_index (silo, "component", "type", &error); // the @type attr
g_autoptr(XbNode) n = xb_silo_query_first (silo, "component/id[text()=$'test.firmware']", &error);

The indexing being denoted by $'' rather than the normal pair of single quotes. If there is something more standard to denote this kind of thing, please let me know and I’ll switch to that instead.

The third feature is: Stemming; which means you can search for “gaming mouse” and still get results that mention games, game and Gaming. This is also how you can search for words like Kongreßstraße which matches kongressstrasse. In an ideal world stemming would be computationally free, but if we are comparing millions of records each call to libstemmer sure adds up. Adding the stem() XPath operator took a few minutes, but making it usable took up a whole weekend.

The query we wanted to run would be of the form id[text()~=stem('?') but the stem() would be called millions of times on the very same string for each comparison. To fix this, and to make other XPath operators faster I implemented an opcode rewriting optimisation pass to the XbMachine parser. This means if you call lower-case(text())==lower-case('GIMP.DESKTOP') we only call the UTF-8 strlower function N+1 times, rather than 2N times. For lower-case() the performance increase is slight, but for stem it actually makes the feature usable in gnome-software. The opcode rewriting optimisation pass is kinda dumb in how it works (“lets try all combinations!”), but works with all of the registered methods, and makes all existing queries faster for almost free.

One common question I’ve had is if libxmlb is supposed to obsolete appstream-glib, and the answer is “it depends”. If you’re creating or building AppStream metadata, or performing any AppStream-specific validation then stick to the appstream-glib or appstream-builder libraries. If you just want to read AppStream metadata you can use either, but if you can stomach a binary blob of rewritten metadata stored somewhere, libxmlb is going to be a couple of orders of magnitude faster and use a ton less memory.

If you’re thinking of using libxmlb in your project send me an email and I’m happy to add more documentation where required. At the moment libxmlb does everything I need for fwupd and gnome-software and so apart from bugfixes I think it’s basically “done”, which should make my manager somewhat happier. Comments welcome.

Pipewire Hackfest 2018

Posted by Bastien Nocera on October 31, 2018 11:44 AM
Good morning from Edinburgh, where the breakfast contains haggis, and the charity shops have some interesting finds.

My main goal in attending this hackfest was to discuss Pipewire integration in the desktop, and how it will eventually replace PulseAudio as the audio daemon.

The main problem GNOME has had over the years with PulseAudio relate mostly to how PulseAudio was a black box when it came to its routing policy. What happens when you plug in an HDMI cable into your laptop? Or turn on your Bluetooth headset? I've heard the stories of folks with highly mobile workstations having to constantly visit the Sound settings panel.

PulseAudio has policy scattered in a number of places (do a "git grep routing" inside the sources to see that): some are in the device manager, then modules themselves can set priorities for their outputs and inputs. But there's nothing to take all the information in, and take a decision based on the hardware that's plugged in, and the applications currently in use.

For Pipewire, the policy decisions would be split off from the main daemon. Pipewire, as it gains PulseAudio compatibility layers, will grow a default/example policy engine that will try to replicate PulseAudio's behaviour. At the very least, that will mean that Pipewire won't regress compared to PulseAudio, and might even be able to take better decisions in the short term.

For GNOME, we still wanted to take control of that part of the experience, and make our own policy decisions. It's very possible that this engine will end up being featureful and generic enough that it will be used by more than just GNOME, or even become the default Pipewire one, but it's far too early to make that particular decision.

In the meanwhile, we wanted the GNOME policies to not be written in C, difficult to experiment with for power users, and for edge use cases. We could have started writing a configuration language, but it would have been too specific, and there are plenty of embeddable languages around. It was also a good opportunity for me to finally write the helper library I've been meaning to write for years, based on my favourite embedded language, Lua.

So I'm introducing Anatole. The goal of the project is to make it trivial to write chunks of programs in Lua, while the core of your project is written in C (we might even be able to embed it in Python or Javascript, once introspection support is added).

It's still in the very early days, and unusable for anything as of yet, but progress should be pretty swift. The code is mostly based on Victor Toso's incredible "Lua factory" plugin in Grilo. (I'm hoping that, once finished, I won't have to remember on which end of the stack I need to push stuff for Lua to do something with it ;)

Using the LVFS to influence procurement decisions

Posted by Richard Hughes on October 30, 2018 12:41 PM

The National Cyber Security Centre (part of GCHQ, the UK version of the NSA) wrote a nice article on using the LVFS to influence procurement decisions. It’s probably also worth noting that the two biggest OEMs making consumer hardware also require all their ODMs to also support firmware updates on the LVFS. More and more mega-corporations also have “supports the LVFS” as a requirement for procurement.

The LVFS is slowly and carefully moving to the Linux Foundation, so expect more outreach and announcements soon.

libxmlb now a dependency of fwupd and gnome-software

Posted by Richard Hughes on October 22, 2018 07:31 AM

I’ve just released libxmlb 0.1.3, and merged the branches for fwupd and gnome-software so that it becomes a hard dependency on both projects. A few people have reviewed the libxmlb code, and Mario, Kalev and Robert reviewed the fwupd and gnome-software changes so I’m pretty confident I’ve not broken anything too important — but more testing very welcome. GNOME Software RSS usage is about 50% of what is shipped in 3.30.x and fwupd is down by 65%! If you want to ship the upcoming fwupd 1.2.0 or gnome-software 3.31.2 in your distro you’ll need to have libxmlb packaged, or be happy using a meson subpackage to download libxmlb during the build of each dependent project.

Initial thoughts on MongoDB's new Server Side Public License

Posted by Matthew Garrett on October 16, 2018 10:43 PM
MongoDB just announced that they were relicensing under their new Server Side Public License. This is basically the Affero GPL except with section 13 largely replaced with new text, as follows:

If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Software or modified version.

“Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.


MongoDB admit that this license is not currently open source in the sense of being approved by the Open Source Initiative, but say:We believe that the SSPL meets the standards for an open source license and are working to have it approved by the OSI.

At the broadest level, AGPL requires you to distribute the source code to the AGPLed work[1] while the SSPL requires you to distribute the source code to everything involved in providing the service. Having a license place requirements around things that aren't derived works of the covered code is unusual but not entirely unheard of - the GPL requires you to provide build scripts even if they're not strictly derived works, and you could probably make an argument that the anti-Tivoisation provisions of GPL3 fall into this category.

A stranger point is that you're required to provide all of this under the terms of the SSPL. If you have any code in your stack that can't be released under those terms then it's literally impossible for you to comply with this license. I'm not a lawyer, so I'll leave it up to them to figure out whether this means you're now only allowed to deploy MongoDB on BSD because the license would require you to relicense Linux away from the GPL. This feels sloppy rather than deliberate, but if it is deliberate then it's a massively greater reach than any existing copyleft license.

You can definitely make arguments that this is just a maximalist copyleft license, the AGPL taken to extreme, and therefore it fits the open source criteria. But there's a point where something is so far from the previously accepted scenarios that it's actually something different, and should be examined as a new category rather than already approved categories. I suspect that this license has been written to conform to a strict reading of the Open Source Definition, and that any attempt by OSI to declare it as not being open source will receive pushback. But definitions don't exist to be weaponised against the communities that they seek to protect, and a license that has overly onerous terms should be rejected even if that means changing the definition.

In general I am strongly in favour of licenses ensuring that users have the freedom to take advantage of modifications that people have made to free software, and I'm a fan of the AGPL. But my initial feeling is that this license is a deliberate attempt to make it practically impossible to take advantage of the freedoms that the license nominally grants, and this impression is strengthened by it being something that's been announced with immediate effect rather than something that's been developed with community input. I think there's a bunch of worthwhile discussion to have about whether the AGPL is strong and clear enough to achieve its goals, but I don't think that this SSPL is the answer to that - and I lean towards thinking that it's not a good faith attempt to produce a usable open source license.

(It should go without saying that this is my personal opinion as a member of the free software community, and not that of my employer)

[1] There's some complexities around GPL3 code that's incorporated into the AGPLed work, but if it's not part of the AGPLed work then it's not covered

comment count unavailable comments

Firefox on Wayland update

Posted by Martin Stransky on October 09, 2018 09:38 AM

As a next step in the Wayland effort we have new fresh Firefox packages [1] with all the goodies from Firefox 63/64 (Nightly) for you. They come with better (and fixed) rendering, v-sync support, and working HiDPI. Support for hi-res displays is not perfect yet and more fixes are on the way – thanks to Jan Horak who wrote that patches.

The builds also ship PipeWire WebRTC patch for desktop sharing created by Jan Grulich and Tomas Popela. Wayland applications are isolated from desktop and don’t have access to other windows (as X11) thus PipeWire supplies the missing functionality along the browser sandbox.

I think the rendering is generally covered now and the browser should work smoothly with Wayland backend. That’s also a reason why I make it default on Fedora 30 (Rawhide) and firefox-x11 package is available as a X11 fallback. Fedora 29 and earlier stay with default X11 backend and Wayland is provided by firefox-wayland package.

And there’s surely some work left to make Firefox perfect on Wayland – for instance correctly place popups on Gtk 3.24, update WebRender/EGL, fix KDE compositor and so on.

[1] Fedora 27 Fedora 28 Fedora 29

Flatpak, after 1.0

Posted by Matthias Clasen on October 08, 2018 05:45 PM

Flatpak 1.0 has happened a while ago, and we now have a stable base. We’re up to the 1.0.3 bug-fix release at this point, and hope to get 1.0.x adopted in all major distros before too long.

Does that mean that Flatpak is done ? Far from it! We have just created a stable branch and started landing some queued-up feature work on the master branch. This includes things like:

  • Better life-cycle control with ps and kill commands
  • Logging and history
  • File copy-paste and DND
  • A better testsuite, including coverage reports

Beyond these, we have a laundry list of things we want to work on in the near future, including

  • Using  host GL drivers (possibly with libcapsule)
  • Application renaming and end-of-life migration
  • A portal for dconf/gsettings (a stopgap measure until we have D-Bus container support)
  • A portal for webcam access
  • More tests!

We are also looking at improving the scalability of the flathub infrastructure. The repository has grown to more than 400 apps, and buildbot is not really meant for using it the way we do.

What about releases?

We have not set a strict schedule, but the consensus seems to be that we are aiming for roughly quarterly releases, with more frequent devel snapshots as needed. Looking at the calendar, that would mean we should expect a stable 1.2 release around the end of the year.

Open for contribution

One of the easiest ways to help Flatpak is to get your favorite applications on flathub, either by packaging it yourself, or by convincing the upstream to do it.

If you feel like contributing to Flatpak itself,  please do! Flatpak is still a young project, and there are plenty of small to medium-size features that can be added. The tests are also a nice place to stick your toe in and see if you can improve the coverage a bit and maybe find a bug or two.

Or, if that is more your thing, we have a nice design for improving the flatpak commandline user experience that is waiting to be implemented.

Announcing the first release of libxmlb

Posted by Richard Hughes on October 04, 2018 06:36 PM

Today I did the first 0.1.0 preview release of libxmlb. We’re at the “probably API stable, but no promises” stage. This is the library I introduced a couple of weeks ago, and since then I’ve been porting both fwupd and gnome-software to use it. The former is almost complete, and nearly ready to merge, but the latter is still work in progress with a fair bit of code to write. I did manage to launch gnome-software with libxmlb yesterday, and modulo a bit of brokenness it’s both faster to start (over 800ms faster from cold boot!) and uses an amazing 90Mb less RSS at runtime. I’m planning to merge the libxmlb branch into the unstable branch of fwupd in the next few weeks, so I need volunteers to package up the new hard dep for Debian, Ubuntu and Arch.

The tarball is in the usual place – it’s a simple Meson-built library that doesn’t do anything crazy. I’ve imported and built it already for Fedora, much thanks to Kalev for the super speedy package review.

I guess I should explain how applications are expected to use this library. At its core, there are just five different kinds of objects you need to care about:

  • XbSilo – a deduplicated string pool and a read only node tree. This is typically kept alive for the application lifetime.
  • XbNode – a “Gobject wrapped” immutable node available as a query result from XbSilo.
  • XbBuilder – a “compiler” to build the XbSilo from XbBuilderNode’s or XbBuilderSource’s. This is typically created and destroyed at startup or when the blob needs regenerating.
  • XbBuilderNode – a mutable node that can have a parent, children, attributes and a value
  • XbBuilderSource – a source of data for XbBuilder, e.g. a .xml.gz file or just a raw XML string

The way most applications will use libxmlb is to create a local XbBuilder instance, add some XbBuilderSource’s and then “ensure” a local cache file. The “ensure” process either mmap loads the binary blob if all the file mtimes are the same, or compiles a blob saving it to a new file. You can also tell the XbSilo to watch all the sources that it was built with, so that if any files change at runtime the valid property gets set to FALSE and the application can xb_builder_ensure() at a convenient time.

Once the XbBuilder has been compiled, a XbSilo pops out. With the XbSilo you can query using most common XPath statements – I actually ended up implementing a FORTH-style stack interpreter so we can now do queries like /components/component/id[contains(upper-case(text()),'GIMP')] – I’ll say a bit more on that in a minute. Queries can limit the number of results (for speed), and are deduplicated in a sane way so it’s really quite a simple process to achieve something that would be a lot of C code. It’s possible to directly query an attribute or text value from a node, so the silo doesn’t have to be passed around either.

In the process of porting gnome-software, I had to make libxmlb thread-safe – which required some internal organisation. We now have an non-exported XbMachine stack interpreter, and then the XbSilo actually registers the XML-specific methods (like contains()) and functions (like ~=). These get passed some per-method user data, and also some per-query private data that is shared with the node tree – allowing things like [last()] and position()=3 to work. The function callbacks just get passed an query-specific stack, which means you can allow things like comparing “1” to 1.00f This makes it easy to support more of XPath in the future, or to support something completely application specific like gnome-software-search() without editing the library.

If anyone wants to do any API or code review I’d be super happy to answer any questions. Coverity and valgrind seem happy enough with all the self tests, but that’s no replacement for a human asking tricky questions. Thanks!

Speeding up AppStream: mmap’ing XML using libxmlb

Posted by Richard Hughes on September 20, 2018 09:27 AM

AppStream and the related AppData are XML formats that have been adopted by thousands of upstream projects and are being used in about a dozen different client programs. The AppStream metadata shipped in Fedora is currently a huge 13Mb XML file, which with gzip compresses down to a more reasonable 3.6Mb. AppStream is awesome; it provides translations of lots of useful data into basically all languages and includes screenshots for almost everything. GNOME Software is built around AppStream, and we even use a slightly extended version of the same XML format to ship firmware update metadata from the LVFS to fwupd.

XML does have two giant weaknesses. The first is that you have to decompress and then parse the files – which might include all the ~300 tiny AppData files as well as the distro-provided AppStream files, if you want to list installed applications not provided by the distro. Seeking lots of small files isn’t so slow on a SSD, and loading+decompressing a small file is actually quicker than loading an uncompressed larger file. Parsing an XML file typically means you set up some callbacks, which then get called for every start tag, text section, then end tag – so for a 13Mb XML document that’s nested very deeply you have to do a lot of callbacks. This means you have to process the description of GIMP in every language before you can even see if Shotwell exists at all.

The typical way parsing XML involves creating a “node tree” when parsing the XML. This allows you treat the XML document as a Document Object Model (DOM) which allows you to navigate the tree and parse the contents in an object oriented way. This means you typically allocate on the heap the nodes themselves, plus copies of all the string data. AsNode in libappstream-glib has a few tricks to reduce RSS usage after parsing, which includes:

  • Interning common element names like description, p, ul, li
  • Freeing all the nodes, but retaining all the node data
  • Ignoring node data for languages you don’t understand
  • Reference counting the strings from the nodes into the various appstream-glib GObjects

This still has a both drawbacks; we need to store in hot memory all the screenshot URLs of all the apps you’re never going to search for, and we also need to parse all these long translated descriptions data just to find out if gimp.desktop is actually installable. Deduplicating strings at runtime takes nontrivial amounts of CPU and means we build a huge hash table that uses nearly as much RSS as we save by deduplicating.

On a modern system, parsing ~300 files takes less than a second, and the total RSS is only a few tens of Mb – which is fine, right? Except on resource constrained machines it takes 20+ seconds to start, and 40Mb is nearly 10% of the total memory available on the system. We have exactly the same problem with fwupd, where we get one giant file from the LVFS, all of which gets stored in RSS even though you never have the hardware that it matches against. Slow starting of fwupd and gnome-software is one of the reasons they stay resident, and don’t shutdown on idle and restart when required.

We can do better.

We do need to keep the source format, but that doesn’t mean we can’t create a managed cache to do some clever things. Traditionally I’ve been quite vocal against squashing structured XML data into databases like sqlite and Xapian as it’s like pushing a square peg into a round hole, and forces you to think like a database doing 10 level nested joins to query some simple thing. What we want to use is something like XPath, where you can query data using the XML structure itself.

We also want to be able to navigate the XML document as if it was a DOM, i.e. be able to jump from one node to it’s sibling without parsing all the great, great, great, grandchild nodes to get there. This means storing the offset to the sibling in a binary file.

If we’re creating a cache, we might as well do the string deduplication at creation time once, rather than every time we load the data. This has the added benefit in that we’re converting the string data from variable length strings that you compare using strcmp() to quarks that you can compare just by checking two integers. This is much faster, as any SAT solver will tell you. If we’re storing a string table, we can also store the NUL byte. This seems wasteful at first, but has one huge advantage – you can mmap() the string table. In fact, you can mmap the entire cache. If you order the string table in a sensible way then you store all the related data in one block (e.g. the <id> values) so that you don’t jump all over the cache invalidating almost everything just for a common query. mmap’ing the strings means you can avoid strdup()ing every string just in case; in the case of memory pressure the kernel automatically reclaims the memory, and the next time automatically loads it from disk as required. It’s almost magic.

I’ve spent the last few days prototyping a library, which is called libxmlb until someone comes up with a better name. I’ve got a test branch of fwupd that I’ve ported from libappstream-glib and I’m happy to say that RSS has reduced from 3Mb (peak 3.61Mb) to 1Mb (peak 1.07Mb) and the startup time has gone from 280ms to 250ms. Unless I’ve missed something drastic I’m going to port gnome-software too, and will expect even bigger savings as the amount of XML is two orders of magnitude larger.

So, how do I use this thing. First, lets create a baseline doing things the old way:

$ time appstream-util search gimp.desktop
real	0m0.645s
user	0m0.800s
sys	0m0.184s

To create a binary cache:

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.497s
user	0m0.453s
sys	0m0.028s

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.016s
user	0m0.004s
sys	0m0.006s

Notice the second time it compiled nearly instantly, as none of the filename or modification timestamps of the sources changed. This is exactly what programs would do every time they are launched.

$ df -h appstream.xmlb
4.2M	appstream.xmlb

$ time xb-tool query appstream.xmlb "components/component[@type='desktop']/id[text()='firefox.desktop']"
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
real	0m0.008s
user	0m0.007s
sys	0m0.002s

8ms includes the time to load the file, search for all the components that match the query and the time to export the XML. You get three results as there’s one AppData file, one entry in the distro AppStream, and an extra one shipped by Fedora to make Firefox featured in gnome-software. You can see the whole XML component of each result by appending /.. to the query. Unlike appstream-glib, libxmlb doesn’t try to merge components – which makes it much less magic, and a whole lot simpler.

Some questions answered:

  • Why not just use a GVariant blob?: I did initially, and the cache was huge. The deeply nested structure was packed inefficiently as you have to assume everything is a hash table of a{sv}. It was also slow to load; not much faster than just parsing the XML. It also wasn’t possible to implement the zero-copy XPath queries this way.
  • Is this API and ABI stable?: Not yet, as soon as gnome-software is ported.
  • You implemented XPath in c‽: No, only a tiny subset. See the README.md

Comments welcome.

Thunderbird 60 with title bar hidden

Posted by Martin Stransky on September 18, 2018 08:58 AM

Many users like hidden system titlebar as Firefox feature although it’s not finished yet. But we’re very close and I hope to have Firefox 64 in shape that the title bar can be disabled by default at least on Gnome and matches Firefox outfit at Windows and Mac.

Thunderbird 60 was finally released for Fedora and comes with a basic version of the feature as it was introduced at Firefox 60 ESR. There’s a simple checkbox at “Customize” page at Firefox but Thunderbird is missing an easy switch.

To disable the title bar at Thunderbird 60, you need to go to system menu Edit -> Preferences and choose Advanced tab. Then click at Config Editor at page left bottom corner, open it and look for mail.tabs.drawInTitlebar. Double clik on it and your bird should be titleless 🙂

Fedora Firefox – GCC/CLANG dilemma

Posted by Martin Stransky on September 17, 2018 09:00 PM

After reading Mike’s blog post about official Mozilla Firefox switch to LLVM Clang, I was wondering if we should also use that setup for official Fedora Firefox binaries.

The numbers look strong but as Honza Hubicka mentioned, Mozilla uses pretty ancient GCC6 to create binaries and it’s not very fair to compare it with up-to date LLVM Clang 6.

Also if I’m reading the mozilla bug correctly the PGO/LTO is not yet enabled for Linux, only plain optimized builds are used for now…which means the transition at Mozilla is not so far than I expected.

I also went through some Poronix tests which indicates there’s no black and white situation there although Mike claimed that LLVM Clang is generally better that GCC. But it’s possible that Firefox codebase somehow fits better LLVM Clang than GCC.

After some consideration I think we’ll stay with GCC for now and I’m going compare Fedora GCC 8 builds with the Mozilla  LLVM Clang ones when there are available. Both builds can’t use -march=native so It may be an equal comparsion. Also Fedora should enable the PGO+LTO GCC setup to get the best from GCC.

[Update] I was wrong and PGO+LTO should be enabled also for Linux builds now. The numbers looks very well and I wonder if we can match them with GCC8! 🙂

The Commons Clause doesn't help the commons

Posted by Matthew Garrett on September 10, 2018 11:26 PM
The Commons Clause was announced recently, along with several projects moving portions of their codebase under it. It's an additional restriction intended to be applied to existing open source licenses with the effect of preventing the work from being sold[1], where the definition of being sold includes being used as a component of an online pay-for service. As described in the FAQ, this changes the effective license of the work from an open source license to a source-available license. However, the site doesn't go into a great deal of detail as to why you'd want to do that.

Fortunately one of the VCs behind this move wrote an opinion article that goes into more detail. The central argument is that Amazon make use of a great deal of open source software and integrate it into commercial products that are incredibly lucrative, but give little back to the community in return. By adopting the commons clause, Amazon will be forced to negotiate with the projects before being able to use covered versions of the software. This will, apparently, prevent behaviour that is not conducive to sustainable open-source communities.

But this is where things get somewhat confusing. The author continues:

Our view is that open-source software was never intended for cloud infrastructure companies to take and sell. That is not the original ethos of open source.

which is a pretty astonishingly unsupported argument. Open source code has been incorporated into proprietary applications without giving back to the originating community since before the term open source even existed. MIT-licensed X11 became part of not only multiple Unixes, but also a variety of proprietary commercial products for non-Unix platforms. Large portions of BSD ended up in a whole range of proprietary operating systems (including older versions of Windows). The only argument in favour of this assertion is that cloud infrastructure companies didn't exist at that point in time, so they weren't taken into consideration[2] - but no argument is made as to why cloud infrastructure companies are fundamentally different to proprietary operating system companies in this respect. Both took open source code, incorporated it into other products and sold them on without (in most cases) giving anything back.

There's one counter-argument. When companies sold products based on open source code, they distributed it. Copyleft licenses like the GPL trigger on distribution, and as a result selling products based on copyleft code meant that the community would gain access to any modifications the vendor had made - improvements could be incorporated back into the original work, and everyone benefited. Incorporating open source code into a cloud product generally doesn't count as distribution, and so the source code disclosure requirements don't trigger. So perhaps that's the distinction being made?

Well, no. The GNU Affero GPL has a clause that covers this case - if you provide a network service based on AGPLed code then you must provide the source code in a similar way to if you distributed it under a more traditional copyleft license. But the article's author goes on to say:

AGPL makes it inconvenient but does not prevent cloud infrastructure providers from engaging in the abusive behavior described above. It simply says that they must release any modifications they make while engaging in such behavior.

IE, the problem isn't that cloud providers aren't giving back code, it's that they're using the code without contributing financially. There's no difference between what cloud providers are doing now and what proprietary operating system vendors were doing 30 years ago. The argument that "open source" was never intended to permit this sort of behaviour is simply untrue. The use of permissive licenses has always allowed large companies to benefit disproportionately when compared to the authors of said code. There's nothing new to see here.

But that doesn't mean that the status quo is good - the argument for why the commons clause is required may be specious, but that doesn't mean it's bad. We've seen multiple cases of open source projects struggling to obtain the resources required to make a project sustainable, even as many large companies make significant amounts of money off that work. Does the commons clause help us here?

As hinted at in the title, the answer's no. The commons clause attempts to change the power dynamic of the author/user role, but it does so in a way that's fundamentally tied to a business model and in a way that prevents many of the things that make open source software interesting to begin with. Let's talk about some problems.

The power dynamic still doesn't favour contributors

The commons clause only really works if there's a single copyright holder - if not, selling the code requires you to get permission from multiple people. But the clause does nothing to guarantee that the people who actually write the code benefit, merely that whoever holds the copyright does. If I rewrite a large part of a covered work and that code is merged (presumably after I've signed a CLA that assigns a copyright grant to the project owners), I have no power in any negotiations with any cloud providers. There's no guarantee that the project stewards will choose to reward me in any way. I contribute to them but get nothing back in return - instead, my improved code allows the project owners to charge more and provide stronger returns for the VCs. The inequity has shifted, but individual contributors still lose out.

It discourages use of covered projects

One of the benefits of being able to use open source software is that you don't need to fill out purchase orders or start commercial negotiations before you're able to deploy. Turns out the project doesn't actually fill your needs? Revert it, and all you've lost is some development time. Adding additional barriers is going to reduce uptake of covered projects, and that does nothing to benefit the contributors.

You can no longer meaningfully fork a project

One of the strengths of open source projects is that if the original project stewards turn out to violate the trust of their community, someone can fork it and provide a reasonable alternative. But if the project is released with the commons clause, it's impossible to sell any forked versions - anyone who wishes to do so would still need the permission of the original copyright holder, and they can refuse that in order to prevent a fork from gaining any significant uptake.

It doesn't inherently benefit the commons

The entire argument here is that the cloud providers are exploiting the commons, and by forcing them to pay for a license that allows them to make use of that software the commons will benefit. But there's no obvious link between these things. Maybe extra money will result in more development work being done and the commons benefiting, but maybe extra money will instead just result in greater payout to shareholders. Forcing cloud providers to release their modifications to the wider world would be of benefit to the commons, but this is explicitly ruled out as a goal. The clause isn't inherently incompatible with this - the negotiations between a vendor and a project to obtain a license to be permitted to sell the code could include a commitment to provide patches rather money, for instance, but the focus on money makes it clear that this wasn't the authors' priority.

What we're left with is a license condition that does nothing to benefit individual contributors or other users, and costs us the opportunity to fork projects in response to disagreements over design decisions or governance. What it does is ensure that a range of VC-backed projects are in a better position to improve their returns, without any guarantee that the commons will be left better off. It's an attempt to solve a problem that's existed since before the term "open source" was even coined, by simply layering on a business model that's also existed since before the term "open source" was even coined[3]. It's not anything new, and open source derives from an explicit rejection of this sort of business model.

That's not to say we're in a good place at the moment. It's clear that there is a giant level of power disparity between many projects and the consumers of those projects. But we're not going to fix that by simply discarding many of the benefits of open source and going back to an older way of doing things. Companies like Tidelift[4] are trying to identify ways of making this sustainable without losing the things that make open source a better way of doing software development in the first place, and that's what we should be focusing on rather than just admitting defeat to satisfy a small number of VC-backed firms that have otherwise failed to develop a sustainable business model.

[1] It is unclear how this interacts with licenses that include clauses that assert you can remove any additional restrictions that have been applied
[2] Although companies like Hotmail were making money from running open source software before the open source definition existed, so this still seems like a reach
[3] "Source available" predates my existence, let alone any existing open source licenses
[4] Disclosure: I know several people involved in Tidelift, but have no financial involvement in the company

comment count unavailable comments

ASG! 2018 Tickets

Posted by Lennart Poettering on September 10, 2018 10:00 PM

<large>All Systems Go! 2018 Tickets Selling Out Quickly!</large>

Buy your tickets for All Systems Go! 2018 soon, they are quickly selling out! The conference takes place on September 28-30, in Berlin, Germany, in a bit over two weeks.

Why should you attend? If you are interested in low-level Linux userspace, then All Systems Go! is the right conference for you. It covers all topics relevant to foundational open-source Linux technologies. For details on the covered topics see our schedule for day #1 and for day #2.

For more information please visit our conference website!

See you in Berlin!

On Flatpak dependencies

Posted by Matthias Clasen on September 07, 2018 05:50 PM

Package managers have to deal with dependencies – too many of them. Over time things have gotten complicated: there are now soft dependencies, reverse dependencies and boolean conditions. So complicated that you can probably do general computation in the dependency solver now.

Thankfully flatpak is a lot simpler: there’s apps, and there’s runtimes, and every app depends on exactly one runtime. Simple and beautiful.

Of course, thats not the whole story.  Lets take a look.

Dependencies

The only hard dependencies in Flatpak are between an app and the runtime that it uses.

Every app uses exactly one runtime. When installing the app, Flatpak will automatically install the runtime it needs, and it will refuse to uninstall the runtime as long as there are apps using it. You can override this using the –force-remove option.

Related refs

The common term Flatpak uses for apps and runtimes is refs. Apart from dependencies, Flatpak has a notion of related refs. These are what other packaging systems might classify as soft dependencies, and they come in various forms and shapes.

The first group are standard pieces that get split off at build time, like .Locale and .Debug refs. The first contain translations and are similar to what other packaging systems call langpacks. The second contain debug information for binaries, and are the equivalent of debuginfo packages in other systems.

The next group are extensions. Both applications and runtimes can declare extension points,  and flatpak will look for refs that match those when setting up a sandbox, and mounts the ones it finds inside the sandbox.

Some extensions are a bit more special, in that they are conditional on some condition of the system. For example, they might only be relevant if the system has an Nvidia GPU, or apply for a certain GTK+ theme.

Another kind of relationship exists between a runtime and its matching sdk.

Whats used ?

The Flatpak uninstall command has a –unused option that makes an educated guess about what refs are no longer needed on your system. For each ref, it looks if it is an application, or the runtime of an installed application, or related to one of these. In each of these cases, it considers the ref used and skips it. Whatever is left afterwards get removed.

There are some heuristics involved when looking at related refs, and we are still fine-tuning these. One change that we’re recently made is to consider sdks used, to make –unused usable for developers.

There is still more fine-tuning to be done. For example, Flatpak will happily use a runtime from the system installation to run an application from the user installation. But the uninstall command works only on a single installation, so it does not see these dependencies, and might remove the runtime. Thankfully, it is easy to recover, should this happen to you: just install the runtime again.

3 Million Firmware Files and Counting…

Posted by Richard Hughes on September 07, 2018 09:55 AM

In the last two years the LVFS has supplied over 3 million firmware files to end users. We now have about a two dozen companies uploading firmware, of which 9 are multi-billion dollar companies.

Every month about 200,000 more devices get upgraded and from the reports so far the number of failed updates is less than 0.01% averaged over all firmware types. The number of downloads is going up month-on-month, although we’re no longer growing exponentially, thank goodness. The server load average is 0.18, and we’ve made two changes recently to scale even more for less money: signing files in a 30 minute cron job rather than immediately, and switching from Amazon to BunnyCDN.

The LVFS is mainly run by just one person (me!) and my time is sponsored by the ever-awesome Red Hat. The hardware costs, which recently included random development tools for testing the dfu and nvme plugins, and the server and bandwidth costs are being paid from charitable donations from the community. We’re even cost positive now, so I’m building up a little pot for the next server or CDN upgrade. By pretty much any metric, the LVFS is a huge success, and I’m super grateful to all the people that helped the project grow.

The LVFS does have one weakness, that it has a bus factor of one. In other words, if I got splattered by a bus, the LVFS would probably cease to exist in the current form. To further grow the project, and to reduce the dependence on me, we’re going to be moving various parts of the LVFS to the Linux Foundation. This means that they’ll be sysadmins who don’t have to google basic server things, a proper community charter, and access to an actual legal team. From a OEM point of view, nothing much should change, including the most important thing that it’ll continue to be free to use for everyone. The existing server and all the content will be migrated to the LVFS infrastructure. From a users point of view, new metadata and firmware will be signed by the Linux Foundation key, rather than my key, although we haven’t decided on a date for the switch-over yet. The LF key has been trusted by fwupd for firmware since 1.0.8 and it’s trivial to backport to older branches if required.

Before anyone gets too excited and starts pointing me at all my existing bugs for my other stuff: I’ll probably still be the person “onboarding” vendors onto the LVFS, and I’m fully expecting to remain the maintainer and core contributor to the lvfs-website code itself — but I certainly should have a bit more time for GNOME Software and color stuff.

In related news, even more vendors are jumping on the LVFS. No more public announcements yet, but hopefully soon. For a lot of hardware companies they’ll be comfortable “going public” when the new hardware currently in development is on shelves in stores. So, please raise a glass to the next 3 million downloads!

What's new in libinput 1.12

Posted by Peter Hutterer on September 04, 2018 03:34 AM

libinput 1.12 was a massive development effort (over 300 patchsets) with a bunch of new features being merged. It'll be released next week or so, so it's worth taking a step back and looking at what actually changed.

The device quirks files replace the previously used hwdb-based udev properties. I've written about this in more detail here but the gist is: we have our own .ini style file format that can match on devices and apply the various quirks devices need. This simplifies debugging a lot, we can now reliably tell users why a quirks file applies or doesn't apply, historically a problem with the hwdb.

The sphinx-based documentation was merged, fixed and added to. We switched to sphinx for the docs and the result is much more user-friendly. Which was the point, it was a switch from a developer-oriented documentation to a user-oriented one. Not that documentation is ever finished.

The usual set of touchpad improvements went in, e.g. the slight motion on finger up is now ignored. We have size-based thumb detection now (useful for Apple touchpads!). And of course various quirks for better pressure ranges, etc. Tripletap on some synaptics touchpads had a tendency to cause multiple taps because of some weird event sequence. Movement in the software button now generates events, the buttons are not just a dead zone anymore. Pointer jump detection is more adaptive now and catches and discards smaller jumps that previously slipped through the cracks. A particularly quirky behaviour was seen on Dell XPS i2c touchpads that exhibit a huge pointer jump, courtesy of the trackpoint controller going to sleep and taking its time to wake up. The delay is still there but the pointer at least lands in the correct location.

We now have improved direction-locking for two-finger scrolling on touchpads. Scrolling up/down should not generate horizontal scroll events anymore as long as the movement is close enough to vertical. This feature is transparent, a diagonal or horizontal movement will immediately disable the direction lock and produce horizontal scroll events as expected.

The trackpoint acceleration has been re-done, see this post for more details and links to the background articles. I've only received one bug report for the new acceleration so it seems to work quite well now. Trackpoints that send events in bursts (e.g. bluetooth ones) are smoothened now to avoid jerky movement.

Velocity averaging was dropped to increase pointer accuracy. Previously we averaged the velocity across multiple events which makes the motion smoother on jittery devices but less accurate on good devices.

We build on FreeBSD now. Presumably this also means it works on FreeBSD :)

libinput now supports palm detection on touchscreens, at least where the ABS_MT_TOOL_TYPE evdev bit is provided.

I think that's about it. Busy days...

Realtek on the LVFS!

Posted by Richard Hughes on August 27, 2018 01:02 PM

For the last week I’ve been working with Realtek engineers adding USB3 hub firmware support to fwupd. We’re still fleshing out all the details, as we also want to also update any devices attached to the hub using i2c – which is more important than it first seems. A lot of “multifunction” dongles or USB3 hubs are actually USB3 hubs with other hardware connected internally. We’re going to be working on updating the HDMI converter firmware next, probably just by dropping a quirk file and adding some standard keys to fwupd. This will let us use the same plugin for any hardware that uses the rts54xx chipset as the base.

Realtek have been really helpful and open about the hardware, which is a refreshing difference to a lot of other hardware companies. I’m hopeful we can get the new plugin in fwupd 1.1.2 although supported hardware won’t be available for a few months yet, which also means there’s no panic getting public firmware on the LVFS. It will mean we get a “works out of the box” experience when the new OEM branded dock/dongle hardware starts showing up.

About Flatpak installations

Posted by Matthias Clasen on August 26, 2018 09:04 PM

If you have tried Flatpak, you probably know that it can install apps per user or system-wide.  Installing an app system-wide has the advantage that all users on the system can use it. Installing per-user has the advantage that it doesn’t require privileges.

Most of the time, that’s more than enough choice. But there is more.

Standard locations

Before moving on, it is useful to briefly look at where Flatpak installs apps by default.

flatpak install --system flathub org.gnome.Todo

When you install an app like this. it ends up in in /var/lib/flatpak. If you instead use the –user option, it ends up in ~/.local/share/flatpak. To be 100% correct, I should say $XDG_DATA_HOME/flatpak, since Flatpak does respect the XDG basedir spec.

Anatomy of an installation

Flatpak calls the places where it installs apps installations. Every installation has a few subdirectories that are useful to know:

  • repo – this is the OSTree repository where the files for the installed apps and runtimes reside. It is a single repository, so all the apps and runtimes that are part of the same installation get the benefit of deduplication via content-addressing.
  • exports – when an app is installed, Flatpak extracts some files that need to be visible to the outside world, such a desktop files, icons, D-Bus services files, and this is where they end up.
  • appstream – a flatpak repository contains the appstream data for the apps it contains as a separate branch, and Flatpak extracts it on the client-side for consumers like KDE’s Discover or GNOME Software.
  • app, runtime – the deployed versions of apps and runtimes get checked out here. Diving deeper, you see the files of an app in app/org.gimp.GIMP/current/active/files. This directory is what gets mounted in the sandbox as /app if you run the GIMP.

Custom installations

So far, so good. But maybe you have a setup with dozens of machines, and have an existing setup where /opt is shared. Wouldn’t it be nice to have a flatpak installation there, instead of duplicating it in /var on every machine ? This is where custom installations come in.

You can tell Flatpak about another place to  install apps by dropping a file in /etc/flatpak/installations.d/. It can be as simple as the following:

[Installation "bigleaf"]
Path=/opt/flatpak
DisplayName=bigleaf

See the flatpak-installation man page for all the details about this file.

GNOME Software currently doesn’t know such custom installations, and you will have to adjust the shell glue in /etc/profile.d/flatpak.sh for GNOME shell to see apps from there, but that is easy enough.

A patch to make flatpak.sh pick up custom installations automatically would be a welcome contribution!

Apps on a stick

Flatpak has a few more tricks up its sleeve when it comes to sharing apps between machines. A pretty cool one is the recently added create-usb command. It lets you copy one (or more) apps on a usb stick, and install it from there on another machine. While trying this out, I hit a few hurdles, that I’ll briefly point out here.

To make this work, Flatpak relies on an extra piece of information about the remote, the collection ID. The collection ID is a property of the actual remote repository. Flathub has one, org.flathub.Stable.

To make use of it, we need to add it to the configuration for the remote, like this:

$ flatpak remote-modify --collection-id=org.flathub.Stable flathub
$ flatpak update

If you don’t add the collection ID to your remote configuration, you will be greeted by an error saying “Remote ‘flathub’ does not have a collection ID set”. If you omit the flatpak update, the error will say “No such branch (org.flathub.Stable, ostree-metadata) in repository”.

Another error you may hit is “fsetxattr: Operation not supported”.  I think the create-usb command is meant to work with FAT-formatted usb sticks, so this will hopefully be fixed soon. For now, just format your usb stick as EXT4.

After these preparations, we are ready for:

$ flatpak --verbose create-usb /run/media/mclasen/Flatpak org.gimp.GIMP

which will take some time to copy things to the usb stick (which happes to be mounted at /run/media/mclasen/Flatpak) . When this command is done, we can inspect the contents of the OSTree repository like this:

$ flatpak remote-ls file:///run/media/mclasen/Flatpak/.ostree/repo
Ref 
org.freedesktop.Platform.Icontheme.Adwaita
org.freedesktop.Platform.VAAPI.Intel 
org.freedesktop.Platform.ffmpeg 
org.gimp.GIMP 
org.gnome.Platform

Flatpak copied not just the GIMP itself, but also runtimes and extensions that it uses. This ensures that we can install the app from the usb stick even if some of these related refs are missing on the target system.

But of course, we still need to see it work! So I uninstalled the GIMP, disabled my network, plugged the usb stick back in, and:

$ flatpak install --user flathub org.gimp.GIMP
0 metadata, 0 content objects imported; 569 B transferred in 0 seconds

flatpak install: Error updating remote metadata for 'flathub': [6] Couldn't resolve host name
Installing in user:
org.gimp.GIMP/x86_64/stable flathub 1eb97e2d4cde
permissions: ipc, network, wayland, x11
file access: /tmp, host, xdg-config/GIMP, xdg-config/gtk-3.0
dbus access: org.gtk.vfs, org.gtk.vfs.*
Is this ok [y/n]: y
Installing for user: org.gimp.GIMP/x86_64/stable from flathub
[####################] 495 metadata, 4195 content objects imported; 569 B transferred in 1 seconds
Now at 1eb97e2d4cde.

Voilà, an offline installation of a Flatpak. I left the error message in there as proof that I was actually offline :) A nice detail of the collection ID approach is that Flatpak knows that it can still update the GIMP from flathub when I’m online.

Coming soon, peer-to-peer

This post is already too long, so I’ll leave peer-to-peer and advertising Flatpak repositories on the local network via avahi for another time.

Until then, happy Flatpaking! 💓📦💓📦

 

Fun with SuperIO

Posted by Richard Hughes on August 23, 2018 11:44 AM

While I’m waiting back for NVMe vendors (already one tentatively onboard!) I’ve started looking at “embedded controller” devices. The EC on your laptop historically used to just control the PS/2 keyboard and mouse, but now does fan control, power management, UARTs, GPIOs, LEDs, SMBUS, and various tasks the main CPU is too important to care about. Vendors issue firmware updates for this kind of device, but normally wrap up the EC update as part of the “BIOS” update as the system firmware and EC work together using various ACPI methods. Some vendors do the EC update out-of-band and so we need to teach fwupd about how to query the EC to get the model and version on that specific hardware. The Linux laptop vendor Tuxedo wants to update the EC and system firmware separately using the LVFS, and helpfully loaned me an InfinityBook Pro 13 that was immediately disassembled and connected to all kinds of exotic external programmers. On first impressions the N131WU seems quick, stable and really well designed internally — I’m sure would get a 10/10 for repairability.

At the moment I’m just concentrating on SuperIO devices from ITE. If you’re interested what SuperIO chip(s) you have on your machine you can either use superiotool from coreboot-utils or sensors-detect from lm_sensors. If you’ve got a SuperIO device from ITE please post what signature, vendor and model machine you have in the comments and I’ll ask if I need any more information from you. I’m especially interested in vendors that use devices with the signature 0x8587, which seems to be a favourite with the Clevo reference board. Thanks!

Please welcome AKiTiO to the LVFS

Posted by Richard Hughes on August 23, 2018 08:08 AM

Another week, another vendor. This time the vendor is called AKiTiO, a vendor that make a large number of very nice ThunderBolt peripherals.

Over the last few weeks AKiTiO added support for the Node and Node Lite devices, and I’m sure they’ll be more in the future. It’s been a pleasure working with the engineers and getting them up to speed with uploading to the LVFS.

In other news, Lenovo also added support for the ThinkPad T460 on the LVFS, so get any updates while they’re hot. If you want to try this you’ll have to enable the lvfs-testing remote either using fwupdmgr enable-remote lvfs-testing or using the sources dialog in recent versions of GNOME Software. More Lenovo updates coming soon, and hopefully even more vendor announcements too.

Adventures with NVMe, part 2

Posted by Richard Hughes on August 21, 2018 08:01 AM

A few days ago I asked people to upload their NVMe “cns” data to the LVFS. So far, 908 people did that, and I appreciate each and every submission. I promised I’d share my results, and this is what I’ve found:

Number of vendors implementing slot 1 read only “s1ro” factory fallback: 10 – this was way less than I hoped. Not all is lost: the number of slots in a device “nfws”indicate how many different versions of firmware the drive can hold, just like some wireless broadband cards. The idea is that a bad firmware flash means you can “fall back” to an old version that actually works. It was surprising how many drives didn’t have this feature because they only had one slot in total:

I also wanted to know how many firmware versions there were for a specific model (deduping by removing the capacity string in the model); the idea being that if drives with the same model string all had the same version firmware then the vendor wasn’t supplying firmware updates at all, and might be a lost cause, or have perfect firmware. Vendors don’t usually change shipped firmware on NMVe drives for no reason, and so a vendor having multiple versions of firmware for a given model could indicate a problem or enhancement important enough to re-run all the QA checks:

So, not all bad, but we can’t just assume that trying to flash a firmware is a safe thing to do for all drives. The next, much bigger problem was trying to identify which drives should be flashed with a specific firmware. You’d think this would be a simple problem, where the existing firmware version would be stored in the “fr” firmware revision string and the model name would be stored in the “mn” string. Alas, only Lenovo and Apple store a sane semver like 1.2.3, other vendors seem to encode the firmware revision using as-yet-unknown methods. Helpfully, the model name alone isn’t all we need to identify the firmware to flash as different drives can have different firmware for the laptop OEM without changing the mn or fr. For this I think we need to look into the elusive “vs” vendor-defined block which was the reason I was asking for the binary dump of the CNS rather than the nvme -H or nvme -o json output. The vendor block isn’t formally defined as part of the NVMe specification and the ODM (and maybe the OEM?) can use this however they want.

Only 137 out of the supplied ~650 NVMe CNS blobs contained vendor data. SK hynix drives contain an interesting-looking string of something like KX0WMJ6KS0760T6G01H0, but I have no idea how to parse that. Seagate has simply 2002. Liteon has a string like TW01345GLOH006BN05SXA04. Some Samsung drives have things like KR0N5WKK0184166K007HB0 and CN08Y4V9SSX0087702TSA0 – the same format as Toshiba CN08D5HTTBE006BEC2K1A0 but it’s weird that the blob is all ASCII – I was somewhat hoping for a packed GUID in the sea of NULs. They do have some common sub-sections, so if you know what these are please let me know!

I’ve built a fwupd plugin that should be able to update firmware on NVMe drives, but it’s 100% untested. I’m going to use the leftover donation money for the LVFS to buy various types of NVMe hardware that I can flash with different firmware images and not cry if all the data gets wiped or the device get bricked. I’ve already emailed my contact at Samsung and fingers crossed something nice happens. I’ll do the same with Toshiba and Lenovo next week. I’ll also update this blog post next week with the latest numbers, so if you upload your data now it’s still useful.

NVMe Firmware: I Need Your Data

Posted by Richard Hughes on August 17, 2018 02:45 PM

In a recent Google Plus post I asked what kind of hardware was most interesting to be focusing on next. UEFI updating is now working well with a large number of vendors, and the LVFS “onboarding” process is well established now. On that topic we’ll hopefully have some more announcements soon. Anyway, back to the topic in hand: The overwhelming result from the poll was that people wanted NVMe hardware supported, so that you can trivially update the firmware of your SSD. Firmware updates for SSDs are important, as most either address data consistency issues or provide nice performance fixes.

Unfortunately there needs to be some plumbing put in place first, so don’t expect anything awesome very fast. The NVMe ecosystem is pretty new, and things like “what version number firmware am I running now” and “is this firmware OEM firmware or retail firmware” are still queried using vendor-specific extensions. I only have two devices to test with (Lenovo P50 and Dell XPS 13) and so I’m asking for some help with data collection. Primarily I’m trying to find out what NMVe hardware people are actually using, so I can approach the most popular vendors first (via the existing OEMs). I’m also going to be looking at the firmware revision string that each vendor sets to find quirks we need — for instance, Toshiba encodes MODEL VENDOR, and everyone else specifies VENDOR MODEL. Some drives contain the vendor data with a GUID, some don’t, I have no idea of the relative number or how many different formats there are. I’d also like to know how many firmware slots the average SSD has, and the percentage of drives that have a protected slot 1 firmware. This all lets us work out how safe it would be to attempt a new firmware update on specific hardware — the very last thing we want to do is brick an expensive new NMVe SSD with all your data on.

So, what do I would like you to do. You don’t need to reboot, unmount any filesystems or anything like that. Just:

  1. Install nvme (e.g. dnf install nvme-cli or build it from source
  2. Run the following command:
    sudo nvme id-ctrl --raw-binary /dev/nvme0 > /tmp/id-ctrl
    
  3. If that worked, run the following command:
    curl -F type=nvme \
        -F "machine_id="`cat /etc/machine-id` \
        -F file=@/tmp/id-ctrl \
        https://staging.fwupd.org/lvfs/upload_hwinfo

If you’re not sure if you have a NVMe drive you can check with the nvme command above. The command isn’t doing anything with the firmware; it’s just asking the NVMe drive to report what it knows about itself. It should be 100% safe, the kernel already did the same request at system startup.

We are sending your random machine ID to ensure we don’t record duplicate submissions — if that makes you unhappy for some reason just choose some other 32 byte hex string. In the binary file created by nvme there is the encoded model number and serial number of your drive; if this makes you uneasy please don’t send the file.

Many thanks, and needless to say I’ll be posting some stats here when I’ve got enough submissions to be statistically valid.

libinput's "new" trackpoint acceleration method

Posted by Peter Hutterer on August 16, 2018 04:47 AM

This is mostly a request for testing, because I've received zero feedback on the patches that I merged a month ago and libinput 1.12 is due to be out. No comments so far on the RC1 and RC2 either, so... well, maybe this gets a bit broader attention so we can address some things before the release. One can hope.

Required reading for this article: Observations on trackpoint input data and X server pointer acceleration analysis - part 5.

As the blog posts linked above explain, the trackpoint input data is difficult and largely arbitrary between different devices. The previous pointer acceleration libinput had relied on a fixed reporting rate which isn't true at low speeds, so the new acceleration method switches back to velocity-based acceleration. i.e. we convert the input deltas to a speed, then apply the acceleration curve on that. It's not speed, it's pressure, but it doesn't really matter unless you're a stickler for technicalities.

Because basically every trackpoint has different random data ranges not linked to anything easily measurable, libinput's device quirks now support a magic multiplier to scale the trackpoint range into something resembling a sane range. This is basically what we did before with the systemd POINTINGSTICK_CONST_ACCEL property except that we're handling this in libinput now (which is where acceleration is handled, so it kinda makes sense to move it here). There is no good conversion from the previous trackpoint range property to the new multiplier because the range didn't really have any relation to the physical input users expected.

So what does this mean for you? Test the libinput RCs or, better, libinput from master (because it's stable anyway), or from the Fedora COPR and check if the trackpoint works. If not, check the Trackpoint Configuration page and follow the instructions there.

How the 60-evdev.hwdb works

Posted by Peter Hutterer on August 09, 2018 02:17 AM

libinput made a design decision early on to use physical reference points wherever possible. So your virtual buttons are X mm high/across, the pointer movement is calculated in mm, etc. Unfortunately this exposed us to a large range of devices that don't bother to provide that information or just give us the wrong information to begin with. Patching the kernel for every device is not feasible so in 2015 the 60-evdev.hwdb was born and it has seen steady updates since. Plenty a libinput bug was fixed by just correcting the device's axis ranges or resolution. To take the magic out of the 60-evdev.hwdb, here's a blog post for your perusal, appreciation or, failing that, shaking a fist at. Note that the below is caller-agnostic, it doesn't matter what userspace stack you use to process your input events.

There are four parts that come together to fix devices: a kernel ioctl and a trifecta of udev rules hwdb entries and a udev builtin.

The kernel's EVIOCSABS ioctl

It all starts with the kernel's struct input_absinfo.


struct input_absinfo {
__s32 value;
__s32 minimum;
__s32 maximum;
__s32 fuzz;
__s32 flat;
__s32 resolution;
};
The three values that matter right now: minimum, maximum and resolution. The "value" is just the most recent value on this axis, ignore fuzz/flat for now. The min/max values simply specify the range of values the device will give you, the resolution how many values per mm you get. Simple example: an x axis given at min 0, max 1000 at a resolution of 10 means your devices is 100mm wide. There is no requirement for min to be 0, btw, and there's no clipping in the kernel so you may get values outside min/max. Anyway, your average touchpad looks like this in evemu-record:

# Event type 3 (EV_ABS)
# Event code 0 (ABS_X)
# Value 2572
# Min 1024
# Max 5112
# Fuzz 0
# Flat 0
# Resolution 41
# Event code 1 (ABS_Y)
# Value 4697
# Min 2024
# Max 4832
# Fuzz 0
# Flat 0
# Resolution 37
This is the information returned by the EVIOCGABS ioctl (EVdev IOCtl Get ABS). It is usually run once on device init by any process handling evdev device nodes.

Because plenty of devices don't announce the correct ranges or resolution, the kernel provides the EVIOCSABS ioctl (EVdev IOCtl Set ABS). This allows overwriting the in-kernel struct with new values for min/max/fuzz/flat/resolution, processes that query the device later will get the updated ranges.

udev rules, hwdb and builtins

The kernel has no notification mechanism for updated axis ranges so the ioctl must be applied before any process opens the device. This effectively means it must be applied by a udev rule. udev rules are a bit limited in what they can do, so if we need to call an ioctl, we need to run a program. And while udev rules can do matching, the hwdb is easier to edit and maintain. So the pieces we have is: a hwdb that knows when to change (and the values), a udev program to apply the values and a udev rule to tie those two together.

In our case the rule is 60-evdev.rules. It checks the 60-evdev.hwdb for matching entries [1], then invokes the udev-builtin-keyboard if any matching entries are found. That builtin parses the udev properties assigned by the hwdb and converts them into EVIOCSABS ioctl calls. These three pieces need to agree on each other's formats - the udev rule and hwdb agree on the matches and the hwdb and the builtin agree on the property names and value format.

By itself, the hwdb itself has no specific format beyond this:


some-match-that-identifies-a-device
PROPERTY_NAME=value
OTHER_NAME=othervalue
But since we want to match for specific use-cases, our udev rule assembles several specific match lines. Have a look at 60-evdev.rules again, the last rule in there assembles a string in the form of "evdev:name:the device name:content of /sys/class/dmi/id/modalias". So your hwdb entry could look like this:

evdev:name:My Touchpad Name:dmi:*svnDellInc*
EVDEV_ABS_00=0:1:3
If the name matches and you're on a Dell system, the device gets the EVDEV_ABS_00 property assigned. The "evdev:" prefix in the match line is merely to distinguish from other match rules to avoid false positives. It can be anything, libinput unsurprisingly used "libinput:" for its properties.

The last part now is understanding what EVDEV_ABS_00 means. It's a fixed string with the axis number as hex number - 0x00 is ABS_X. And the values afterwards are simply min, max, resolution, fuzz, flat, in that order. So the above example would set min/max to 0:1 and resolution to 3 (not very useful, I admit).

Trailing bits can be skipped altogether and bits that don't need overriding can be skipped as well provided the colons are in place. So the common use-case of overriding a touchpad's x/y resolution looks like this:


evdev:name:My Touchpad Name:dmi:*svnDellInc*
EVDEV_ABS_00=::30
EVDEV_ABS_01=::20
EVDEV_ABS_35=::30
EVDEV_ABS_36=::20
0x00 and 0x01 are ABS_X and ABS_Y, so we're setting those to 30 units/mm and 20 units/mm, respectively. And if the device is multitouch capable we also need to set ABS_MT_POSITION_X and ABS_MT_POSITION_Y to the same resolution values. The min/max ranges for all axes are left as-is.

The most confusing part is usually: the hwdb uses a binary database that needs updating whenever the hwdb entries change. A call to systemd-hwdb update does that job.

So with all the pieces in place, let's see what happens when the kernel tells udev about the device:

  • The udev rule assembles a match and calls out to the hwdb,
  • The hwdb applies udev properties where applicable and returns success,
  • The udev rule calls the udev keyboard-builtin
  • The keyboard builtin parses the EVDEV_ABS_xx properties and issues an EVIOCSABS ioctl for each axis,
  • The kernel updates the in-kernel description of the device accordingly
  • The udev rule finishes and udev sends out the "device added" notification
  • The userspace process sees the "device added" and opens the device which now has corrected values
  • Celebratory champagne corks are popping everywhere, hands are shaken, shoulders are patted in congratulations of another device saved from the tyranny of wrong axis ranges/resolutions

Once you understand how the various bits fit together it should be quite easy to understand what happens. Then the remainder is just adding hwdb entries where necessary but the touchpad-edge-detector tool is useful for figuring those out.

[1] Not technically correct, the udev rule merely calls the hwdb builtin which searches through all hwdb entries. It doesn't matter which file the entries are in.

GNOME Software and automatic updates

Posted by Richard Hughes on August 08, 2018 07:20 PM

For GNOME 3.30 we’ve enabled something that people have been asking for since at least the birth of the gnome-software project: automatically installing updates.

This of course comes with some caveats. Since it’s still not safe to auto-update packages (trust me, I triaged the hundreds of bugs) we will restrict automatic updates to Flatpaks. Although we do automatically download things like firmware updates, ostree content, and package updates by default they’re deployed manually like before. I guess it’s important to say that the auto-update of Flatpaks is optional and can easily be turned off in the GUI, and that you’ll be notified when applications have been auto-updated and need restarting.

Another common complaint with gnome-software was that it didn’t show the same list of updates as command line tools like dnf. The internal refactoring required for auto-deploying updates also allows us to show updates that are available, but not yet downloaded. We’ll still try and auto-download them ahead of time if possible, but won’t hide them until that. This does mean that “new” updates could take some time to download in the updates panel before either the firmware update is performed or the offline update is scheduled.

This also means we can add some additional UI controlling whether updates should be downloaded and deployed automatically. This doesn’t override the existing logic regarding metered connections or available battery power, but does give the user some more control without resorting to using gsettings invocation on the command line.

Also notable for GNOME 3.30 is that we’ve switched to the new libflatpak transaction API, which both simplifies the flatpak plugin considerably, and it means we install the same runtimes and extensions as the flatpak CLI. This was another common source of frustration as anyone trying to install from a flatpakref with RuntimeRepo set will testify.

With these changes we’ve also bumped the plugin interface version, so if you have out-of-tree plugins they’ll need recompiling before they work again. After a little more polish, the new GNOME Software 2.29.90 will soon be available in Fedora Rawhide, and will thus be available in Fedora 29. If 3.30 is as popular as I think it might be, we might even backport gnome-software 3.30.1 into Fedora 28 like we did for 3.28 and Fedora 27 all those moons ago.

Comments welcome.

Please welcome Lenovo to the LVFS

Posted by Richard Hughes on August 06, 2018 08:45 AM

I’d like to formally welcome Lenovo to the LVFS. For the last few months myself and Peter Jones have been working with partners of Lenovo and the ThinkPad, ThinkStation and ThinkCenter groups inside Lenovo to get automatic firmware updates working across a huge number of different models of hardware.

Obviously, this is a big deal. Tens of thousands of people are likely to be offered a firmware update in the next few weeks, and hundreds of thousands over the next few months. Understandably we’re not just flipping a switch and opening the floodgates, so if you’ve not seen anything appear in fwupdmgr update or in GNOME Software don’t panic. Over the next couple of weeks we’ll be moving a lot of different models from the various testing and embargoed remotes to the stable remote, and so the list of supported hardware will grow. That said, we’ll only be supporting UEFI hardware produced fairly recently, so there’s no point looking for updates on your beloved T61. I also can’t comment on what other Lenovo branded hardware is going to be supported in the future as I myself don’t know.

Bringing Lenovo to the LVFS has been a lot of work. It needed changes to the low level fwupdate library, fwupd, and even the LVFS admin portal itself for various vendor-defined reasons. We’ve been working in semi-secret for a long time, and I’m sure it’s been frustrating to all involved not being able to speak openly about the grand plan. I do think Lenovo should be applauded for the work done so far due to the enormity of the task, rather than chastised about coming to the party a little late. If anyone from HP is reading this, you’re now officially late.

We’re still debugging a few remaining issues, and also working on making the update metadata better quality, so please don’t judge Lenovo (or me!) too harshly if there are initial niggles with the update process. Updating the firmware is slightly odd in that it sometimes needs to reboot a few times with some scary-sounding beeps, and on some hardware the first UEFI update you do might look less than beautiful. If you want to do the firmware update on Lenovo hardware, you’ll have a lot more success with newer versions of fwupd and fwupdate, although we should do a fairly good job of not offering the update if it’s not going to work. All our testing has been done with a fully updated Fedora 28 workstation. It of course works with SecureBoot turned on, but if you’ve enabled the BootOrder lock manually you’ll need to turn that off first.

I’d like to personally thank all the Lenovo engineers and managers I’ve worked with over the last few months. All my time has been sponsored by Red Hat, and they rightfully deserve love too.

Flatpak portal experiments

Posted by Matthias Clasen on August 04, 2018 03:19 AM

One of the signs that a piece of software is reaching a mature state is its ability to serve  use cases that nobody had anticipated when it was started. I’ve recently had this experience with Flatpak.

We have been discussing some possible new directions for the GTK+ file chooser. And it occurred to me that it might be convenient to use the file chooser portal as a way to experiment with different file choosers without having to change either GTK+ itself or the applications.

To verify this idea, I wrote a quick portal implementation that uses the venerable GTK+ 2 file chooser.

Here is Corebird (a GTK+ 3 application) using the GTK+ 2 file chooser to select an image.

On Flatpak updates

Posted by Matthias Clasen on August 02, 2018 05:06 PM

Maybe you remember times when updating your system was risky business – your web browser might crash of start to behave funny because the update pulled data files or fonts out from underneath the running process, leading to fireworks or, more likely, crashes.

Flatpak updates on the other hand are 100% safe. You can call

 flatpak update

and the running instances of are not affected in any way. Flatpak keeps existing deployments around until the last user is gone.  If you quit the application and restart it, you will get the updated version, though.

This is very nice, and works just fine. But maybe we can do even better?

Improving the system

It would be great if the system was aware of the running instances, and offered me to restart them to take advantage of the new version that is now available. There is a good chance that GNOME Software will gain this feature before too long.

But for now, it does not have it.

Do it yourself

Many apps, in particular those that are not native to the Linux distro world, expect to update themselves, and we have had requests to enable this functionality in flatpak. We do think that updating software is a system responsibility that should be controlled by global policies and be under the users control, so we haven’t quite followed the request.

But Flatpak 1.0 does have an API that is useful in this context, the “Flatpak portal“. It has a Spawn method that allows applications to launch a process in a new sandbox.

Spawn (IN  ay    cwd_path,
       IN  aay   argv,
       IN  a{uh} fds,
       IN  a{ss} envs,
       IN  u     flags,
       IN  a{sv} options,
       OUT u     pid)

There are several use cases for this, from sandboxing thumbnailers (which create thumbnails for possibly untrusted content files) to sandboxing web browser tabs individually. The use case we are interested in here is restarting the latest version of the app itself.

One complication that I’ve run into when trying this out is the “unique application” pattern that is built into GApplication and similar application classes: Since there is already an owner for the application ID on the session bus, my newly spawned version will just back off and exit. Which is clearly not what I intended in this case.

Make it stop

The workaround I came up with is not very pretty, but functional. It requires several parts.

First, I need a “quit” action exported on the session bus. The newly spawned version will activate this action of the running instance to convince it to go away. Thankfully, my example app already had this action, for the Quit item in the app menu.

I don’t want this to happen unconditionally, but only if I am spawning a new version. To achieve this, I made my app only activate “quit” if the –replace option is present, and add that option to the commandline that I pass to the “Spawn” call.

The code for this part is less pretty than it could be, since GApplication gets in the way a bit. I have to manually check for the –replace option and do the “quit” D-Bus call by hand.

Doing the “quit” call synchronously is not quite enough to avoid a race condition between the running instance dropping the bus name and my new instance attempting to take it. Therefore, I explicitly wait for the bus name to become unowned before entering g_application_run().

<video class="wp-video-shortcode" controls="controls" height="267" id="video-2309-3" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-124710-PM.webm?_=3" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-124710-PM.webm</video>

But it all works fine. To test it, i exported a “restart” action and added it to the app menu.

Tell me about it

But who can remember to open the app menu and click “Restart”. That is just too cumbersome. Thankfully, flatpak has a solution for this: When you update an app that is running, it creates a marker file named

/app/.updated

inside the sandbox for each running instance.

That makes it very easy for the app to find out when it has been updated, by just monitoring this file. Once the file appears, it can pop up a dialog that offers the user to restart the newer version of the app. A good quality implementation of this will of course save and restore the state when doing this.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-2309-4" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-125142-PM.webm?_=4" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-125142-PM.webm</video>

Voilá, updates made easy!

You can find the working example in the portal-test repository.

A Fedora COPR for libinput git master

Posted by Peter Hutterer on August 01, 2018 01:07 AM

To make testing libinput git master easier, I set up a whot/libinput-git Fedora COPR yesterday. This repo gets the push triggers directly from GitLab so it will rebuild with whatever is currently on git master.

To use the COPR, simply run:


sudo dnf copr enable whot/libinput-git
sudo dnf upgrade libinput
This will give you the libinput package from git. It'll have a date/time/git sha based NVR, e.g. libinput-1.11.901-201807310551git22faa97.fc28.x86_64. Easy to spot at least.

To revert back to the regular Fedora package run:


sudo dnf copr disable whot/libinput-git
sudo dnf distro-sync "libinput-*"

Disclaimer: This is an automated build so not every package is tested. I'm running git master exclusively (from a a ninja install) and I don't push to master unless the test suite succeeds. So the risk for ending up with a broken system is low.

On that note: if you are maintaining a similar repo for other distributions and would like me to add a push trigger in GitLab for automatic rebuilds, let me know.

Porting Coreboot to the 51NB X210

Posted by Matthew Garrett on July 31, 2018 05:28 AM
The X210 is a strange machine. A set of Chinese enthusiasts developed a series of motherboards that slot into old Thinkpad chassis, providing significantly more up to date hardware. The X210 has a Kabylake CPU, supports up to 32GB of RAM, has an NVMe-capable M.2 slot and has eDP support - and it fits into an X200 or X201 chassis, which means it also comes with a classic Thinkpad keyboard . We ordered some from a Facebook page (a process that involved wiring a large chunk of money to a Chinese bank which wasn't at all stressful), and a couple of weeks later they arrived. Once I'd put mine together I had a quad-core i7-8550U with 16GB of RAM, a 512GB NVMe drive and a 1920x1200 display. I'd transplanted over the drive from my XPS13, so I was running stock Fedora for most of this development process.

The other fun thing about it is that none of the firmware flashing protection is enabled, including Intel Boot Guard. This means running a custom firmware image is possible, and what would a ridiculous custom Thinkpad be without ridiculous custom firmware? A shadow of its potential, that's what. So, I read the Coreboot[1] motherboard porting guide and set to.

My life was made a great deal easier by the existence of a port for the Purism Librem 13v2. This is a Skylake system, and Skylake and Kabylake are very similar platforms. So, the first job was to just copy that into a new directory and start from there. The first step was to update the Inteltool utility so it understood the chipset - this commit shows what was necessary there. It's mostly just adding new PCI IDs, but it also needed some adjustment to account for the GPIO allocation being different on mobile parts when compared to desktop ones. One thing that bit me - Inteltool relies on being able to mmap() arbitrary bits of physical address space, and the kernel doesn't allow that if CONFIG_STRICT_DEVMEM is enabled. I had to disable that first.

The GPIO pins got dropped into gpio.h. I ended up just pushing the raw values into there rather than parsing them back into more semantically meaningful definitions, partly because I don't understand what these things do that well and largely because I'm lazy. Once that was done, on to the next step.

High Definition Audio devices (or HDA) have a standard interface, but the codecs attached to the HDA device vary - both in terms of their own configuration, and in terms of dealing with how the board designer may have laid things out. Thankfully the existing configuration could be copied from /sys/class/sound/card0/hwC0D0/init_pin_configs[2] and then hda_verb.h could be updated.

One more piece of hardware-specific configuration is the Video BIOS Table, or VBT. This contains information used by the graphics drivers (firmware or OS-level) to configure the display correctly, and again is somewhat system-specific. This can be grabbed from /sys/kernel/debug/dri/0/i915_vbt.

A lot of the remaining platform-specific configuration has been split out into board-specific config files. and this also needed updating. Most stuff was the same, but I confirmed the GPE and genx_dec register values by using Inteltool to dump them from the vendor system and copy them over. lspci -t gave me the bus topology and told me which PCIe root ports were in use, and lsusb -t gave me port numbers for USB. That let me update the root port and USB tables.

The final code update required was to tell the OS how to communicate with the embedded controller. Various ACPI functions are actually handled by this autonomous device, but it's still necessary for the OS to know how to obtain information from it. This involves writing some ACPI code, but that's largely a matter of cutting and pasting from the vendor firmware - the EC layout depends on the EC firmware rather than the system firmware, and we weren't planning on changing the EC firmware in any way. Using ifdtool told me that the vendor firmware image wasn't using the EC region of the flash, so my assumption was that the EC had its own firmware stored somewhere else. I was ready to flash.

The first attempt involved isis' machine, using their Beaglebone Black as a flashing device - the lack of protection in the firmware meant we ought to be able to get away with using flashrom directly on the host SPI controller, but using an external flasher meant we stood a better chance of being able to recover if something went wrong. We flashed, plugged in the power and… nothing. Literally. The power LED didn't turn on. The machine was very, very dead.

Things like managing battery charging and status indicators are up to the EC, and the complete absence of anything going on here meant that the EC wasn't running. The most likely reason for that was that the system flash did contain the EC's firmware even though the descriptor said it didn't, and now the system was very unhappy. Worse, the flash wouldn't speak to us any more - the power supply from the Beaglebone to the flash chip was sufficient to power up the EC, and the EC was then holding onto the SPI bus desperately trying to read its firmware. Bother. This was made rather more embarrassing because isis had explicitly raised concern about flashing an image that didn't contain any EC firmware, and now I'd killed their laptop.

After some digging I was able to find EC firmware for a related 51NB system, and looking at that gave me a bunch of strings that seemed reasonably identifiable. Looking at the original vendor ROM showed very similar code located at offset 0x00200000 into the image, so I added a small tool to inject the EC firmware (basing it on an existing tool that does something similar for the EC in some HP laptops). I now had an image that I was reasonably confident would get further, but we couldn't flash it. Next step seemed like it was going to involve desoldering the flash from the board, which is a colossal pain. Time to sleep on the problem.

The next morning we were able to borrow a Dediprog SPI flasher. These are much faster than doing SPI over GPIO lines, and also support running the flash at different voltage. At 3.5V the behaviour was the same as we'd seen the previous night - nothing. According to the datasheet, the flash required at least 2.7V to run, but flashrom listed 1.8V as the next lower voltage so we tried. And, amazingly, it worked - not reliably, but sufficiently. Our hypothesis is that the chip is marginally able to run at that voltage, but that the EC isn't - we were no longer powering the EC up, so could communicated with the flash. After a couple of attempts we were able to write enough that we had EC firmware on there, at which point we could shift back to flashing at 3.5V because the EC was leaving the flash alone.

So, we flashed again. And, amazingly, we ended up staring at a UEFI shell prompt[3]. USB wasn't working, and nor was the onboard keyboard, but we had graphics and were executing actual firmware code. I was able to get USB working fairly quickly - it turns out that Linux numbers USB ports from 1 and the FSP numbers them from 0, and fixing that up gave us working USB. We were able to boot Linux! Except there were a whole bunch of errors complaining about EC timeouts, and also we only had half the RAM we should.

After some discussion on the Coreboot IRC channel, we figured out the RAM issue - the Librem13 only has one DIMM slot. The FSP expects to be given a set of i2c addresses to probe, one for each DIMM socket. It is then able to read back the DIMM configuration and configure the memory controller appropriately. Running i2cdetect against the system SMBus gave us a range of devices, including one at 0x50 and one at 0x52. The detected DIMM was at 0x50, which made 0x52 seem like a reasonable bet - and grepping the tree showed that several other systems used 0x52 as the address for their second socket. Adding that to the list of addresses and passing it to the FSP gave us all our RAM.

So, now we just had to deal with the EC. One thing we noticed was that if we flashed the vendor firmware, ran it, flashed Coreboot and then rebooted without cutting the power, the EC worked. This strongly suggested that there was some setup code happening in the vendor firmware that configured the EC appropriately, and if we duplicated that it would probably work. Unfortunately, figuring out what that code was was difficult. I ended up dumping the PCI device configuration for the vendor firmware and for Coreboot in case that would give us any clues, but the only thing that seemed relevant at all was that the LPC controller was configured to pass io ports 0x4e and 0x4f to the LPC bus with the vendor firmware, but not with Coreboot. Unfortunately the EC was supposed to be listening on 0x62 and 0x66, so this wasn't the problem.

I ended up solving this by using UEFITool to extract all the code from the vendor firmware, and then disassembled every object and grepped them for port io. x86 systems have two separate io buses - memory and port IO. Port IO is well suited to simple devices that don't need a lot of bandwidth, and the EC is definitely one of these - there's no way to talk to it other than using port IO, so any configuration was almost certainly happening that way. I found a whole bunch of stuff that touched the EC, but was clearly depending on it already having been enabled. I found a wide range of cases where port IO was being used for early PCI configuration. And, finally, I found some code that reconfigured the LPC bridge to route 0x4e and 0x4f to the LPC bus (explaining the configuration change I'd seen earlier), and then wrote a bunch of values to those addresses. I mimicked those, and suddenly the EC started responding.

It turns out that the writes that made this work weren't terribly magic. PCs used to have a SuperIO chip that provided most of the legacy port functionality, including the floppy drive controller and parallel and serial ports. Individual components (called logical devices, or LDNs) could be enabled and disabled using a sequence of writes that was fairly consistent between vendors. Someone on the Coreboot IRC channel recognised that the writes that enabled the EC were simply using that protocol to enable a series of LDNs, which apparently correspond to things like "Working EC" and "Working keyboard". And with that, we were done.

Coreboot doesn't currently have ACPI support for the latest Intel graphics chipsets, so right now my image doesn't have working backlight control.Backlight control also turned out to be interesting. Most modern Intel systems handle the backlight via registers in the GPU, but the X210 uses the embedded controller (possibly because it supports both LVDS and eDP panels). This means that adding a simple display stub is sufficient - all we have to do on a backlight set request is store the value in the EC, and it does the rest.

Other than that, everything seems to work (although there's probably a bunch of power management optimisation to do). I started this process knowing almost nothing about Coreboot, but thanks to the help of people on IRC I was able to get things working in about two days of work[4] and now have firmware that's about as custom as my laptop.

[1] Why not Libreboot? Because modern Intel SoCs haven't had their memory initialisation code reverse engineered, so the only way to boot them is to use the proprietary Intel Firmware Support Package.
[2] Card 0, device 0
[3] After a few false starts - it turns out that the initial memory training can take a surprisingly long time, and we kept giving up before that had happened
[4] Spread over 5 or so days of real time

comment count unavailable comments

libinput now has ReadTheDocs-style documentation

Posted by Peter Hutterer on July 30, 2018 04:16 AM

libinput's documentation started out as doxygen of the developer API - they were the main target 4 years ago. Over time, more and more extra documentation was added and now most of it is aimed at users (for self-debugging and troubleshooting or just to explain concepts and features). Unfortunately, with doxygen this all ends up in the "Related Pages". The developer API documentation itself became a less important part, by now all the major compositors have libinput support and it doesn't change much. So while it needs to be there, most of the traffic goes to the user documentation (I think, it's not like I'm running stats).

Something more suited for prose-style docs was needed. I prefer the RTD look so last week I converted most of the libinput documentation into RST format and it's now built with sphinx and the RTD theme. Same URL as before: http://wayland.freedesktop.org/libinput/doc/latest/.

The biggest difference is that the Developer API Documentation (still doxygen) is now at http://wayland.freedesktop.org/libinput/doc/latest/api/, (i.e. add /api/ to the link). If you're programming against libinput's API (e.g. because you're writing a compositor), that's where you need to go.

It's still basically the same content as before, I'll be tidying things up and adding to it over the next few weeks. Hopefully without breaking existing links. There is probably detritus from the doxygen → rst change floating around, I'll be fixing that too. If you want to help out please don't hesitate, I'll do my best to be quick to review any merge requests.

ASG! 2018 CfP Closes TODAY

Posted by Lennart Poettering on July 29, 2018 10:00 PM

<large>The All Systems Go! 2018 Call for Participation Closes TODAY!</large>

The Call for Participation (CFP) for All Systems Go! 2018 will close TODAY, on 30th of July! We’d like to invite you to submit your proposals for consideration to the CFP submission site quickly!

ASG image

All Systems Go! is everybody's favourite low-level Userspace Linux conference, taking place in Berlin, Germany in September 28-30, 2018.

For more information please visit our conference website!

The Ascendance of nftables

Posted by Dan Williams on July 27, 2018 07:20 PM
<figure class="wp-caption aligncenter" id="attachment_851" style="width: 660px"><figcaption class="wp-caption-text">The Sun sets on iptables (image by fdecomite, CC BY 2.0)</figcaption></figure>

iptables is the default Linux firewall and packet manipulation tool. If you’ve ever been responsible for a Linux machine (aside from an Android phone perhaps) then you’ve had to touch iptables. It works, but that’s about the best thing anyone can say about it.

At Red Hat we’ve been working hard to replace iptables with its successor: nftables. Which has actually been around for years but for various reasons was unable to completely replace iptables.  Until now.

What’s Wrong With iptables?

iptables is slow. It processes rules linearly which was fine in the days of 10/100Mbit ethernet. But we can do better, and nftables does; it uses maps and concatenations to touch packets as little as possible for a given action.

Most of nftables’ intelligence is in the userland tools rather than the kernel, reducing the possibility for downtime due to kernel bugs. iptables puts most of its logic in the kernel and you can guess where that leads.

When adding or updating even a single rule, iptables must read the entire existing table from the kernel, make the change, and send the whole thing back. iptables also requires locking workarounds to prevent parallel processes from stomping on each other or returning errors. Updating an entire table requires some synchronization across all CPUs meaning the more CPUs you have, the longer it takes. These issues cause problems in container orchestration systems (like OpenShift and Kubernetes) where 100,000 rules and 15 second iptables-restore runs are not uncommon. nftables can update one or many rules without touching any of the others.

iptables requires duplicate rules for IPv4 and IPv6 packets and for multiple actions, which just makes the performance and maintenance problems worse. nftables allows the same rule to apply to both IPv4 and IPv6 and supports multiple actions in the same rule, keeping your ruleset small and simple.

If you’ve every had to log or debug iptables, you know how awful that can be. nftables allows logging and other actions in the same rule, saving you time, effort, and cirrhosis of the liver. It also provides the “nft monitor trace” command to watch how rules apply to live packets.

nftables also uses the same netlink API infrastructure as other modern kernel systems like /sbin/ip, the Wi-Fi stack, and others, so it’s easier to use in other programs without resorting to command-line parsing and execing random binaries.

Finally, nftables has integrated set support with consistent syntax rather than requiring a separate tool like ipset.

What about eBPF?

You might have heard that eBPF will replace everything and give everyone a unicorn. It might, if/when it gets enhancements for accountability, traceability, debuggability, auditability, and broad driver support for XDP. But nftables has been around for years and has most (all?) of these things today.

nftables Everywhere

I’d like to highlight the great work by members of my team to bring nftables over the finish line:

  • Phil Sutter is almost done with compat versions of arptables and ebtables and has been adding testcases everywhere. He also added a JSON interface to libnftables (much like /sbin/ip) for easier programmatic use which firewalld will use in the near future.
  • Eric Garver updated firewalld (the default firewall manager on Fedora, RHEL, and other distros) to use nftables by default. This change alone will seamlessly flip the nftables switch for countless users. It’s a huge deal.
  • Florian Westphal figured out how to make nftables and iptables NAT coexist in the kernel. He also fixed up the iptables compat commands and handles the upstream releases to make sure we can actually use this stuff.
  • And of course the upstream netfilter community!

Thanks iptables; it’s been a nice ride. But nftables is better.

 

Why it's not a good idea to handle evdev directly

Posted by Peter Hutterer on July 25, 2018 02:34 AM

Gather round children, it's story time. Especially for you children who lurk on /r/linux and think you may learn something there. Today, I'll tell you a horror story. The one where we convert kernel input events into touchpad events, with the subtle subtitle of "friends don't let friends handle evdev events".

The question put forward is "why do we need libinput at all", when, as frequently suggested on the usual websites, it's sufficient to just read evdev data and there's really no need for libinput. That is of course true. You can use evdev events from the kernel directly. Did you know that the events the kernel gives you are absolute coordinates? And that not all touchpads have buttons? Or that some touchpads have specific event sequences that need to be filtered? No? Well, boy, are you in for a few surprises! Anyway, let's go and handle evdev events ourselves and write our own libmyinput.

How do we know something is a touchpad? Well, we look at the exposed evdev bits. We need ABS_X, ABS_Y and BTN_TOOL_FINGER but don't want INPUT_PROP_DIRECT. If the latter bit is set then we have a touchscreen (probably). We don't actually care about buttons here, that comes later. ABS_X and ABS_Y give us device-absolute coordinates. On touch down you get the evdev frame of "a finger is down at x/y device units from the top-left". As you move around, you get the x/y coordinate updates. The data itself is exactly the same as you would get from a touchscreen, but we know it's a touchpad because we queried the other bits at startup. So your first job is to convert the absolute x/y coordinates to deltas by subtracting the previous position.

Touchpads have different resolutions for x and y so a delta of 10/10 does not mean it's a 45-degree movement. Better check with the resolution to convert this to physical distances to be on the safe side. Oh, btw, the axes aren't reliable. The min/max ranges and the resolutions are wrong on a large number of touchpads. Luckily systemd fixes this for you with the 60-evdev.hwdb. But I should probably note that hwdb only exists because of libinput... Either way, you don't have to care about it because the road's already paved. You're welcome.

Oh wait, you do have to care a little because there are touchpads (e.g. HP Stream 11, ZBook Studio G3, ...) where bits are missing or wrong. So you better write a device database that tells you when you have correct the evdev bits. You could implement this as config option but that's just saying "I know what's wrong here, I know how to fix it but I'm still going to make you google for it and edit a local configuration file to make it work". You could treat your users this way, but you really shouldn't.

As you're happily processing your deltas, you notice that on some touchpads you get motion before you touch the touchpad. Ooops, we need a way to tell whether a finger is down. Luckily the kernel gives you BTN_TOUCH for that event, so you switch your implementation to only calculate deltas when BTN_TOUCH is set. But then you realise that is effectively a hardcoded threshold in the kernel and does not match a lot of devices. Some devices require too-hard finger pressure to trigger BTN_TOUCH, others send it on super-light pressure or even while hovering. After grinding some enamel away you find that many touchpads give you ABS_PRESSURE. Awesome, let's make touches pressure-based instead. Let's use a threshold, no, I mean a device-specific threshold (because if two touchpads would be the same the universe will stop doing whatever a universe does, I clearly haven't thought this through). Luckily we already have the device database so we just add the thresholds there.

Oh, if you want this to run on a Apple touchpad better implement touch size handling (ABS_MT_TOUCH_MAJOR/ABS_MT_TOUCH_MINOR). These axes give you the size of the touching ellipse which is great. Except that the value is just an arbitrary number range that have no reflection to physical properties, so better update your database so you can add those thresholds.

Ok, now we have single-finger handling in our libnotinput. Let's add some sophisticated touchpad features like button clicks. Buttons are easy, the kernel gives us BTN_LEFT and BTN_RIGHT and, if you're lucky, BTN_MIDDLE. Unless you have a clickpad of course in which case you only ever get BTN_LEFT because the whole touchpad can be depressed (much like you, if you continue writing your own evdev handling). Those clickpads are in the majority of laptops these days, so we have to deal with them. The two approaches we have are "software button areas" and "clickfinger". The former detects where your finger is when you push the touchpad down - if it's in the bottom right corner we convert the kernel's BTN_LEFT to a BTN_RIGHT and pass that on. Decide how big the buttons will be (note: some touchpads that need software buttons are only 50mm high, others exceed 100mm height). Whatever size you choose, it's an invisible line on the touchpad. Do you know yet how you will handle a finger that moves from outside the button are into the button area before the click? Or the other way round? Maybe add this to your todo list for fixing later.

Maybe "clickfinger" is easier? It counts how many fingers are on the touchpad when clicking (1 finger == left click, 2 fingers == right click, 3 fingers == middle click). Much easier, except that so far we only handle one finger. The easy fix is to use BTN_TOOL_DOUBLETAP and BTN_TOOL_TRIPLETAP which are bitflags that tell you when a second/third finger are down. Add that to your libthisisnotlibinput. Coincidentally, users often click with their thumb while moving. So you have one finger moving the pointer, then a thumb click. Two fingers down but the user doesn't perceive it as such, this should be a left click. Oops, we don't actually know where the second finger is.

Let's switch our libstillnotlibinput to use ABS_MT_POSITION_X and ABS_MT_POSITION_Y because that gives us per-finger position information (once you understand how the kernel's MT protocol slots work). And when I say "switch" of course I meant "add" because there are still touchpads in use that don't support multitouch so you get to keep both implementations. There are also a bunch of touchpads that can give you the position of two fingers but not of the third. Wipe that tear away and pencil that into your todo list. I haven't mentioned semi-mt devices yet that will give you multitouch position data for two fingers but it won't track them correctly - the first touch position is always the top/left of the bounding box, the second touch is always the bottom/right of the bounding box. Do the right thing for our libwhathaveidone and just pretend semi-mt devices are single-touch touchpads. libinput (the real one) does the same because my sanity is something to be cherished.

Oh, on another note, some touchpads don't have any buttons (some Wacom tablets are large touchpads). Add that to your todo list. You wanted middle buttons to work? Few touchpads have a middle button (clickpads never do anyway). Better write a middle button emulation system that generates BTN_MIDDLE when both buttons are pressed. Or when a finger is on the left and another finger is on the right software button. Or when a finger is in a virtual middle button area. All these need to be present because if not, you get dissed by users for not implementing their favourite interaction method.

So we're several paragraphs in and so far we have: finger tracking and some button handling. And a bunch of things on the todo list. We haven't even started with other fancy features like edge scrolling, two-finger scrolling, pinch/swipe gestures or thumb and palm detection. Oh, and you're not yet handling any other devices like graphics tablets which are a world of their own. If you think all the other features and devices are any less of a mess... well, an Austrian comedian once said (paraphrased): "optimism is just a fancy word for ignorance".

All this is just handling features that users have come to expect. Examples for non-features that you'll have to implement: on some Lenovo series (*50 and newer) you will get a pointer jump after a series of of events that only have pressure information. You'll have to detect and discard that jump. The HP Pavilion DM4 touchpad has random jumps in the slot data. Synaptics PS/2 touchpads may 'randomly' end touches and restart them on the next event frame 10ms later. If you don't handle that you'll get ghost taps. And so on and so forth.

So as you, happily or less so, continue writing your libthisismoreworkthanexpected you'll eventually come to realise that you're just reimplementing libinput. Congratulations or condolences, whichever applies.

libinput's raison d'etre is that it deals with all the mess above so that compositor authors can be blissfully unaware of all this. That's the reason why all the major/general-purpose compositors have switched to libinput. That's the reason most distributions now use libinput with the X server (through the xf86-input-libinput driver). libinput has made some design decisions that you may disagree with but honestly, that's life. Deal with it. It doesn't even do all I want and I wrote >90% of it. Suggesting that you can just handle evdev directly is like suggesting you can use GPS coordinates directly to navigate. Sure you can, but there's a reason why people instead use a Tom Tom or Google Maps.

ASG! 2018 CfP Closes Soon

Posted by Lennart Poettering on July 22, 2018 10:00 PM

<large>The All Systems Go! 2018 Call for Participation Closes in One Week!</large>

The Call for Participation (CFP) for All Systems Go! 2018 will close in one week, on 30th of July! We’d like to invite you to submit your proposals for consideration to the CFP submission site quickly!

ASG image

Notification of acceptance and non-acceptance will go out within 7 days of the closing of the CFP.

All topics relevant to foundational open-source Linux technologies are welcome. In particular, however, we are looking for proposals including, but not limited to, the following topics:

  • Low-level container executors and infrastructure
  • IoT and embedded OS infrastructure
  • BPF and eBPF filtering
  • OS, container, IoT image delivery and updating
  • Building Linux devices and applications
  • Low-level desktop technologies
  • Networking
  • System and service management
  • Tracing and performance measuring
  • IPC and RPC systems
  • Security and Sandboxing

While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome, as long as they have a clear and direct relevance for user-space.

For more information please visit our conference website!

Flatpak – a look behind the portal

Posted by Matthias Clasen on July 19, 2018 04:58 PM

Flatpak allows sandboxed apps to interact with the rest of the system via portals. Portals are simply D-Bus services that are designed to be safe to expose to untrusted apps.

Principles

There are several principles that have guided the design of the existing portals.

 Keep the user in control

To achieve this, most portals  will show a dialog to let the user accept or deny the applications’ request. This is not a hard rule — in some cases, a dialog is just not practical.

Avoid yes/no questions

Direct questions about permissions tend to be dismissed without much thought, since they get in the way of the task at hand. Therefore, portals avoid this kind of question whenever possible and instead just let the user get on with the task.

For example, when an app is requesting to open a file on the host, we just present the user with a fille chooser. By selecting a file, the user implicitly grants the application access to the file. Or he can cancel the file selection and implicitly deny the applications’ request.

Don’t be annoying

Nothing is worse than having to answer the same question over and over. Portals make use of a database to record previous decisions and avoid asking repeatedly for the same thing.

Practice

The database used by portals is called the permission store. The permission store is organized in tables, with a table for each portal that needs one. It has a D-Bus api, but it is more convenient to explore it using the recently added flatpak commands:

flatpak permission-list
flatpak permission-list devices
flatpak permission-list desktop-used-apps video/webm

The first command will list all permissions in all tables, the second will show the content of the “devices” table, and the last one will show just the row for video/webm in the “desktop-used-apps” table.

There are also commands that deal with permissions on a per-application basis.

flatpak permission-show org.gnome.Epiphany
flatpak permission-reset org.gnome.Epiphany

The first command will show all the permissions that apply to the application, the second will remove all permissions for the application.

And more…

The most important table in the permission store is the “documents” table, where the documents portal stores information about host files that have been exported for applications. The documents portal makes the exported files available via a fuse filesystem at

/run/user/1000/doc

A useful subdirectory here is by-app, where the exported files are visible on a per-application bases (when setting up a sandbox, flatpak makes only this part of the document store available inside the sandbox).

It is instructive to browse this filesystem, but flatpak also has a dedicated set of commands for exploring the contents of the documents portal.

flatpak document-list
flatpak document-list org.gnome.Epiphany

The first command lists all exported files, the second shows only the files that are exported for an individual application.

flatpak document-info $HOME/example.pdf

This command shows information about a file that is exported in the document portal, such as which applications have access to it, and what they are allowed to do.

Lastly, there are document-export and document-unexport commands that allow to add or remove files from the document portal.

Summary

If you want to explore how portals work, or just need to double-check which files an app has access to, flatpak has tools that let you do so conveniently.