Fedora desktop Planet

High resolution wheel scrolling in the desktop stack

Posted by Peter Hutterer on April 04, 2020 04:00 AM

This is a follow up from the kernel support for high-resolution wheel scrolling which you totally forgot about because it's already more then a year in the past and seriously, who has the attention span these days to remember this. Anyway, I finally found time and motivation to pick this up again and I started lining up the pieces like cans, for it only to be shot down by the commentary of strangers on the internet. The Wayland merge request lists the various pieces (libinput, wayland, weston, mutter, gtk and Xwayland) but for the impatient there's also an Fedora 32 COPR. For all you weirdos inexplicably not running the latest Fedora, well, you'll have to compile this yourself, just like I did.

Let's recap: in v5.0 the kernel added new axes REL_WHEEL_HI_RES and REL_HWHEEL_HI_RES for all devices. On devices that actually support high-resolution wheel scrolling (Logitech and Microsoft mice, primarily) you'll get multiple hires events before the now-legacy REL_WHEEL events. On all other devices those two are in sync.

Integrating this into the userspace stack was a bit of a mess at first, but I think the solution is good enough, even if it has a rather verbose explanation on how to handle it. The actual patches to integrate ended up being relatively simple. So let's see why it's a bit weird:

When Wayland started, back in WhoahReallyThatLongAgo, scrolling was specified as the wl_pointer.axis event with a value in pixels. This works fine for touchpads, not so much for wheels. The early versions of Weston decreed that one wheel click was 10 pixels [1] and, perhaps surprisingly, the world kept on turning. When libinput was forked from Weston an early change was that wheel events would have two values - degrees of movement and click count ("discrete steps"). The wayland protocol was expanded to include the discrete steps as wl_pointer.axis_discrete as well. Then backwards compatibility reared its ugly head and Mutter, Weston, GTK all basically said: one discrete step equals 10 pixels so we multiply the discrete value by 10 and, perhaps surprisingly, the world kept on turning.

This worked out well enough for a few years but with high resolution wheels we ran into a problem. Discrete steps are integers, so we can't send partial values. And the protocol is defined in a way that any tweaking of the behaviour would result in broken clients which, perhaps surprisingly, is a Bad Thing. This lead to the current proposal of separate events. LIBINPUT_EVENT_POINTER_AXIS_WHEEL and for Wayland the wl_pointer.axis_v120 event, linked to above. These events are (like the kernel events) a parallel event stream to the previous events and effectively replace the LIBINPUT_EVENT_POINTER_AXIS and Wayland wl_pointer.axis/axis_discrete pair for wheel events (not so for touchpad or button scrolling though).

The compositor side of things is relatively simple: take the events from libinput and pass the hires ones as v120 events and the lowres ones as v120 events with a value of zero. The client side takes the v120 events and uses them over wl_pointer.axis/axis_discrete unless one is zero in which case you can discard all axis events in that wl_pointer.frame. Since most client implementation already have the support for smooth scrolling (because, well, touchpads do exist) it's relatively simple to integrate - the new events just feed into the smooth scrolling code. And since you already have to do wheel emulation for that (because, well, old clients exist) wheel emulation is handled easily too.

All that to provide buttery smooth [2] wheel scrolling. Or not, if your hardware doesn't support it. In which case, well, live with the warm fuzzy feeling that someone else has a better user experience now. Or soon, anyway.

[1] with, I suspect, the scientific measurement of "yeah, that seems about alright"
[2] like butter out of a fridge, so still chunky but at least less so than before

PAM testing using pam_wrapper and dbusmock

Posted by Bastien Nocera on April 01, 2020 04:53 PM
On the road to libfprint and fprintd 2.0, we've been fixing some long-standing bugs, including one that required porting our PAM module from dbus-glib to sd-bus, systemd's D-Bus library implementation.

As you can imagine, I have confidence in my ability to write bug-free code at the first attempt, but the foresight to know that this code will be buggy if it's not tested (and to know there's probably a bug in the tests if they run successfully the first time around). So we will have to test that PAM module, thoroughly, before and after the port.

Replacing fprintd

First, to make it easier to run and instrument, we needed to replace fprintd itself. For this, we used dbusmock, which is both a convenience Python library and way to write instrumentable D-Bus services, and wrote a template. There are a number of existing templates for a lot of session and system services, in case you want to test the integration of your code with NetworkManager, low-memory-monitor, or any number of other services.

We then used this to write tests for the command-line utilities, so we can both test our new template and test the command-line utilities themselves.

Replacing gdm

Now that we've got a way to replace fprintd and a physical fingerprint reader, we should write some tests for the (old) PAM module to replace sudo, gdm, or the login authentication services.

Co-workers Andreas Schneier and Jakub Hrozek worked on pam_wrapper, an LD_PRELOAD library to mock the PAM library, and Python helpers to write simple PAM services. This LWN article explains how to test PAM applications, and PAM modules.

After fixing a few bugs in pam_wrapper, and combining with the fprintd dbusmock work above, we could wrap and test the fprintd PAM module like it never was before.

Porting to sd-bus

Finally, porting the PAM module to sd-bus was pretty trivial, a loop of 1) writing tests that work against the old PAM module, 2) porting a section of the code (like the fingerprint reader enumeration, or the timeout support), and 3) testing against the new sd-bus based code. The result was no regressions that we could test for.

Conclusion

Both dbusmock, and pam_wrapper are useful tools in your arsenal to write tests, and given those (fairly) easy to use CIs in GNOME and FreeDesktop.org's GitLabs, it would be a shame not to.

You might also be interested in umockdev, to mock a number of device types, and mocklibc (which combined with dbusmock powers polkit's unattended CI)

Sandboxing WebKitGTK Apps

Posted by Michael Catanzaro on March 31, 2020 03:56 PM

When you connect to a Wi-Fi network, that network might block your access to the wider internet until you’ve signed into the network’s captive portal page. An untrusted network can disrupt your connection at any time by blocking secure requests and replacing the content of insecure requests with its login page. (Of course this can be done on wired networks as well, but in practice it mainly happens on Wi-Fi.) To detect a captive portal, NetworkManager sends a request to a special test address (e.g. http://fedoraproject.org/static/hotspot.txt) and checks to see whether it the content has been replaced. If so, GNOME Shell will open a little WebKitGTK browser window to display http://nmcheck.gnome.org, which, due to the captive portal, will be hijacked by your hotel or airport or whatever to display the portal login page. Rephrased in security lingo: an untrusted network may cause GNOME Shell to load arbitrary web content whenever it wants. If that doesn’t immediately sound dangerous to you, let’s ask me from four years ago why that might be bad:

Web engines are full of security vulnerabilities, like buffer overflows and use-after-frees. The details don’t matter; what’s important is that skilled attackers can turn these vulnerabilities into exploits, using carefully-crafted HTML to gain total control of your user account on your computer (or your phone). They can then install malware, read all the files in your home directory, use your computer in a botnet to attack websites, and do basically whatever they want with it.

If the web engine is sandboxed, then a second type of attack, called a sandbox escape, is needed. This makes it dramatically more difficult to exploit vulnerabilities.

The captive portal helper will pop up and load arbitrary web content without user interaction, so there’s nothing you as a user could possibly do about it. This makes it a tempting target for attackers, so we want to ensure that users are safe in the absence of a sandbox escape. Accordingly, beginning with GNOME 3.36, the captive portal helper is now sandboxed.

How did we do it? With basically one line of code (plus a check to ensure the WebKitGTK version is new enough). To sandbox any WebKitGTK app, just call webkit_web_context_set_sandbox_enabled(). Ta-da, your application is now magically secure!

No, really, that’s all you need to do. So if it’s that simple, why isn’t the sandbox enabled by default? It can break applications that use WebKitWebExtension to run custom code in the sandboxed web process, so you’ll need to test to ensure that your application still works properly after enabling the sandbox. (The WebKitGTK sandbox will become mandatory in the future when porting applications to GTK 4. That’s thinking far ahead, though, because GTK 4 isn’t supported yet at all.) You may need to use webkit_web_context_add_path_to_sandbox() to give your web extension access to directories that would otherwise be blocked by the sandbox.

The sandbox is critically important for web browsers and email clients, which are constantly displaying untrusted web content. But really, every app should enable it. Fix your apps! Then thank Patrick Griffis from Igalia for developing WebKitGTK’s sandbox, and the bubblewrap, Flatpak, and xdg-desktop-portal developers for providing the groundwork that makes it all possible.

Initial release of Jcat

Posted by Richard Hughes on March 23, 2020 12:32 PM

Today I released the first official tarball of Jcat, version 0.1.0. I’ve started the process to get the package into Fedora as it will almost certainly be a hard requirement in the next major version of fwupd.

Since I announced Jcat a few weeks ago, I’ve had a lot of positive feedback about the general concept and, surprisingly, even one hardware vendors suggested they might start self-signing their firmware before uploading to the LVFS (which is great!). More LVFS announcements coming soon I promise…

The LVFS has been including Jcat files in archives and generating them for metadata for about three weeks now, and we’ve had no issues reported. Once the package is available in Fedora 32 I’ll merge the fwupd pull request to make it a hard dep. All you other distro package maintainers, please go do your packaging thing!

If anyone finds any oddities or weird behavior, please file an issue. I’m not expecting to make API breaks now, but will if we find a design bug. Most of the code is imported from fwupd, and so I’m pretty comfortable with the general design. Comments welcome.

It's templates all the way down

Posted by Peter Hutterer on March 20, 2020 11:33 AM

Benjamin Tissoires and I have been busy anthophila and working on the freedesktop CI templates. This post is primarily of interest if you're working on GitLab, specifically if your repo is hosted on gitlab.freedesktop.org. If either of those applies, prepare to be distracted from the current pandemic, otherwise maybe just prepare to be entertained. I'll do my best to be less miserable than the news.

We all know that CI/CD really helps with finding bugs early. If you don't know that yet, insert a jedi handwave before the previous sentence and now you do. GitLab is the git forge now used by freedesktop.org and it comes with a built-in CI system. I'm leaving out the difficult bits such as actually setting the thing up because this is obviously all handled by Heinzelmännchen and just readily available, hooray. I'm also going to assume that you roughly know how to write GitLab CI jobs or, failing that, at least know how to read YAML without screaming. So for this post, we start with the basic problem that your .gitlab-ci.yml is getting unwieldy, repetitive or generally just kinda sucks to maintain. Which is roughly where libinput and libevdev were a while back which caused Benjamin to start the ci-templates.

Now, what do we want? (other than a COVID-19 cure) Reproducible tests, possibly on different distributions, with the same base system across tests. For my repos the goal was basically "test on the common distributions to catch certain bugs early". [1] For Mesa, the requirement is closer to "have a fixed set of images that 'never' change so tests are reproducible". Both goals have much in common.

Your first venture into CI will look like this:


myjob:
image: fedora:31
before_script:
- dnf update -y
- dnf install -y onepackage twopackage threepackage floor
script:
- meson builddir && ninja -C builddir test
So, in short: take a Fedora 31 docker image, update it [2], install the required packages and then run the actual test part - meson and ninja. Easy.

This works fine but it takes approximately forever because dnf update is slow and you're potentially pulling down gigs of packages on every test run. Which is fun, but less so when you have 10 different jobs and they all do that. So let's call this step 1 and pretend we're more advanced than that. Step 2 is where you start building an image you re-use, steps 3 to N are the bits where you learn more than you want to know about docker, podman, skopeo and how many typos you can put into a YAML file. So, ad break, and we jump right to the part where enlightenment is just around the corner or wherever enlightenment lurks these days.

Using the CI Templates

Here's the .gitlab-ci.yml to build a Fedora 31 images with ci-templates and run the test on that image:


include:
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/fedora.yml'

variables:
# project name of the upstream repo
FDO_UPSTREAM_REPO: someproject/name

stages:
- prep
- test

myimage:
extends: .fdo.container-build@fedora
stage: prep
variables:
FDO_DISTRIBUTION_VERSION: '31'
FDO_DISTRIBUTION_PACKAGES: 'onepackage twopackage threepackage floor'
FDO_DISTRIBUTION_TAG: '2020-03-20.0'

myjob:
extends: .fdo.distribution-image@fedora
stage: test
script:
- meson builddir && ninja -C builddir test
variables:
FDO_DISTRIBUTION_VERSION: '31'
FDO_DISTRIBUTION_TAG: '2020-03-20.0'
Now, you guessed correctly that the .fdo and FDO_ prefixes are used by the templates. There is a bunch of stuff hidden here. Basically, this will:
  • check if the image exists in your personal project's registry and use that, but if not
  • check if the image exists in the given upstream project's registry and use that, but if not
  • create a Fedora 31 image with the given packages installed and pushes it with the tag to the registry
  • use that image (whether newly created or pre-existing) and run the tests on it
There are a few more details too, but that's roughly the summary of it. For existing tags, the the myimage job effectively becomes a noop and the myjob job will re-use the image. The image will be in your registry so you can podman run it locally to reproduce a bug.

To build a new image, simply change the tag. Either because you want newer packages or you need extra (or less packages). And the nice thing here: you will build a new image as part of your merge request and run the CI against that new image. But upstream and every other MR will keep using the old image - right up until your MR is merged at which point every (future) MR will use that new updated image.

Want to build a Debian Stretch image? Replace Fedora and 31 with debian and stretch. Same for Ubuntu, Centos, Alpine and Arch though for those two you don't need a version number.

Templating the templates

"But, but, Peter, I want to test on eleventy different distribution like you do" I hear you say. Well, fear not, for this is where the ci-fairy comes in. How about we *gasp* generate the .gitlab-ci.yml file from a base configuration? That can't possibly be a bad idea, so let's do that! First, we save our configuration into the .gitlab-ci/config.yml:


distributions:
- name: fedora
tag: 12345
version: 30
- name: ubuntu
tag: abcde
version: '19.10'
# and so on, and so forth

packages:
- curl
- wget
- gcc
There is no specific requirement on the structure of the config file, ci-fairy simply loads it and passes it to Jinja2. Your template could thus look like this .gitlab-ci/ci.template file:

include:
{% for d in distributions %}
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/{{d.name}}.yml'
{% endfor %}

stages:
- prep
- test

{% for d in distributions %}

.{{d.name}}.{{d.version}}:
variables:
FDO_DISTRIBUTION_VERSION: '{{d.version}}'
FDO_DISTRIBUTION_TAG: '{{d.tag}}'

myimage.{{d.name}}.{{d.version}}:
extends:
- .fdo.container-build@{{d.name}}
- .{{d.name}}.{{d.version}}
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "{{' '.join(packages)}}"

myjob.{{d.name}}.{{d.version}}:
extends:
- .fdo.distribution-image@{{d.name}}
- .{{d.name}}.{{d.version}}
stage: test
script:
- meson builddir && ninja -C builddir
{% endfor %}
And to locally generate our .gitlab-ci.yml, all we need to do is

$ pip3 install git+http://gitlab.freedesktop.org/freedesktop/ci-templates
$ cd path/to/project
$ ci-fairy generate-template
$ ci-fairy lint # checks the resulting YAML for syntax errors
$ git commit .gitlab-ci.yml
And, for reference, the file we generated here looks like this:

include:
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/fedora.yml'
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/ubuntu.yml'

stages:
- prep
- test

.fedora.30:
variables:
FDO_DISTRIBUTION_VERSION: '30'
FDO_DISTRIBUTION_TAG: '12345'

myimage.fedora.30:
extends:
- .fdo.container-build@fedora
- .fedora.30
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "curl wget gcc"

myjob.fedora.30:
extends:
- .fdo.distribution-image@fedora
- .fedora.30
stage: test
script:
- meson builddir && ninja -C builddir

.ubuntu.19.10:
variables:
FDO_DISTRIBUTION_VERSION: '19.10'
FDO_DISTRIBUTION_TAG: 'abcde'

myimage.ubuntu.19.10:
extends:
- .fdo.container-build@ubuntu
- .ubuntu.19.10
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "curl wget gcc"

myjob.ubuntu.19.10:
extends:
- .fdo.distribution-image@ubuntu
- .ubuntu.19.10
stage: test
script:
- meson builddir && ninja -C builddir
Aside from the templating a new thing here is the e.g. .fedora.30 template what we extend from. This is an easy way to avoid having to set things like the distribution version and the tag multiple times. And a few things of note: the tag is job-specific (not distribution-specific). So you could have two Fedora 30 images with two different tags. This is also just an example I typed out, a real-world .gitlab-ci.yml will look more complex and different. So only rely on the above to get an idea of what's possible.

A word for non-gitlab.freedesktop.org users: You can use the remote: include directive to use the templates from elsewhere. ci-fairy isn't tied to freedesktop.org either but you'll have to provide more flags to get what you want instead of relying on the default behaviours.

The documentation for CI Templates has more, go and peruse my pretties.

[1] For months the CI was basically just a build test because I couldn't run the test suite in a container
[2] Updating isn't always required but sooner or later you run into a dependency issue if you don't

Wayland/X11: How to run Firefox in mixed environment

Posted by Martin Stransky on March 16, 2020 12:47 PM

running

Mixed X11/Wayland environment is a source of great annoyance when comes to running Firefox on Wayland along other X11 applications, like terminal, mail clients etc. Some X11 applications set GDK_BACKEND variable to x11 which effectively breaks Wayland applications and causes Firefox to start in X11 mode. That breaks Firefox remote protocol when Firefox searches for running instances and tries to reuse it.

When Firefox instance is already running on Wayland and you launch X11 version you get infamous “Firefox is already running” dialog due to locked Firefox profile.

Firefox 74 ships new MOZ_DBUS_REMOTE env variable and it forces Firefox on X11 to use D-Bus remote protocol as well as Wayland version does. So when X11 Firefox is launched after Wayland one with the same profile, already running Wayland instance is reused to open the link instead of the “Close Firefox” greetings.

All you need is to put this line to your ~/.bashrc file and restart your shell:

export MOZ_DBUS_REMOTE = 1

Epiphany 3.36 and WebKitGTK 2.28

Posted by Michael Catanzaro on March 11, 2020 03:00 PM

So, what’s new in Epiphany 3.36?

PDF.js

Once upon a time, beginning with GNOME 3.14, Epiphany had supported displaying PDF documents via the Evince NPAPI browser plugin developed by Carlos Garcia Campos. Unfortunately, because NPAPI plugins have to use X11-specific APIs to draw web content, this didn’t  suffice for very long. When GNOME switched to Wayland by default in GNOME 3.24 (yes, that was three years ago!), this functionality was left behind. Using an NPAPI plugin also meant the code was inherently unsandboxable and tied to a deprecated technology. Epiphany disabled support for NPAPI plugins by default in Epiphany 3.30, hiding the functionality behind a hidden setting, which has now finally been removed for Epiphany 3.36, killing off NPAPI for good.

Jan-Michael Brummer, who comaintains Epiphany with me, tried bringing back PDF support for Epiphany 3.34 using libevince, but eventually we decided to give up on this approach due to difficulty solving some user experience issues. Also, the rendering occurred in the unsandboxed UI process, which was again not good for security.

But PDF support is now back in Epiphany 3.36, and much better than before! Thanks to Jan-Michael, Epiphany now supports displaying PDFs using the amazing PDF.js. We are thankful for Mozilla’s work in developing PDF.js and open sourcing it for us to use. Viewing PDFs in Epiphany using PDF.js is more convenient than downloading them and opening them in Evince, and because the PDF is rendered in the sandboxed web process, using web technologies rather than poppler, it’s also approximately one bazillion times more secure.

<figure aria-describedby="caption-attachment-8741" class="wp-caption aligncenter" id="attachment_8741" style="width: 1232px">Screenshot of Epiphany displaying a PDF document<figcaption class="wp-caption-text" id="caption-attachment-8741">Look, it’s a PDF!</figcaption></figure>

One limitation of PDF.js is that it does not support forms. If you need to fill out PDF forms, you’ll need to download the PDF and open it in Evince, just as you would if using Firefox.

Dark Mode

Thanks to Carlos Garcia, it should finally be possible to use Epiphany with dark GTK themes. WebKitGTK has historically rendered HTML elements using the GTK theme, which has not been good for users of dark themes, which broke badly on many websites, usually due to dark text being drawn on dark backgrounds or various other problems with unexpected dark widgets. Since WebKitGTK 2.28, WebKit will try to manually change to a light GTK theme when it thinks a dark theme is in use, then use the light theme to render web content. (This work has actually been backported to WebKitGTK 2.26.4, so you don’t need to upgrade to WebKitGTK 2.28 to benefit, but the work landed very recently and we haven’t blogged about it yet.) Thanks to Cassidy James from elementary for providing example pages for testing dark mode behavior.

<figure aria-describedby="caption-attachment-8792" class="wp-caption aligncenter" id="attachment_8792" style="width: 1920px">Screenshot demonstrating broken dark mode support<figcaption class="wp-caption-text" id="caption-attachment-8792">Broken dark mode support prior to WebKitGTK 2.26.4. Notice that the first two pages use dark color schemes when light color schemes are expected, and the dark blue links are hard to read over the dark gray background. Also notice that the text in the second image is unreadable.</figcaption></figure> <figure aria-describedby="caption-attachment-8795" class="wp-caption aligncenter" id="attachment_8795" style="width: 1920px">Screenshot demonstrating fixed dark mode support in WebKitGTK 2.26.4<figcaption class="wp-caption-text" id="caption-attachment-8795">Since WebKitGTK 2.26.4, dark mode works as it does in most other browsers. Websites that don’t support dark mode are light, and websites that do support dark mode are dark. Widgets themed using GTK are always light.</figcaption></figure>

Since Carlos had already added support for the prefers-color-scheme media query last year, this now gets us up to dark mode parity with most browsers, except, notably, Safari. Unlike other browsers, Safari allows websites to opt-in to rendering dark system widgets, like WebKitGTK used to do before these changes. Whether to support this in WebKitGTK remains to-be-determined.

Process Swap on Navigation (PSON)

PSON, which debuted in Safari 13, is a major change in WebKit’s process model. PSON is the first component of site isolation, which Chrome has supported for some time, and which Firefox is currently working towards. If you care about web security, you should care a lot about site isolation, because the web browser community has arrived at a consensus that this is the best way to mitigate speculative execution attacks.

Nowadays, all modern web browsers use separate, sandboxed helper processes to render web content, ensuring that the main user interface process, which is unsandboxed, does not touch untrusted web content. Prior to 3.36, Epiphany already used a separate web process to display each browser tab (except for “related views,” where one tab opens another and gains scripting ability over the opened tab, subject to the Same Origin Policy). But in Epiphany 3.36, we now also have a separate web process per website. Each tab will swap between different web processes when navigating between different websites, to prevent any one web process from loading content from different websites.

To make these process swap navigations fast, a pool of prewarmed processes is used to hide the startup cost of launching a new process by ensuring the new process exists before it’s needed; otherwise, the overhead of launching a new web process to perform the navigation would become noticeable. And suspended processes live on after they’re no longer in use because they may be needed for back/forward navigations, which use WebKit’s page cache when possible. (In the page cache, pages are kept in memory indefinitely, to make back/forward navigations fast.)

Due to internal refactoring, PSON previously necessitated some API breakage in WebKitGTK 2.26 that affected Evolution and Geary: WebKitGTK 2.26 deprecated WebKit’s single web process model and required that all applications use one web process per web view, which Evolution and Geary were not, at the time, prepared to handle. We tried hard to avoid this, because we hate to make behavioral changes that break applications, but in this case we decided it was unavoidable. That was the status quo in 2.26, without PSON, which we disabled just before releasing 2.26 in order to limit application breakage to just Evolution and Geary. Now, in WebKitGTK 2.28, PSON is finally available for applications to use on an opt-in basis. (It will become mandatory in the future, for GTK 4 applications.) Epiphany 3.36 opts in. To make this work, Carlos Garcia designed new WebKitGTK APIs for cross-process communication, and used them to replace the private D-Bus server that Epiphany previously used for this purpose.

WebKit still has a long way to go to fully implement site isolation, but PSON is a major step down that road. Thanks to Brady Eidson and Chris Dumez from Apple for making this work, and to Carlos Garcia for handling most of the breakage (there was a lot). As with any major intrusive change of such magnitude, regressions are inevitable, so don’t hesitate to report issues on WebKit Bugzilla.

highlight.js

Once upon a time, WebKit had its own implementation for viewing page source, but this was removed from WebKit way back in 2014, in WebKitGTK 2.6. Ever since, Epiphany would open your default text editor, usually gedit, to display page source. Suffice to say that this was not a very satisfactory solution.

I finally managed to implement view source mode at the Epiphany level for Epiphany 3.30, but I had trouble making syntax highlighting work. I tried using various open source syntax highlighting libraries, but most are designed to highlight small amounts of code, not large web pages. The libraries I tried were not fast enough, so I gave up on syntax highlighting at the time.

Thanks to Jan-Michael, Epiphany 3.36 supports syntax highlighting using highlight.js, so we finally have view source mode working fully properly once again. It works much better than my failed attempts with different JS libraries. Please thank the highlight.js developers for maintaining this library, and for making it open source.

<figure aria-describedby="caption-attachment-8744" class="wp-caption aligncenter" id="attachment_8744" style="width: 1232px">Screenshot displaying Epiphany's view source mode<figcaption class="wp-caption-text" id="caption-attachment-8744">Colors!</figcaption></figure>

Service Workers

Service workers are now available in WebKitGTK 2.28. Our friends at Apple had already implemented service worker support a couple years ago for Safari 11, but we were pretty slow in bringing this functionality to Linux. Finally, WebKitGTK should now be up to par with Safari in this regard.

Cookies!

Patrick Griffis has updated libsoup and WebKitGTK to support SameSite cookies. He’s also tightened up our cookie policy by implementing strict secure cookies, which prevents http:// pages from setting secure cookies (as they could overwrite secure cookies set by https:// pages).

Adaptive Design

As usual, there are more adaptive design improvements throughout the browser, to provide a better user experience on the Librem 5. There’s still more work to be done here, but Epiphany continues to provide the best user experience of any Linux browser at small screen sizes. Thanks to Adrien Plazas and Jan-Michael for their continued work on this.

<figure aria-describedby="caption-attachment-8909" class="wp-caption aligncenter" id="attachment_8909" style="width: 445px">Screenshot showing Epiphany running in mobile mode at small window size.<figcaption class="wp-caption-text" id="caption-attachment-8909">As before, simply resize your browser window to see Epiphany dynamically transition between desktop mode and mobile mode.</figcaption></figure>

elementary OS

With help from Alexander Mikhaylenko, we’ve also upstreamed many elementary OS design changes, which will be used when running under the Pantheon desktop (and not impact users on other desktops), so that the elementary developers don’t need to maintain their customizations as separate patches anymore. This will eliminate a few elementary-specific bugs, including some keyboard shortcuts that were previously broken only in elementary, and some odd tab bar behavior. Although Epiphany still doesn’t feel quite as native as an app designed just for elementary OS, it’s getting closer.

Epiphany 3.34

I failed to blog about Epiphany 3.34 when I released it last September. Hopefully you have updated to 3.34 already, and are already enjoying the two big features from this release: the new adblocker, and the bubblewrap sandbox.

The new adblocker is based on WebKit Content Blockers, which was developed by Apple several years ago. Adrian Perez developed new WebKitGTK API to expose this functionality, changed Epiphany to use it, and deleted Epiphany’s older resource-hungry adblocker that was originally copied from Midori. Previously, Epiphany kept a large GHashMap of compiled regexes in every web process, consuming a very significant amount of RAM for each process. It also took time to compile these regexes when launching each new web process. Now, the adblock filters are instead compiled into an efficient bytecode format that gets mmapped between all web processes to avoid excessive resource use. The bytecode is interpreted by WebKit itself, rather than by Epiphany’s web process extension (which Epiphany uses to execute custom code in WebKit’s web process), for greatly improved performance.

Lastly, Epiphany 3.34 enabled Patrick’s bubblewrap sandbox, which was added in WebKitGTK 2.26. Bubblewrap is an amazing sandboxing tool, already used effectively by flatpak and rpm-ostree, and I’m very pleased with Patrick’s decision to use it for WebKit as well. Because enabling the sandbox can break applications, it is currently opt-in for GTK 3 apps (but will become mandatory for GTK 4 apps). If your application uses WebKitGTK, you really need to take some time to enable this sandbox using webkit_web_context_set_sandbox_enabled(). The sandbox has introduced a couple regressions that we didn’t notice until too late; notably,  printing no longer works, which, half a year later, we still haven’t managed to fix yet. (I’ll try to get to it soon.)

OK, this concludes your 3.36 and 3.34 updates. Onward to 3.38!

WebGL and fgx acceleration on Wayland

Posted by Martin Stransky on March 03, 2020 09:42 AM
<figure aria-describedby="caption-attachment-105" class="wp-caption alignnone" data-shortcode="caption" id="attachment_105" style="width: 690px">fishes<figcaption class="wp-caption-text" id="caption-attachment-105">WebGL on Wayland running in full speed.</figcaption></figure>

Firefox on Linux have suffered by poor WebGL performance for long, long time. It was given by missing general acceleration on Linux as there are always broken gfx drivers on X11, various hacks and different standards, closed source drivers and so on. Long story short – to do gfx acceleration seriously on Linux have been PITA. For instance Chrome (which supports gfx acceleration on Linux/X11) shows long list of active exceptions and workarounds listen at chrome://gpu/ page.

It’s also reason why Firefox never enabled it by default although it also implements gfx acceleration – Mozilla does not have resources to spend too much time on every broken gfx card / driver.

Fortunately situation was changed with Wayland. Working gfx acceleration is a sort of prerequisite to even start a decent Wayland compositor like Mutter or Plasma so when Firefox is launched on Wayland we can pretty much expect working GL environment. Also dmabuf is widely supported by Wayland compositor so we finally have all pieces together to build fully accelerated browser on Linux which is equal to its Windows siblings.

Firefox supports two acceleration modes – WebRender and GL compositor. WebRender is the new one and it’s superior in web content rendering. GL compositor is the former one, less advanced but it’s still faster for some scenarios where bits are heavily shifted from one side to another one – video playback and WebGL.

Both WebRender and GL compositor have implemented dmabuf back end which means textures used by WebRender/GL compositor can be created directly at GPU and shared  without copy among compositor / GPU browser processes. Such GPU memory can be in the same time mapped as EGL framebuffer so we can render WebGL frames directly to GPU memory, handle them from webgl process to chrome process and render it as a texture to a web page.

All those pieces are tied together in recent nightly where we finally have full WebGL support on Wayland and it will be shipped as Firefox 75. If you run Fedora/Gnome you can try it by yourself. Just grab latest nightly from Mozilla, enable HW acceleration, set widget.wayland-dmabuf-webgl.enabled to true at about:config, restart browser and open your favorite WebGL application like maps.google.com or WebGL samples.

Introducing Jcat

Posted by Richard Hughes on February 28, 2020 03:59 PM

In 2015 I wrote a bit about Microsoft Catalog files and threatened to invent a new specification if nobody else stepped up to the plate. Microsoft have still not documented the format, and have seemingly broken the on-disk file format at least once, and there’s still no easy way we can write these files in Linux. Yesterday was a long day, and I think I have a prototype I want to share with the world now. Lets go back a bit first, what is a catalog file anyway?

A catalog file is a container format that can hold any number of signatures for external files. The data files themselves are not contained in the catalog, only the signatures. Detached signatures are something we’re familiar with in Linux, typically in the form of .asc files for GPG and .p7b files for PKCS-7 protocols. Detached files are something we also generate on the LVFS and consume in fwupd and this is the reason you can see more than just the README.txt, firmware.bin and firmware.metainfo.xml in the cabinet archive:

The bug we’re trying to fix was we wanted to sign the metainfo file too so we can be sure that it came from the LVFS, rather than been modified by anyone capable of extracting and creating a new cabinet archive. This would have added firmware.metainfo.xml.asc and firmware.metainfo.xml.p7b to the archive, and at some point you have to wonder just how many extensions we can nest before the world descends into chaos. It also meant that we would have to add an extra n_firmware_files + n_metainfo_files detached signatures to each archive for every extra signing mechanism we add.

The other problem with a single detached signature per engine per file is that only one entity can sign the firmware. At the moment it’s just the LVFS, but what if Dell wanted to sign the firmware with a detached signature too, saying “this firmware is 100% from Dell”? Would we have com.dell.firmware.metainfo.xml.p7b too? This all got very complicated and basically forced me to create a new project to make it all much simpler.

Jcat is a gzipped JSON file of detached signatures. Because it’s gzipped it’s easy to compress and decompress in basically any language, and because it’s JSON it’s dead simple to parse and generate in any framework. There is a little overhead of some metadata (e.g. signing ID, creation time, etc) and but it’s all the kind of thing you can just edit in vim if you needed to. There’s also support for storing binary stuff like DER certificates (base64 to the rescue…), but if possible I’d like it to be all readable in a text editor. The jcat command line tool can import existing detached signatures into the Jcat file, and can also verify the existing .jcat file against all the files in a directory or archive. You can include multiple signatures for the same file (using the AppStream ID as the key) and of course sign multiple files using all the cryptographic engines you need. There’s also rudimentary support for actually creating signatures in the jcat command line client too, although it’s WIP for the GNUTLS engine and completely missing for GPGME at the moment.

This new thing also lets us fix another glaring issue in fwupd. Some companies can’t use PKCS-7, and some can’t use GPG for equally bad and nonsensical reasons – at the moment you need to specify the remote keyring when enabling a remote as we need to know if we need to download the metadata.xml.gz.asc or the .p7b version. Using a .jcat file allows to to not care, and just download one detached thing that can be used no matter how you’ve compiled your system. By adding SHA-256 as an additional not-to-be-used-for-trust engine, Jcat also lets you verify the download of your metadata and cabinet files even when you don’t have GPG or PKCS-7 available, which I know at least one company does on an IOT project. Jcat allows us to move the scary cryptographic verification code out of fwupd and makes the update-your-firmware codebase easier to maintain without worrying about potential landmines.

I’ve got a wip/hughsie/jcat branch on fwupd, and the same for lvfs-website although none of it has had any kind of peer review. Feedback on the initial release of libjcat most welcome from anyone familiar with writing libraries with GIO and Glib.

I suck at naming, but Jcat is supposed to be “JSON Catalog”. My daughters also have lots of Jellycat toys scattered around the house too, and the name seemed not to be taken by any other projects so far. If anyone knows of a project already using the name, please let me know. Feedback very welcome.

Enable Git Commit Message Syntax Highlighting in Vim on Fedora

Posted by Michael Catanzaro on February 27, 2020 10:44 PM

Were you looking forward to reading an exciting blog post about substantive technical issues affecting GNOME or the Linux desktop community? Sorry, not today.

When setting up new machines, I’m often frustrated by lack of syntax highlighting for git commit messages in vim. On my main workstation, vim uses comforting yellow letters for the first line of my commit message to let me know I’m good on line length, or red background to let me know my first line is too long, and after the first line it automatically inserts a new line break whenever I’ve typed past 72 characters. It’s pretty nice. I can never remember how I get it working in the end, and I spent too long today trying to figure it out yet again. Eventually I realized there was another difference besides the missing syntax highlighting: I couldn’t see the current line or column number, and I couldn’t see the mode indicator either. Now you might be able to guess my mistake: git was not using /usr/bin/vim at all! Because Fedora doesn’t have a default $EDITOR, git defaults to using /usr/bin/vi, which is basically sad trap vim. Solution:

$ git config --global core.editor vim

You also have to install the vim-enhanced package to get /usr/bin/vim, but that’s a lot harder to forget to do.

You’re welcome, Internet!

A tale of missing touches

Posted by Peter Hutterer on February 20, 2020 07:39 AM

libinput 1.15.1 had a new feature: it matched the expected touch count with the one actually seen as opposed to the one advertised by the kernel. That is good news for ALPS devices whose kernel driver lies about their capabilities because these days who doesn't. However, in some cases that feature had the side-effect of reducing the touch count to zero - meaning libinput would ignore any touch. This caused a slight UX degradation.

After a bit of debugging and/or cursing, the issue was identified as a libevdev issue, specifically - the way libevdev replays events after a SYN_DROPPED event. And after several days of fixing things, adding stuff to the CI and adding meson support for libevdev so the CI can actually run a few useful things, it's time for a blog post to brain-dump and possibly entertain the occasional reader such as you are. Congratulations, I guess.

The Linux kernel's evdev protocol is a serial protocol where all events have a type, a code and a value. Events are grouped by EV_SYN.SYN_REPORT events, so the event type is EV_SYN (0), the event code is SYN_REPORT (also 0). The value is usually (but not always), you guessed it, zero. A SYN_REPORT signals that the current event sequence (also called a "frame") is to be interpreted as one hardware event [0]. In the simplest case, two hardware events from a mouse could look like this:


EV_REL REL_X 1
EV_SYN SYN_REPORT 0
EV_REL REL_X 1
EV_REL REL_Y 1
EV_SYN SYN_REPORT 0
While we have five evdev events here, those represent one hardware event with an x movement of 1 and a second hardware event with a diagonal movement by 1/1. Glorious, we all understand evdev now (if not, read this and immediately afterwards this, although that second post will be rather reinforced by this post).

Life as software developer would be quite trivial but our universe hates us and we need an extra event code called SYN_DROPPED. This event is used by the kernel when events from the device come in faster than you're reading them. This shouldn't happen given that most input devices scan out at the casual rate of every 7ms or slower and we're not exactly running on carrier pigeons here. But your compositor has been a busy bee rendering all these browser windows containing kitten videos and thus completely neglected to check whether you've moved the finger on the touchpad recently. So the kernel sort-of clears the current event buffer and positions a shiny steaming SYN_DROPPED in there to notify the compositor of its wrongdoing. [1]

Now, we could assume that every evdev client (libinput, every Xorg driver, ...) knows how to handle SYN_DROPPED events correctly but we're self-aware enough that we don't. So SYN_DROPPED handling is wrapped via libevdev, in a way that lets the clients use almost exactly the same processing paths they use for normal events. libevdev gives you a notification that a SYN_DROPPED occured, then you fetch the events one-by-one until libevdev tells you have the complete current state of the device, and back to kittens you go. In pseudo-code, your input stack's event loop works like this:


while (user_wants_kittens):
event = libevdev_get_event()

if event is a SYN_DROPPED:
while (libevdev_is_still_synchronizing):
event = libevdev_get_event()
process_event(event)
else:
process_event(event)
Now, this works great for keys where you get the required events to release or press new keys. This works great for relative axes because meh, who cares [2]. This works great for absolute axes because you just get the current state of the device and done. This works great for touch because, no wait, that bit is awful.

You see, the multi-touch protocol is ... special. It uses the absolute axes, but it also multiplexes over those axes via the slot protocol. A normal two-touch event looks like this:


EV_ABS ABS_MT_SLOT 0
EV_ABS ABS_MT_POSITION_X 123
EV_ABS ABS_MT_SLOT 1
EV_ABS ABS_MT_POSITION_X 456
EV_ABS ABS_MT_POSITION_Y 789
EV_ABS ABS_X 123
EV_SYN SYN_REPORT 0
The first two evdev events are slot 0 (first touch [3]), the second two evdev events are slot 1 (second touch [3]). Both touches update their X position but the second touch also updates its Y position. But for single-touch emulation we also get the normal absolute axis event [3]. Which is equivalent to the first touch [3] and can be ignored if you're handling the MT axes [3] (I'm getting a lot of mileage out of that footnote). And because things aren't confusing enough: events within an evdev frame are position-independent except the ABS_MT axes which need to be processed in sequence. So that ABS_X events could be anywhere within that frame, but the ABS_MT axes need to be grouped by slot.

About that single-touch emulation... We also have a single-touch multi-touch protocol via EV_KEY. For devices that can only track N fingers but can detect N+M fingers, we have a set of BTN_TOOL defines. Two fingers down sets BTN_TOOL_DOUBLETAP, three fingers down sets BTN_TOOL_TRIPLETAP, etc. Those are just a bitfield though, so no position data is available. And it tops out at BTN_TOOL_QUINTTAP but then again, that's a good maximum backed by a lot of statistical samples from users hands. On many devices, we have to combine that single-touch MT protocol with the real MT protocol. Synaptics touchpads on PS/2 only support 2 finger positions but detect up 5 touches otherwise [4]. And remember the ALPS devices? They say they have 4 slots but may only send data for two or three, so we have to detect this at runtime and switch to the BTN_TOOL bits for some touches.

So anyway, now that we unfortunately all understand the MT protocol(s), let's look at that libevdev bug. libevdev checks the slot states after SYN_DROPPED to detect whether any touch has stopped or started during SYN_DROPPED. It also detects whether a touch has changed, i.e. the user lifted the finger(s) and put the finger(s) down again while SYN_DROPPED was happening. For those touches it generates the events to stop the original touch, then events to start the new touch. This needs to be done over two event frames, i.e. with a SYN_REPORT in between [5]. But the implementation ended up splitting those changes - any touch that changed was terminated in the first event frame, any touch that outright stopped was terminated in the second event frame. That in itself wasn't the problem yet, the problem was that libevdev didn't emulate the single-touch multi-touch protocol with those emulated frames. So we ended up with event frames where slots would terminate but the single-touch protocol didn't update until a frame later.

This doesn't matter for most users. Both protocols were still correct-enough in their own bubble, only once you start mixing protocols was where it all started getting wonky. libinput does this because it has to, too many devices out there only track two fingers. So if you want three-finger tapping and pinch gestures, you need to handle both protocols. Despite this we didn't notice until we added the quirk for ALPS devices. Because now libinput sometimes noticed that after a SYN_DROPPED there were no fingers on the touchpad (because they all stopped/changed) but the BTN_TOOL bits were still on so clearly we have a touchpad that cannot track all fingers it detects - in this case zero. [6]

So to recap: libinput's auto-adjustment of the touch count for buggy touchpad devices failed thanks to libevdev's buggy workaround of the device sync. The device sync we need because we can't rely on userspace handling touches correctly across SYN_DROPPED. An event which only gets triggered because the compositor is too buggy to read input events in time. I don't know how to describe it exactly, but what I can see all the way down are definitely not turtles.

And the sad thing about it: if we didn't try to correct for the firmware and accepted that gestures are just broken on ALPS devices because the kernel driver is lying to us, none of the above would have mattered. Likewise, the old xorg synaptics driver won't be affected by this because it doesn't handle multitouch properly anyway, so it doesn't need to care about these discrepancies. Or, in other words and much like real life: the better you try to be, the worse it all gets.

And as the take-home lesson: do upgrade to libinput 1.15.2 and do upgrade to libevdev 1.9.0 when it's out. Your kittens won't care but at least that way it won't make me feel like I've done all this work in vain.

[0] Unless the SYN_REPORT value is nonzero but let's not confuse everyone more than necessary
[1] A SYN_DROPPED is per userspace client, so a debugging tool reading from the same event node may not see that event unless it too is busy with feline renderings.
[2] yes, you'll get pointer jumps because event data is missing but since you've been staring at those bloody cats anyway, you probably didn't even notice
[3] usually, but not always
[4] on those devices, identifying a 3-finger pinch gesture only works if you put the fingers down in the correct order
[5] historical reasons: in theory a touch could change directly but most userspace can't handle it and it's too much effort to add now
[6] libinput 1.15.2 leaves you with 1 finger in that case and that's good enough until libevdev is released

What usage restrictions can we place in a free software license?

Posted by Matthew Garrett on February 20, 2020 12:45 AM
Growing awareness of the wider social and political impact of software development has led to efforts to write licenses that prevent software being used to engage in acts that are seen as socially harmful, with the Hippocratic License being perhaps the most discussed example (although the JSON license's requirement that the software be used for good, not evil, is arguably an earlier version of the theme). The problem with these licenses is that they're pretty much universally considered to fall outside the definition of free software or open source licenses due to their restrictions on use, and there's a whole bunch of people who have very strong feelings that this is a very important thing. There's also the more fundamental underlying point that it's hard to write a license like this where everyone agrees on whether a specific thing is bad or not (eg, while many people working on a project may feel that it's reasonable to prohibit the software being used to support drone strikes, others may feel that the project shouldn't have a position on the use of the software to support drone strikes and some may even feel that some people should be the victims of drone strikes). This is, it turns out, all quite complicated.

But there is something that many (but not all) people in the free software community agree on - certain restrictions are legitimate if they ultimately provide more freedom. Traditionally this was limited to restrictions on distribution (eg, the GPL requires that your recipient be able to obtain corresponding source code, and for GPLv3 must also be able to obtain the necessary signing keys to be able to replace it in covered devices), but more recently there's been some restrictions that don't require distribution. The best known is probably the clause in the Affero GPL (or AGPL) that requires that users interacting with covered code over a network be able to download the source code, but the Cryptographic Autonomy License (recently approved as an Open Source license) goes further and requires that users be able to obtain their data in order to self-host an equivalent instance.

We can construct examples of where these prevent certain fields of endeavour, but the tradeoff has been deemed worth it - the benefits to user freedom that these licenses provide is greater than the corresponding cost to what you can do. How far can that tradeoff be pushed? So, here's a thought experiment. What if we write a license that's something like the following:

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. All permissions granted by this license must be passed on to all recipients of modified or unmodified versions of this work
2. This work may not be used in any way that impairs any individual's ability to exercise the permissions granted by this license, whether or not they have received a copy of the covered work


This feels like the logical extreme of the argument. Any way you could use the covered work that would restrict someone else's ability to do the same is prohibited. This means that, for example, you couldn't use the software to implement a DRM mechanism that the user couldn't replace (along the lines of GPLv3's anti-Tivoisation clause), but it would also mean that you couldn't use the software to kill someone with a drone (doing so would impair their ability to make use of the software). The net effect is along the lines of the Hippocratic license, but it's framed in a way that is focused on user freedom.

To be clear, I don't think this is a good license - it has a bunch of unfortunate consequences like it being impossible to use covered code in self-defence if doing so would impair your attacker's ability to use the software. I'm not advocating this as a solution to anything. But I am interested in seeing whether the perception of the argument changes when we refocus it on user freedom as opposed to an independent ethical goal.

Thoughts?

Edit:

Rich Felker on Twitter had an interesting thought - if clause 2 above is replaced with:

2. Your rights under this license terminate if you impair any individual's ability to exercise the permissions granted by this license, even if the covered work is not used to do so

how does that change things? My gut feeling is that covering actions that are unrelated to the use of the software might be a reach too far, but it gets away from the idea that it's your use of the software that triggers the clause.

comment count unavailable comments

User-specific XKB configuration - part 1

Posted by Peter Hutterer on February 07, 2020 05:42 AM

The xkeyboard-config project is the repository for all XKB descriptions, or "keyboard layouts" as the layman would say. But languages are weird and thus xkeyboard-config contains an obscene amount of different layouts. And of course there are additional layouts that are more experimental than common [1].

The fault, as usual, lies with us (the pronoun, not the layout). XKB is weird and its flexible to the point of driving even bananas bananas but due to some historic accidents it's largely non-editable. All XKB files are installed in system folders and we all know the 11th commandment was "thou shalt not edit things in /usr/share". But, luckily, that is all about to change. Or rather: it has changed as of libxkbcommon 0.10.0, released Jan 20 2020.

xkeyboard-config provides two types of files. The ones that actually set up your keyboard layout and the ones that allow you to keep sane while doing so, despite your best efforts to the contrary.

KcCGST

Let's look at the first set of files. XKB uses "keycodes, compat, geometry, symbols, types", conveniently if unpronouncably called KcCGST. Keycodes map your "physical" scancode to an internal code-name. For example, your key with the digit 1 on it is AE01 (alphabetic key, row E from bottom, key number 1 from left). Then you map those keycodes into symbols (1 and !). This happens based on the key's type which defines the combination of modifiers to produce the symbols [2]. Compat is largely magic weird stuff (locking modifiers, pointer control) and geometry would let you draw a pretty picture of your keyboard if it was defined for your keyboard which it won't be.

To see the full keyboard layout simply run xkbcomp -xkb $DISPLAY - and marvel. xkeyboard-config keeps all these parts so your X server or Wayland compositor can load it at runtime depending on your layout.

RMLVO

But when it comes to actual layout selection we like our users enough that we don't make them handle KcCGST but rather provide them with RMLVO instead - "rules, models, layouts, variants, options". You select layout "us" and something converts this into the right components to actually load. Run setxkbmap -layout us -print to see this happening.

"layouts" is what you'd usually associate with a country (except politics is still a thing, so more weirdness here) and "variants" are variations of those. Layout "us" gives you QWERTY and "fr" gives you AZERTY but the "us(dvorak)" variant gives you whatever heresy dvorak applies to those physical keys. And of course, things don't stop there - options are tack-on thingies that do stuff. Like remapping caps lock to compose so you're less capable of shouting at me. Come to think of it, it should really be enabled by default for that reason. You can combine multiple options largely at-will. "models" are largely obsolete (except where they aren't) thanks to the Linux kernel evdev interface which makes all keyboard look the same. But they used to be a thing and maybe one day they'll make a comeback like bell bottom jeans. Disliked by everyone but some weirdos insist on using them.

Rules

Finally, we have rules and thus come to the core of the matter of this post. Rules are magic files that tell the various tools how to go from RMLVO to KcCGST. It's a weird format but it's quite understandable, just open /usr/share/X11/xkb/rules/evdev and have a looksie. It'll make you the popular kid at the next frat party.

Many many moons ago before the Y2K bug was even in its larvae stage, the idea was that you could configure all of those because every UNIX tool had to be more flexible than your yoga teacher. I'm unsure to what extent this was actually ever the case but around 2007-ish the old keyboard driver got deprecated and the evdev driver made it's grand entrance. And one side-effect of that was that things broke. evdev uses different keycodes, so all those users that copy-pasted unnecessary XKB configuration into their xorg.conf now had broken keys because they were applying the wrong rules. After whacking enough moles that we got in trouble with the RSPCA we started hardcoding the "evdev" ruleset everywhere. The xorg.conf option "XKBRules" became a noop and thus stopped breaking users' setups.

Except that it also stopped users from deploying their own rules files - something that probably didn't really matter anyway. This had some unintended side-effects though. First, to have a working custom XKB layout you basically had to get it merged upstream. Yes, you could edit the files locally but they'd just be overwritten next time you update the packages. Second, getting rid of hardcoded things is hard so we're stuck with the evdev ruleset for the forseeable future. This was the situation until, well, now.

User-specific rules and layouts

The new libxkbcommon release changes two things: it prepends $XDG_CONFIG_HOME/xkb/ to the lookup path for XKB rules (and other files). So any file in that path will be picked before the system paths. This makes it possible to have KcCGST files in your home directory and actually use them. This was somewhat possible before by passing the right flags to the various tools but now it's on by default - at least where libxkbcommon is used (Wayland).

Secondly, rules files now support an include statement. This means you can set your own rules and include system rules. Because everything is hardcoded to evdev this effectively means your new rule file will be $XDG_CONFIG_HOME/xkb/rules/evdev and have at least one line: ! include %S/rules/evdev. If you do just that, you get the evdev ruleset from the system installation path. And any lines you add before or after that line will be loaded. Have a look at the git commit for the details but the summary is that you'll have a rules file that looks like this:


$ cat $XDG_CONFIG_DIR/xkb/rules/evdev
! option = symbols
custom:foo = +custom(foo)
custom:bar = +custom(baz)

! include %S/evdev
This file will define the option->symbol mappings as above and then include the system-provided evdev rules file, i.e. it'll behave like before with those two added. To get those to do something, you need to have the actual symbols files:

$ cat $XDG_CONFIG_HOME/xkb/symbols/custom
partial alphanumeric_keys
xkb_symbols "foo" {
key <TLDE> { [ VoidSymbol ] };
};

partial alphanumeric_keys
xkb_symbols "baz" {
key <AB01> { [ k, K ] };
};

And voila, you can now use the XKB option "custom:foo" and/or "custom:bar" to remap your tilde or Z key. The rest is left to the reader as an exercise in creativity.

Remaining work

The libxkbcommon change was only the first part of the full feature. The remaining parts is to have libxkbcommon actually resolve XDG_CONFIG_HOME when running in gnome-shell which doesn't work right now thanks to secure_getenv() always returning NULL. That's an issue with gnome-shell in particular thanks to the rt-scheduler feature, enabled by default on Fedora already.

The second part, and harder, is to make the new options appear in the various graphical configuration tools. xkeyboard-config ships an XML file [3] that lists every possible combination with some human description for it. This XML file is used by the various tools directly but none of those tools support XML's XI:Include statements. So we'd either have to update all those tools to extend the parsing accordingly or, most likely a smarter long-term solution, write a wrapper library that provides a stable API to get at the same info. That way we can update the include paths under the hood without having to update every tool. Of course this requires every tool to update to the new library first, so, well, chicken, egg, usual problem. Anyway, we'll get there eventually.

[1] For example, I suspect a meetup of icelandic dvorak users doesn't qualify for a group discount.
[2] Each key has "levels" with one symbol each and modifiers that switch between those levels. Most keys have two levels - normal and shift. But there's a key type for EIGHT_LEVEL_ALPHABETIC_LEVEL_FIVE_LOCK and you can cry or laugh and both reactions are appropriate
[3] ask your grandparents about that, it's basically JSON for old people

Let’s Learn Spelling!

Posted by Michael Catanzaro on February 01, 2020 09:49 PM

Were you looking forward to reading an exciting blog post about substantive technical issues affecting GNOME or the Linux desktop community? Sorry, not today.

GNOME

It used to be an acronym, so it’s all uppercase. Write “GNOME,” never “Gnome.” Please stop writing “Gnome.”

Would it help if you imagine an adorable little garden gnome dying each time you get it wrong?

If you’re lazy and hate capital letters, or for technical contexts like package or project names, then all-lowercase “gnome” might be appropriate, but “Gnome” certainly never is.

Red Hat

This one’s not that hard. Why are some people writing “RedHat” without any space? It doesn’t make sense. Red Hat. Easy!

SUSE and openSUSE

S.u.S.E. and SuSE are both older spellings for the company currently called SUSE. Apparently at some point in the past they realized that the lowercase u was stupid and causes readers’ eyes to bleed.  Can we please let it die?

Similarly, openSUSE is spelled “openSUSE,” not “OpenSUSE.” Do not capitalize the o, even if it’s the first word in a sentence. Do not write “openSuSE” or “OpenSuSE” (which people somehow manage to do even when they’re not trolling) or anything at all other than “openSUSE.” I know this is probably too much to ask, but once you get the hang of it, it’s not so hard.

elementary OS

I don’t often see this one messed up. If you can write elementary OS, you can probably write openSUSE properly too! They’re basically the same structure, right? All lowercase, then all caps. I have faith in you, dear reader! Don’t let me down!

GTK and WebKitGTK

We removed the + from the end of both of these, because it was awful. You’re welcome!

Again, all lowercase is probably OK in technical contexts. “gtk-webkit” is not. WebKitGTK.

Avoiding gaps in IOMMU protection at boot

Posted by Matthew Garrett on January 28, 2020 10:48 PM
When you save a large file to disk or upload a large texture to your graphics card, you probably don't want your CPU to sit there spending an extended period of time copying data between system memory and the relevant peripheral - it could be doing something more useful instead. As a result, most hardware that deals with large quantities of data is capable of Direct Memory Access (or DMA). DMA-capable devices are able to access system memory directly without the aid of the CPU - the CPU simply tells the device which region of memory to copy and then leaves it to get on with things. However, we also need to get data back to system memory, so DMA is bidirectional. This means that DMA-capable devices are able to read and write directly to system memory.

As long as devices are entirely under the control of the OS, this seems fine. However, this isn't always true - there may be bugs, the device may be passed through to a guest VM (and so no longer under the control of the host OS) or the device may be running firmware that makes it actively malicious. The third is an important point here - while we usually think of DMA as something that has to be set up by the OS, at a technical level the transactions are initiated by the device. A device that's running hostile firmware is entirely capable of choosing what and where to DMA.

Most reasonably recent hardware includes an IOMMU to handle this. The CPU's MMU exists to define which regions of memory a process can read or write - the IOMMU does the same but for external IO devices. An operating system that knows how to use the IOMMU can allocate specific regions of memory that a device can DMA to or from, and any attempt to access memory outside those regions will fail. This was originally intended to handle passing devices through to guests (the host can protect itself by restricting any DMA to memory belonging to the guest - if the guest tries to read or write to memory belonging to the host, the attempt will fail), but is just as relevant to preventing malicious devices from extracting secrets from your OS or even modifying the runtime state of the OS.

But setting things up in the OS isn't sufficient. If an attacker is able to trigger arbitrary DMA before the OS has started then they can tamper with the system firmware or your bootloader and modify the kernel before it even starts running. So ideally you want your firmware to set up the IOMMU before it even enables any external devices, and newer firmware should actually do this automatically. It sounds like the problem is solved.

Except there's a problem. Not all operating systems know how to program the IOMMU, and if a naive OS fails to remove the IOMMU mappings and asks a device to DMA to an address that the IOMMU doesn't grant access to then things are likely to explode messily. EFI has an explicit transition between the boot environment and the runtime environment triggered when the OS or bootloader calls ExitBootServices(). Various EFI components have registered callbacks that are triggered at this point, and the IOMMU driver will (in general) then tear down the IOMMU mappings before passing control to the OS. If the OS is IOMMU aware it'll then program new mappings, but there's a brief window where the IOMMU protection is missing - and a sufficiently malicious device could take advantage of that.

The ideal solution would be a protocol that allowed the OS to indicate to the firmware that it supported this functionality and request that the firmware not remove it, but in the absence of such a protocol we're left with non-ideal solutions. One is to prevent devices from being able to DMA in the first place, which means the absence of any IOMMU restrictions is largely irrelevant. Every PCI device has a busmaster bit - if the busmaster bit is disabled, the device shouldn't start any DMA transactions. Clearing that seems like a straightforward approach. Unfortunately this bit is under the control of the device itself, so a malicious device can just ignore this and do DMA anyway. Fortunately, PCI bridges and PCIe root ports should only forward DMA transactions if their busmaster bit is set. If we clear that then any devices downstream of the bridge or port shouldn't be able to DMA, no matter how malicious they are. Linux will only re-enable the bit after it's done IOMMU setup, so we should then be in a much more secure state - we still need to trust that our motherboard chipset isn't malicious, but we don't need to trust individual third party PCI devices.

This patch just got merged, adding support for this. My original version did nothing other than clear the bits on bridge devices, but this did have the potential for breaking devices that were still carrying out DMA at the moment this code ran. Ard modified it to call the driver shutdown code for each device behind a bridge before disabling DMA on the bridge, which in theory makes this safe but does still depend on the firmware drivers behaving correctly. As a result it's not enabled by default - you can either turn it on in kernel config or pass the efi=disable_early_pci_dma kernel command line argument.

In combination with firmware that does the right thing, this should ensure that Linux systems can be protected against malicious PCI devices throughout the entire boot process.

comment count unavailable comments

libinput's timer offset errors

Posted by Peter Hutterer on January 28, 2020 02:34 AM

Let's say you have a friend (this wouldn't happen to you, of course, just that friend) who is staring at their system logs and wonder why it is full of messages similar to this:


libinput error: client bug: timer event5 debounce short: offset negative (-7ms)
And the question is of course - what is going on here and why hasn't this been fixed yet. Now, the libinput documentation explains this already but it's always worthwhile to fire out a blog post into the void in the hope someone reads it.

libinput uses a specific model to communicate with the Wayland compositor (or the X server). There is a single epoll file descriptor and that fd will trigger whenever something happens that's of interest to libinput. When that fd triggers, the compositor is expected to call libinput_dispatch() which is the main "do stuff" function of libinput.

The actual trigger doesn't matter, it could be an event from a device but it could be something else. The caller doesn't have to care. All that matters is that there is code like this:


if (libinput_fd_triggered_in_select)
libinput_dispatch();
And then libinput will do the right thing. Whether you also want events from libinput is almost orthogonal to this.

libinput uses timerfd internally so any timeouts also trigger the epoll fd. Timeouts are scheduled based on the event's time stamp, so if you get an event with timestamp T, a timeout of 180ms will be scheduled for time T + 180ms. So the process looks something like this:


T(0): kernel button event
T(0): libinput_dispatch(): schedule timeout for T(0+180)
...
T(180): epoll fd triggers
T(180): libinput_dispatch(): process timeout
...
This works generally fine. Even with some delays we don't generally need to worry about the timeouts and they still trigger as expected. But some of the timeouts are "short", as in 8ms short. And this is where these warnings may trigger.

Let's say your compositor is busy doing some rendering. The epoll fd triggers with a button event but the compositor is too busy to handle it immediately. Instead, it finishes whatever it's doing and only then calls libinput_dispatch():


T(0): kernel button event
...
T(12): libinput_dispatch(): schedule timeout for T(0+8)

libinput error: client bug: timer event5 debounce short: offset negative (-4ms)
libinput will still use the event's timestamp instead of the wall clock time so the scheduled timeout is no longer in the future. And that is when the error message will be printed. This isn't a libinput bug, it's always a bug in the compositor. Especially gnome-shell is still struggling with these instances and while great strives have been made to make it more responsive, it can still happen.

The error message may seem cryptic, but it provides a bunch of useful information: event5 is your event node, "debounce short" is the timer name so we know where we got stuck. And 4ms gives us an indication how much we got delayed.

And for the record: the other end of this issue - delayed libinput_dispatch() after a timeout should have triggered is handled quietly by libinput. For example, if you have a physical event queued and a timeout expiry, we will process the earlier one first to make sure the sequences are handled correctly.

Hunting UEFI Implants

Posted by Richard Hughes on January 27, 2020 06:37 PM

Last week I spent 3 days training on how to detect UEFI firmware implants. The training was run by Alex Matrosov via Hardwear.io and was a comprehensive deep-dive into UEFI firmware internals so that we could hunt for known and unknown implants. I’d 100% recommend this kind of training, it was excelent. Although I understood the general concepts of the protection mechanisms like SMM, HP Sure Start and Intel BIOSGuard before doing the training, it was really good to understand how the technologies really worked, with real world examples of where hardware vendors were getting the implementation wrong – giving the bad guys full control of your hardware. The training was superb, and Alex used lots of hands-on lab sessions to avoid PowerPoint overload. My fellow students were a mixture of security professionals and employees from various government departments from all over the world. We talked, a lot.

My personal conclusion quite simply is that we’re failing as an industry. In the pursuit to reduce S3 resume time from 2s to 0.5s we introduce issues like the S3 bootscript vulnerability. With the goal to boot as quickly as possible, we only check the bare minimum certificate chain allowing additional malicious DXEs to be added to an image. OEMs are choosing inexpensive EC hardware from sketchy vendors that are acting as root of trust and also emulating hardware designed 30 years ago, whilst sharing the system SPI chip. By trying to re-use existing power management primitives like SMM as a security boundary the leaky abstractions fail us. Each layer in the security stack is assuming that the lower below it is implemented correctly, and so all it takes is one driver with SMM or CSME access to not check a memory address in a struct correctly and everything on top (e.g. BootGuard, ALSR, SELinux, etc) is broken. Coreboot isn’t the panacea here either as to get that to run you need to turn off various protections like BootGuard, and some techniques like Sure Start mean that Coreboot just isn’t a viable option. The industry seems invested into EDK2, for better or worse. This shouldn’t just be important to the few people just buying stuff from Purism – 10,000x laptops are being sold on Amazon for every laptop sold by vendors that care about this stuff.

Most of the easy-to-exploit issues are just bugs with IBV or ODM-provided code, some of which can be fixed with a firmware update. Worst still, if you allow your “assumed secure” laptop out of sight then all bets are off with security. About a quarter of people at the UEFI training had their “travel laptop” tampered with at some point – with screws missing after “customs inspections” or with tamper seals broken after leaving a laptop in a hotel room. You really don’t need to remove the screws to image a hard drive these days. But, lets back away from the state-sponsored attacker back to reality for a minute.

The brutal truth is that security costs money. Vendors have to choose between saving 10 cents on a bill-of-materials by sharing a SPI chip (so ~$10K over a single batch), or correctly implementing BIOSGuard. What I think the LVFS now needs to do is provide some easy-to-understand market information to people buying hardware. We already know a huge amount of information about the device from signed reports and from analyzing the firmware binaries. What we’re not doing very well is explaining it to the user in a way they can actually understand. I didn’t understand the nuances between BIOSGuard and BootGuard until a few days ago, and I’ve been doing this stuff for years.

What I propose we do is assign some core protections some weight, and then verify and document how each vendor is configuring each model. For instance, I might say that for my dads laptop any hardware “SEC1” and above is fine as he’s only using it for Facebook or YouTube and it needs to be inexpensive. For my personal laptop I would be happy to restrict my choice of models to anything SEC3 and above. If you’re working as a journalist under some corrupt government, or am a security researcher, only SEC4 and above would be suitable. The reality is that SEC4 is going to be several hundred dollars more expensive than some cheap imported no-name hardware that doesn’t even conform to SEC1.

Of course, we’ll need to expand the tests we do in fwupd to detect implementation errors, and to verify that the model that we’ve purchased does indeed match the SEC level advertised by the LVFS. I’m talking to a few different people on how to do this securely. What I do know is that it will involve a reboot to get some of the data that we can’t even get in kernel mode with SecureBoot turned on. The chipsec report gets us some of the way there, but it’s just too complicated for end users and won’t work with SB turned on.

My proposal would be as follows:

  • SEC1: SecureBoot, BIOS_WE, BLE, SMM_BWP, and updates on the LVFS, with no existing detectable SMM issues (like ThinkPwn for example)
  • SEC2: PRx set correctly, if not using BootGuard or BIOSGuard, with PCR0 attestation data
  • SEC3: BootGuard enabled, with the EC controller requiring signed images
  • SEC4: Intel BIOSGuard or HP SureStart
  • SEC5: Hardware attestation like Apple T2 or Google Titan

SEC1 is a really low bar. Anything not unsetting BIOS_WE can be flashed at runtime trivially. If you’re traveling with a laptop for work you really ought to be specifying at least a SEC3 level of protection.

Of course, some vendors might not care at all about security for some models. A “gaming laptop” with a flashing RGB keyboard backlight is really designed for playing games as fast as possible, and the fact that the BIOS is unlocked might be a good thing as then the user can flash a custom unsigned BIOS with a CounterStrike-themed vendor image. I don’t think we ought to make vendors feel guilty about not even hitting SEC1. Perhaps we could let the consumer vote with their wallet and make the ecosystem more secure. I’m not sold on the SECx name either, it’s not very catchy. I suck at naming stuff. Comments welcome.

Verifying your system state in a secure and private way

Posted by Matthew Garrett on January 20, 2020 12:53 PM
Most modern PCs have a Trusted Platform Module (TPM) and firmware that, together, support something called Trusted Boot. In Trusted Boot, each component in the boot chain generates a series of measurements of next component of the boot process and relevant configuration. These measurements are pushed to the TPM where they're combined with the existing values stored in a series of Platform Configuration Registers (PCRs) in such a way that the final PCR value depends on both the value and the order of the measurements it's given. If any measurements change, the final PCR value changes.

Windows takes advantage of this with its Bitlocker disk encryption technology. The disk encryption key is stored in the TPM along with a policy that tells it to release it only if a specific set of PCR values is correct. By default, the TPM will release the encryption key automatically if the PCR values match and the system will just transparently boot. If someone tampers with the boot process or configuration, the PCR values will no longer match and boot will halt to allow the user to provide the disk key in some other way.

Unfortunately the TPM keeps no record of how it got to a specific state. If the PCR values don't match, that's all we know - the TPM is unable to tell us what changed to result in this breakage. Fortunately, the system firmware maintains an event log as we go along. Each measurement that's pushed to the TPM is accompanied by a new entry in the event log, containing not only the hash that was pushed to the TPM but also metadata that tells us what was measured and why. Since the algorithm the TPM uses to calculate the hash values is known, we can replay the same values from the event log and verify that we end up with the same final value that's in the TPM. We can then examine the event log to see what changed.

Unfortunately, the event log is stored in unprotected system RAM. In order to be able to trust it we need to compare the values in the event log (which can be tampered with) with the values in the TPM (which are much harder to tamper with). Unfortunately if someone has tampered with the event log then they could also have tampered with the bits of the OS that are doing that comparison. Put simply, if the machine is in a potentially untrustworthy state, we can't trust that machine to tell us anything about itself.

This is solved using a procedure called Remote Attestation. The TPM can be asked to provide a digital signature of the PCR values, and this can be passed to a remote system along with the event log. That remote system can then examine the event log, make sure it corresponds to the signed PCR values and make a security decision based on the contents of the event log rather than just on the final PCR values. This makes the system significantly more flexible and aids diagnostics. Unfortunately, it also means you need a remote server and an internet connection and then some way for that remote server to tell you whether it thinks your system is trustworthy and also you need some way to believe that the remote server is trustworthy and all of this is well not ideal if you're not an enterprise.

Last week I gave a talk at linux.conf.au on one way around this. Basically, remote attestation places no constraints on the network protocol in use - while the implementations that exist all do this over IP, there's no requirement for them to do so. So I wrote an implementation that runs over Bluetooth, in theory allowing you to use your phone to serve as the remote agent. If you trust your phone, you can use it as a tool for determining if you should trust your laptop.

I've pushed some code that demos this. The current implementation does nothing other than tell you whether UEFI Secure Boot was enabled or not, and it's also not currently running on a phone. The phone bit of this is pretty straightforward to fix, but the rest is somewhat harder.

The big issue we face is that we frequently don't know what event log values we should be seeing. The first few values are produced by the system firmware and there's no standardised way to publish the expected values. The Linux Vendor Firmware Service has support for publishing these values, so for some systems we can get hold of this. But then you get to measurements of your bootloader and kernel, and those change every time you do an update. Ideally we'd have tooling for Linux distributions to publish known good values for each package version and for that to be common across distributions. This would allow tools to download metadata and verify that measurements correspond to legitimate builds from the distribution in question.

This does still leave the problem of the initramfs. Since initramfs files are usually generated locally, and depend on the locally installed versions of tools at the point they're built, we end up with no good way to precalculate those values. I proposed a possible solution to this a while back, but have done absolutely nothing to help make that happen. I suck. The right way to do this may actually just be to turn initramfs images into pre-built artifacts and figure out the config at runtime (dracut actually supports a bunch of this already), so I'm going to spend a while playing with that.

If we can pull these pieces together then we can get to a place where you can boot your laptop and then, before typing any authentication details, have your phone compare each component in the boot process to expected values. Assistance in all of this extremely gratefully received.

comment count unavailable comments

Fedora Firefox team at 2019

Posted by Martin Stransky on January 07, 2020 02:43 PM

logoI think the last year was the strongest one in whole Fedora Firefox team history. We have been always contributed at Mozilla but in 2019 we finished some major outstanding projects at upstream and also ship them at Fedora.

The first finished project I’d like to mention is disabled system titlebar by default on Gnome. Firefox UI on Linux finally matches Windows/MacOS and provides similar user experience. We also implement various tweaks like styled and HiDPI titlebar button rendering and left/right button placement.

A rather small by code changes but highly impacted was gcc optimization with PGO/LTO.  In cooperation with Jakub Jelinek and SuSE guys we managed to match and even slightly outperform default Mozilla Firefox binaries which are built with clang. I’m going to post more accurate numbers in some follow up post as was already published by a Czech  linux magazine.

Firefox Gnome search provider is another small but useful feature we introduced last year. It’s not integrated at upstream yet because it needs an update for an upcoming async history lookup API at Firefox side but we ship it as tech preview to get more user feedback.

And then there’s our biggest project so far – Firefox with native Wayland backend. Fedora 31 ships it by default for Gnome which closes an initial developer phase and we can focus on polishing, bug fixing and adding more features now. It’s the biggest project we have been working on so far and also extends the Gtk2 to Gtk3 transition. There are also many people from and outside of Mozilla who helped with it and some of them are brand new contributors to Firefox which is awesome.

The Wayland backend is going to get more and more features in the future. We’re investigating possible advantages of DMA-BUF backend which can be used for HW accelerated video playback or direct WebGL rendering. We need to address missing Xvfb on Wayland to run tests on Wayland and build Firefox with PGO/LTO there. We’re also going to look at other Wayland compositors like Plasma and Sway to make sure Firefox works fine there – so many challenges and a lot of fun are waiting for fearless fox hackers! 😉

Wifi deauthentication attacks and home security

Posted by Matthew Garrett on December 27, 2019 03:26 AM
I live in a large apartment complex (it's literally a city block big), so I spend a disproportionate amount of time walking down corridors. Recently one of my neighbours installed a Ring wireless doorbell. By default these are motion activated (and the process for disabling motion detection is far from obvious), and if the owner subscribes to an appropriate plan these recordings are stored in the cloud. I'm not super enthusiastic about the idea of having my conversations recorded while I'm walking past someone's door, so I decided to look into the security of these devices.

One visit to Amazon later and I had a refurbished Ring Video Doorbell 2™ sitting on my desk. Tearing it down revealed it uses a TI SoC that's optimised for this sort of application, linked to a DSP that presumably does stuff like motion detection. The device spends most of its time in a sleep state where it generates no network activity, so on any wakeup it has to reassociate with the wireless network and start streaming data.

So we have a device that's silent and undetectable until it starts recording you, which isn't a great place to start from. But fortunately wifi has a few, uh, interesting design choices that mean we can still do something. The first is that even on an encrypted network, the packet headers are unencrypted and contain the address of the access point and whichever device is communicating. This means that it's possible to just dump whatever traffic is floating past and build up a collection of device addresses. Address ranges are allocated by the IEEE, so it's possible to map the addresses you see to manufacturers and get some idea of what's actually on the network[1] even if you can't see what they're actually transmitting. The second is that various management frames aren't encrypted, and so can be faked even if you don't have the network credentials.

The most interesting one here is the deauthentication frame that access points can use to tell clients that they're no longer welcome. These can be sent for a variety of reasons, including resource exhaustion or authentication failure. And, by default, they're entirely unprotected. Anyone can inject such a frame into your network and cause clients to believe they're no longer authorised to use the network, at which point they'll have to go through a new authentication cycle - and while they're doing that, they're not able to send any other packets.

So, the attack is to simply monitor the network for any devices that fall into the address range you want to target, and then immediately start shooting deauthentication frames at them once you see one. I hacked airodump-ng to ignore all clients that didn't look like a Ring, and then pasted in code from aireplay-ng to send deauthentication packets once it saw one. The problem here is that wifi cards can only be tuned to one frequency at a time, so unless you know the channel your potential target is on, you need to keep jumping between frequencies while looking for a target - and that means a target can potentially shoot off a notification while you're looking at other frequencies.

But even with that proviso, this seems to work reasonably reliably. I can hit the button on my Ring, see it show up in my hacked up code and see my phone receive no push notification. Even if it does get a notification, the doorbell is no longer accessible by the time I respond.

There's a couple of ways to avoid this attack. The first is to use 802.11w which protects management frames. A lot of hardware supports this, but it's generally disabled by default. The second is to just ignore deauthentication frames in the first place, which is a spec violation but also you're already building a device that exists to record strangers engaging in a range of legal activities so paying attention to social norms is clearly not a priority in any case.

Finally, none of this is even slightly new. A presentation from Def Con in 2016 covered this, demonstrating that Nest cameras could be blocked in the same way. The industry doesn't seem to have learned from this.

[1] The Ring Video Doorbell 2 just uses addresses from TI's range rather than anything Ring specific, unfortunately

comment count unavailable comments

More on Flatpak updates

Posted by Matthias Clasen on December 20, 2019 03:33 AM

The last time I talked about flatpak updates, I explained how flatpak apps can detect that a newer version has been installed, and restart themselves. That is great, and may almost be good enough when you have automatic updates. But that is not always the case.

Thankfully, we can do better. Since 1.5, Flatpak has a portal API that lets applications monitor for updates, and request updating themselves.

Here is how this looks when it is all put together:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-9100-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2019/12/update-monitor.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2019/12/update-monitor.webm</video>

In the terminal, I’m building a new version of the the portal test app, and update my (local) repository. The flatpak portal is noticing that the update appeared (I’m running it with a short poll timeout here, instead of the usual 30 minutes), and sends out a D-Bus signal to the application, which requests to be updated, and then restarts itself.

Using the portal API directly is not very convenient, since you have to listen to D-Bus signals and whatnot. Therefore, we now have a library called libportal, which is providing simple async wrappers for most portals. That is what the portal test app in the demo is using, and you should be using it too in your applications.

The first stable release of libportal will appear very soon, with Flatpak 1.6, and then it will find its way into runtimes.

Update: Since this is a portal, users are in control of what apps are allowed to do. If you don’t want an application to update itself, you can put an end to it with

flatpak permission-set flatpak updates $APPID no

Use ‘ask’ instead of ‘no’ to get a confirmation dialog. The permission-set command is new in flatpak 1.6.

GMemoryMonitor (low-memory-monitor, 2nd phase)

Posted by Bastien Nocera on December 17, 2019 11:53 PM
TL;DR

Use GMemoryMonitor in glib 2.63.3 and newer in your applications to lower overall memory usage, and detect low memory conditions.

low-memory-monitor

To start with, let's come back to low-memory-monitor, announced at the end of August.

It's not really a “low memory monitor”. I know, the name is deceiving, but it actually monitors memory pressure stalls, and how hard it is for the kernel to allocate memory when applications need it. The longer it takes to allocate memory, the longer the kernel takes to allocate it, usually because it needs to move memory around to make room for a big allocation, when an application starts up for example, or prepares an in-memory buffer for saving.

It is not a daemon that will kill programs on low memory. It's not a user-space out-of-memory killer, and does not take those policy decisions. It can however be configured to ask the kernel to do that. The kernel doesn't really know what it's doing though, and user-space isn't helping either, so best disable that for now...

As listed in low-memory-monitor's README (and in the announcement post), there were a number of similar projects around, but none that would offer everything we needed, eg.:
  • Has a D-Bus interface to propagate low memory conditions
  • Requires Linux 5.2's kernel memory pressure stalls information (Android's lowmemorykiller daemon has loads of code to get the same information from the kernel for older versions, and it really is quite a lot of code)
  • Written in a compiled language to save on startup/memory usage costs (around 500 lines of C code, as counted by sloccount)
  • Built-in policy, based upon values used in Android and Endless OS
 GMemoryMonitor

Next up, in our effort to limit memory usage, we'll need some help from applications. That's where GMemoryMonitor comes in. It's simple enough, listen to the low-memory-warning signal and free some image thumbnails, index caches, or dump some data to disk, when you receive a signal.

The signal also gives you a “warning level”, with 255 being when low-memory-monitor would trigger the kernel's OOM killer, and lower values different levels of “try to be a good citizen”.

The more astute amongst you will have noticed that low-memory-monitor runs as root, on the system bus, and wonder how those new fangled (5 years old today!) sandboxed applications would receive those signals. Fear not! Support for a portal version of GMemoryMonitor landed in xdg-desktop-portal on the same day as in glib. Everything tied together with installed tests that use the real xdg-desktop-portal to test the portal and unsandboxed versions.

How about an OOM killer?

By using memory pressure stall information, we receive information about the state of the kernel before getting into swapping that'd cause the machine to become unusable. This also means that, as our threshold for keeping everything ticking is low, if we were to kill high memory consumers, we'd get a butter smooth desktop, but, based on my personal experience, your browser and your mail client would take it in turns disappearing from your desktop in a way that you wouldn't even notice.

We'll definitely need to think about our next step in application state management, and changing our running applications paradigm.

Distributions should definitely disable the OOM killer for now, and possibly try their hands at upstream some systemd OOMPolicy and OOMScoreAdjust options for system daemons.

Conclusion

Creating low-memory-monitor was easy enough, getting everything else in place was decidedly more complicated. In addition to requiring changes to glib, xdg-desktop-portal and python-dbusmock, it also required a lot of work on the glib CI to save me from having to write integration tests in C that would have required a lot of scaffolding. So thanks to all involved in particular Philip Withnall for his patience reviewing my changes.

Dual-GPU support follow-up: NVIDIA driver support

Posted by Bastien Nocera on December 13, 2019 04:15 PM
If you remember, back in 2016, I did the work to get a “Launch on Discrete GPU” menu item added to application in gnome-shell.

This cycle I worked on adding support for the NVIDIA proprietary driver, so that the menu item shows up, and the right environment variables are used to launch applications on that device.

Tested with another unsupported device...


Behind the scenes

There were a number of problems with the old detection code in switcheroo-control:
- it required the graphics card to use vga_switcheroo in the kernel, which the NVIDIA driver didn't do
- it could support more than 2 GPUs
- and it didn't really actually know which GPU was going to be the “main” one

And, on top of all that, gnome-shell expected the Mesa OpenGL stack to be used, so it only knew the right environment variables to do that, and only for one secondary GPU.

So we've extended switcheroo-control and its API to do all this.

(As a side note, commenters asked me about the KDE support, and how it would integrate, and it turns out that KDE's code just checks for the presence of a file in /sys, which is only present when vga_switcheroo is used. So I would encourage KDE to adopt the switcheroo-control D-Bus API for this)

Closing

All this will be available in Fedora 32, using GNOME 3.36 and switcheroo-control 2.0. We might backport this to Fedora 31 after it's been tested, and if there is enough interest.

Improving the security model of the LVFS

Posted by Richard Hughes on December 11, 2019 08:34 AM

There are lots of layers of security in the LVFS and fwupd design, including restricted account modes, 2FA, and server side AppStream namespaces. The most powerful one is the so-called vendor-id that the vendors cannot assign themselves, and is assigned by me when creating the vendor account on the LVFS. The way this works is that all firmware from the vendor is tagged with a vendor-id string like USB:0x056A which in this case matches the USB consortium vendor assigned ID. Client side, the vendor-id from the signed metadata is checked against the physical device and the firmware is updated only if the ID matches. This ensures that malicious or careless users on the LVFS can never ship firmware updates for other vendors hardware. About 90% of the vendors on the LVFS are locked down with this mechanism.

Some vendors have to have IDs that they don’t actually own, a good example here is for a DFU device like the 8bitdo controllers. In runtime mode they use the USB-assigned 8bitdo VID, but in bootloader mode they use a generic VID which is assigned to the chip supplier as they are using the reference bootloader. This is obviously fine, and both vendor IDs are assigned to 8bitdo on the LVFS for this reason. Another example is where Lenovo is responsible for updating Lenovo-specific NVMe firmware, but where the NVMe vendor isn’t always Lenovo’s PCI ID.

Where this breaks down a little more is for hardware devices that don’t have a built-in assigned vendor mapping. There are three plugins which are causing minor headaches:

  • Redfish — there’s seemingly no PCI vendor code for the enumerated devices themselves
  • ATA — the ATA-ATAPI-5 specification bizarrely makes no mention of any kind of vendor ID in the IDENTIFY block
  • UEFI — the ESRT table frustratingly just lists the version number and the GUID of devices, but no actual sysfs link to each

All the other plugins can be handled in a sane way, mostly automatically as the vast majority derive from either FuUsbDevice or FuUdevDevice.

As UEFI UpdateCapsule updates seem to be the 2nd most popular way to distribute firmware updates we probably ought to think of a sane way of limiting firmware updates to the existing BIOS vendor. We could query the DMI data, so that for instance Lenovo is only able to update Lenovo hardware — but we have to use a made-up pseudo-vendor-id of DMI:Lenovo. Maybe this isn’t so bad. Perhaps the vendor ID isn’t so useful with UEFI Update Capsule as the capsules themselves have to be signed by the firmware vendor before they’ll actually be run.

Anyway, to the point of this blog post: Until recently fwupd would refuse to apply the update if the metadata contained a vendor-id, but the device had not set one. This situation now might happen if for instance a vendor had to have no vendor-id because the device traditionally had no PCI or USB VID, and now in newer versions of fwupd the device would actually have a virtual ID, and so the vendor could be locked down on the LVFS. The fix here is to ignore the metadata vendor-id if there’s no device vendor-id, rather than failing to update.

Most people should be running fwupd 1.3.x, which is the latest and greatest branch of fwupd. I appreciate some LTS distros can’t rebase to a newer minor version, and so for old versions of fwupd I’ve backported the fix. These are the fixes you want if you’re running 0.9.x, 1.0.x, 1.1.x or 1.2.x.

I’ll make the vendor-id a hard requirement for all vendors in about 6 months time, so if you maintain a distro packaged version of fwupd you have that much time before some updates will stop working. If anyone has comments or concerns, please let me know.

OSFC 2019 – Introducing the Linux Vendor Firmware Service

Posted by Richard Hughes on December 04, 2019 12:00 PM

A few months ago I gave a talk at OSFC.io titled Introducing the Linux Vendor Firmware Service.

If you have a few minutes it’s a really useful high-level view of the entire architecture, along with a few quick dives into some of the useful things the LVFS can do. Questions and comments welcome!

coverity scan

Posted by Caolán McNamara on November 23, 2019 08:26 PM
When we make C++17 a requirement for LibreOffice at the end of 2018 the version of coverity provided by scan.coverity.com no longer worked for us. In July 2019 a newer version of the coverity tooling was available which supported C++17 and analysis resumed.

Prior to losing coverity support we had a defect density (i.e. defects per 1,000 line of code) of 0, on its return this had inflated to 0.06 due to both new defects introduced during the down period and old defects newly detected due to additional checks introduced in the new version.

Today we're finally back to 0

Growing the fwupd ecosystem

Posted by Richard Hughes on November 19, 2019 12:26 PM

Yesterday I wrote a blog about what hardware vendors need to provide so I can write them a fwupd plugin. A few people contacted me telling me that I should make it more generic, as I shouldn’t be the central point of failure in this whole ecosystem. The sensible thing, of course, is growing the “community” instead, and building up a set of (paid) consultants that can help the OEMs and ODMs, only getting me involved to review pull requests or for general advice. This would certainly reduce my current feeling of working at 100% and trying to avoid burnout.

As a first step, I’ve created an official page that will list any consulting companies that I feel are suitable to recommend for help with fwupd and the LVFS. The hardware vendors would love to throw money at this stuff, so they don’t have to care about upstream project release schedules and dealing with a gumpy maintainer like me. I’ve pinged the usual awesome people like Igalia, and hopefully more companies will be added to this list during the next few days.

If you do want your open-source consultancy to be added, please email me a two paragraph corporate-friendly blurb I can include on that new page, also with a link I can use for the “more details” button. If you’re someone I’ve not worked with before, you should be in a position to explain the difference between a capsule update and a DFU update, and be able to tell me what a version format is. I don’t want to be listing companies that don’t understand what fwupd actually is :)

Google and fwupd sitting in a tree

Posted by Richard Hughes on November 18, 2019 03:41 PM

I’ve been told by several sources (but not by Google directly, heh) that from Christmas onwards the “Designed for ChromeBook” sticker requires hardware vendors to use fwupd rather than random non-free binaries. This does make a lot of sense for Google, as all the firmware flash tools I’ve seen the source for are often decades old, contain layer-on-layers of abstractions, have dubious input sanitisation and are quite horrible to use. Many are setuid, which doesn’t make me sleep well at night, and I suspect the security team at Google also. Most vendor binaries are built for the specific ODM hardware device, and all of them but one doesn’t use any kind of source control or formal review process.

The requirement from Google has caused mild panic among silicon suppliers and ODMs, as they’re having to actually interact with an open source upstream project and a slightly grumpy maintainer that wants to know lots of details about hardware that doesn’t implement one of the dozens of existing protocols that fwupd supports. These are companies that have never had to deal with working with “outside” people to develop software, and it probably comes as quite a shock to the system. To avoid repeating myself these are my basic rules when adding support for a device with a custom protocol in fwupd:

  • I can give you advice on how to write the plugin if you give me the specifications without signing an NDA, and/or the existing code under a LGPLv2+ license. From experience, we’ll probably not end up using any of your old code in fwupd but the error defines and function names might be similar, and I don’t anyone to get “tainted” from looking at non-free code, so it’s safest all round if we have some reference code marked with the right license that actually compiles on Fedora 31. Yes, I know asking the legal team about releasing previously-nonfree code with a GPLish licence is difficult.
  • If you are running Linux, and want our help to debug or test your new plugin, you need to be running Fedora 30 or 31. If you run Ubuntu you’ll need to use the snap version of fwupd, and I can’t help you with random Ubuntu questions or interactions between the snap version and the distro version. I know your customer might be running Debian Stable or Ubuntu LTS, but that’s not what I’m paid to support. If you do use Fedora 29+ or RHEL 7+ you can also use the nice COPR I provide with git snapshots of master.
  • Please reflect the topology of your device. If writes have to go through another interface, passthru or IC, please give us access to documentation about that device too. I’m fed up having to reverse engineer protocols from looking at the “wrong side” of the client source code. If the passthru is implemented by different vendor, they’ll need to work on the same terms as this.
  • If you want to design and write all of the plugin yourself, that’s awesome, but please follow the existing style and don’t try to wrap your existing code base with the fwupd plugin API. If your device has three logical children with different version numbers or firmware formats, we want to see three devices in fwupdmgr. If you want to restrict the child devices to a parent vendor, that’s fine, we now support that in fwupd and on the LVFS. If you’re adding custom InstanceIDs, these have to be documented in the README.md file.
  • If you’re using an nonstandard firmware format (as in, not DFU, Intel HEX or Motorola SREC) then you’ll need to write a firmware parser that’s going to be valgrind’ed and fuzzed. We will need all the header/footer documentation so we can verify the parser and add some small redistributable fuzz targets. If the blob is being passed to the hardware without parsing, you still might need to know the format of the header so that the plugin can do a sanity check that the firmware is suitable for the hardware, and that any internal CRC is actually correct. All the firmware parsers have to be paranoid and written defensively, because it’s me that looks bad on LWN if CVEs get issued.
  • If you want me to help with the plugin, I’m probably going to ask for test hardware, and two different versions of the firmware that can actually be flashed to the hardware you sent. A bare PCB is fine, but if you send me something please let me know so I can give you my personal address rather than have to collect it from a Red Hat office. If you send me hardware, ensure you also include a power supply that’s going to work in the UK, e.g. 240V. If you want it back, you’ll also need to provide me with UPS/DHL collection sticker.
  • You do need to think how to present your device version number. e.g. is 0x12345678 meant to be presented as “12.34.5678” or “18.52.86.120” – the LVFS really cares if this is correct, and users want to see the “same” version numbers as on the OEM web-page.
  • You also need to know if the device is fully functional during the update, or if it operates in a degraded or bootloader mode. We also need to know what happens if flashing fails, e.g. is the device a brick, or is there some kind of A/B partition that makes a flash failure harmless? If the device is a brick, how can it be recovered without an RMA?
  • After the update is complete fwupd need to “restart” the device so that the new firmware version can be verified, so there needs to be some kind of command the device understands – we can ask the user to reboot or re-plug the device if this is the only way to do this, although in 2019 we can really do better than that.
  • If you’re sharing a huge LGPLv2+ lump of code, we need access to someone who actually understands it, preferably the person that wrote it in the first place. Typically the code is uncommented and a recipe for a headache so being able to ask a human questions is invaluable. For this, either IRC, email or even just communicating via a shared Google doc (more common than you would believe…) is fine. I can’t discuss this stuff on Telegram, Hangouts or WhatsApp, sorry.
  • Once a plugin exists in fwupd and is upstream, we will expect pull requests to add either more VID/PIDs, #defines or to add variations to the protocol for new versions of the hardware. I’m going to be grumpy if I just get sent a random email with demands about backporting all the VID/PIDs to Debian stable. I have zero control on when Debian backports anything, and very little influence on when Ubuntu does a SRU. I have a lot of influence on when various Fedora releases get a new fwupd, and when RHEL gets backports for new hardware support.

Now, if all this makes me sound like a grumpy upstream maintainer then I apologize. I’m currently working with about half a dozen silicon suppliers who all failed some or all of the above bullets. I’m multiplexing myself with about a dozen companies right now, and supporting fwupd isn’t actually my entire job at Red Hat. I’m certainly not going to agree to “signing off a timetable” for each vendor as none of the vendors actually pay me to do anything…

Given interest in fwupd has exploded in the last year or so, I wanted to post something like this rather than have a 10-email back and forth about my expectations with each vendor. Some OEMs and even ODMs are now hiring developers with Linux experience, and I’m happy to work with them as fwupd becomes more important. I’ve already helped quite a few developers at random vendors get up to speed with fwupd and would be happy to help more. As the importance of fwupd and the LVFS grows more and more, vendors will need to hire developers who can build, extend and support their hardware. As fwupd grows, I’ll be asking vendors to do more of the work, as “get upstream to do it” doesn’t scale.

Extending proprietary PC embedded controller firmware

Posted by Matthew Garrett on November 18, 2019 08:19 AM
I'm still playing with my X210, a device that just keeps coming up with new ways to teach me things. I'm now running Coreboot full time, so the majority of the runtime platform firmware is free software. Unfortunately, the firmware that's running on the embedded controller (a separate chip that's awake even when the rest of the system is asleep and which handles stuff like fan control, battery charging, transitioning into different power states and so on) is proprietary and the manufacturer of the chip won't release data sheets for it. This was disappointing, because the stock EC firmware is kind of annoying (there's no hysteresis on the fan control, so it hits a threshold, speeds up, drops below the threshold, turns off, and repeats every few seconds - also, a bunch of the Thinkpad hotkeys don't do anything) and it would be nice to be able to improve it.

A few months ago someone posted a bunch of fixes, a Ghidra project and a kernel patch that lets you overwrite the EC's code at runtime for purposes of experimentation. This seemed promising. Some amount of playing later and I'd produced a patch that generated keyboard scancodes for all the missing hotkeys, and I could then use udev to map those scancodes to the keycodes that the thinkpad_acpi driver would generate. I finally had a hotkey to tell me how much battery I had left.

But something else included in that post was a list of the GPIO mappings on the EC. A whole bunch of hardware on the board is connected to the EC in ways that allow it to control them, including things like disabling the backlight or switching the wifi card to airplane mode. Unfortunately the ACPI spec doesn't cover how to control GPIO lines attached to the embedded controller - the only real way we have to communicate is via a set of registers that the EC firmware interprets and does stuff with.

One of those registers in the vendor firmware for the X210 looked promising, with individual bits that looked like radio control. Unfortunately writing to them does nothing - the EC firmware simply stashes that write in an address and returns it on read without parsing the bits in any way. Doing anything more with them was going to involve modifying the embedded controller code.

Thankfully the EC has 64K of firmware and is only using about 40K of that, so there's plenty of room to add new code. The problem was generating the code in the first place and then getting it called. The EC is based on the CR16C architecture, which binutils supported until 10 days ago. To be fair it didn't appear to actually work, and binutils still has support for the more generic version of the CR16 family, so I built a cross assembler, wrote some assembly and came up with something that Ghidra was willing to parse except for one thing.

As mentioned previously, the existing firmware code responded to writes to this register by saving it to its RAM. My plan was to stick my new code in unused space at the end of the firmware, including code that duplicated the firmware's existing functionality. I could then replace the existing code that stored the register value with code that branched to my code, did whatever I wanted and then branched back to the original code. I hacked together some assembly that did the right thing in the most brute force way possible, but while Ghidra was happy with most of the code it wasn't happy with the instruction that branched from the original code to the new code, or the instruction at the end that returned to the original code. The branch instruction differs from a jump instruction in that it gives a relative offset rather than an absolute address, which means that branching to nearby code can be encoded in fewer bytes than going further. I was specifying the longest jump encoding possible in my assembly (that's what the :l means), but the linker was rewriting that to a shorter one. Ghidra was interpreting the shorter branch as a negative offset, and it wasn't clear to me whether this was a binutils bug or a Ghidra bug. I ended up just hacking that code out of binutils so it generated code that Ghidra was happy with and got on with life.

Writing values directly to that EC register showed that it worked, which meant I could add an ACPI device that exposed the functionality to the OS. My goal here is to produce a standard Coreboot radio control device that other Coreboot platforms can implement, and then just write a single driver that exposes it. I wrote one for Linux that seems to work.

In summary: closed-source code is more annoying to improve, but that doesn't mean it's impossible. Also, strange Russians on forums make everything easier.

comment count unavailable comments

Native GTK Dialogs in LibreOffice

Posted by Caolán McNamara on October 31, 2019 08:01 PM

LibreOffice Native GTK Dialog Status

The LibreOffice UI was traditionally implemented with its own VCL toolkit which via theming emulated the host desktop toolkit.

Then we migrated the file format the dialogs were described in to the GtkBuilder file format. But still implemented with VCL widgetry, though with additional GTK-alike layout widgets.

Then migrated the translation format to gettext .mo files, which added plural form translation support we had lacked.

Then incrementally migrated the code driving the dialogs to a new API with two implementations, one for VCL widgetry and one for GTK.


Over the last few major releases the GTK version of LibreOffice has increasingly had true GTK dialogs and less VCL dialogs and in master, as of this week, there are now no direct uses of the VCL dialog APIs.

There are still some non-dialog utility windows and other elements to port over, but dialogs are complete.

LibreOffice has a lot of UI. There are 1029 XML UI definition files in master. 480 definitions of a GtkDialog and 75 additional GtkMessageDialog definitions. The remainder of the files typically describe a single page of a Notebook, Assistant or Sidebar, often appearing in multiple dialogs.

Here are some gifs of a small set of the dialogs from master under Fedora 31, taken under Wayland with peek, showing some of the stock animations of the default GTK 3.24 Adwaita theme

The Writer Character dialog

Notebook, Color Selector MenuButton, and ToggleButton animations

The Calc Page dialog

SpinButtons and legacy Preview widgets hosted in a native dialog


The Writer Paragraph dialog

"Double Decker" Notebook and Scale widgets

The Writer AutoCorrect dialog

Smooth scrolling of huge Emoji autocorrect list

Chart 3D View dialog

Amusingly Over-engineered custom lighting direction widget


The Options dialog

TreeView, Overlay ScrollBar, fade in animation of CheckButtons


GNOME, and Free Software Is Under Attack

Posted by Richard Hughes on October 22, 2019 01:34 PM

A month ago, GNOME was hit by a patent troll. We’re fighting, but need some money to fund the legal defense, and counterclaim. I just donated, and if you use or develop free software you should too.

Letting Birds scooters fly free

Posted by Matthew Garrett on October 18, 2019 11:44 AM
(Note: These issues were disclosed to Bird, and they tell me that fixes have rolled out. I haven't independently verified)

Bird produce a range of rental scooters that are available in multiple markets. With the exception of the Bird Zero[1], all their scooters share a common control board described in FCC filings. The board contains three primary components - a Nordic NRF52 Bluetooth controller, an STM32 SoC and a Quectel EC21-V modem. The Bluetooth and modem are both attached to the STM32 over serial and have no direct control over the rest of the scooter. The STM32 is tied to the scooter's engine control unit and lights, and also receives input from the throttle (and, on some scooters, the brakes).

The pads labeled TP7-TP11 near the underside of the STM32 and the pads labeled TP1-TP5 near the underside of the NRF52 provide Serial Wire Debug, although confusingly the data and clock pins are the opposite way around between the STM and the NRF. Hooking this up via an STLink and using OpenOCD allows dumping of the firmware from both chips, which is where the fun begins. Running strings over the firmware from the STM32 revealed "Set mode to Free Drive Mode". Challenge accepted.

Working back from the code that printed that, it was clear that commands could be delivered to the STM from the Bluetooth controller. The Nordic NRF52 parts are an interesting design - like the STM, they have an ARM Cortex-M microcontroller core. Their firmware is split into two halves, one the low level Bluetooth code and the other application code. They provide an SDK for writing the application code, and working through Ghidra made it clear that the majority of the application firmware on this chip was just SDK code. That made it easier to find the actual functionality, which was just listening for writes to a specific BLE attribute and then hitting a switch statement depending on what was sent. Most of these commands just got passed over the wire to the STM, so it seemed simple enough to just send the "Free drive mode" command to the Bluetooth controller, have it pass that on to the STM and win. Obviously, though, things weren't so easy.

It turned out that passing most of the interesting commands on to the STM was conditional on a variable being set, and the code path that hit that variable had some impressively complicated looking code. Fortunately, I got lucky - the code referenced a bunch of data, and searching for some of the values in that data revealed that they were the AES S-box values. Enabling the full set of commands required you to send an encrypted command to the scooter, which would then decrypt it and verify that the cleartext contained a specific value. Implementing this would be straightforward as long as I knew the key.

Most AES keys are 128 bits, or 16 bytes. Digging through the code revealed 8 bytes worth of key fairly quickly, but the other 8 bytes were less obvious. I finally figured out that 4 more bytes were the value of another Bluetooth variable which could be simply read out by a client. The final 4 bytes were more confusing, because all the evidence made no sense. It looked like it came from passing the scooter serial number to atoi(), which converts an ASCII representation of a number to an integer. But this seemed wrong, because atoi() stops at the first non-numeric value and the scooter serial numbers all started with a letter[2]. It turned out that I was overthinking it and for the vast majority of scooters in the fleet, this section of the key was always "0".

At that point I had everything I need to write a simple app to unlock the scooters, and it worked! For about 2 minutes, at which point the network would notice that the scooter was unlocked when it should be locked and sent a lock command to force disable the scooter again. Ah well.

So, what else could I do? The next thing I tried was just modifying some STM firmware and flashing it onto a board. It still booted, indicating that there was no sort of verified boot process. Remember what I mentioned about the throttle being hooked through the STM32's analogue to digital converters[3]? A bit of hacking later and I had a board that would appear to work normally, but about a minute after starting the ride would cut the throttle. Alternative options are left as an exercise for the reader.

Finally, there was the component I hadn't really looked at yet. The Quectel modem actually contains its own application processor that runs Linux, making it significantly more powerful than any of the chips actually running the scooter application[4]. The STM communicates with the modem over serial, sending it an AT command asking it to make an SSL connection to a remote endpoint. It then uses further AT commands to send data over this SSL connection, allowing it to talk to the internet without having any sort of IP stack. Figuring out just what was going over this connection was made slightly difficult by virtue of all the debug functionality having been ripped out of the STM's firmware, so in the end I took a more brute force approach - I identified the address of the function that sends data to the modem, hooked up OpenOCD to the SWD pins on the STM, ran OpenOCD's gdb stub, attached gdb, set a breakpoint for that function and then dumped the arguments being passed to that function. A couple of minutes later and I had a full transaction between the scooter and the remote.

The scooter authenticates against the remote endpoint by sending its serial number and IMEI. You need to send both, but the IMEI didn't seem to need to be associated with the serial number at all. New connections seemed to take precedence over existing connections, so it would be simple to just pretend to be every scooter and hijack all the connections, resulting in scooter unlock commands being sent to you rather than to the scooter or allowing someone to send fake GPS data and make it impossible for users to find scooters.

In summary: Secrets that are stored on hardware that attackers can run arbitrary code on probably aren't secret, not having verified boot on safety critical components isn't ideal, devices should have meaningful cryptographic identity when authenticating against a remote endpoint.

Bird responded quickly to my reports, accepted my 90 day disclosure period and didn't threaten to sue me at any point in the process, so good work Bird.

(Hey scooter companies I will absolutely accept gifts of interesting hardware in return for a cursory security audit)

[1] And some very early M365 scooters
[2] The M365 scooters that Bird originally deployed did have numeric serial numbers, but they were 6 characters of type code followed by a / followed by the actual serial number - the number of type codes was very constrained and atoi() would terminate at the / so this was still not a large keyspace
[3] Interestingly, Lime made a different design choice here and plumb the controls directly through to the engine control unit without the application processor having any involvement
[4] Lime run their entire software stack on the modem's application processor, but because of [3] they don't have any realtime requirements so this is more straightforward

comment count unavailable comments

libinput and tablet pad keys

Posted by Peter Hutterer on October 17, 2019 11:23 PM

Upcoming in libinput 1.15 is a small feature to support Wacom tablets a tiny bit better. If you look at the higher-end devices in Wacom's range, e.g. the Cintiq 27QHD you'll notice that at the top right of the device are three hardware-buttons with icons. Those buttons are intended to open the config panel, the on-screen display or the virtual keyboard. They've been around for a few years and supported in the kernel for a few releases. But in userspace, they events from those keys were ignored, casted out in the wild before eventually running out of electrons and succumbing to misery. Well, that's all changing now with a new interface being added to libinput to forward those events.

Step back a second and let's look at the tablet interfaces. We have one for tablet tools (styli) and one for tablet pads. In the latter, we have events for rings, strips and buttons. The latter are simply numerically ordered, so button 1 is simply button 1 with no special meaning. Anything more specific needs to be handled by the compositor/client side which is responsible for assigning e.g. keyboard shortcuts to those buttons.

The special keys however are different, they have a specific function indicated by the icon on the key itself. So libinput 1.15 adds a new event type for tablet pad keys. The events look quite similar to the button events but they have a linux/input-event-codes.h specific button code that indicates what they are. So the compositor can start the OSD, or control panel, or whatever directly without any further configuration required.

This interface hasn't been merged yet, it's waiting for the linux kernel 5.4 release which has a few kernel-level fixes for those keys.

libinput and button scrolling locks

Posted by Peter Hutterer on October 17, 2019 10:56 PM

For a few years now, libinput has provided button scrolling. Holding a designated button down and moving the device up/down or left/right creates the matching scroll events. We enable this behaviour by default on some devices (e.g. trackpoints) but it's available on mice and some other devices. Users can change the button that triggers it, e.g. assign it to the right button. There are of course a couple of special corner cases to make sure you can still click that button normally but as I said, all this has been available for quite some time now.

New in libinput 1.15 is the button lock feature. The button lock removes the need to hold the button down while scrolling. When the button lock is enabled, a single button click (i.e. press and release) of that button holds that button logically down for scrolling and any subsequent movement by the device is translated to scroll events. A second button click releases that button lock and the device goes back to normal movement. That's basically it, though there are some extra checks to make sure the button can still be used for normal clicking (you will need to double-click for a single logical click now though).

This is primarily an accessibility feature and is likely to find it's way into the GUI tools under the accessibility headers.

Riddle me this

Posted by Benjamin Otte on October 17, 2019 10:46 PM

Found this today while playing around, thought people might enjoy this riddle.

$> echo test.c
typedef int foo;
int main()
{
  foo foo = 1;
  return (foo) +0;
}
$> gcc -Wall -o test test.c && ./test && echo $?

What does this print?

  1. 0
  2. 1
  3. Some compilation warnings, then 0.
  4. Some compilation warnings, then 1.
  5. It doesn’t compile.

I’ll put an answer in the comments.

libinput's bus factor is 1

Posted by Peter Hutterer on October 16, 2019 05:56 AM

A few weeks back, I was at XDC and gave a talk about various current and past input stack developments (well, a subset thereof anyway). One of the slides pointed out libinput's bus factor and I'll use this blog to make this a bit more widely known.

If you don't know what the bus factor is, Wikipedia defines it as:

The "bus factor" is the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel.
libinput has a bus factor of 1.

Let's arbitrarily pick the 1.9.0 release (roughly 2 years ago) and look at the numbers: of the ~1200 commits since 1.9.0, just under 990 were done by me. In those 2 years we had 76 contributors in total, but only 24 of which have more than one commit and only 6 contributors have more than 5 commits. The numbers don't really change much even if we go all the way back to 1.0.0 in 2015. These numbers do not include the non-development work: release maintenance for new releases and point releases, reviewing CI failures [1], writing documentation (including the stuff on this blog), testing and bug triage. Right now, this is effectively all done by one person.

This is... less than ideal. At this point libinput is more-or-less the only input stack we have [2] and all major distributions rely on it. It drives mice, touchpads, tablets, keyboards, touchscreens, trackballs, etc. so basically everything except joysticks.

Anyway, I'm largely writing this blog post in the hope that someone gets motivated enough to dive into this. Right now, if you get 50 patches into libinput you get the coveted second-from-the-top spot, with all the fame and fortune that entails (i.e. little to none, but hey, underdogs are big in popular culture). Short of that, any help with building an actual community would be appreciated too.

Either way, lest it be said that no-one saw it coming, let's ring the alarm bells now before it's too late. Ding ding!

[1] Only as of a few days ago can we run the test suite as part of the CI infrastructure, thanks to Benjamin Tissoires. Previously it was run on my laptop and virtually nowhere else.
[2] fyi, xf86-input-evdev: 5 patches in the same timeframe, xf86-input-synaptics: 6 patches (but only 3 actual changes) so let's not pretend those drivers are well-maintained.

Investigating the security of Lime scooters

Posted by Matthew Garrett on October 04, 2019 06:04 AM
(Note: to be clear, this vulnerability does not exist in the current version of the software on these scooters. Also, this is not the topic of my Kawaiicon talk.)

I've been looking at the security of the Lime escooters. These caught my attention because:
(1) There's a whole bunch of them outside my building, and
(2) I can see them via Bluetooth from my sofa
which, given that I'm extremely lazy, made them more attractive targets than something that would actually require me to leave my home. I did some digging. Limes run Linux and have a single running app that's responsible for scooter management. They have an internal debug port that exposes USB and which, until this happened, ran adb (as root!) over this USB. As a result, there's a fair amount of information available in various places, which made it easier to start figuring out how they work.

The obvious attack surface is Bluetooth (Limes have wifi, but only appear to use it to upload lists of nearby wifi networks, presumably for geolocation if they can't get a GPS fix). Each Lime broadcasts its name as Lime-12345678 where 12345678 is 8 digits of hex. They implement Bluetooth Low Energy and expose a custom service with various attributes. One of these attributes (0x35 on at least some of them) sends Bluetooth traffic to the application processor, which then parses it. This is where things get a little more interesting. The app has a core event loop that can take commands from multiple sources and then makes a decision about which component to dispatch them to. Each command is of the following form:

AT+type,password,time,sequence,data$

where type is one of either ATH, QRY, CMD or DBG. The password is a TOTP derived from the IMEI of the scooter, the time is simply the current date and time of day, the sequence is a monotonically increasing counter and the data is a blob of JSON. The command is terminated with a $ sign. The code is fairly agnostic about where the command came from, which means that you can send the same commands over Bluetooth as you can over the cellular network that the Limes are connected to. Since locking and unlocking is triggered by one of these commands being sent over the network, it ought to be possible to do the same by pushing a command over Bluetooth.

Unfortunately for nefarious individuals, all commands sent over Bluetooth are ignored until an authentication step is performed. The code I looked at had two ways of performing authentication - you could send an authentication token that was derived from the scooter's IMEI and the current time and some other stuff, or you could send a token that was just an HMAC of the IMEI and a static secret. Doing the latter was more appealing, both because it's simpler and because doing so flipped the scooter into manufacturing mode at which point all other command validation was also disabled (bye bye having to generate a TOTP). But how do we get the IMEI? There's actually two approaches:

1) Read it off the sticker that's on the side of the scooter (obvious, uninteresting)
2) Take advantage of how the scooter's Bluetooth name is generated

Remember the 8 digits of hex I mentioned earlier? They're generated by taking the IMEI, encrypting it using DES and a static key (0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88), discarding the first 4 bytes of the output and turning the last 4 bytes into 8 digits of hex. Since we're discarding information, there's no way to immediately reverse the process - but IMEIs for a given manufacturer are all allocated from the same range, so we can just take the entire possible IMEI space for the modem chipset Lime use, encrypt all of them and end up with a mapping of name to IMEI (it turns out this doesn't guarantee that the mapping is unique - for around 0.01%, the same name maps to two different IMEIs). So we now have enough information to generate an authentication token that we can send over Bluetooth, which disables all further authentication and enables us to send further commands to disconnect the scooter from the network (so we can't be tracked) and then unlock and enable the scooter.

(Note: these are actual crimes)

This all seemed very exciting, but then a shock twist occurred - earlier this year, Lime updated their authentication method and now there's actual asymmetric cryptography involved and you'd need to engage in rather more actual crimes to obtain the key material necessary to authenticate over Bluetooth, and all of this research becomes much less interesting other than as an example of how other companies probably shouldn't do it.

In any case, congratulations to Lime on actually implementing security!

comment count unavailable comments

Some Flatpak updates

Posted by Matthias Clasen on October 03, 2019 11:43 AM

Flatpak development is not standing still. Here is a quick summary of recent and coming changes.

Better extensions

In 1.4.2, Flatpak gained the ability to use extra-data for extensions. This mechanism has been around for applications for a long time, but it is a new feature for extensions.

The 19.08 version of the freedesktop runtime uses it for its new org.freedesktop.Platform.openh264 extension, which uses the Cisco openh264 builds.

Since we are taking the ‘run everywhere’ aspect of Flatpak seriously, we’ve backported this feature from the 1.4 branch to older stable branches and released 1.2.4 and 1.0.9, so even users on very stable distributions can enjoy this new feature.

Future plans

We’ve quietly started to work on Flatpak 1.6, which should be out before the end of the year.

On the roadmap for the this release, we have

  • Support for masking updates and pinning apps.  This gives users more control about what updates Flatpak installs, without having to answer questions every time.
  • Parental controls. This optional feature uses libmalcontent to implement policies about what applications users can install and run, based on OARS content ratings.
  • Disk space checks. This is an ongoing effort to improve the accuracy of our disk- and download-size handling and to handle low disk space situations more gracefully.
  • Infrastructure for purchases/donations. This is still a bit of a research topic.

You can follow the discussion around these features, the flatpak roadmap and general flatpak topics on the flatpak mailing list.

Coming soon to portals

Things are happening on the portal side too. Some of these have already landed, and will appear in a release soon.

Secrets

We have a secrets portal now.  It works by providing a master secret to the sandboxed app, which is then used to store the applications secrets in an encrypted file inside the sandbox . The master secret is stored in the session keyring.

This is nice in that applications don’t leave their secrets behind in the keyring when they are uninstalled, and the application secrets are safe from others.

The backend for this portal will be provided by gnome-keyring and libsecret will automatically use it inside a sandbox. Backend implementations for other environments are more than welcome.

The secret portal is the work of Daiki Ueno, who gave a talk about it at Guadec.

Self-updates

The Flatpak commandline and tools like Discover or the Elementary app store do a fine job of handling updates for Flatpak apps and runtimes.

But the reality is that self-updating is a popular feature for applications, so we added an update portal that lets them do this in a clean way, with proper integration in the Flatpak machinery.

Backgrounds 1

The background portal monitors applications that are running in the background (without open windows). It gives apps a way to request permission to run in the background, and it notifies users when apps are trying to do so sneakily without permission. The portal also lets applications request to be started automatically when the user logs in.

To implement this, the portal needs information from the compositor about open windows, and which applications they belong to. Currently, this is implemented for gnome-shell, other backends are more than welcome.

Window sharing

The screencast portal now lets you select individual windows, in addition to screens, if the application asks for this.

For now, the portal identifies windows by the application icon and window title. We are looking to improve this by using thumbnails.

Backgrounds 2

We will add a small bit of desktop integration with a portal for setting desktop wallpapers.

A portal library

In the ideal case, portal functionality is used transparently by existing desktop libraries without the need for apps to do anything special. Examples for this are GtkFileChooserNative using the file chooser portal, or libsecret using the new secret portal.

But for some portals, there is no natural library api, and in these cases, doing the portal interaction with D-Bus calls can be a bit cumbersome.

Therefore, we are working on a libportal library that will provide GIO-style async apis for portal requests.

Open for contribution

If you want to get involved with Flatpak development, or are just curious, check out the flatpak project on github, chime in on the Flatpak mailing list, or find us on IRC in #flatpak on freenode.

Do we need to rethink what free software is?

Posted by Matthew Garrett on September 27, 2019 05:47 PM
Licensing has always been a fundamental tool in achieving free software's goals, with copyleft licenses deliberately taking advantage of copyright to ensure that all further recipients of software are in a position to exercise free software's four essential freedoms. Recently we've seen people raising two very different concerns around existing licenses and proposing new types of license as remedies, and while both are (at present) incompatible with our existing concepts of what free software is, they both raise genuine issues that the community should seriously consider.

The first is the rise in licenses that attempt to restrict business models based around providing software as a service. If users can pay Amazon to provide a hosted version of a piece of software, there's little incentive for them to pay the authors of that software. This has led to various projects adopting license terms such as the Commons Clause that effectively make it nonviable to provide such a service, forcing providers to pay for a commercial use license instead.

In general the entities pushing for these licenses are VC backed companies[1] who are themselves benefiting from free software written by volunteers that they give nothing back to, so I have very little sympathy. But it does raise a larger issue - how do we ensure that production of free software isn't just a mechanism for the transformation of unpaid labour into corporate profit? I'm fortunate enough to be paid to write free software, but many projects of immense infrastructural importance are simultaneously fundamental to multiple business models and also chronically underfunded. In an era where people are becoming increasingly vocal about wealth and power disparity, this obvious unfairness will result in people attempting to find mechanisms to impose some degree of balance - and given the degree to which copyleft licenses prevented certain abuses of the commons, it's likely that people will attempt to do so using licenses.

At the same time, people are spending more time considering some of the other ethical outcomes of free software. Copyleft ensures that you can share your code with your neighbour without your neighbour being able to deny the same freedom to others, but it does nothing to prevent your neighbour using your code to deny other fundamental, non-software, freedoms. As governments make more and more use of technology to perform acts of mass surveillance, detention, and even genocide, software authors may feel legitimately appalled at the idea that they are helping enable this by allowing their software to be used for any purpose. The JSON license includes a requirement that "The Software shall be used for Good, not Evil", but the lack of any meaningful clarity around what "Good" and "Evil" actually mean makes it hard to determine whether it achieved its aims.

The definition of free software includes the assertion that it must be possible to use the software for any purpose. But if it is possible to use software in such a way that others lose their freedom to exercise those rights, is this really the standard we should be holding? Again, it's unsurprising that people will attempt to solve this problem through licensing, even if in doing so they no longer meet the current definition of free software.

I don't have solutions for these problems, and I don't know for sure that it's possible to solve them without causing more harm than good in the process. But in the absence of these issues being discussed within the free software community, we risk free software being splintered - on one side, with companies imposing increasingly draconian licensing terms in an attempt to prop up their business models, and on the other side, with people deciding that protecting people's freedom to life, liberty and the pursuit of happiness is more important than protecting their freedom to use software to deny those freedoms to others.

As stewards of the free software definition, the Free Software Foundation should be taking the lead in ensuring that these issues are discussed. The priority of the board right now should be to restructure itself to ensure that it can legitimately claim to represent the community and play the leadership role it's been failing to in recent years, otherwise the opportunity will be lost and much of the activist energy that underpins free software will be spent elsewhere.

If free software is going to maintain relevance, it needs to continue to explain how it interacts with contemporary social issues. If any organisation is going to claim to lead the community, it needs to be doing that.

[1] Plus one VC firm itself - Bain Capital, an investment firm notorious for investing in companies, extracting as much value as possible and then allowing the companies to go bankrupt

comment count unavailable comments

Synaptics CX Audio Support

Posted by Richard Hughes on September 25, 2019 04:02 PM

A couple of weeks ago, Synaptics (who now own Conexant) sent me 22,000+ lines of LGPLv2+ licensed C++ that was capable of updating the firmware of all the CXxxxx audio devices that exist in various laptops and peripherals. Most of last week was spent reading the code, and refactoring it to be a CX audio plugin in fwupd. There were a few things I could do to reduce the code size considerably:

  • Use the abstractions shared with all the other plugins, e.g. SREC file format processing, data chunking and low level USB HID
  • Drop support for hardware families which are no longer supported and not likely to receive updates
  • Remove the layers of abstractions and the macros-of-macros-of-macros so common with a codebase age measured in decades
  • Use helper objects in GLib and GObject rather than having to create everything from scratch

So, after all that we got down to a 1377 line fwupd plugin which is a 16x code reduction. It’s broadly comparable in functionality to the 22,000 line code drop but only works in fwupd as a plugin rather than as a standalone updater. To add support for new hardware to the plugin all we have to do is add an entry to the quirk file, which tells us which CX family the specific USB VID/PID is using. The rest is auto-detected.

I can’t tell you the OEM or the hardware all this work is being driven by, but eagle-eyed readers will work it out :) In some cases you might see an extra device appear in fwupdmgr get-devices if you’re running the soon-to-be-released fwupd 1.3.2 and hopefully we can get firmware updates which use this new device on the LVFS some time this year.

Epiphany Technology Preview Users: Action Required

Posted by Michael Catanzaro on September 18, 2019 02:19 PM

Epiphany Technology Preview has moved from https://sdk.gnome.org to https://nightly.gnome.org. The old Epiphany Technology Preview is now end-of-life. Action is required to update. If you installed Epiphany Technology Preview prior to a couple minutes ago, uninstall it using GNOME Software and then reinstall using this new flatpakref.

Apologies for this disruption.

The main benefit to end users is that you’ll no longer need separate remotes for nightly runtimes and nightly applications, because everything is now hosted in one repo. See Abderrahim’s announcement for full details on why this transition is occurring.

It's time to talk about post-RMS Free Software

Posted by Matthew Garrett on September 14, 2019 11:57 AM
Richard Stallman has once again managed to demonstrate incredible insensitivity[1]. There's an argument that in a pure technical universe this is irrelevant and we should instead only consider what he does in free software[2], but free software isn't a purely technical topic - the GNU Manifesto is nakedly political, and while free software may result in better technical outcomes it is fundamentally focused on individual freedom and will compromise on technical excellence if otherwise the result would be any compromise on those freedoms. And in a political movement, there is no way that we can ignore the behaviour and beliefs of that movement's leader. Stallman is driving away our natural allies. It's inappropriate for him to continue as the figurehead for free software.

But I'm not calling for Stallman to be replaced. If the history of social movements has taught us anything, it's that tying a movement to a single individual is a recipe for disaster. The FSF needs a president, but there's no need for that person to be a leader - instead, we need to foster an environment where any member of the community can feel empowered to speak up about the importance of free software. A decentralised movement about returning freedoms to individuals can't also be about elevating a single individual to near-magical status. Heroes will always end up letting us down. We fix that by removing the need for heroes in the first place, not attempting to find increasingly perfect heroes.

Stallman was never going to save us. We need to take responsibility for saving ourselves. Let's talk about how we do that.

[1] There will doubtless be people who will leap to his defense with the assertion that he's neurodivergent and all of these cases are consequences of that.

(A) I am unaware of a formal diagnosis of that, and I am unqualified to make one myself. I suspect that basically everyone making that argument is similarly unqualified.
(B) I've spent a lot of time working with him to help him understand why various positions he holds are harmful. I've reached the conclusion that it's not that he's unable to understand, he's just unwilling to change his mind.

[2] This argument is, obviously, bullshit

comment count unavailable comments

GNOME Firmware 3.34.0 Release

Posted by Richard Hughes on September 13, 2019 01:12 PM

This morning I tagged the newest fwupd release, 1.3.1. There are a lot of new things in this release and a whole lot of polishing, so I encourage you to read the release notes if this kind of thing interests you.

Anyway, to the point of this post. With the new fwupd 1.3.1 you can now build just the libfwupd library, which makes it easy to build GNOME Firmware (old name: gnome-firmware-updater) in Flathub. I tagged the first official release 3.34.0 to celebrate the recent GNOME release, and to indicate that it’s ready for use by end users. I guess it’s important to note this is just a random app hacked together by 3 engineers and not something lovelingly designed by the official design team. All UX mistakes are my own :)

GNOME Firmware is designed to be a not-installed-by-default power-user tool to investigate, upgrade, downgrade and re-install firmware.
GNOME Software will continue to be used for updates as before. Vendor helpdesks can ask users to install GNOME Firmware rather than getting them to look at command line output.

We need to polish up GNOME Firmware going forwards, and add the last few features we need. If this interests you, please send email and I’ll explain what needs doing. We also need translations, although that can perhaps wait until GNOME Firmware moves to GNOME proper, rather than just being a repo in my personal GitLab. If anyone does want to translate it before then, please open merge requests, and be sure to file issues if any of the strings are difficult to translate or ambigious. Please also file issues (or even better merge requests!) if it doesn’t build or work for you.

If you just want to try out a new application, it takes 10 seconds to install it from Flathub.

Unit-testing static functions in C

Posted by Peter Hutterer on September 12, 2019 04:21 AM

An annoying thing about C code is that there are plenty of functions that cannot be unit-tested by some external framework - specifically anything declared as static. Any larger code-base will end up with hundreds of those functions, many of which are short and reasonably self-contained but complex enough to not trust them by looks only. But since they're static I can't access them from the outside (and "outside" is defined as "not in the same file" here).

The approach I've chosen in the past is to move the more hairy ones into separate files or at least declare them normally. That works but is annoying for some cases, especially those that really only get called once. In case you're wondering whether you have at least one such function in your source tree: yes, the bit that parses your commandline arguments is almost certainly complicated and not tested.

Anyway, this week I've finally found the right combination of hacks to make testing static functions easy, and it's:

  • #include the source file in your test code.
  • Mock any helper functions you'd need to trick the called functions
  • Instruct the linker to ignore unresolved symbols
And boom, you can write test cases to only test a single file within your source tree. And without any modifications to the source code itself.

A more detailed writeup is available in this github repo.

For the impatient, the meson snippet for a fictional source file example.c would look like this:


test('test-example',
executable('test-example',
'example.c', 'test-example.c',
dependencies: [dep_ext_library],
link_args: ['-Wl,--unresolved-symbols=ignore-all',
'-Wl,-zmuldefs',
'-no-pie'],
install: false),
)

There is no restriction on which test suite you can use. I've started adding a few of test cases based on this approach to libinput and so far it's working well. If you have a better approach or improvements, I'm all ears.

Please welcome Acer to the LVFS

Posted by Richard Hughes on September 11, 2019 11:44 AM

Acer has now officialy joined the LVFS, promoting the Aspire A315 firmware to stable.

Acer has been testing the LVFS for some time and now all the legal and technical checks have been completed. Other models will follow soon!

WebKit Vulnerabilities Facilitate Human Rights Abuses

Posted by Michael Catanzaro on September 08, 2019 05:32 PM

Chinese state actors have recently abused vulnerabilities in the JavaScriptCore component of WebKit to hack the personal computing devices of Uighur Muslims in the Xinjiang region of China. Mass digital surveillance is a key component of China’s ongoing brutal human rights crackdown in the region.

This has resulted in a public relations drama that is largely a distraction to the issue at hand. Whatever big-company PR departments have to say on the matter, I have no doubt that the developers working on WebKit recognize the severity of this incident and are grateful to Project Zero, which reported these vulnerabilities and has previously provided numerous other high-quality private vulnerability reports. (Many other organizations deserve credit for similar reports, especially Trend Micro’s Zero Day Initiative.)

WebKit as a project will need to reassess certain software development practices that may have facilitated the abuse of these vulnerabilities. The practice of committing security fixes to open source long in advance of corresponding Safari releases may need to be reconsidered.

Sadly, Uighurs should assume their personal computing devices have been compromised by state-sponsored attackers, and that their private communications are not private. Even if not compromised in this particular incident, similar successful attacks are overwhelmingly likely in the future.

Realizing that I’m not Super Human: Part 1

Posted by Richard Hughes on August 28, 2019 12:44 PM

Most of the content on this blog is technical in nature, as is my twitter feed. I wanted to step to one side, and talk a bit about one of the little things I’ve learned about my body: I’m not super human any more.

I’m one of those people that have been really lucky with my general physical and mental health over the years. I used to play a lot of rugby and got the odd injury, but nothing a long hot bath couldn’t fix. Modulo catching the flu a few years ago I don’t really get ill very much.

About this time last year I began to get a small amount of back pain when sitting for a long time, or when walking around for over an hour or so. This was the first warning. Over the next few months this got worse to the point it was now an electrical tingling all down one leg whenever I did “too much” walking or playing with the kids. I self-diagnosed this as some kind of sciatica and didn’t pay too much attention to it. This was the second warning sign. After my back finally “went pop” a couple of times in one week leaving me unable to walk properly at all, I finally went to a private physiotherapist and asked for some advice. Luckily for me this was all covered as part of my Red Hat compensation package and I didn’t have to pay a thing, which I know really isn’t the case if you’re paying for healthcare yourself.

The Physio did quite a lot of tests and then announced that my posture was, put bluntly, total crap. There was no magic pill nor any special sports massage to make it better, but everything could be fixed with a little bit of hard work. I had to make some immediate changes: my comfy armchair was out, a standing desk was in. 10 hours sitting in a chair coding was bad, hourly breaks were enforced. I was given some exercises to do every day (which I did) and after about 6 weeks of visits I was discharged as the tingling had gone and the back pain was much less. The physio suggested I do a weekly Pilates class to further improve my posture and to keep everything where it should be.

This was waaaay outside my comfort zone, as I’d never done any kind of exercise or group class before. I went to a group class and immediately realized I was at least two orders of magnitude less capable than everyone else. I could barely touch my knees when they could all touch the floor. The instructor was really kind and showed me all the positions and things to do and not do, but I still felt a bit weird in a class of mostly middle aged women dragging them all down to my level. I asked the instructor if he did 1:1 classes and he said yes; Since then I’ve been doing a 1 hour Pilates class every other week and, against all odds, I’m actually quite enjoying it now. My posture is much better; when I run I feel less like I’m flopping about and now have a stable “core” of muscle holding me all together. I can throw my children around in the park, and not worry about discs in my back bulging to the point of rupture. My breathing and concentration has improved, and if anything I guess I’m slightly more productive with hourly breaks.

Talking to other men, it seems quite a few people also do Pilates, but for some reason are a bit embarrassed to admit it to other people. I suppose I was initially too, but not now. My wife does Yoga, and I guess to me Pilates feels like a more physical Yoga without all the spiritual stuff mixed in. I’m not quite a card-carrying evangelist, but I really would recommend you try Pilates if you sit at a desk all day hunched over an editor all day, like I used to. Doing 1:1 classes is expensive (about £80/month) but it is 100% worth it with the results I’ve had so far.

So, the conclusion: I’m not Super Human any more, but that’s okay. If you’ve read this far – shoulders back, chin up, and get back to coding. If you’re interested, want an awesome instructor and you live in West London, give Ash a call.

GNOME Firmware Updater

Posted by Richard Hughes on August 28, 2019 09:37 AM

A few months ago, Dell asked if I’d like to co-mentor an intern over the summer. The task was to create a GTK “power user” application for managing firmware. The idea being that someone like Dell support could ask the user to run a little application and then read back firmware versions or downgrade to an older firmware version rather than getting them to use the command line. GNOME and KDE software centers deliberately show a “simple” view of firmware, only showing devices when updates are pending.

In June I was introduced to Andrew Schwenn, who was our intern for the summer. This blog isn’t about Andrew, but I will say he did amazingly well and was soon up to speed filing excellent pull requests even with a grumpy anally-retentive maintainer like me. Andrew has finished his internship now, but I wouldn’t be surprised if we work again with him in the future. Most of the work so far is from Andrew, so I can’t claim too much credit here.

GNOME Firmware Updater was designed in the style of a GNOME Control Center panel, and all the code is written in a way to make a port very simple indeed if that’s what we actually want. At the moment it’s a seporate project and binary, as we’re still prototyping the UI and working out what kind of UX we want from a power user tool. It’s mostly complete and a few weeks away from it’s first release. When it does get an official release, I’ll be sure to upload it to Flathub to make it easy for the world to install. If this sounds interesting to you the code is here. I don’t have a huge amount of time to dedicate to this power user tool, but please open pull requests or issues if there’s something you’d like to see fixed.

Tuhi - an application to support Wacom SmartPad devices

Posted by Peter Hutterer on August 26, 2019 11:19 AM

Sounds like déjà vu? Right, I posted a post with an almost identical title 18 months ago or so. This is about Tuhi 0.2, new and remodeled and completely different to that. Sort-of.

Tuhi is an application that supports the Wacom SmartPad devices - Bamboo Spark, Bamboo Slate, Bamboo Folio and Intuos Pro. The Bamboo range are digital notepads. They come with a real pen, you draw normally on the pad and use Bluetooth LE and Wacom's Inkspace application later to sync the files to disk. The Intuos Pro is the same but it's designed as a "normal" tablet with the paper mode available as well.

18 months ago, Benjamin Tissoires and I wrote Tuhi as a DBus session daemon. Tuhi would download the drawings from the file and make them available as JSON files over DBus to be converted to SVG or some other format by ... "clients". We wrote a simple commandline tool to debug Tuhi but no GUI, largely in the hope that maybe someone would be interested in doing that. Fast forward to now and that hasn't happened but I had some spare cycles over the last weeks so I present to you: Tuhi 0.2, now with a GTK GUI:

It's basic but also because it shouldn't do much more than just downloading the drawings and allowing you to save them. This is not an editing UI, it's effectively a file manager for the drawings on the tablet. And since by design those drawings get deleted as you download them, there isn't even much to that (don't worry, Tuhi doesn't really delete files, you can recover almost everything).

Under the hood there were some internal changes too but I suspect they'll be boring to most. The more interesting bits are reworks so we can test the conversions a lot better now and - worst case - recover files if Tuhi crashes. It is largely reverse-engineered after all.

On that note I would like to also extend my thanks to Wacom who have provided us with some of the specs for the protocol (under NDA, we cannot share these with the community, sorry). These specs helped tremendously understanding the protocol bits that were confusing at best and unknown at worst. There are still some corners in the protocol that we don't know but for the most recent generation (i.e. Intuos Pro) we should have correct parsing of the protocol.

And many thanks to Jakub Steiner for the fancy logo.

And, as of a few minutes ago, Tuhi is available as flatpak from flathub.org. For the forseeable future is is the best way to install Tuhi.

low-memory-monitor: new project announcement

Posted by Bastien Nocera on August 21, 2019 10:57 AM
I'll soon be flying to Greece for GUADEC but wanted to mention one of the things I worked on the past couple of weeks: the low-memory-monitor project is off the ground, though not production-ready.

low-memory-monitor, as its name implies, monitors the amount of free physical memory on the system and will shoot off signals to interested user-space applications, usually session managers, or sandboxing helpers, when that memory runs low, making it possible for applications to shrink their memory footprints before it's too late either to recover a usable system, or avoid taking a performance hit.

It's similar to Android's lowmemorykiller daemon, Facebook's oomd, Endless' psi-monitor, amongst others

Finally a GLib helper and a Flatpak portal are planned to make it easier for applications to use, with an API similar to iOS' or Android's.

Combined with work in Fedora to use zswap and remove the use of disk-backed swap, this should make most workstation uses more responsive and enjoyable.