Fedora desktop Planet

Fedora Atomic Workstation for development

Posted by Matthias Clasen on February 16, 2018 11:47 PM

I’m frequently building GTK+.  Since I am using Fedora Atomic Workstation now, i have to figure out how to do GTK+ development in this new environment. GTK+ may be a good example for the big middle ground of things that are not desktop applications, but also not part of the OS itself.

Image result for project atomic logo

Last week I figured out how to use a buildah container to build release tarballs for GNOME modules, and I actually used that setup to produce a GTK+ release as well.

But for fixing bugs and other development, I generally need to run test cases and demo apps, like the venerable gtk-demo. Running these outside the container does not work, since the GTK+ libraries I built are linked against libraries that are installed inside the container and not present on the host, such as libvulkan. I could of course resort to package layering to install them on the host, but that would miss the point of using Atomic Workstation.

The alternative is running the demo apps inside the container, which should work – its the same filesystem that they were built in. But they can’t talk to the compositor, since the Wayland socket is on the outside: /run/user/1000/wayland-0. I tried to work around this by making the socket visible in the container, but my knowledge of container tools and buildah is too limited to make it work. My apps still complain about not being able to open a display connection.

What now ? I decided that while GTK+ is not a desktop application, I can treat my test apps like one and write a flatpak manifest for them. This way, I can use GNOME builders awesome flatpak support to build and run them, like I already did for  GNOME recipes.

Here is a minimal  flatpak manifest that works:

{
  "id" : "org.gtk.gtk-demo",
  "runtime" : "org.gnome.Sdk",
  "runtime-version" : "master",
  "sdk" : "org.gnome.Sdk",
  "command" : "gtk4-demo",
  "finish-args" : [
    "--socket=wayland"
  ],
  "modules" : [
   {
     "name" : "graphene",
     "buildsystem" : "meson",
     "builddir" : true,
     "sources" : [
       {
         "type" : "git",
         "url" : "https://github.com/ebassi/graphene.git"
       }
     ]
   },
   {
     "name" : "gtk+",
     "buildsystem" : "meson",
     "builddir" : true,
     "sources" : [
       {
         "type" : "git",
         "url" : "https://gitlab.gnome.org/GNOME/gtk.git"
       }
     ]
   }
 ]
}

After placing this json file into the toplevel directory of my  GTK+ checkout, it appears as a new build configuration in GNOME builder:

If you look closely, you’ll notice that I added another manifest, for gtk4-widget-factory. You can have multiple manifests in your tree, and GNOME builder will let you switch between them in the Build Preferences.

After all this preparation, I can now hit the play button and have my demo app run right from inside GNOME builder. Note that the application is running inside a flatpak sandbox, using the runtime that was specified in the Build Preferences, so it is cleanly separated from the OS. And I can easily build and run against different runtimes, to test compatibility with older GNOME releases.

This may be the final push that makes me switch to GNOME Builder for day-to-day development on Fedora Atomic Workstation: It just works!

LVFS will block old versions of fwupd for some firmware

Posted by Richard Hughes on February 16, 2018 11:59 AM

The ability to restrict firmware to specific versions of fwupd and the existing firmware version was added to fwupd in version 0.8.0. This functionality was added so that you could prevent the firmware being deployed if the upgrade was going to fail, either because:

  • The old version of fwupd did not support the new hardware quirks
  • If the upgraded-from firmware had broken upgrade functionality

The former is solved by updating fwupd, the latter is solved by following the vendor procedure to manually flash the hardware, e.g. using a DediProg to flash the EEPROM directly. Requiring a specific fwupd version is used by the Logitech Unifying receiver update for example, and requiring a previous minimum firmware version is used by one (soon to be two…) laptop OEMs at the moment.

Although fwupd 0.8.0 was released over a year ago it seems people are still downloading firmware with older fwupd versions. 98% of the downloads from the LVFS are initiated from gnome-software, and 2% of people using the fwupdmgr command line or downloading the .cab file from the LVFS using a browser manually.

At the moment, fwupd is being updated in Ubuntu xenial to 0.8.3 but it is still stuck at the long obsolete 0.7.4 in Debian stable. Fedora, or course, is 100% up to date with 1.0.5 in F27 and 0.9.6 in F26 and F25. Even RHEL 7.4 has 0.8.2 and RHEL 7.5 will be 1.0.1.

Detecting the fwupd version also gets slightly more complicated, as the user agent only gives us the ‘client version’ rather than the ‘fwupd version’ in most software. This means we have to use the minimum fwupd version required by the client when choosing if it is safe to provide the file. GNOME Software version 3.26.0 was the first version to depend on fwupd ≥ 0.8.0 and so anything newer than that would be safe. This gives a slight problem, as Ubuntu will be shipping an old gnome-software 3.20.x and a new-enough fwupd 0.8.x and so will be blacklisted for any firmware that requires a specific fwupd version. Which includes the Logitech security update…

The user agent we get from gnome-software is gnome-software/3.20.1 and so we can’t do anything very clever. I’m obviously erring on not bricking a tiny amount of laptop hardware rather than making a lot of Logitech hardware secure on Ubuntu 16.04, given the next LTS 18.04 is out on April 26th anyway. This means people might start getting a detected fwupd version too old message on the console if they try updating using 16.04.

A workaround for xenial users might be if someone at Canonical could include this patch that changes the user agent in gnome-software package to be gnome-software/3.20.1 fwupd/0.8.3 and I can add a workaround in the LVFS download code to parse that. Comments welcome.

Fedora Atomic Workstation: Building flatpaks

Posted by Matthias Clasen on February 14, 2018 11:10 PM

In order to use my new Atomic Workstation for real, I need to be able to build things locally,  including creating flatpaks.

Image result for project atomic logo

One of the best tools for the  job (building flatpaks) is GNOME builder. I had already installed the stable build from flathub, but Christian told me that the nightly build is way better for flatpak building, so I went to install it from here.

Getting GNOME Builder

This highlights one of the nice aspects of flatpak: it is fundamentally decentralized. While flathub serves as a convenient one-stop-shop for many apps, it is entirely possible to have other remotes. Flathub is not privileged at all.

It is also perfectly possible to have both the stable gnome-builder from flathub and the nightly installed at the same time.

The only limitation is that only one of them will get to be presented as ‘the’ GNOME Builder by the desktop, since they use the same app id.  You can change between the installed versions of an application using the flatpak cli:

flatpak make-current --user org.gnome.Builder master

Building flatpaks

Now on to building flatpaks! Naturally, my testcase is GNOME Recipes. I have a git checkout of it, so I proceeded to open it in GNOME Builder, started a build … and it failed, with a somewhat cryptic error message about chdir() failing :-(

After quite a bit of head-scratching and debugging, we determined that this happens because flatpak is doing builds in a sandbox as well, and it is replacing /var with its own bind mount to do so. This creates a bit of confusion with the /home -> /var/home symlink that is part of the Atomic Workstation image. We are still trying to determine the best fix for this, you can follow along in this issue.

Since I am going to travel soon, I can’t wait for the official fix, so I came up with a workaround: Remove the /home -> /var/home symlink, create a regular /home directory in its place, and change /etc/fstab to mount my home partition there instead of /var/home. One reason why this is ugly is that I am modifying the supposedly immutable OS image. How ? By removing the immutable attribute with chattr -i /.  Another reason why it is ugly is that this has to be repeated everytime a new image gets installed (regardless whether it is via an update or via package layering).

But, with this workaround in place, there is no longer a troublesome symlink to cause trouble for flatpak, and my build succeeds. Once it is built, I can run the recipes flatpak with one click on the play button in builder.

Neat! I am almost ready to take Atomic Workstation on the road.

fwupd now tells you about known issues

Posted by Richard Hughes on February 13, 2018 02:57 PM

After a week of being sick and not doing much, I’m showing the results of a day-or-so of hacking:

So, most of that being familiar to anyone that’s followed my previous blog posts. But wait, what’s that about a known issue?

That one little URL for the user to click on is the result of a rule engine being added to the LVFS. Of course, firmware updates shouldn’t ever fail, but in the real world they do, because distros don’t create /boot/efi correctly (cough, Arch Linux) or just because some people are running old versions of efivar, a broken git snapshot of libfwupdate or because a vendor firmware updater doesn’t work with secure boot turned on (urgh). Of all the failures logged on the LVFS, 95% fall into about 3 or 4 different failure causes, and if we know hundreds of people are hitting an issue we already understand we can provide them with some help.

So, how does this work? If you’re a user you don’t see any of this, you just download the metadata and firmware semi-automatically and get on with your life. If you’re a blessed hardware vendor on the LVFS (i.e. you can QA the updates into the stable branch) you can also create and view the rules for firmware owned by just your vendor group:

This new functionality will be deployed to the LVFS during the next downtime window. Comments welcome.

Fedora Atomic Workstation: What about apps ?

Posted by Matthias Clasen on February 12, 2018 12:10 AM

I recently switched my main system to Fedora Atomic Workstation, and described my initial experience here. But I am going to travel soon, so I won’t have much time to fiddle with my laptop, and need to get things into working order.

Image result for project atomic logoConnections

One thing I needed to investigate is getting my VPN connections over from the old system. After a bit of consideration, I decided that it was easiest to just copy the relevant files from the old installation – /etc is not part of the immutable OS image, so this works just fine. I booted back into the old system and:

cp /etc/NetworkManager/system-connections/* /ostree/boot-1/.../etc/NetworkManager/system-connections

Note that you may also have to copy certificates over in a similar way.

Applications

But the bigger task for getting this system into working order is, of course, getting the applications back.

I could of course just use rpm-ostree’s layering and install fedora rpms for many apps. But, that would mean sliding back into the old world where applications are part of the OS, and dependencies force the OS and the apps to be updated together, etc. Since I want to explore the proper workflows and advantages of the Atomic model, I’ll instead try to install the apps separately from the OS, as flatpaks.

Currently, the best way to get flatpaks is to use flathub, so i went back there to see if I can find all I need. flathub has a bit more than 200 applications currently. That may not seem much, compared to the Android playstore, but lets see whats actually there:

With  Telegram, Spotify, Gimp, LibreOffice and some others, I find most of what I need frequently. And Skype, Slack, Inkscape, Blender and others are there too. Not bad.

Browsers

But what about web browsers? firefox is included in the atomic workstation image. To make it play media, you have to do the same things as on the traditional workstation – find an ffmpeg package and use rpm layering to make it part of the image.

chrome is unfortunately hard to package as a flatpak, since its own sandboxing technology conflicts with the sandboxing that is applied by flatpak. There is of course a chrome rpm, but it installs into /opt, which is not currently supported by rpm-ostree. See this issue for more details and possible solutions.

Beyond the major browsers, there’s some other choices available in flathub, such as GNOME Web or Eolie. These browsers use gstreamer for multimedia support, so they will pick up codecs that are available on the host system via the gstreamer runtime extension.

Next steps

The trip I’m going on is for a hackfest that will focus on an application (GNOME Recipes, in fact), so I will need a well-working setup for building flatpaks locally.

I’ll try out how well GNOME Builder handles this task on an Atomic System.

Razer doesn’t care about Linux

Posted by Richard Hughes on February 11, 2018 10:44 PM

tl;dr: Don’t buy hardware from Razer and expect firmware updates to fix security problems on Linux.

Razer is a vendor that makes high-end gaming hardware, including laptops, keyboards and mice. I opened a ticket with Razor a few days ago asking them if they wanted to support the LVFS project by uploading firmware and sharing the firmware update protocol used. I offered to upstream any example code they could share under a free license, or to write the code from scratch given enough specifications to do so. This is something I’ve done for other vendors, and doesn’t take long as most vendor firmware updaters all do the same kind of thing; there are only so many ways to send a few kb of data to USB devices. The fwupd project provides high-level code for accessing USB devices, so yet-another-update-protocol is no big deal. I explained all about the LVFS, and the benefits it provided to a userbase that is normally happy to vote using their wallet to get hardware that’s supported on the OS of their choice.

I just received this note on the ticket, which was escalated appropriately:

I have discussed your offer with the dedicated team and we are thankful for your enthusiasm and for your good idea.
I am afraid I have also to let you know that at this moment in time our support for software is only focused on Windows and Mac.

The CEO of Razer Min-Liang Tan said recently “We’re inviting all Linux enthusiasts to weigh in at the new Linux Corner on Insider to post feedback, suggestions and ideas on how we can make it the best notebook in the world that supports Linux.” If this is true, and more than just a sound-bite, supporting the LVFS for firmware updates on the Razer Blade to solve security problems like Meltdown and Spectre ought to be a priority?

Certainly if peripheral updates or system firmware UpdateCapsule are not supportable on Linux, it would be good to correct well read articles as those makes it sound like Razor is interested in Linux users, of which the reality seems somewhat less optimistic. I’ve updated the vendor list with this information to avoid other people asking or filing tickets. Disappointing, but I’ll hopefully have some happier news soon about a different vendor.

First steps with Fedora Atomic Workstation

Posted by Matthias Clasen on February 09, 2018 01:30 AM

There’s been a lot of attention for the Fedora Atomic Workstation project recently, with several presentations at devconf (Kalev LemberColin Walters, Jonathan Lebon) and fosdem (Sanja Bonic), blog posts and other docs.

Image result for project atomic logoI’ve played with the Atomic Workstation before, but it was always in a VM. That is a low-risk way to try it out, but the downside is that you can jump back to your ‘normal’ system at the first problem… which, naturally,  I did. The recent attention inspired me to try again.

This time, I wanted to try it for real and get some actual work done on the Atomic side. So this morning, I set out to convert my main system to Atomic Workstation. The goal I’ve set myself for today was to create a gnome-font-viewer release tarball using a container-based workflow.

There are two ways to install Atomic Workstation. You can either download an .iso and install from scratch, or you can convert an existing system. I chose the second option, following these instructions.  By and large, the instructions were accurate and led me to a successful installation. A few notes:

  • You need ~10G of free space on your root filesystem
  • I got server connection errors several time – just restarting the ostree pull command will eventually let it complete
  • The instructions recommend copying grub.cfg from /boot/loader to /boot/grub2/, but that only works for the current tree – if you install updates or add a layer to your ostree image, you have to repeat it. An easier solution is to create a symlink instead.

After a moment of fear, I decided to reboot, and found myself inside the Atomic Workstation – it just worked. After opening a terminal and finding my git checkouts, I felt a little helpless – none of git, gitg, gcc (or many other of the developer tools I’m used to) are around. What now ?

Thankfully, firefox was available, so I went to http://flathub.org and installed gitg as a flatpak, with a single click.

For the other developer tools, remember that my goal was to use a container-based workflow, so my next step was to install buildah, which is a tool to work with containers without the need for docker.   Installing the buildah rpm on Atomic Workstation feels a bit like a magic trick – after all, isn’t this an immutable image-based OS ?

What happens when you run

rpm-ostree install buildah

is that rpm-ostree is composing a new image by layering the rpm on top of the existing image. As expected, I had to reboot into the new image to see the newly installed tool.

Next, I tried to figure out some of the basics of working with buildah – here is a brief introduction to buildah that i found helpful. After creating and starting a Fedora-based container with

buildah from fedora
buildah run fedora-working-container bash

I could use dnf to install git, gcc and a few other things in the container. So far, so good.  But in order to make a gnome-font-viewer release, there is still one thing missing: I need access to my git checkout inside the container.  After some browsing around, I came up with this command:

buildah run -v /srv:/srv:rslave fedora-working-container bash

which should make /srv from the host system appear inside the container. And… i was stuck – trying to enumerate the contents of /src in the container was giving me permission errors, despite running as root.

Eventually, it dawned on me that selinux is to blame… The command

sudo chcon -R -h -t container_file_t /srv

is needed to make things work as expected. Alternatively, you could set selinux to be permissive.

From here on, things were pretty straightforward. I additionally needed to make my ssh keys available so I could push my commits from inside the container, and I needed a series of dnf commands to make enough build dependencies and tools available:

dnf install  git
dnf install meson
dnf install gtk3-devel
...

But eventually,

meson . build
ninja -Cbuild
ninja -Cbuild dist

worked and produced this tarball – success!

So, when you try gnome-font-viewer 3.27.90 remember: it was produced in a container.

The first steps are always the hardest. I expect things to get easier as I learn more about this way of working.

 

builders

Posted by Benjamin Otte on February 03, 2018 12:53 PM

An idiom that has shown up in GTK4 development is the idea of immutable objects and builders. The idea behind an immutable object is that you can be sure that it doesn’t change under you, so you don’t need to track changes, you can expose it in your API without having to fear users of the API are gonna change that object under you, you can use it as a key when caching and last but not least you can pass it into multiple threads without requiring synchronization.
Examples of immutable objects in GTK4 are GdkCursor, GdkTexture, GdkContentFormats or GskRenderNode. An example outside of GTK would be GBytes

Sometimes these objects are easy to create using a series of constructors, but oftentimes these objects are more complex. And because those objects are immutable, we can’t just provide setters for the various properties. Instead, we provide builder objects. They are short-lived objects whose only purpose is to manage the construction of an immutable object.
Here’s an example of how that would look:

Sandwich *
make_me_a_sandwich (void)
{
  SandwichBuilder *builder;

  /* create builder */
  builder = sandwich_builder_new ();
  /* setup the object to create */
  sandwich_builder_set_bread (builder, white_bread);
  sandwich_builder_add_ingredient (builder, cheese);
  sandwich_builder_add_ingredient (builder, ham);
  sandwich_builder_set_toasted (builder, TRUE);
  /* free the builder, create the object and return it */
  return sandwich_builder_free_to_sandwich (builder);
}

This approach works well in C, but does not work when trying to make the builder accessible to bindings. Bindings need no explicit memory management, so they want to write code like this:

def make_me_a_sandwich():
    # create builder
    builder = SandwichBuilder ()
    # setup the object to create
    builder.set_bread (white_bread)
    builder.add_ingredient (cheese)
    builder.add_ingredient (ham)
    builder.set_toasted (True)
    # create the object and return it
    return builder.to_sandwich ()

We spent the hackfest arguing about how to create a C API that works for both of those use cases and the consensus so far has been to turn builders into refcounted boxed types that provide both the above APIs – but advertises the C-specific parts only to the C APIs and the binding-specific parts only to bindings. So the C header for the above examples would look like this:

SandwichBuilder *sandwich_builder_new (void);
/* (skip) in bindings */
Sandwich *sandwich_builder_free_to_sandwich (SandwichBuilder *builder) G_GNUC_WARN_UNUSED_RESULT;

/* to be used by bindings only */
GType *sandwich_builder_get_type (void) G_GNUC_CONST;
SandwichBuilder *sandwich_builder_ref (SandwichBuilder *builder);
void sandwich_builder_unref (SandwichBuilder *builder);
Sandwich sandwich_builder_to_sandwich (SandwichBuilder *builder);

/* shared API */
void sandwich_builder_set_bread (SandwichBuilder *builder, Bread *bread);
void sandwich_builder_add_ingredient (SandwichBuilder *builder, Ingredient *ingredient);
void sandwich_builder_set_toasted (SandwichBuilder *builder, gboolean should_toast);

/* and in the .c file: */
G_DEFINE_BOXED_TYPE (SandwichBuilder, sandwich_builder, sandwich_builder_ref, sandwich_builder_unref)

And now I’m off to review all our builder APIs so they conform to this idea.

GTK+ hackfest, day 2

Posted by Matthias Clasen on February 03, 2018 08:28 AM

The second day of the GTK+ hackfest in Brussels started with an hour of patch review. We then went through scattered items from the agenda and collected answers to some questions.

We were lucky to have some dear friends join us for part of the day.  Allison came by for an extended GLib bug review session with Philip, and Adrien discussed strategies for dealing with small and changing form factors with us.  Allison, Adrien: thanks for coming!

The bulk of the day was taking up by a line-by-line review of our GTK+ 4 task list. Even though some new items appeared, it got a bit shorter, and many of the outstanding ones are much clearer now.

Our (jumbled and unedited) notes from the day are here.

The day ended with a nice dinner that was graciously sponsored by Purism. Thanks!

Decisions, decisions

We discussed too many things during these two days for a concise summary of every result, but here are some of the highlights:

  • gitlab migration: We want to migrate the GTK+ git repository as soon as possible. The bug migration needs preparation and will follow later
  • GTK+ 4 roadmap: We are aiming for an initial release in the fall of this year. We’ll reevaluate this target date at GUADEC

GTK+ hackfest, day 1

Posted by Matthias Clasen on February 01, 2018 10:21 PM

A number of GTK+ developers met today in Brussels for 2-day hackfest ahead of FOSDEM. Sadly, a few friends who we’d have loved to see couldn’t make it, but we still have enough of the core team together for a productive meeting.

We decided that it would be a good idea to start the day with ‘short overviews of rewritten subsystems’, to get everybody on the same page. The quick overviews turned out to take most of the day, but it was intermixed with a lot of very productive discussions and decisions.

My (jumbled and undedited) notes from the day are here.

<figure class="wp-caption aligncenter" id="attachment_1945" style="max-width: 225px"><figcaption class="wp-caption-text">Benjamin explaining the GTK+ clipboard on an actual clipboard</figcaption></figure>

At the end of the day, we’ve found a nice Vietnamese restaurant around the corner from the venue, and Shaun came by for food and beer.

I hope that day 2 of this short hackfest will be similarly productive!

Thanks to GNOME for sponsoring this event.

 

 

 

Firmware Telemetry for Vendors

Posted by Richard Hughes on February 01, 2018 12:10 PM

We’ve shipped nearly 1.2 MILLION firmware updates out to Linux users since we started the LVFS project.

I found out this nugget of information using a new LVFS vendor feature, soon to be deployed: Telemetry. This builds on the previously discussed success/failure reporting and adds a single page for the vendor to get statistics about each bit of hardware. Until more people are running the latest fwupd and volunteering to share their update history it’s less useful, but still interesting until then.

No new batches of ColorHug2

Posted by Richard Hughes on February 01, 2018 09:48 AM

I was informed by AMS (the manufacturer that makes the XYZ sensor that’s the core of the CH2 device) that the AS73210 (aka MTCSiCF) and the MTI08D are end of life products. The replacement for the sensor the vendor offers is the AS73211, which of course is more expensive and electrically incompatible with the AS73210.

The somewhat-related new AS7261 sensor does look interesting as it somewhat crosses the void between a colorimeter and something that can take non-emissive readings, but it’s a completely different sensor to the one on the ColorHug2, and mechanically to the now-abandoned ColorHug+. I’m also feeling twice burned buying specialist components from single-source suppliers.

Being a parents to a 16 week old baby doesn’t put Ania and I in a position where I can go through the various phases of testing, prototypes, test batch, production batch etc for a device refresh like we did with the ColorHug -> ColorHug2. I’m hoping I can get a chance to play with some more kinds of sensors from different vendors, although that’s not going to happen before I start getting my free time back. At the moment I have about 50 fully completed ColorHug2 devices in boxes ready to be sold.

In the true spirit of OpenHardware and free enterprise, if anyone does want to help with the design of a new ColorHug device I’m open for ideas. ColorHug was really just a hobby that got out of control, and I’d love for someone else to have the thrill and excitement of building a nano-company from scratch. Taking me out the equation completely, I’d be as equally happy referring on people who want to buy a ColorHug upgrade or replacement to a different project, if the new product met with my approval :)

So, 50 ColorHugs should last about 3 months before stock runs out, but I know a few people are using devices on production lines and other sorts of industrial control — if that sounds familiar, and you’d like to buy a spare device, now is the time to do so. Of course, I’ll continue supporting all the existing 3162 devices well into the future. I hope to be back building OpenHardware soon, and hopefully with a new and improved ColorHug3.

tuhi - a daemon to support Wacom SmartPad devices

Posted by Peter Hutterer on January 30, 2018 08:59 AM

For the last few weeks, Benjamin Tissoires and I have been working on a new project: Tuhi [1], a daemon to connect to and download data from Wacom SmartPad devices like the Bamboo Spark, Bamboo Slate and, eventually, the Bamboo Folio and the Intuos Pro Paper devices. These devices are not traditional graphics tablets plugged into a computer but rather smart notepads where the user's offline drawing is saved as stroke data in vector format and later synchronised with the host computer over Bluetooth. There it can be converted to SVG, integrated into the applications, etc. Wacom's application for this is Inkspace.

There is no official Linux support for these devices. Benjamin and I started looking at the protocol dumps last year and, luckily, they're not completely indecipherable and reverse-engineering them was relatively straightforward. Now it is a few weeks later and we have something that is usable (if a bit rough) and provides the foundation for supporting these devices properly on the Linux desktop. The repository is available on github at https://github.com/tuhiproject/tuhi/.

The main core is a DBus session daemon written in Python. That daemon connects to the devices and exposes them over a custom DBus API. That API is relatively simple, it supports the methods to search for devices, pair devices, listen for data from devices and finally to fetch the data. It has some basic extras built in like temporary storage of the drawing data so they survive daemon restarts. But otherwise it's a three-way mapper from the Bluez device, the serial controller we talk to on the device and the Tuhi DBus API presented to the clients. One such client is the little commandline tool that comes with tuhi: tuhi-kete [2]. Here's a short example:


$> ./tools/tuhi-kete.py
Tuhi shell control
tuhi> search on
INFO: Pairable device: E2:43:03:67:0E:01 - Bamboo Spark
tuhi> pair E2:43:03:67:0E:01
INFO: E2:43:03:67:0E:01 - Bamboo Spark: Press button on device now
INFO: E2:43:03:67:0E:01 - Bamboo Spark: Pairing successful
tuhi> listen E2:43:03:67:0E:01
INFO: E2:43:03:67:0E:01 - Bamboo Spark: drawings available: 1516853586, 1516859506, [...]
tuhi> list
E2:43:03:67:0E:01 - Bamboo Spark
tuhi> info E2:43:03:67:0E:01
E2:43:03:67:0E:01 - Bamboo Spark
Available drawings:
* 1516853586: drawn on the 2018-01-25 at 14:13
* 1516859506: drawn on the 2018-01-25 at 15:51
* 1516860008: drawn on the 2018-01-25 at 16:00
* 1517189792: drawn on the 2018-01-29 at 11:36
tuhi> fetch E2:43:03:67:0E:01 1516853586
INFO: Bamboo Spark: saved file "Bamboo Spark-2018-01-25-14-13.svg"
I won't go into the details because most should be obvious and this is purely a debugging client, not a client we expect real users to use. Plus, everything is still changing quite quickly at this point.

The next step is to get a proper GUI application working. As usual with any GUI-related matter, we'd really appreciate some help :)

The project is young and relying on reverse-engineered protocols means there are still a few rough edges. Right now, the Bamboo Spark and Slate are supported because we have access to those. The Folio should work, it looks like it's a re-packaged Slate. Intuos Pro Paper support is still pending, we don't have access to a device at this point. If you're interested in testing or helping out, come on over to the github site and get started!

[1] tuhi: Maori for "writing, script"
[2] kete: Maori for "kit"

GCab and CVE-2018-5345

Posted by Richard Hughes on January 23, 2018 01:22 PM

tl;dr: Update GCab from your distributor.

Longer version: Just before Christmas I found a likely exploitable bug in the libgcab library. Various security teams have been busy with slightly more important issues, and so it’s taken a lot longer than usual to be verified and assigned a CVE. The issue I found was that libgcab attempted to read a large chunk into a small buffer, overwriting lots of interesting things past the end of the buffer. ALSR and SELinux saves us in nearly all cases, so it’s not the end of the world. Almost a textbook C buffer overflow (rust, yada, whatever) so it was easy to fix.

Some key points:

  • This only affects libgcab, not cabarchive or libarchive
  • All gcab versions less than 0.8 are affected
  • Anything that links to gcab is affected, so gnome-software, appstream-glib and fwupd at least
  • Once you install the fixed gcab you need to restart anything that’s using it, e.g. fwupd
  • There is no silly branded name for this bug
  • The GCab project is incredibly well written, and I’ve been hugely impressed with the code quality
  • You can test if your GCab has been fixed by attempting to decompress this file, if the program crashes, you need to update

With Marc-André’s blessing, I’ve released version v0.8 of gcab with this fix. I’ve also released v1.0 which has this fix (and many more nice API additions) which also switches the build system to Meson and cleans up a lot of leaks using g_autoptr(). If you’re choosing a version to update to, the answer is probably 1.0 unless you’re building for something more sedate like RHEL 5 or 6. You can get the Fedora 27 packages here or they’ll be on the mirrors tomorrow.

Privacy expectations and the connected home

Posted by Matthew Garrett on January 17, 2018 09:45 PM
Traditionally, devices that were tied to logins tended to indicate that in some way - turn on someone's xbox and it'll show you their account name, run Netflix and it'll ask which profile you want to use. The increasing prevalence of smart devices in the home changes that, in ways that may not be immediately obvious to the majority of people. You can configure a Philips Hue with wall-mounted dimmers, meaning that someone unfamiliar with the system may not recognise that it's a smart lighting system at all. Without any actively malicious intent, you end up with a situation where the account holder is able to infer whether someone is home without that person necessarily having any idea that that's possible. A visitor who uses an Amazon Echo is not necessarily going to know that it's tied to somebody's Amazon account, and even if they do they may not know that the log (and recorded audio!) of all interactions is available to the account holder. And someone grabbing an egg out of your fridge is almost certainly not going to think that your smart egg tray will trigger an immediate notification on the account owner's phone that they need to buy new eggs.

Things get even more complicated when there's multiple account support. Google Home supports multiple users on a single device, using voice recognition to determine which queries should be associated with which account. But the account that was used to initially configure the device remains as the fallback, with unrecognised voices ended up being logged to it. If a voice is misidentified, the query may end up being logged to an unexpected account.

There's some interesting questions about consent and expectations of privacy here. If someone sets up a smart device in their home then at some point they'll agree to the manufacturer's privacy policy. But if someone else makes use of the system (by pressing a lightswitch, making a spoken query or, uh, picking up an egg), have they consented? Who has the social obligation to explain to them that the information they're producing may be stored elsewhere and visible to someone else? If I use an Echo in a hotel room, who has access to the Amazon account it's associated with? How do you explain to a teenager that there's a chance that when they asked their Home for contact details for an abortion clinic, it ended up in their parent's activity log? Who's going to be the first person divorced for claiming that they were vegan but having been the only person home when an egg was taken out of the fridge?

To be clear, I'm not arguing against the design choices involved in the implementation of these devices. In many cases it's hard to see how the desired functionality could be implemented without this sort of issue arising. But we're gradually shifting to a place where the data we generate is not only available to corporations who probably don't care about us as individuals, it's also becoming available to people who own the more private spaces we inhabit. We have social norms against bugging our houseguests, but we have no social norms that require us to explain to them that there'll be a record of every light that they turn on or off. This feels like it's going to end badly.

(Thanks to Nikki Everett for conversations that inspired this post)

(Disclaimer: while I work for Google, I am not involved in any of the products or teams described in this post and my opinions are my own rather than those of my employer's)

comment count unavailable comments

Phoning home after updating firmware?

Posted by Richard Hughes on January 10, 2018 03:42 PM

Somebody made a proposal on the fwupd mailing list that the machine running fwupd should “phone home” to the LVFS with success or failure after the firmware update has been attempted.

This would let the hardware vendor that uploaded firmware know there are problems straight away, rather than waiting for thousands of frustrated users to file bugs. The report should needs to contain something that identifies the machine and a boolean, and in the event of an error, enough debug information to actually be useful. It would obviously involve sending the users IP address to the server too.

I ran a poll on my Google+ page, and this was the result:

So, a significant minority of people felt like it stepped over the line of privacy v.s. pragmatism. This told me I couldn’t just forge onward with automated collection, and this blog entry outlines what we’ve done for the 1.0.4 release. I hope this proposal is acceptable to even the most paranoid of users.

The fwupd daemon now stores the result of each attempted update in a local SQLite database. In the event there’s a firmware update that’s been attempted, we now ask the user if they would like to upload this information to the LVFS. Using GNOME this would just be a slider in the control center Privacy panel, and I’ll leave it to the distros to decide if this slider should be on or off by default. If you’re using the fwupdmgr tool this is what it shows:

$ fwupdmgr report-history
Target:                  https://the-lvfs-server/lvfs/firmware/report
Payload:                 {
                           "ReportVersion" : 1,
                           "MachineId" : "9c43dd393922b7edc16cb4d9a36ac01e66abc532db4a4c081f911f43faa89337",
                           "DistroId" : "fedora",
                           "DistroVersion" : "27",
                           "DistroVariant" : "workstation",
                           "Reports" : [
                             {
                               "DeviceId" : "da145204b296610b0239a4a365f7f96a9423d513",
                               "Checksum" : "d0d33e760ab6eeed6f11b9f9bd7e83820b29e970",
                               "UpdateState" : 2,
                               "Guid" : "77d843f7-682c-57e8-8e29-584f5b4f52a1",
                               "FwupdVersion" : "1.0.4",
                               "Plugin" : "unifying",
                               "Version" : "RQR12.05_B0028",
                               "VersionNew" : "RQR12.07_B0029",
                               "Flags" : 674,
                               "Created" : 1515507267,
                               "Modified" : 1515507956
                             }
                           ]
                         }
Proceed with upload? [Y|n]: 

Using this new information that the user volunteers, we can display a new line in the LVFS web-console:

Which expands out to the report below:

This means vendors using the LVFS know first of all how many downloads they have, and also the number of success and failures. This allows us to offer the same kind of staged deployment that Microsoft Update does, where you can limit the number of updated machines to 10,000/day or automatically pause the specific firmware deployment if > 1% of the reports come back with failures.

Some key points:

  • We don’t share the IP address with the vendor, in fact it’s not even saved in the MySQL database
  • The MachineId is a salted hash of your actual /etc/machine-id
  • The LVFS doesn’t store reports for firmware that it did not sign itself, i.e. locally built firmware archives will be ignored and not logged
  • You can disable the reporting functionality in all applications by editing /etc/fwupd/remotes.d/*.conf
  • We have an official GDPR document too — we’ll probably link to that from the Privacy panel in GNOME

Comments welcome.

libevdev-python - a python wrapper for libevdev

Posted by Peter Hutterer on January 07, 2018 11:57 PM

Last year, just before the holidays Benjamin Tissoires and I worked on a 'new' project - libevdev-python. This is, unsurprisingly, a Python wrapper to libevdev. It's not exactly new since we took the git tree from 2016 when I was working on it the first time round but this time we whipped it into a better shape. Now it's at the point where I think it has the API it should have, pythonic and very easy to use but still with libevdev as the actual workhorse in the background. It's available via pip3 and should be packaged for your favourite distributions soonish.

Who is this for? Basically anyone who needs to work with the evdev protocol. While C is still a thing, there are many use-cases where Python is a much more sensible choice. The libevdev-python documentation on ReadTheDocs provides a few examples which I'll copy here, just so you get a quick overview. The first example shows how to open a device and then continuously loop through all events, searching for button events:


import libevdev

fd = open('/dev/input/event0', 'rb')
d = libevdev.Device(fd)
if not d.has(libevdev.EV_KEY.BTN_LEFT):
print('This does not look like a mouse device')
sys.exit(0)

# Loop indefinitely while pulling the currently available events off
# the file descriptor
while True:
for e in d.events():
if not e.matches(libevdev.EV_KEY):
continue

if e.matches(libevdev.EV_KEY.BTN_LEFT):
print('Left button event')
elif e.matches(libevdev.EV_KEY.BTN_RIGHT):
print('Right button event')
The second example shows how to create a virtual uinput device and send events through that device:

import libevdev
d = libevdev.Device()
d.name = 'some test device'
d.enable(libevdev.EV_REL.REL_X)
d.enable(libevdev.EV_REL.REL_Y)
d.enable(libevdev.EV_KEY.BTN_LEFT)
d.enable(libevdev.EV_KEY.BTN_MIDDLE)
d.enable(libevdev.EV_KEY.BTN_RIGHT)

uinput = d.create_uinput_device()
print('new uinput test device at {}'.format(uinput.devnode))
events = [InputEvent(libevdev.EV_REL.REL_X, 1),
InputEvent(libevdev.EV_REL.REL_Y, 1),
InputEvent(libevdev.EV_SYN.SYN_REPORT, 0)]
uinput.send_events(events)
And finally, if you have a textual or binary representation of events, the evbit function helps to convert it to something useful:

>>> import libevdev
>>> print(libevdev.evbit(0))
EV_SYN:0
>>> print(libevdev.evbit(2))
EV_REL:2
>>> print(libevdev.evbit(3, 4))
ABS_RY:4
>>> print(libevdev.evbit('EV_ABS'))
EV_ABS:3
>>> print(libevdev.evbit('EV_ABS', 'ABS_X'))
ABS_X:0
>>> print(libevdev.evbit('ABS_X'))
ABS_X:0
The latter is particularly helpful if you have a script that needs to analyse event sequences and look for protocol bugs (or hw/fw issues).

More explanations and details are available in the libevdev-python documentation. That doc also answers the question why libevdev-python exists when there's already a python-evdev package. The code is up on github.

More fun with fonts

Posted by Matthias Clasen on January 04, 2018 01:04 AM

Just before Christmas, I spent some time in New York to continue font work with Behdad that we had begun earlier this year.

As you may remember from my last post on fonts, our goal was to support OpenType font variations. The Linux text rendering stack has multiple components: freetype, fontconfig, harfbuzz, cairo, pango. Achieving our goal required a number of features and fixes in all these components.

Getting all the required changes in place is a bit time-consuming, but the results are finally starting to come together. If you use the master branches of freetype, fontconfig, harfbuzz, cairo, pango and GTK+, you can try this out today.

Warm-up

But beyond variations, we want to improve font support in general. To start off, we fixed a few bugs in the color Emoji support in cairo and GTK+.

Polish

Next was small improvements to the font chooser, such as a cleaner look for the font list, type-to-search and maintaining the sensitivity of the select button:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1920-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-035205-PM.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-035205-PM.webm</video>

Features

I also spent some time on OpenType features, and making them accessible to users.  When I first added feature support in Pango, I wrote a GTK+ demo that shows them in action, but without a ready-made GTK+ dialog, basically no applications have picked this up.

Time to change this! After some experimentation, I came up with what I think is an acceptable UI for customizing features of a font:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1920-2" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-063819-PM-3.webm?_=2" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-063819-PM-3.webm</video>

It is still somewhat limited since we only show features that are supported by the selected font and make sense for entire documents or paragraphs of text.  Many OpenType features can really only be selected for smaller ranges of text, such as fractions or subscripts. Support for those may come at a later time.

Part of the necessary plumbing for making this work nicely was to implement the font-feature-settings CSS property, which brings GTK+ closer to full support for level 3 of the CSS font module. For theme authors, this means that all OpenType font features are accessible from CSS.

One thing to point out here is that font feature settings are not part of the PangoFont  object, but get specified via attributes (or markup, if you like). For the font chooser, this means that we’ve had to add new API to return the selected features: pango_font_chooser_get_font_features(). Applications need to apply the returned features to their text by wrapping them in a PangoAttribute.

Variations

Once we had this ‘tweak page’ added to the font chooser, it was the natural place to expose variations as well, so this is what we did next. Remember that variations define number of ‘axes’ for the font, along which the characteristics of the font can be continuously changed. In UI terms, this means we that we add sliders similar to the one we already have for the font size:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1920-3" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-065307-PM.webm?_=3" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-065307-PM.webm</video>

Again, fully supporting variations meant implementing the corresponding  font-variation-settings CSS property (yes, there is a level 4 of the CSS fonts module). This will enable some fun experiments, such as animating font changes:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1920-4" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-070645-PM.webm?_=4" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-070645-PM.webm</video>

All of this work would be hard to do without some debugging and exploration tools. gtk-demo already contained the Font Features example. During the week in New York, I’ve made it handle variations as well, and polished it in various ways.

To reflect that it is no longer just about font features, it is now called Font Explorer. One fun thing I added is a combined weight-width plane, so you can now explore your fonts in 2 dimensions:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1920-5" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-074638-PM.webm?_=5" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/01/Screencast-from-01-03-2018-074638-PM.webm</video>

Whats next

As always, there is more work to do. Here is an unsorted list of ideas for next steps:

  • Backport the font chooser improvements to GTK+ 3. Some new API is involved, so we’ll have to see about it.
  • Add pango support for variable families. The current font chooser code uses freetype and harfbuzz APIs to find out about OpenType features and variations. It would be nice to have some API in pango for this.
  • Improve font filtering. It would be nice to support filtering by language or script in the font chooser. I have code for this, but it needs some more pango API to perform acceptably.
  • Better visualization for features. It would be nice to highlight the parts of a string that are affected by certain features. harfbuzz does not currently provide this information though.
  • More elaborate feature support. For example, it would be nice to have a way to enable character-level features such as fractions or superscripts.
  • Support for glyph selection. Several OpenType features provide (possibly multiple) alternative glyphs,  with the expectation that the user will be presented with a choice. harfbuzz does not have convenient API for implementing this.
  • Add useful font metadata to fontconfig, such as ‘Is this a serif, sans-serif or handwriting font ?’ and use it to offer better filtering
  • Implement @font-face rules in CSS and use them to make customized fonts first-class objects.

Help with any of this is more than welcome!

When should behaviour outside a community have consequences inside it?

Posted by Matthew Garrett on December 21, 2017 10:09 AM
Free software communities don't exist in a vacuum. They're made up of people who are also members of other communities, people who have other interests and engage in other activities. Sometimes these people engage in behaviour outside the community that may be perceived as negatively impacting communities that they're a part of, but most communities have no guidelines for determining whether behaviour outside the community should have any consequences within the community. This post isn't an attempt to provide those guidelines, but aims to provide some things that community leaders should think about when the issue is raised.

Some things to consider

Did the behaviour violate the law?

This seems like an obvious bar, but it turns out to be a pretty bad one. For a start, many things that are common accepted behaviour in various communities may be illegal (eg, reverse engineering work may contravene a strict reading of US copyright law), and taking this to an extreme would result in expelling anyone who's ever broken a speed limit. On the flipside, refusing to act unless someone broke the law is also a bad threshold - much behaviour that communities consider unacceptable may be entirely legal.

There's also the problem of determining whether a law was actually broken. The criminal justice system is (correctly) biased to an extent in favour of the defendant - removing someone's rights in society should require meeting a high burden of proof. However, this is not the threshold that most communities hold themselves to in determining whether to continue permitting an individual to associate with them. An incident that does not result in a finding of criminal guilt (either through an explicit finding or a failure to prosecute the case in the first place) should not be ignored by communities for that reason.

Did the behaviour violate your community norms?

There's plenty of behaviour that may be acceptable within other segments of society but unacceptable within your community (eg, lobbying for the use of proprietary software is considered entirely reasonable in most places, but rather less so at an FSF event). If someone can be trusted to segregate their behaviour appropriately then this may not be a problem, but that's probably not sufficient in all cases. For instance, if someone acts entirely reasonably within your community but engages in lengthy anti-semitic screeds on 4chan, it's legitimate to question whether permitting them to continue being part of your community serves your community's best interests.

Did the behaviour violate the norms of the community in which it occurred?

Of course, the converse is also true - there's behaviour that may be acceptable within your community but unacceptable in another community. It's easy to write off someone acting in a way that contravenes the standards of another community but wouldn't violate your expected behavioural standards - after all, if it wouldn't breach your standards, what grounds do you have for taking action?

But you need to consider that if someone consciously contravenes the behavioural standards of a community they've chosen to participate in, they may be willing to do the same in your community. If pushing boundaries is a frequent trait then it may not be too long until you discover that they're also pushing your boundaries.

Why do you care?

A community's code of conduct can be looked at in two ways - as a list of behaviours that will be punished if they occur, or as a list of behaviours that are unlikely to occur within that community. The former is probably the primary consideration when a community adopts a CoC, but the latter is how many people considering joining a community will think about it.

If your community includes individuals that are known to have engaged in behaviour that would violate your community standards, potential members or contributors may not trust that your CoC will function as adequate protection. A community that contains people known to have engaged in sexual harassment in other settings is unlikely to be seen as hugely welcoming, even if they haven't (as far as you know!) done so within your community. The way your members behave outside your community is going to be seen as saying something about your community, and that needs to be taken into account.

A second (and perhaps less obvious) aspect is that membership of some higher profile communities may be seen as lending general legitimacy to someone, and they may play off that to legitimise behaviour or views that would be seen as abhorrent by the community as a whole. If someone's anti-semitic views (for example) are seen as having more relevance because of their membership of your community, it's reasonable to think about whether keeping them in your community serves the best interests of your community.

Conclusion

I've said things like "considered" or "taken into account" a bunch here, and that's for a good reason - I don't know what the thresholds should be for any of these things, and there doesn't seem to be even a rough consensus in the wider community. We've seen cases in which communities have acted based on behaviour outside their community (eg, Debian removing Jacob Appelbaum after it was revealed that he'd sexually assaulted multiple people), but there's been no real effort to build a meaningful decision making framework around that.

As a result, communities struggle to make consistent decisions. It's unreasonable to expect individual communities to solve these problems on their own, but that doesn't mean we can ignore them. It's time to start coming up with a real set of best practices.

comment count unavailable comments

More Bluetooth (and gaming) features

Posted by Bastien Nocera on December 15, 2017 03:57 PM
In the midst of post-release bug fixing, we've also added a fair number of new features to our stack. As usual, new features span a number of different components, so integrators will have to be careful picking up all the components when, well, integrating.

PS3 clones joypads support

Do you have a PlayStation 3 joypad that feels just a little bit "off"? You can't find the Sony logo anywhere on it? The figures on the face buttons look like barbed wire? And if it were a YouTube video, it would say "No copyright intended"?


Bingo. When plugged in via USB, those devices advertise themselves as SHANWAN or Gasia, and implement the bare minimum to work when plugged into a PlayStation 3 console. But as a Linux computer would behave slightly differently, we need to fix a couple of things.

The first fix was simple, but necessary to be able to do any work: disable the rumble motor that starts as soon as you plug the pad through USB.

Once that's done, we could work around the fact that the device isn't Bluetooth compliant, and hard-code the HID service it's supposed to offer.

Bluetooth LE Battery reporting

Bluetooth Low Energy is the new-fangled (7-year old) protocol for low throughput devices, from a single coin-cell powered sensor, to input devices. What's great is that there's finally a standardised way for devices to export their battery statuses. I've added support for this in BlueZ, which UPower then picks up for desktop integration goodness.

There are a number of Bluetooth LE joypads available for pickup, including a few that should be firmware upgradeable. Look for "Bluetooth 4" as well as "Bluetooth LE" when doing your holiday shopping.

gnome-bluetooth work

Finally, this is the boring part. Benjamin and I reworked code that's internal to gnome-bluetooth, as used in the Settings panel as well as the Shell, to make it use modern facilities like GDBusObjectManager. The overall effect of this is, less code, less brittle and more reactive when Bluetooth adapters come and go, such as when using airplane mode.

Apart from the kernel patch mentioned above (you'll know if you need it :), those features have been integrated in UPower 0.99.7 and in the upcoming BlueZ 5.48. And they will of course be available in Fedora, both in rawhide and as updates to Fedora 27 as soon as the releases have been done and built.

GG!

The Intel ME vulnerabilities are a big deal for some people, harmless for most

Posted by Matthew Garrett on December 14, 2017 01:31 AM
(Note: all discussion here is based on publicly disclosed information, and I am not speaking on behalf of my employers)

I wrote about the potential impact of the most recent Intel ME vulnerabilities a couple of weeks ago. The details of the vulnerability were released last week, and it's not absolutely the worst case scenario but it's still pretty bad. The short version is that one of the (signed) pieces of early bringup code for the ME reads an unsigned file from flash and parses it. Providing a malformed file could result in a buffer overflow, and a moderately complicated exploit chain could be built that allowed the ME's exploit mitigation features to be bypassed, resulting in arbitrary code execution on the ME.

Getting this file into flash in the first place is the difficult bit. The ME region shouldn't be writable at OS runtime, so the most practical way for an attacker to achieve this is to physically disassemble the machine and directly reprogram it. The AMT management interface may provide a vector for a remote attacker to achieve this - for this to be possible, AMT must be enabled and provisioned and the attacker must have valid credentials[1]. Most systems don't have provisioned AMT, so most users don't have to worry about this.

Overall, for most end users there's little to worry about here. But the story changes for corporate users or high value targets who rely on TPM-backed disk encryption. The way the TPM protects access to the disk encryption key is to insist that a series of "measurements" are correct before giving the OS access to the disk encryption key. The first of these measurements is obtained through the ME hashing the first chunk of the system firmware and passing that to the TPM, with the firmware then hashing each component in turn and storing those in the TPM as well. If someone compromises a later point of the chain then the previous step will generate a different measurement, preventing the TPM from releasing the secret.

However, if the first step in the chain can be compromised, all these guarantees vanish. And since the first step in the chain relies on the ME to be running uncompromised code, this vulnerability allows that to be circumvented. The attacker's malicious code can be used to pass the "good" hash to the TPM even if the rest of the firmware has been tampered with. This allows a sufficiently skilled attacker to extract the disk encryption key and read the contents of the disk[2].

In addition, TPMs can be used to perform something called "remote attestation". This allows the TPM to provide a signed copy of the recorded measurements to a remote service, allowing that service to make a policy decision around whether or not to grant access to a resource. Enterprises using remote attestation to verify that systems are appropriately patched (eg) before they allow them access to sensitive material can no longer depend on those results being accurate.

Things are even worse for people relying on Intel's Platform Trust Technology (PTT), which is an implementation of a TPM that runs on the ME itself. Since this vulnerability allows full access to the ME, an attacker can obtain all the private key material held in the PTT implementation and, effectively, adopt the machine's cryptographic identity. This allows them to impersonate the system with arbitrary measurements whenever they want to. This basically renders PTT worthless from an enterprise perspective - unless you've maintained physical control of a machine for its entire lifetime, you have no way of knowing whether it's had its private keys extracted and so you have no way of knowing whether the attestation attempt is coming from the machine or from an attacker pretending to be that machine.

Bootguard, the component of the ME that's responsible for measuring the firmware into the TPM, is also responsible for verifying that the firmware has an appropriate cryptographic signature. Since that can be bypassed, an attacker can reflash modified firmware that can do pretty much anything. Yes, that probably means you can use this vulnerability to install Coreboot on a system locked down using Bootguard.

(An aside: The Titan security chips used in Google Cloud Platform sit between the chipset and the flash and verify the flash before permitting anything to start reading from it. If an attacker tampers with the ME firmware, Titan should detect that and prevent the system from booting. However, I'm not involved in the Titan project and don't know exactly how this works, so don't take my word for this)

Intel have published an update that fixes the vulnerability, but it's pretty pointless - there's apparently no rollback protection in the affected 11.x MEs, so while the attacker is modifying your flash to insert the payload they can just downgrade your ME firmware to a vulnerable version. Version 12 will reportedly include optional rollback protection, which is little comfort to anyone who has current hardware. Basically, anyone whose threat model depends on the low-level security of their Intel system is probably going to have to buy new hardware.

This is a big deal for enterprises and any individuals who may be targeted by skilled attackers who have physical access to their hardware, and entirely irrelevant for almost anybody else. If you don't know that you should be worried, you shouldn't be.

[1] Although admins should bear in mind that any system that hasn't been patched against CVE-2017-5689 considers an empty authentication cookie to be a valid credential

[2] TPMs are not intended to be strongly tamper resistant, so an attacker could also just remove the TPM, decap it and (with some effort) extract the key that way. This is somewhat more time consuming than just reflashing the firmware, so the ME vulnerability still amounts to a change in attack practicality.

comment count unavailable comments

CSR devices now supported in fwupd

Posted by Richard Hughes on December 11, 2017 12:42 PM

On Friday I added support for yet another variant of DFU. This variant is called “driverless DFU” and is used only by BlueCore chips from Cambridge Silicon Radio (now owned by Qualcomm). The driverless just means that it’s DFU like, and routed over HID, but it’s otherwise an unremarkable protocol. CSR is a huge ODM that makes most of the Bluetooth audio chips in vendor hardware. The hardware vendor can enable or disable features on the CSR microcontroller depending on licensing options (for instance echo cancellation), and there’s even a little virtual machine to do simple vendor-specific things. All the CSR chips are updatable in-field, and most vendors issue updates to fix sound quality issues or to add support for new protocols or devices.

The BlueCore CSR chips are used everywhere. If you have a “wireless” speaker or headphones that uses Bluetooth there is a high probability that it’s using a CSR chip inside. This makes the addition of CSR support into fwupd a big deal to access a lot of vendors. It’s a lot easier to say “just upload firmware” rather than “you have to write code” so I think it’s useful to have done this work.

The vendor working with me on this feature has been the awesome AIAIAI who make some very nice modular headphones. A few minutes ago we uploaded the H05 v1.5 firmware to the LVFS testing stream and v1.6 will be coming soon with even more bug fixes. To update the AIAIAI H05 firmware you just need to connect the USB cable and press and hold the top and bottom buttons on the headband until the LED goes out. You can then update the firmware using fwupdmgr update or just using GNOME Software. The big caveat is that you have to be running fwupd >= 1.0.3 which isn’t scheduled to be released until after Christmas.

I’ve contacted some more vendors I suspect are using the CSR chips. These include:

  • Jarre Technologies
  • RIVA Audio
  • Avantree
  • Zebra
  • Fugoo
  • Bowers&Wilkins
  • Plantronics
  • BeoPlay
  • JBL

If you know of any other “wireless speaker” companies that have issued at least one firmware update to users, please let me know in a comment here or in an email. I will follow up all suggestions and put the status on the Naughty&Nice vendorlist so please check that before suggesting a company. It would also be really useful to know the contact details (e.g. the web-form URL, or the email address) and also the model name of the device that might be updatable, although I’m happy to google myself if required. Thanks as always to Red Hat for allowing me to work on this stuff.

OARS Gets a New Home

Posted by Richard Hughes on December 07, 2017 09:50 AM

The Open Age Ratings Service is a simple website that lets you generate some content rating XML for your upstream AppData file.

In the last few months it’s gone from being hardly used to being used multiple times an hour, probably due to the requirement that applications on Flathub need it as part of the review process. After some complaints, I’ve added a ton more explanation to each question and made it easier to use. In particular if you specify that you’re creating metadata for a “non-game” then 80% of the questions get hidden from view.

As part of the relaunch, we now have a proper issue tracker and we’re already pushed out some minor (API compatible) enhancements which will become OARS v1.1. These include several cultural sensitivity questions such as:

  • Homosexuality
  • Prostitution
  • Adultery
  • Desecration
  • Slavery
  • Violence towards places of worship

The cultural sensitivity questions are work in progress. If you have any other ideas, or comments, please let me know. Also, before I get internetted-to-death, this is just for advisory purposes, not for filtering. Thanks.

UTC and Anywhere on Earth support

Posted by Bastien Nocera on December 06, 2017 04:06 PM
A quick post to tell you that we finally added UTC support to Clocks' and the Shell's World Clocks section. And if you're into it, there's also Anywhere on Earth support.

You will need to have git master versions of libgweather (our cities and timezones database), and gnome-clocks. This feature will land in GNOME 3.28.



Many thanks to Giovanni for coming up with an API he was happy with after I attempted a couple of iterations on one. Enjoy!

Update: As expected, a bug crept in. Thanks to Colin Guthrie for spotting the error in the "Anywhere on Earth" timezone. See this section for the fun we have to deal with.

ColorHug Plus Update

Posted by Richard Hughes on December 04, 2017 04:28 PM

Here’s an update for people waiting for news on the ColorHug+ spectrophotometer, and perhaps not the update that you were hoping for. Three things have recently happened, and each of them makes producing the ColorHug+ even harder than it was before:

  1. A few weeks I became a father again. Producing the ColorHug and ColorHug2 devices takes a significant amount of time, brain, muscle and love, and I’m still struggling with dividing up my time between being a modern hands-on dad and also a full time job at Red Hat. ColorHug was (and still is) a hobby that got a little out of control, and not something that brings in any significant amount of money. A person spending £300 on a complex device is going to expect at least some level of support, even when I’ve had no sleep and only have half a brain on a Saturday morning.
  2. Brexit has made the GBP currency plunge in value over the last 12 months, which in theory should be good as it will encourage exports. What’s slightly different for me is that 80% of the components for each device are purchased in USD and EUR, and the remaining ones in GBP have risen accordingly with the currency plunge. I have no idea what a post-Brexit Britain looks like, but I think it’s a prudent choice to not “risk” £20k in an investment I’d essentially hope to break even on long term, for fun.
  3. The sensor for the ColorHug+ was going to be based on the bare chip SPARK from OceanOptics. I’ve spent a long time working out all the quirks of the sensor, making it work with a UV and wideband illuminant and working out all the packaging questions. The price of the sensor was always going to be expensive (it was greater than half of the RRP in one component alone, even buying a massive batch) but last month I got an email saying the sensor was going to be discontinued and would no longer be available. This is figuratively and also quite literally back to the drawing board.

I’ve included some photos above to show I’ve not been full of hot air for the last year or so, and to remind everyone that the PCB, 3D light guide model and client software are all in the various ColorHug git repos if you want to have a go at building one yourself (although, buy the sensor quickly…). I’ll still continue selling ColorHug2 devices, and supporting all the existing hardware but this might be the end of the line for ColorHug spectrometer. I’ll keep my eye on all the trade magazines for any new sensor that is inexpensive, reliable and accurate enough for ICC profiles, so all this might just be resurrected in the future, but for the short term this is all on ice. If you want a device right now the X-Rite i1Studio is probably the best of the bunch, although it is sold by Pantone with an RRP of £450. Fair warning: Pantone and free software are not exactly bedfellows, although it does work with ArgyllCMS using a reverse engineered userspace driver that might void your warranty.

I’ll update the website at some point this evening, I’m not sure whether to just post all this or remove the ColorHug+ page completely. Perhaps a sad announcement, but perhaps not one that’s too unexpected considering the lack of updates in the last few months. Sorry to disappoint everybody.

An OpenHardware 1-port Hub?

Posted by Richard Hughes on November 29, 2017 11:29 AM

I’ve spent the last couple of evenings designing an OpenHardware USB 2.0 1-port hub tentatively called the ColorHub (although, better ideas certainly welcome). Back a bit: What’s the point in a 1-port hub?

The finished device is a Cypress 2 port hub internally, with a PIC microcontroller hard wired onto the other “fixed” port. The microcontroller can control the hub and the USB power of the other “removable” port, so you can simulate an unplug, replug or hub reset using a few simple commands. This also allows you forcefully reset hardware that’s not responding, and also allows you to add hardware tests to test enumeration and device removal. With the device it is trivial to write a script to replug a device 5000 times over one evening, or re-connect a USB device that’s not responding for whatever reason. The smart hub also reports when USB devices are connected to the downstream port, and even when they have not enumerated correctly. There could be commands to get the status of that and also optionally wait until those things have happened.

The other killer feature for me is that the microcontroller has lots of spare analog and digital IO, and with two included solid-state MOSFET relays you can wire up two physical switches so that no user interaction is required. This means you can test hardware that has these kind of requirements:

  • Remove USB plug
  • Press and hold buttons A&B
  • Insert USB plug
  • Release buttons A&B

It would be fairly trivial to wire up the microcontroller ADC to get a rough power consumption figure, or to set some custom hub descriptors; it would be completely open and “hackable” like the ColorHug.

I’ve made just one prototype and am using it quite nicely in the fwupd self tests, but talking to others yesterday this seems the kind of device that would be useful for other people doing similar QA activities. I need to build another 2 for the other devices requiring manual button-presses in the fwupd hardware cardboard-box-tests and it’s exactly the same price to order 50 tiny PCBs as 5.

The dangerous question: Would anyone else be interested in purchasing this kind of thing? The price would be in the £50-60 range, so certainly not cheap, but this is really the cost of ultra-small batches of moderately complicated electronics these days. If you’re interested, send me an email (richard_at_hughsie_dot_com) and depending on demand I’ll either design some nice custom PCBs or just hack together two more prototypes for my own use. Please also tell me if something like this already exists: if so I can save some time and just buy something that someone else has built. Comments welcome.

Potential impact of the Intel ME vulnerability

Posted by Matthew Garrett on November 27, 2017 10:17 PM
(Note: this is my personal opinion based on public knowledge around this issue. I have no knowledge of any non-public details of these vulnerabilities, and this should not be interpreted as the position or opinion of my employer)

Intel's Management Engine (ME) is a small coprocessor built into the majority of Intel CPU chipsets[0]. Older versions were based on the ARC architecture[1] running an embedded realtime operating system, but from version 11 onwards they've been small x86 cores running Minix. The precise capabilities of the ME have not been publicly disclosed, but it is at minimum capable of interacting with the network[2], display[3], USB, input devices and system flash. In other words, software running on the ME is capable of doing a lot, without requiring any OS permission in the process.

Back in May, Intel announced a vulnerability in the Advanced Management Technology (AMT) that runs on the ME. AMT offers functionality like providing a remote console to the system (so IT support can connect to your system and interact with it as if they were physically present), remote disk support (so IT support can reinstall your machine over the network) and various other bits of system management. The vulnerability meant that it was possible to log into systems with enabled AMT with an empty authentication token, making it possible to log in without knowing the configured password.

This vulnerability was less serious than it could have been for a couple of reasons - the first is that "consumer"[4] systems don't ship with AMT, and the second is that AMT is almost always disabled (Shodan found only a few thousand systems on the public internet with AMT enabled, out of many millions of laptops). I wrote more about it here at the time.

How does this compare to the newly announced vulnerabilities? Good question. Two of the announced vulnerabilities are in AMT. The previous AMT vulnerability allowed you to bypass authentication, but restricted you to doing what AMT was designed to let you do. While AMT gives an authenticated user a great deal of power, it's also designed with some degree of privacy protection in mind - for instance, when the remote console is enabled, an animated warning border is drawn on the user's screen to alert them.

This vulnerability is different in that it allows an authenticated attacker to execute arbitrary code within the AMT process. This means that the attacker shouldn't have any capabilities that AMT doesn't, but it's unclear where various aspects of the privacy protection are implemented - for instance, if the warning border is implemented in AMT rather than in hardware, an attacker could duplicate that functionality without drawing the warning. If the USB storage emulation for remote booting is implemented as a generic USB passthrough, the attacker could pretend to be an arbitrary USB device and potentially exploit the operating system through bugs in USB device drivers. Unfortunately we don't currently know.

Note that this exploit still requires two things - first, AMT has to be enabled, and second, the attacker has to be able to log into AMT. If the attacker has physical access to your system and you don't have a BIOS password set, they will be able to enable it - however, if AMT isn't enabled and the attacker isn't physically present, you're probably safe. But if AMT is enabled and you haven't patched the previous vulnerability, the attacker will be able to access AMT over the network without a password and then proceed with the exploit. This is bad, so you should probably (1) ensure that you've updated your BIOS and (2) ensure that AMT is disabled unless you have a really good reason to use it.

The AMT vulnerability applies to a wide range of versions, everything from version 6 (which shipped around 2008) and later. The other vulnerability that Intel describe is restricted to version 11 of the ME, which only applies to much more recent systems. This vulnerability allows an attacker to execute arbitrary code on the ME, which means they can do literally anything the ME is able to do. This probably also means that they are able to interfere with any other code running on the ME. While AMT has been the most frequently discussed part of this, various other Intel technologies are tied to ME functionality.

Intel's Platform Trust Technology (PTT) is a software implementation of a Trusted Platform Module (TPM) that runs on the ME. TPMs are intended to protect access to secrets and encryption keys and record the state of the system as it boots, making it possible to determine whether a system has had part of its boot process modified and denying access to the secrets as a result. The most common usage of TPMs is to protect disk encryption keys - Microsoft Bitlocker defaults to storing its encryption key in the TPM, automatically unlocking the drive if the boot process is unmodified. In addition, TPMs support something called Remote Attestation (I wrote about that here), which allows the TPM to provide a signed copy of information about what the system booted to a remote site. This can be used for various purposes, such as not allowing a compute node to join a cloud unless it's booted the correct version of the OS and is running the latest firmware version. Remote Attestation depends on the TPM having a unique cryptographic identity that is tied to the TPM and inaccessible to the OS.

PTT allows manufacturers to simply license some additional code from Intel and run it on the ME rather than having to pay for an additional chip on the system motherboard. This seems great, but if an attacker is able to run code on the ME then they potentially have the ability to tamper with PTT, which means they can obtain access to disk encryption secrets and circumvent Bitlocker. It also means that they can tamper with Remote Attestation, "attesting" that the system booted a set of software that it didn't or copying the keys to another system and allowing that to impersonate the first. This is, uh, bad.

Intel also recently announced Intel Online Connect, a mechanism for providing the functionality of security keys directly in the operating system. Components of this are run on the ME in order to avoid scenarios where a compromised OS could be used to steal the identity secrets - if the ME is compromised, this may make it possible for an attacker to obtain those secrets and duplicate the keys.

It's also not entirely clear how much of Intel's Secure Guard Extensions (SGX) functionality depends on the ME. The ME does appear to be required for SGX Remote Attestation (which allows an application using SGX to prove to a remote site that it's the SGX app rather than something pretending to be it), and again if those secrets can be extracted from a compromised ME it may be possible to compromise some of the security assumptions around SGX. Again, it's not clear how serious this is because it's not publicly documented.

Various other things also run on the ME, including stuff like video DRM (ensuring that high resolution video streams can't be intercepted by the OS). It may be possible to obtain encryption keys from a compromised ME that allow things like Netflix streams to be decoded and dumped. From a user privacy or security perspective, these things seem less serious.

The big problem at the moment is that we have no idea what the actual process of compromise is. Intel state that it requires local access, but don't describe what kind. Local access in this case could simply require the ability to send commands to the ME (possible on any system that has the ME drivers installed), could require direct hardware access to the exposed ME (which would require either kernel access or the ability to install a custom driver) or even the ability to modify system flash (possible only if the attacker has physical access and enough time and skill to take the system apart and modify the flash contents with an SPI programmer). The other thing we don't know is whether it's possible for an attacker to modify the system such that the ME is persistently compromised or whether it needs to be re-compromised every time the ME reboots. Note that even the latter is more serious than you might think - the ME may only be rebooted if the system loses power completely, so even a "temporary" compromise could affect a system for a long period of time.

It's also almost impossible to determine if a system is compromised. If the ME is compromised then it's probably possible for it to roll back any firmware updates but still report that it's been updated, giving admins a false sense of security. The only way to determine for sure would be to dump the system flash and compare it to a known good image. This is impractical to do at scale.

So, overall, given what we know right now it's hard to say how serious this is in terms of real world impact. It's unlikely that this is the kind of vulnerability that would be used to attack individual end users - anyone able to compromise a system like this could just backdoor your browser instead with much less effort, and that already gives them your banking details. The people who have the most to worry about here are potential targets of skilled attackers, which means activists, dissidents and companies with interesting personal or business data. It's hard to make strong recommendations about what to do here without more insight into what the vulnerability actually is, and we may not know that until this presentation next month.

Summary: Worst case here is terrible, but unlikely to be relevant to the vast majority of users.

[0] Earlier versions of the ME were built into the motherboard chipset, but as portions of that were incorporated onto the CPU package the ME followedEdit: Apparently I was wrong and it's still on the chipset
[1] A descendent of the SuperFX chip used in Super Nintendo cartridges such as Starfox, because why not
[2] Without any OS involvement for wired ethernet and for wireless networks in the system firmware, but requires OS support for wireless access once the OS drivers have loaded
[3] Assuming you're using integrated Intel graphics
[4] "Consumer" is a bit of a misnomer here - "enterprise" laptops like Thinkpads ship with AMT, but are often bought by consumers.

comment count unavailable comments

LVFS Article On OpenSource.com

Posted by Richard Hughes on November 20, 2017 12:04 PM

I wrote an article on the LVFS for OpenSource.com. If you’re interested in an overview of how firmware updates work in Linux, and a little history it might be an interesting read.

Eben Moglen is no longer a friend of the free software community

Posted by Matthew Garrett on November 13, 2017 05:42 PM
(Note: While the majority of the events described below occurred while I was a member of the board of directors of the Free Software Foundation, I am no longer. This is my personal position and should not be interpreted as the opinion of any other organisation or company I have been affiliated with in any way)

Eben Moglen has done an amazing amount of work for the free software community, serving on the board of the Free Software Foundation and acting as its general counsel for many years, leading the drafting of GPLv3 and giving many forceful speeches on the importance of free software. However, his recent behaviour demonstrates that he is no longer willing to work with other members of the community, and we should reciprocate that.

In early 2016, the FSF board became aware that Eben was briefing clients on an interpretation of the GPL that was incompatible with that held by the FSF. He later released this position publicly with little coordination with the FSF, which was used by Canonical to justify their shipping ZFS in a GPL-violating way. He had provided similar advice to Debian, who were confused about the apparent conflict between the FSF's position and Eben's.

This situation was obviously problematic - Eben is clearly free to provide whatever legal opinion he holds to his clients, but his very public association with the FSF caused many people to assume that these positions were held by the FSF and the FSF were forced into the position of publicly stating that they disagreed with legal positions held by their general counsel. Attempts to mediate this failed, and Eben refused to commit to working with the FSF on avoiding this sort of situation in future[1].

Around the same time, Eben made legal threats towards another project with ties to FSF. These threats were based on a license interpretation that ran contrary to how free software licenses had been interpreted by the community for decades, and was made without any prior discussion with the FSF (2017-12-11 update: page 126 of this document includes the email in which Eben asserts that the Software Freedom Conservancy is engaging in plagiarism by making use of appropriately credited material released under a Creative Commons license). This, in conjunction with his behaviour over the ZFS issue, led to him stepping down as the FSF's general counsel.

Throughout this period, Eben disparaged FSF staff and other free software community members in various semi-public settings. In doing so he harmed the credibility of many people who have devoted significant portions of their lives to aiding the free software community. At Libreplanet earlier this year he made direct threats against an attendee - this was reported as a violation of the conference's anti-harassment policy.

Eben has acted against the best interests of an organisation he publicly represented. He has threatened organisations and individuals who work to further free software. His actions are no longer to the benefit of the free software community and the free software community should cease associating with him.

[1] Contrary to the claim provided here, Bradley was not involved in this process.

(Edit to add: various people have asked for more details of some of the accusations here. Eben is influential in many areas, and publicising details without the direct consent of his victims may put them at professional risk. I'm aware that this reduces my credibility, and it's entirely reasonable for people to choose not to believe me as a result. I will add that I said much of this several months ago, so I'm not making stuff up in response to recent events)

comment count unavailable comments

gtk3 + broadway + libreoffice

Posted by Caolán McNamara on November 09, 2017 09:08 PM
Out of the box in Fedora 26 I see that our gtk3 version of LibreOffice mostly works under broadway so here's libreoffice displaying through firefox. Toolbar is toast, but dialogs and menus work.


broadwayd :5 &
firefox http://127.0.0.1:8085 &
GDK_BACKEND=broadway BROADWAY_DISPLAY=:5 soffice --nologo &

Hardware CI Tests in fwupd

Posted by Richard Hughes on November 07, 2017 03:47 PM

Usually near the end of the process of getting a vendor on the LVFS I normally ask them to send me hardware for the tests. Once we’ve got a pretty good idea that the hardware update process is going to work with fwupd (i.e. they’re not insisting on some static linked ELF to be run…) and when they’ve got legal approval to upload the firmware to the LVFS (without an eyewateringly long EULA) we start thinking about how to test the hardware. Once we say “Product Foo from Vendor Bar is supported in Linux” we better make damn sure it doesn’t regress when something in the kernel changes or when someone refactors a plugin to support a different variant of a protocol.

To make this task a little more manageable, we have a little python script that helps automate the devices that can be persuaded to enter DFU mode themselves. To avoid chaos, I also have a little cardboard tray under a little HP Microserver with two 10-port USB hubs with everything organised. Who knew paper-craft would be such an important skill at Red Hat…

As the astute might notice, much of the hardware is a bare PCB. I don’t actually need the complete device for testing, and much of the donated hardware is actually a user return or with a cosmetic defect, or even just a pre-release PCB without the actual hardware attached. This is fine, and actually preferable to the entire device – I only have a small office!

As much of the hardware needs special handling to put it in update mode we can’t 100% automate this task, and sometimes it really is just me sitting in front of the laptop pressing and holding buttons for 30 minutes before uploading a tarball, but it’s sure it comforting to know that firmware updates are tested like this. As usual, thanks should be directed to Red Hat for letting me work on this kind of stuff, they really are a marvelous company to work for.

Quirks in fwupd as key files

Posted by Richard Hughes on November 02, 2017 07:25 PM

In my previous blog post I hinted at you just have to add one line to a data file to add support for new AVR32 microcontrollers and this blog entry should give a few more details.

A few minutes ago I merged a PR that moves the database of supported and quirked devices out of the C code and into runtime loaded files. When fwupd is installed in long-term support distros it’s very hard to backport new versions as new hardware is released. The idea with this functionalty is that the end user can drop an additional (or replace an existing) file in a .d directory with a simple format and the hardware will magically start working. This assumes no new quirks are required, as this would obviously need code changes, but allows us to get most existing devices working in an easy way without the user compiling anything.

The quirk files themselves are simple key files and are documented in the fwupd gtk-doc documentation.

AVR32 devices in fwupd

Posted by Richard Hughes on October 31, 2017 10:13 PM

Over 10 years ago the dfu-programmer project was forked into dfu-utils as the former didn’t actually work at all well with generic devices supporting vanilla 1.0 and 1.1 specification-compliant DFU. It was then adapted to also support the STM variant of DFU (standards FTW). One feature that dfu-programmer did have, which dfu-util never seemed to acquire was support for the AVR variant of DFU (very different from STM DFU, but doing basically the same things). This meant if you wanted to program AVR parts you had to use the long-obsolete tool rather than the slightly less-unmaintained newer tool.

Today I merged a PR in fwupd that adds support for flashing AVR32 devices from Atmel. These are the same chips found in some Arduino protoype boards, and are also the core of many thousands of professional devices like the Nitrokey device. You can already program this kind of hardware in Linux, using clunky commands like:

# dfu-programmer at32uc3a3256s erase
# dfu-programmer at32uc3a3256s flash --suppress-bootloader-mem foo.ihx
# dfu-programmer at32uc3a3256s launch

The crazy long chip identifier is specified manually for each command, as the bootloader VID/PID isn’t always unique for each chip type. For fwupd we need to be able to program hardware without any user input, and without any chance of the wrong chip identifier bricking the hardware. This is possible to do as the chip itself knows its own device ID, but for some reason Atmel wants to make it super difficult to autodetect the hardware by publishing a table of all the processor types they have produced. I’ll cover in a future blog post how we do this mapping in fwupd, but at least for hardware like the Nitrokey you can now use the little dfu-tool helper executable shipped in fwupd to do:

# dfu-tool write foo.ihx

Or, for normal people, you can soon just click the Update button in GNOME Software which uses the DFU plugin in fwupd to apply the update. It’s so easy, and safe.

If you manufacture an AVR32 device that uses the Atmel bootloader (not the Arduino one), and you’re interested in making fwupd work with your hardware it’s likely you just have to add one line to a data file. If your dfu-tool list already specifies a Chip ID along with can-download|can-upload then there’s no excuse at all as it should just work. There is a lot of hardware using the AT32UC3, so I’m hopeful spending the time on the AVR support means more vendors can join the LVFS project.

What is libwacom? A library to describe graphics tablets

Posted by Peter Hutterer on October 26, 2017 11:29 PM

libwacom has been around since 2011 now but I'm still getting the odd question or surprise at what libwacom does, is, or should be. So here's a short summary:

libwacom only provides descriptions

libwacom is a library that provides tablet descriptions but no actual tablet event handling functionality. Simply said, it's a library that provides axes to a bunch of text files. Graphics tablets are complex and to integrate them well we usually need to know more about them than the information the kernel reports. If you need to know whether the tablet is a standalone one (Wacom Intuos series) or a built-in one (Wacom Cintiq series), libwacom will tell you that. You need to know how many LEDs and mode groups a tablet has? libwacom will tell you that. You need an SVG to draw a representation of the tablet's button layout? libwacom will give you that. You need to know which stylus is compatible with your tablet? libwacom knows about that too.

But that's all it does. You cannot feed events to libwacom, and it will not initialise the device for you. It just provides static device descriptions.

libwacom does not make your tablet work

If your tablet isn't working or the buttons aren't handled correctly, or the stylus is moving the wrong way, libwacom won't be able to help with that. As said above, it merely provides extra information about the device but is otherwise completely ignorant of the actual tablet.

libwacom handles any tablet

Sure, it's named after Wacom tablets because that's where the majority of effort goes (not least because Wacom employs Linux developers!). But the description format is independent of the brand so you can add non-Wacom tablets to it too.

Caveat: many of the cheap non-Wacom tablets re-use USB ids so two completely different devices would have the same USB ID, making a static device description useless.

Who uses libwacom?

Right now, the two most prevalent users of libwacom are GNOME and libinput. GNOME's control center and mutter use libwacom for tablet-to-screen mappings as well as to show the various stylus capabilities. And it uses the SVG to draw an overlay for pad buttons. libinput uses it to associate the LEDs on the pad with the right buttons and to initialise the stylus tools axes correctly. The kernel always exposes all possible axes on the event node but not all styli have all axes. With libwacom, we can initialise the stylus tool based on the correct information.

Resources

So now I expect you to say something like "Oh wow, I'm like totally excited about libwacom now and I want to know more and get involved!". Well, fear not, there is more information and links to the repos in the wiki.

Jabra joins the LVFS

Posted by Richard Hughes on October 25, 2017 09:37 AM

Some great news: the Jabra Speak devices are now supported using fwupd, and firmware files have just been uploaded to the LVFS.

You can now update the firmware just by clicking on a button in GNOME Software when using fwupd >= 1.0.0. Working with Jabra to add the required DFU quirks to fwupd and to get legal clearance to upload the firmware has been a pleasure. Their hardware is well designed and works really well in Linux (with the latest firmware), and they’ve been really helpful providing all the specifications we needed to get the firmware upgrade working reliably. We’ll hopefully be adding some different Jabra devices in the coming months to the LVFS too.

More vendor announcements coming soon too.

Attending and Speaking at GNOME.Asia 2017 Summit

Posted by Lennart Poettering on October 23, 2017 10:00 PM

The GNOME.Asia Summit 2017 organizers invited to me to speak at their conference in Chongqing/China, and it was an excellent event! Here's my brief report:

Because we arrived one day early in Chongqing, my GNOME friends Sri, Matthias, Jonathan, David and I started our journey with an excursion to the Dazu Rock Carvings, a short bus trip from Chongqing, and an excellent (and sometimes quite surprising) sight. I mean, where else can you see a buddha with 1000+ hands, and centuries old, holding a cell Nexus 5 cell phone? Here's proof:

The GNOME.Asia schedule was excellent, with various good talks, including some about Flatpak, Endless OS, rpm-ostree, Blockchains and more. My own talk was about The Path to a Fully Protected GNOME Desktop OS Image (Slides available here). In the hallway track I did my best to advocate casync to whoever was willing to listen, and I think enough were ;-). As we all know attending conferences is at least as much about the hallway track as about the talks, and GNOME.Asia was a fantastic way to meet the Chinese GNOME and Open Source communities.

The day after the conference the organizers of GNOME.Asia organized a Chongqing day trip. A particular highlight was the ubiqutious hot pot, sometimes with the local speciality: fresh pig brain.

Here some random photos from the trip: sights, food, social event and more.

I'd like to thank the GNOME Foundation for funding my trip to GNOME.Asia. And that's all for now. But let me close with an old chinese wisdom:

   The Trials Of A Long Journey Always Feeling, Civilized Travel Pass Reputation.

All Systems Go! 2017 Videos Online!

Posted by Lennart Poettering on October 23, 2017 10:00 PM

For those living under a rock, the videos from everybody's favourite Userspace Linux Conference All Systems Go! 2017 are now available online.

All videos

The videos for my own two talks are available here:

Synchronizing Images with casync (Slides)

Containers without a Container Manager, with systemd (Slides)

Of course, this is the stellar work of the CCC VOC folks, who are hard to beat when it comes to videotaping of community conferences.

Discrepancy Report #107743

Posted by Caolán McNamara on October 23, 2017 08:21 PM
Short (1996) little article about a bug in the shuttle starboard manipulator arm display position.

Spoiler: <quote>A half-dozen pages of forms detail [the error] ... the most remarkable thing about the error and its paper trail. “There is no starboard manipulator arm”</quote>

Shaking the tin for LVFS: Asking for donations!

Posted by Richard Hughes on October 16, 2017 03:50 PM

tl;dr: If you feel like you want to donate to the LVFS, you can now do so here.

Nearly 100 million files are downloaded from the LVFS every month, the majority being metadata to know what updates are available. Although each metadata file is very small it still adds up to over 1TB in transfered bytes per month. Amazon has kindly given the LVFS a 2000 USD per year open source grant which more than covers the hosting costs and any test EC2 instances. I really appreciate the donation from Amazon as it allows us to continue to grow, both with the number of Linux clients connecting every hour, and with the number of firmware files hosted. Before the grant sometimes Red Hat would pay the bandwidth bill, and other times it was just paid out my own pocket, so the grant does mean a lot to me. Amazon seemed very friendly towards this kind of open source shared infrastructure, so kudos to them for that.

At the moment the secure part of the LVFS is hosted in a dedicated Scaleway instance, so any additional donations would be spent on paying this small bill and perhaps more importantly buying some (2nd hand?) hardware to include as part of our release-time QA checks.

I already test fwupd with about a dozen pieces of hardware, but I’d feel a lot more comfortable testing different classes of device with updates on the LVFS.

One thing I’ve found that also works well is taking a chance and buying a popular device we know is upgradable and adding support for the specific quirks it has to fwupd. This is an easy way to get karma from a previously Linux-unfriendly vendor before we start discussing uploading firmware updates to the LVFS. Hardware on my wanting-to-buy list includes a wireless network card, a fingerprint scanner and SSDs from a couple of different vendors.

If you’d like to donate towards hardware, please donate via LiberaPay or ask me for PayPal/BACS details. Even if you donate €0.01 per week it would make a difference. Thanks!

IP Accounting and Access Lists with systemd

Posted by Lennart Poettering on October 08, 2017 10:00 PM

TL;DR: systemd now can do per-service IP traffic accounting, as well as access control for IP address ranges.

Last Friday we released systemd 235. I already blogged about its Dynamic User feature in detail, but there's one more piece of new functionality that I think deserves special attention: IP accounting and access control.

Before v235 systemd already provided per-unit resource management hooks for a number of different kinds of resources: consumed CPU time, disk I/O, memory usage and number of tasks. With v235 another kind of resource can be controlled per-unit with systemd: network traffic (specifically IP).

Three new unit file settings have been added in this context:

  1. IPAccounting= is a boolean setting. If enabled for a unit, all IP traffic sent and received by processes associated with it is counted both in terms of bytes and of packets.

  2. IPAddressDeny= takes an IP address prefix (that means: an IP address with a network mask). All traffic from and to this address will be prohibited for processes of the service.

  3. IPAddressAllow= is the matching positive counterpart to IPAddressDeny=. All traffic matching this IP address/network mask combination will be allowed, even if otherwise listed in IPAddressDeny=.

The three options are thin wrappers around kernel functionality introduced with Linux 4.11: the control group eBPF hooks. The actual work is done by the kernel, systemd just provides a number of new settings to configure this facet of it. Note that cgroup/eBPF is unrelated to classic Linux firewalling, i.e. NetFilter/iptables. It's up to you whether you use one or the other, or both in combination (or of course neither).

IP Accounting

Let's have a closer look at the IP accounting logic mentioned above. Let's write a simple unit /etc/systemd/system/ip-accounting-test.service:

[Service]
ExecStart=/usr/bin/ping 8.8.8.8
IPAccounting=yes

This simple unit invokes the ping(8) command to send a series of ICMP/IP ping packets to the IP address 8.8.8.8 (which is the Google DNS server IP; we use it for testing here, since it's easy to remember, reachable everywhere and known to react to ICMP pings; any other IP address responding to pings would be fine to use, too). The IPAccounting= option is used to turn on IP accounting for the unit.

Let's start this service after writing the file. Let's then have a look at the status output of systemctl:

# systemctl daemon-reload
# systemctl start ip-accounting-test
# systemctl status ip-accounting-test
● ip-accounting-test.service
   Loaded: loaded (/etc/systemd/system/ip-accounting-test.service; static; vendor preset: disabled)
   Active: active (running) since Mon 2017-10-09 18:05:47 CEST; 1s ago
 Main PID: 32152 (ping)
       IP: 168B in, 168B out
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/ip-accounting-test.service
           └─32152 /usr/bin/ping 8.8.8.8

Okt 09 18:05:47 sigma systemd[1]: Started ip-accounting-test.service.
Okt 09 18:05:47 sigma ping[32152]: PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
Okt 09 18:05:47 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=1 ttl=59 time=29.2 ms
Okt 09 18:05:48 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=2 ttl=59 time=28.0 ms

This shows the ping command running — it's currently at its second ping cycle as we can see in the logs at the end of the output. More interesting however is the IP: line further up showing the current IP byte counters. It currently shows 168 bytes have been received, and 168 bytes have been sent. That the two counters are at the same value is not surprising: ICMP ping requests and responses are supposed to have the same size. Note that this line is shown only if IPAccounting= is turned on for the service, as only then this data is collected.

Let's wait a bit, and invoke systemctl status again:

# systemctl status ip-accounting-test
● ip-accounting-test.service
   Loaded: loaded (/etc/systemd/system/ip-accounting-test.service; static; vendor preset: disabled)
   Active: active (running) since Mon 2017-10-09 18:05:47 CEST; 4min 28s ago
 Main PID: 32152 (ping)
       IP: 22.2K in, 22.2K out
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/ip-accounting-test.service
           └─32152 /usr/bin/ping 8.8.8.8

Okt 09 18:10:07 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=260 ttl=59 time=27.7 ms
Okt 09 18:10:08 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=261 ttl=59 time=28.0 ms
Okt 09 18:10:09 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=262 ttl=59 time=33.8 ms
Okt 09 18:10:10 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=263 ttl=59 time=48.9 ms
Okt 09 18:10:11 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=264 ttl=59 time=27.2 ms
Okt 09 18:10:12 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=265 ttl=59 time=27.0 ms
Okt 09 18:10:13 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=266 ttl=59 time=26.8 ms
Okt 09 18:10:14 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=267 ttl=59 time=27.4 ms
Okt 09 18:10:15 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=268 ttl=59 time=29.7 ms
Okt 09 18:10:16 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=269 ttl=59 time=27.6 ms

As we can see, after 269 pings the counters are much higher: at 22K.

Note that while systemctl status shows only the byte counters, packet counters are kept as well. Use the low-level systemctl show command to query the current raw values of the in and out packet and byte counters:

# systemctl show ip-accounting-test -p IPIngressBytes -p IPIngressPackets -p IPEgressBytes -p IPEgressPackets
IPIngressBytes=37776
IPIngressPackets=449
IPEgressBytes=37776
IPEgressPackets=449

Of course, the same information is also available via the D-Bus APIs. If you want to process this data further consider talking proper D-Bus, rather than scraping the output of systemctl show.

Now, let's stop the service again:

# systemctl stop ip-accounting-test

When a service with such accounting turned on terminates, a log line about all its consumed resources is written to the logs. Let's check with journalctl:

# journalctl -u ip-accounting-test -n 5
-- Logs begin at Thu 2016-08-18 23:09:37 CEST, end at Mon 2017-10-09 18:17:02 CEST. --
Okt 09 18:15:50 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=603 ttl=59 time=26.9 ms
Okt 09 18:15:51 sigma ping[32152]: 64 bytes from 8.8.8.8: icmp_seq=604 ttl=59 time=27.2 ms
Okt 09 18:15:52 sigma systemd[1]: Stopping ip-accounting-test.service...
Okt 09 18:15:52 sigma systemd[1]: Stopped ip-accounting-test.service.
Okt 09 18:15:52 sigma systemd[1]: ip-accounting-test.service: Received 49.5K IP traffic, sent 49.5K IP traffic

The last line shown is the interesting one, that shows the accounting data. It's actually a structured log message, and among its metadata fields it contains the more comprehensive raw data:

# journalctl -u ip-accounting-test -n 1 -o verbose
-- Logs begin at Thu 2016-08-18 23:09:37 CEST, end at Mon 2017-10-09 18:18:50 CEST. --
Mon 2017-10-09 18:15:52.649028 CEST [s=89a2cc877fdf4dafb2269a7631afedad;i=14d7;b=4c7e7adcba0c45b69d612857270716d3;m=137592e75e;t=55b1f81298605;x=c3c9b57b28c9490e]
    PRIORITY=6
    _BOOT_ID=4c7e7adcba0c45b69d612857270716d3
    _MACHINE_ID=e87bfd866aea4ae4b761aff06c9c3cb3
    _HOSTNAME=sigma
    SYSLOG_FACILITY=3
    SYSLOG_IDENTIFIER=systemd
    _UID=0
    _GID=0
    _TRANSPORT=journal
    _PID=1
    _COMM=systemd
    _EXE=/usr/lib/systemd/systemd
    _CAP_EFFECTIVE=3fffffffff
    _SYSTEMD_CGROUP=/init.scope
    _SYSTEMD_UNIT=init.scope
    _SYSTEMD_SLICE=-.slice
    CODE_FILE=../src/core/unit.c
    _CMDLINE=/usr/lib/systemd/systemd --switched-root --system --deserialize 25
    _SELINUX_CONTEXT=system_u:system_r:init_t:s0
    UNIT=ip-accounting-test.service
    CODE_LINE=2115
    CODE_FUNC=unit_log_resources
    MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0
    INVOCATION_ID=98a6e756fa9d421d8dfc82b6df06a9c3
    IP_METRIC_INGRESS_BYTES=50880
    IP_METRIC_INGRESS_PACKETS=605
    IP_METRIC_EGRESS_BYTES=50880
    IP_METRIC_EGRESS_PACKETS=605
    MESSAGE=ip-accounting-test.service: Received 49.6K IP traffic, sent 49.6K IP traffic
    _SOURCE_REALTIME_TIMESTAMP=1507565752649028

The interesting fields of this log message are of course IP_METRIC_INGRESS_BYTES=, IP_METRIC_INGRESS_PACKETS=, IP_METRIC_EGRESS_BYTES=, IP_METRIC_EGRESS_PACKETS= that show the consumed data.

The log message carries a message ID that may be used to quickly search for all such resource log messages (ae8f7b866b0347b9af31fe1c80b127c0). We can combine a search term for messages of this ID with journalctl's -u switch to quickly find out about the resource usage of any invocation of a specific service. Let's try:

# journalctl -u ip-accounting-test MESSAGE_ID=ae8f7b866b0347b9af31fe1c80b127c0
-- Logs begin at Thu 2016-08-18 23:09:37 CEST, end at Mon 2017-10-09 18:25:27 CEST. --
Okt 09 18:15:52 sigma systemd[1]: ip-accounting-test.service: Received 49.6K IP traffic, sent 49.6K IP traffic

Of course, the output above shows only one message at the moment, since we started the service only once, but a new one will appear every time you start and stop it again.

The IP accounting logic is also hooked up with systemd-run, which is useful for transiently running a command as systemd service with IP accounting turned on. Let's try it:

# systemd-run -p IPAccounting=yes --wait wget https://cfp.all-systems-go.io/en/ASG2017/public/schedule/2.pdf
Running as unit: run-u2761.service
Finished with result: success
Main processes terminated with: code=exited/status=0
Service runtime: 878ms
IP traffic received: 231.0K
IP traffic sent: 3.7K

This uses wget to download the PDF version of the 2nd day schedule of everybody's favorite Linux user-space conference All Systems Go! 2017 (BTW, have you already booked your ticket? We are very close to selling out, be quick!). The IP traffic this command generated was 231K ingress and 4K egress. In the systemd-run command line two parameters are important. First of all, we use -p IPAccounting=yes to turn on IP accounting for the transient service (as above). And secondly we use --wait to tell systemd-run to wait for the service to exit. If --wait is used, systemd-run will also show you various statistics about the service that just ran and terminated, including the IP statistics you are seeing if IP accounting has been turned on.

It's fun to combine this sort of IP accounting with interactive transient units. Let's try that:

# systemd-run -p IPAccounting=1 -t /bin/sh
Running as unit: run-u2779.service
Press ^] three times within 1s to disconnect TTY.
sh-4.4# dnf update
…
sh-4.4# dnf install firefox
…
sh-4.4# exit
Finished with result: success
Main processes terminated with: code=exited/status=0
Service runtime: 5.297s
IP traffic received: …B
IP traffic sent: …B

This uses systemd-run's --pty switch (or short: -t), which opens an interactive pseudo-TTY connection to the invoked service process, which is a bourne shell in this case. Doing this means we have a full, comprehensive shell with job control and everything. Since the shell is running as part of a service with IP accounting turned on, all IP traffic we generate or receive will be accounted for. And as soon as we exit the shell, we'll see what it consumed. (For the sake of brevity I actually didn't paste the whole output above, but truncated core parts. Try it out for yourself, if you want to see the output in full.)

Sometimes it might make sense to turn on IP accounting for a unit that is already running. For that, use systemctl set-property foobar.service IPAccounting=yes, which will instantly turn on accounting for it. Note that it won't count retroactively though: only the traffic sent/received after the point in time you turned it on will be collected. You may turn off accounting for the unit with the same command.

Of course, sometimes it's interesting to collect IP accounting data for all services, and turning on IPAccounting=yes in every single unit is cumbersome. To deal with that there's a global option DefaultIPAccounting= available which can be set in /etc/systemd/system.conf.

IP Access Lists

So much about IP accounting. Let's now have a look at IP access control with systemd 235. As mentioned above, the two new unit file settings, IPAddressAllow= and IPAddressDeny= maybe be used for that. They operate in the following way:

  1. If the source address of an incoming packet or the destination address of an outgoing packet matches one of the IP addresses/network masks in the relevant unit's IPAddressAllow= setting then it will be allowed to go through.

  2. Otherwise, if a packet matches an IPAddressDeny= entry configured for the service it is dropped.

  3. If the packet matches neither of the above it is allowed to go through.

Or in other words, IPAddressDeny= implements a blacklist, but IPAddressAllow= takes precedence.

Let's try that out. Let's modify our last example above in order to get a transient service running an interactive shell which has such an access list set:

# systemd-run -p IPAddressDeny=any -p IPAddressAllow=8.8.8.8 -p IPAddressAllow=127.0.0.0/8 -t /bin/sh
Running as unit: run-u2850.service
Press ^] three times within 1s to disconnect TTY.
sh-4.4# ping 8.8.8.8 -c1
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=59 time=27.9 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 27.957/27.957/27.957/0.000 ms
sh-4.4# ping 8.8.4.4 -c1
PING 8.8.4.4 (8.8.4.4) 56(84) bytes of data.
ping: sendmsg: Operation not permitted
^C
--- 8.8.4.4 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
sh-4.4# ping 127.0.0.2 -c1
PING 127.0.0.1 (127.0.0.2) 56(84) bytes of data.
64 bytes from 127.0.0.2: icmp_seq=1 ttl=64 time=0.116 ms

--- 127.0.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.116/0.116/0.116/0.000 ms
sh-4.4# exit

The access list we set up uses IPAddressDeny=any in order to define an IP white-list: all traffic will be prohibited for the session, except for what is explicitly white-listed. In this command line, we white-listed two address prefixes: 8.8.8.8 (with no explicit network mask, which means the mask with all bits turned on is implied, i.e. /32), and 127.0.0.0/8. Thus, the service can communicate with Google's DNS server and everything on the local loop-back, but nothing else. The commands run in this interactive shell show this: First we try pinging 8.8.8.8 which happily responds. Then, we try to ping 8.8.4.4 (that's Google's other DNS server, but excluded from this white-list), and as we see it is immediately refused with an Operation not permitted error. As last step we ping 127.0.0.2 (which is on the local loop-back), and we see it works fine again, as expected.

In the example above we used IPAddressDeny=any. The any identifier is a shortcut for writing 0.0.0.0/0 ::/0, i.e. it's a shortcut for everything, on both IPv4 and IPv6. A number of other such shortcuts exist. For example, instead of spelling out 127.0.0.0/8 we could also have used the more descriptive shortcut localhost which is expanded to 127.0.0.0/8 ::1/128, i.e. everything on the local loopback device, on both IPv4 and IPv6.

Being able to configure IP access lists individually for each unit is pretty nice already. However, typically one wants to configure this comprehensively, not just for individual units, but for a set of units in one go or even the system as a whole. In systemd, that's possible by making use of .slice units (for those who don't know systemd that well, slice units are a concept for organizing services in hierarchical tree for the purpose of resource management): the IP access list in effect for a unit is the combination of the individual IP access lists configured for the unit itself and those of all slice units it is contained in.

By default, system services are assigned to system.slice, which in turn is a child of the root slice -.slice. Either of these two slice units are hence suitable for locking down all system services at once. If an access list is configured on system.slice it will only apply to system services, however, if configured on -.slice it will apply to all user processes of the system, including all user session processes (i.e. which are by default assigned to user.slice which is a child of -.slice) in addition to the system services.

Let's make use of this:

# systemctl set-property system.slice IPAddressDeny=any IPAddressAllow=localhost
# systemctl set-property apache.service IPAddressAllow=10.0.0.0/8

The two commands above are a very powerful way to first turn off all IP communication for all system services (with the exception of loop-back traffic), followed by an explicit white-listing of 10.0.0.0/8 (which could refer to the local company network, you get the idea) but only for the Apache service.

Use-cases

After playing around a bit with this, let's talk about use-cases. Here are a few ideas:

  1. The IP access list logic can in many ways provide a more modern replacement for the venerable TCP Wrapper, but unlike it it applies to all IP sockets of a service unconditionally, and requires no explicit support in any way in the service's code: no patching required. On the other hand, TCP wrappers have a number of features this scheme cannot cover, most importantly systemd's IP access lists operate solely on the level of IP addresses and network masks, there is no way to configure access by DNS name (though quite frankly, that is a very dubious feature anyway, as doing networking — unsecured networking even – in order to restrict networking sounds quite questionable, at least to me).

  2. It can also replace (or augment) some facets of IP firewalling, i.e. Linux NetFilter/iptables. Right now, systemd's access lists are of course a lot more minimal than NetFilter, but they have one major benefit: they understand the service concept, and thus are a lot more context-aware than NetFilter. Classic firewalls, such as NetFilter, derive most service context from the IP port number alone, but we live in a world where IP port numbers are a lot more dynamic than they used to be. As one example, a BitTorrent client or server may use any IP port it likes for its file transfer, and writing IP firewalling rules matching that precisely is hence hard. With the systemd IP access list implementing this is easy: just set the list for your BitTorrent service unit, and all is good.

    Let me stress though that you should be careful when comparing NetFilter with systemd's IP address list logic, it's really like comparing apples and oranges: to start with, the IP address list logic has a clearly local focus, it only knows what a local service is and manages access of it. NetFilter on the other hand may run on border gateways, at a point where the traffic flowing through is pure IP, carrying no information about a systemd unit concept or anything like that.

  3. It's a simple way to lock down distribution/vendor supplied system services by default. For example, if you ship a service that you know never needs to access the network, then simply set IPAddressDeny=any (possibly combined with IPAddressAllow=localhost) for it, and it will live in a very tight networking sand-box it cannot escape from. systemd itself makes use of this for a number of its services by default now. For example, the logging service systemd-journald.service, the login manager systemd-logind or the core-dump processing unit systemd-coredump@.service all have such a rule set out-of-the-box, because we know that neither of these services should be able to access the network, under any circumstances.

  4. Because the IP access list logic can be combined with transient units, it can be used to quickly and effectively sandbox arbitrary commands, and even include them in shell pipelines and such. For example, let's say we don't trust our curl implementation (maybe it got modified locally by a hacker, and phones home?), but want to use it anyway to download the the slides of my most recent casync talk in order to print it, but want to make sure it doesn't connect anywhere except where we tell it to (and to make this even more fun, let's minimize privileges further, by setting DynamicUser=yes):

    # systemd-resolve 0pointer.de
    0pointer.de: 85.214.157.71
                 2a01:238:43ed:c300:10c3:bcf3:3266:da74
    -- Information acquired via protocol DNS in 2.8ms.
    -- Data is authenticated: no
    # systemd-run --pipe -p IPAddressDeny=any \
                         -p IPAddressAllow=85.214.157.71 \
                         -p IPAddressAllow=2a01:238:43ed:c300:10c3:bcf3:3266:da74 \
                         -p DynamicUser=yes \
                         curl http://0pointer.de/public/casync-kinvolk2017.pdf | lp
    

So much about use-cases. This is by no means a comprehensive list of what you can do with it, after all both IP accounting and IP access lists are very generic concepts. But I do hope the above inspires your fantasy.

What does that mean for packagers?

IP accounting and IP access control are primarily concepts for the local administrator. However, As suggested above, it's a very good idea to ship services that by design have no network-facing functionality with an access list of IPAddressDeny=any (and possibly IPAddressAllow=localhost), in order to improve the out-of-the-box security of our systems.

An option for security-minded distributions might be a more radical approach: ship the system with -.slice or system.slice configured to IPAddressDeny=any by default, and ask the administrator to punch holes into that for each network facing service with systemctl set-property … IPAddressAllow=…. But of course, that's only an option for distributions willing to break compatibility with what was before.

Notes

A couple of additional notes:

  1. IP accounting and access lists may be mixed with socket activation. In this case, it's a good idea to configure access lists and accounting for both the socket unit that activates and the service unit that is activated, as both units maintain fully separate settings. Note that IP accounting and access lists configured on the socket unit applies to all sockets created on behalf of that unit, and even if these sockets are passed on to the activated services, they will still remain in effect and belong to the socket unit. This also means that IP traffic done on such sockets will be accounted to the socket unit, not the service unit. The fact that IP access lists are maintained separately for the kernel sockets created on behalf of the socket unit and for the kernel sockets created by the service code itself enables some interesting uses. For example, it's possible to set a relatively open access list on the socket unit, but a very restrictive access list on the service unit, thus making the sockets configured through the socket unit the only way in and out of the service.

  2. systemd's IP accounting and access lists apply to IP sockets only, not to sockets of any other address families. That also means that AF_PACKET (i.e. raw) sockets are not covered. This means it's a good idea to combine IP access lists with RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 in order to lock this down.

  3. You may wonder if the per-unit resource log message and systemd-run --wait may also show you details about other types or resources consumed by a service. The answer is yes: if you turn on CPUAccounting= for a service, you'll also see a summary of consumed CPU time in the log message and the command output. And we are planning to hook-up IOAccounting= the same way too, soon.

  4. Note that IP accounting and access lists aren't entirely free. systemd inserts an eBPF program into the IP pipeline to make this functionality work. However, eBPF execution has been optimized for speed in the last kernel versions already, and given that it currently is in the focus of interest to many I'd expect to be optimized even further, so that the cost for enabling these features will be negligible, if it isn't already.

  5. IP accounting is currently not recursive. That means you cannot use a slice unit to join the accounting of multiple units into one. This is something we definitely want to add, but requires some more kernel work first.

  6. You might wonder how the PrivateNetwork= setting relates to IPAccessDeny=any. Superficially they have similar effects: they make the network unavailable to services. However, looking more closely there are a number of differences. PrivateNetwork= is implemented using Linux network name-spaces. As such it entirely detaches all networking of a service from the host, including non-IP networking. It does so by creating a private little environment the service lives in where communication with itself is still allowed though. In addition using the JoinsNamespaceOf= dependency additional services may be added to the same environment, thus permitting communication with each other but not with anything outside of this group. IPAddressAllow= and IPAddressDeny= are much less invasive. First of all they apply to IP networking only, and can match against specific IP addresses. A service running with PrivateNetwork= turned off but IPAddressDeny=any turned on, may enumerate the network interfaces and their IP configured even though it cannot actually do any IP communication. On the other hand if you turn on PrivateNetwork= all network interfaces besides lo disappear. Long story short: depending on your use-case one, the other, both or neither might be suitable for sand-boxing of your service. If possible I'd always turn on both, for best security, and that's what we do for all of systemd's own long-running services.

And that's all for now. Have fun with per-unit IP accounting and access lists!

Dynamic Users with systemd

Posted by Lennart Poettering on October 05, 2017 10:00 PM

TL;DR: you may now configure systemd to dynamically allocate a UNIX user ID for service processes when it starts them and release it when it stops them. It's pretty secure, mixes well with transient services, socket activated services and service templating.

Today we released systemd 235. Among other improvements this greatly extends the dynamic user logic of systemd. Dynamic users are a powerful but little known concept, supported in its basic form since systemd 232. With this blog story I hope to make it a bit better known.

The UNIX user concept is the most basic and well-understood security concept in POSIX operating systems. It is UNIX/POSIX' primary security concept, the one everybody can agree on, and most security concepts that came after it (such as process capabilities, SELinux and other MACs, user name-spaces, …) in some form or another build on it, extend it or at least interface with it. If you build a Linux kernel with all security features turned off, the user concept is pretty much the one you'll still retain.

Originally, the user concept was introduced to make multi-user systems a reality, i.e. systems enabling multiple human users to share the same system at the same time, cleanly separating their resources and protecting them from each other. The majority of today's UNIX systems don't really use the user concept like that anymore though. Most of today's systems probably have only one actual human user (or even less!), but their user databases (/etc/passwd) list a good number more entries than that. Today, the majority of UNIX users in most environments are system users, i.e. users that are not the technical representation of a human sitting in front of a PC anymore, but the security identity a system service — an executable program — runs as. Even though traditional, simultaneous multi-user systems slowly became less relevant, their ground-breaking basic concept became the cornerstone of UNIX security. The OS is nowadays partitioned into isolated services — and each service runs as its own system user, and thus within its own, minimal security context.

The people behind the Android OS realized the relevance of the UNIX user concept as the primary security concept on UNIX, and took its use even further: on Android not only system services take benefit of the UNIX user concept, but each UI app gets its own, individual user identity too — thus neatly separating app resources from each other, and protecting app processes from each other, too.

Back in the more traditional Linux world things are a bit less advanced in this area. Even though users are the quintessential UNIX security concept, allocation and management of system users is still a pretty limited, raw and static affair. In most cases, RPM or DEB package installation scripts allocate a fixed number of (usually one) system users when you install the package of a service that wants to take benefit of the user concept, and from that point on the system user remains allocated on the system and is never deallocated again, even if the package is later removed again. Most Linux distributions limit the number of system users to 1000 (which isn't particularly a lot). Allocating a system user is hence expensive: the number of available users is limited, and there's no defined way to dispose of them after use. If you make use of system users too liberally, you are very likely to run out of them sooner rather than later.

You may wonder why system users are generally not deallocated when the package that registered them is uninstalled from a system (at least on most distributions). The reason for that is one relevant property of the user concept (you might even want to call this a design flaw): user IDs are sticky to files (and other objects such as IPC objects). If a service running as a specific system user creates a file at some location, and is then terminated and its package and user removed, then the created file still belongs to the numeric ID ("UID") the system user originally got assigned. When the next system user is allocated and — due to ID recycling — happens to get assigned the same numeric ID, then it will also gain access to the file, and that's generally considered a problem, given that the file belonged to a potentially very different service once upon a time, and likely should not be readable or changeable by anything coming after it. Distributions hence tend to avoid UID recycling which means system users remain registered forever on a system after they have been allocated once.

The above is a description of the status quo ante. Let's now focus on what systemd's dynamic user concept brings to the table, to improve the situation.

Introducing Dynamic Users

With systemd dynamic users we hope to make make it easier and cheaper to allocate system users on-the-fly, thus substantially increasing the possible uses of this core UNIX security concept.

If you write a systemd service unit file, you may enable the dynamic user logic for it by setting the DynamicUser= option in its [Service] section to yes. If you do a system user is dynamically allocated the instant the service binary is invoked, and released again when the service terminates. The user is automatically allocated from the UID range 61184–65519, by looking for a so far unused UID.

Now you may wonder, how does this concept deal with the sticky user issue discussed above? In order to counter the problem, two strategies easily come to mind:

  1. Prohibit the service from creating any files/directories or IPC objects

  2. Automatically removing the files/directories or IPC objects the service created when it shuts down.

In systemd we implemented both strategies, but for different parts of the execution environment. Specifically:

  1. Setting DynamicUser=yes implies ProtectSystem=strict and ProtectHome=read-only. These sand-boxing options turn off write access to pretty much the whole OS directory tree, with a few relevant exceptions, such as the API file systems /proc, /sys and so on, as well as /tmp and /var/tmp. (BTW: setting these two options on your regular services that do not use DynamicUser= is a good idea too, as it drastically reduces the exposure of the system to exploited services.)

  2. Setting DynamicUser=yes implies PrivateTmp=yes. This option sets up /tmp and /var/tmp for the service in a way that it gets its own, disconnected version of these directories, that are not shared by other services, and whose life-cycle is bound to the service's own life-cycle. Thus if the service goes down, the user is removed and all its temporary files and directories with it. (BTW: as above, consider setting this option for your regular services that do not use DynamicUser= too, it's a great way to lock things down security-wise.)

  3. Setting DynamicUser=yes implies RemoveIPC=yes. This option ensures that when the service goes down all SysV and POSIX IPC objects (shared memory, message queues, semaphores) owned by the service's user are removed. Thus, the life-cycle of the IPC objects is bound to the life-cycle of the dynamic user and service, too. (BTW: yes, here too, consider using this in your regular services, too!)

With these four settings in effect, services with dynamic users are nicely sand-boxed. They cannot create files or directories, except in /tmp and /var/tmp, where they will be removed automatically when the service shuts down, as will any IPC objects created. Sticky ownership of files/directories and IPC objects is hence dealt with effectively.

The RuntimeDirectory= option may be used to open up a bit the sandbox to external programs. If you set it to a directory name of your choice, it will be created below /run when the service is started, and removed in its entirety when it is terminated. The ownership of the directory is assigned to the service's dynamic user. This way, a dynamic user service can expose API interfaces (AF_UNIX sockets, …) to other services at a well-defined place and again bind the life-cycle of it to the service's own run-time. Example: set RuntimeDirectory=foobar in your service, and watch how a directory /run/foobar appears at the moment you start the service, and disappears the moment you stop it again. (BTW: Much like the other settings discussed above, RuntimeDirectory= may be used outside of the DynamicUser= context too, and is a nice way to run any service with a properly owned, life-cycle-managed run-time directory.)

Persistent Data

Of course, a service running in such an environment (although already very useful for many cases!), has a major limitation: it cannot leave persistent data around it can reuse on a later run. As pretty much the whole OS directory tree is read-only to it, there's simply no place it could put the data that survives from one service invocation to the next.

With systemd 235 this limitation is removed: there are now three new settings: StateDirectory=, LogsDirectory= and CacheDirectory=. In many ways they operate like RuntimeDirectory=, but create sub-directories below /var/lib, /var/log and /var/cache, respectively. There's one major difference beyond that however: directories created that way are persistent, they will survive the run-time cycle of a service, and thus may be used to store data that is supposed to stay around between invocations of the service.

Of course, the obvious question to ask now is: how do these three settings deal with the sticky file ownership problem?

For that we lifted a concept from container managers. Container managers have a very similar problem: each container and the host typically end up using a very similar set of numeric UIDs, and unless user name-spacing is deployed this means that host users might be able to access the data of specific containers that also have a user by the same numeric UID assigned, even though it actually refers to a very different identity in a different context. (Actually, it's even worse than just getting access, due to the existence of setuid file bits, access might translate to privilege elevation.) The way container managers protect the container images from the host (and from each other to some level) is by placing the container trees below a boundary directory, with very restrictive access modes and ownership (0700 and root:root or so). A host user hence cannot take advantage of the files/directories of a container user of the same UID inside of a local container tree, simply because the boundary directory makes it impossible to even reference files in it. After all on UNIX, in order to get access to a specific path you need access to every single component of it.

How is that applied to dynamic user services? Let's say StateDirectory=foobar is set for a service that has DynamicUser= turned off. The instant the service is started, /var/lib/foobar is created as state directory, owned by the service's user and remains in existence when the service is stopped. If the same service now is run with DynamicUser= turned on, the implementation is slightly altered. Instead of a directory /var/lib/foobar a symbolic link by the same path is created (owned by root), pointing to /var/lib/private/foobar (the latter being owned by the service's dynamic user). The /var/lib/private directory is created as boundary directory: it's owned by root:root, and has a restrictive access mode of 0700. Both the symlink and the service's state directory will survive the service's life-cycle, but the state directory will remain, and continues to be owned by the now disposed dynamic UID — however it is protected from other host users (and other services which might get the same dynamic UID assigned due to UID recycling) by the boundary directory.

The obvious question to ask now is: but if the boundary directory prohibits access to the directory from unprivileged processes, how can the service itself which runs under its own dynamic UID access it anyway? This is achieved by invoking the service process in a slightly modified mount name-space: it will see most of the file hierarchy the same way as everything else on the system (modulo /tmp and /var/tmp as mentioned above), except for /var/lib/private, which is over-mounted with a read-only tmpfs file system instance, with a slightly more liberal access mode permitting the service read access. Inside of this tmpfs file system instance another mount is placed: a bind mount to the host's real /var/lib/private/foobar directory, onto the same name. Putting this together these means that superficially everything looks the same and is available at the same place on the host and from inside the service, but two important changes have been made: the /var/lib/private boundary directory lost its restrictive character inside the service, and has been emptied of the state directories of any other service, thus making the protection complete. Note that the symlink /var/lib/foobar hides the fact that the boundary directory is used (making it little more than an implementation detail), as the directory is available this way under the same name as it would be if DynamicUser= was not used. Long story short: for the daemon and from the view from the host the indirection through /var/lib/private is mostly transparent.

This logic of course raises another question: what happens to the state directory if a dynamic user service is started with a state directory configured, gets UID X assigned on this first invocation, then terminates and is restarted and now gets UID Y assigned on the second invocation, with X ≠ Y? On the second invocation the directory — and all the files and directories below it — will still be owned by the original UID X so how could the second instance running as Y access it? Our way out is simple: systemd will recursively change the ownership of the directory and everything contained within it to UID Y before invoking the service's executable.

Of course, such recursive ownership changing (chown()ing) of whole directory trees can become expensive (though according to my experiences, IRL and for most services it's much cheaper than you might think), hence in order to optimize behavior in this regard, the allocation of dynamic UIDs has been tweaked in two ways to avoid the necessity to do this expensive operation in most cases: firstly, when a dynamic UID is allocated for a service an allocation loop is employed that starts out with a UID hashed from the service's name. This means a service by the same name is likely to always use the same numeric UID. That means that a stable service name translates into a stable dynamic UID, and that means recursive file ownership adjustments can be skipped (of course, after validation). Secondly, if the configured state directory already exists, and is owned by a suitable currently unused dynamic UID, it's preferably used above everything else, thus maximizing the chance we can avoid the chown()ing. (That all said, ultimately we have to face it, the currently available UID space of 4K+ is very small still, and conflicts are pretty likely sooner or later, thus a chown()ing has to be expected every now and then when this feature is used extensively).

Note that CacheDirectory= and LogsDirectory= work very similar to StateDirectory=. The only difference is that they manage directories below the /var/cache and /var/logs directories, and their boundary directory hence is /var/cache/private and /var/log/private, respectively.

Examples

So, after all this introduction, let's have a look how this all can be put together. Here's a trivial example:

# cat > /etc/systemd/system/dynamic-user-test.service <<EOF
[Service]
ExecStart=/usr/bin/sleep 4711
DynamicUser=yes
EOF
# systemctl daemon-reload
# systemctl start dynamic-user-test
# systemctl status dynamic-user-test
● dynamic-user-test.service
   Loaded: loaded (/etc/systemd/system/dynamic-user-test.service; static; vendor preset: disabled)
   Active: active (running) since Fri 2017-10-06 13:12:25 CEST; 3s ago
 Main PID: 2967 (sleep)
    Tasks: 1 (limit: 4915)
   CGroup: /system.slice/dynamic-user-test.service
           └─2967 /usr/bin/sleep 4711

Okt 06 13:12:25 sigma systemd[1]: Started dynamic-user-test.service.
# ps -e -o pid,comm,user | grep 2967
 2967 sleep           dynamic-user-test
# id dynamic-user-test
uid=64642(dynamic-user-test) gid=64642(dynamic-user-test) groups=64642(dynamic-user-test)
# systemctl stop dynamic-user-test
# id dynamic-user-test
id: ‘dynamic-user-test’: no such user

In this example, we create a unit file with DynamicUser= turned on, start it, check if it's running correctly, have a look at the service process' user (which is named like the service; systemd does this automatically if the service name is suitable as user name, and you didn't configure any user name to use explicitly), stop the service and verify that the user ceased to exist too.

That's already pretty cool. Let's step it up a notch, by doing the same in an interactive transient service (for those who don't know systemd well: a transient service is a service that is defined and started dynamically at run-time, for example via the systemd-run command from the shell. Think: run a service without having to write a unit file first):

# systemd-run --pty --property=DynamicUser=yes --property=StateDirectory=wuff /bin/sh
Running as unit: run-u15750.service
Press ^] three times within 1s to disconnect TTY.
sh-4.4$ id
uid=63122(run-u15750) gid=63122(run-u15750) groups=63122(run-u15750) context=system_u:system_r:initrc_t:s0
sh-4.4$ ls -al /var/lib/private/
total 0
drwxr-xr-x. 3 root       root        60  6. Okt 13:21 .
drwxr-xr-x. 1 root       root       852  6. Okt 13:21 ..
drwxr-xr-x. 1 run-u15750 run-u15750   8  6. Okt 13:22 wuff
sh-4.4$ ls -ld /var/lib/wuff
lrwxrwxrwx. 1 root root 12  6. Okt 13:21 /var/lib/wuff -> private/wuff
sh-4.4$ ls -ld /var/lib/wuff/
drwxr-xr-x. 1 run-u15750 run-u15750 0  6. Okt 13:21 /var/lib/wuff/
sh-4.4$ echo hello > /var/lib/wuff/test
sh-4.4$ exit
exit
# id run-u15750
id: ‘run-u15750’: no such user
# ls -al /var/lib/private
total 0
drwx------. 1 root  root   66  6. Okt 13:21 .
drwxr-xr-x. 1 root  root  852  6. Okt 13:21 ..
drwxr-xr-x. 1 63122 63122   8  6. Okt 13:22 wuff
# ls -ld /var/lib/wuff
lrwxrwxrwx. 1 root root 12  6. Okt 13:21 /var/lib/wuff -> private/wuff
# ls -ld /var/lib/wuff/
drwxr-xr-x. 1 63122 63122 8  6. Okt 13:22 /var/lib/wuff/
# cat /var/lib/wuff/test
hello

The above invokes an interactive shell as transient service run-u15750.service (systemd-run picked that name automatically, since we didn't specify anything explicitly) with a dynamic user whose name is derived automatically from the service name. Because StateDirectory=wuff is used, a persistent state directory for the service is made available as /var/lib/wuff. In the interactive shell running inside the service, the ls commands show the /var/lib/private boundary directory and its contents, as well as the symlink that is placed for the service. Finally, before exiting the shell, a file is created in the state directory. Back in the original command shell we check if the user is still allocated: it is not, of course, since the service ceased to exist when we exited the shell and with it the dynamic user associated with it. From the host we check the state directory of the service, with similar commands as we did from inside of it. We see that things are set up pretty much the same way in both cases, except for two things: first of all the user/group of the files is now shown as raw numeric UIDs instead of the user/group names derived from the unit name. That's because the user ceased to exist at this point, and "ls" shows the raw UID for files owned by users that don't exist. Secondly, the access mode of the boundary directory is different: when we look at it from outside of the service it is not readable by anyone but root, when we looked from inside we saw it it being world readable.

Now, let's see how things look if we start another transient service, reusing the state directory from the first invocation:

# systemd-run --pty --property=DynamicUser=yes --property=StateDirectory=wuff /bin/sh
Running as unit: run-u16087.service
Press ^] three times within 1s to disconnect TTY.
sh-4.4$ cat /var/lib/wuff/test
hello
sh-4.4$ ls -al /var/lib/wuff/
total 4
drwxr-xr-x. 1 run-u16087 run-u16087  8  6. Okt 13:22 .
drwxr-xr-x. 3 root       root       60  6. Okt 15:42 ..
-rw-r--r--. 1 run-u16087 run-u16087  6  6. Okt 13:22 test
sh-4.4$ id
uid=63122(run-u16087) gid=63122(run-u16087) groups=63122(run-u16087) context=system_u:system_r:initrc_t:s0
sh-4.4$ exit
exit

Here, systemd-run picked a different auto-generated unit name, but the used dynamic UID is still the same, as it was read from the pre-existing state directory, and was otherwise unused. As we can see the test file we generated earlier is accessible and still contains the data we left in there. Do note that the user name is different this time (as it is derived from the unit name, which is different), but the UID it is assigned to is the same one as on the first invocation. We can thus see that the mentioned optimization of the UID allocation logic (i.e. that we start the allocation loop from the UID owner of any existing state directory) took effect, so that no recursive chown()ing was required.

And that's the end of our example, which hopefully illustrated a bit how this concept and implementation works.

Use-cases

Now that we had a look at how to enable this logic for a unit and how it is implemented, let's discuss where this actually could be useful in real life.

  • One major benefit of dynamic user IDs is that running a privilege-separated service leaves no artifacts in the system. A system user is allocated and made use of, but it is discarded automatically in a safe and secure way after use, in a fashion that is safe for later recycling. Thus, quickly invoking a short-lived service for processing some job can be protected properly through a user ID without having to pre-allocate it and without this draining the available UID pool any longer than necessary.

  • In many cases, starting a service no longer requires package-specific preparation. Or in other words, quite often useradd/mkdir/chown/chmod invocations in "post-inst" package scripts, as well as sysusers.d and tmpfiles.d drop-ins become unnecessary, as the DynamicUser= and StateDirectory=/CacheDirectory=/LogsDirectory= logic can do the necessary work automatically, on-demand and with a well-defined life-cycle.

  • By combining dynamic user IDs with the transient unit concept, new creative ways of sand-boxing are made available. For example, let's say you don't trust the correct implementation of the sort command. You can now lock it into a simple, robust, dynamic UID sandbox with a simple systemd-run and still integrate it into a shell pipeline like any other command. Here's an example, showcasing a shell pipeline whose middle element runs as a dynamically on-the-fly allocated UID, that is released when the pipelines ends.

    # cat some-file.txt | systemd-run ---pipe --property=DynamicUser=1 sort -u | grep -i foobar > some-other-file.txt
    
  • By combining dynamic user IDs with the systemd templating logic it is now possible to do much more fine-grained and fully automatic UID management. For example, let's say you have a template unit file /etc/systemd/system/foobard@.service:

    [Service]
    ExecStart=/usr/bin/myfoobarserviced
    DynamicUser=1
    StateDirectory=foobar/%i
    

    Now, let's say you want to start one instance of this service for each of your customers. All you need to do now for that is:

    # systemctl enable foobard@customerxyz.service --now
    

    And you are done. (Invoke this as many times as you like, each time replacing customerxyz by some customer identifier, you get the idea.)

  • By combining dynamic user IDs with socket activation you may easily implement a system where each incoming connection is served by a process instance running as a different, fresh, newly allocated UID within its own sandbox. Here's an example waldo.socket:

    [Socket]
    ListenStream=2048
    Accept=yes
    

    With a matching waldo@.service:

    [Service]
    ExecStart=-/usr/bin/myservicebinary
    DynamicUser=yes
    

    With the two unit files above, systemd will listen on TCP/IP port 2048, and for each incoming connection invoke a fresh instance of waldo@.service, each time utilizing a different, new, dynamically allocated UID, neatly isolated from any other instance.

  • Dynamic user IDs combine very well with state-less systems, i.e. systems that come up with an unpopulated /etc and /var. A service using dynamic user IDs and the StateDirectory=, CacheDirectory=, LogsDirectory= and RuntimeDirectory= concepts will implicitly allocate the users and directories it needs for running, right at the moment where it needs it.

Dynamic users are a very generic concept, hence a multitude of other uses are thinkable; the list above is just supposed to trigger your imagination.

What does this mean for you as a packager?

I am pretty sure that a large number of services shipped with today's distributions could benefit from using DynamicUser= and StateDirectory= (and related settings). It often allows removal of post-inst packaging scripts altogether, as well as any sysusers.d and tmpfiles.d drop-ins by unifying the needed declarations in the unit file itself. Hence, as a packager please consider switching your unit files over. That said, there are a number of conditions where DynamicUser= and StateDirectory= (and friends) cannot or should not be used. To name a few:

  1. Service that need to write to files outside of /run/<package>, /var/lib/<package>, /var/cache/<package>, /var/log/<package>, /var/tmp, /tmp, /dev/shm are generally incompatible with this scheme. This rules out daemons that upgrade the system as one example, as that involves writing to /usr.

  2. Services that maintain a herd of processes with different user IDs. Some SMTP services are like this. If your service has such a super-server design, UID management needs to be done by the super-server itself, which rules out systemd doing its dynamic UID magic for it.

  3. Services which run as root (obviously…) or are otherwise privileged.

  4. Services that need to live in the same mount name-space as the host system (for example, because they want to establish mount points visible system-wide). As mentioned DynamicUser= implies ProtectSystem=, PrivateTmp= and related options, which all require the service to run in its own mount name-space.

  5. Your focus is older distributions, i.e. distributions that do not have systemd 232 (for DynamicUser=) or systemd 235 (for StateDirectory= and friends) yet.

  6. If your distribution's packaging guides don't allow it. Consult your packaging guides, and possibly start a discussion on your distribution's mailing list about this.

Notes

A couple of additional, random notes about the implementation and use of these features:

  1. Do note that allocating or deallocating a dynamic user leaves /etc/passwd untouched. A dynamic user is added into the user database through the glibc NSS module nss-systemd, and this information never hits the disk.

  2. On traditional UNIX systems it was the job of the daemon process itself to drop privileges, while the DynamicUser= concept is designed around the service manager (i.e. systemd) being responsible for that. That said, since v235 there's a way to marry DynamicUser= and such services which want to drop privileges on their own. For that, turn on DynamicUser= and set User= to the user name the service wants to setuid() to. This has the effect that systemd will allocate the dynamic user under the specified name when the service is started. Then, prefix the command line you specify in ExecStart= with a single ! character. If you do, the user is allocated for the service, but the daemon binary is invoked as root instead of the allocated user, under the assumption that the daemon changes its UID on its own the right way. Note that after registration the user will show up instantly in the user database, and is hence resolvable like any other by the daemon process. Example: ExecStart=!/usr/bin/mydaemond

  3. You may wonder why systemd uses the UID range 61184–65519 for its dynamic user allocations (side note: in hexadecimal this reads as 0xEF00–0xFFEF). That's because distributions (specifically Fedora) tend to allocate regular users from below the 60000 range, and we don't want to step into that. We also want to stay away from 65535 and a bit around it, as some of these UIDs have special meanings (65535 is often used as special value for "invalid" or "no" UID, as it is identical to the 16bit value -1; 65534 is generally mapped to the "nobody" user, and is where some kernel subsystems map unmappable UIDs). Finally, we want to stay within the 16bit range. In a user name-spacing world each container tends to have much less than the full 32bit UID range available that Linux kernels theoretically provide. Everybody apparently can agree that a container should at least cover the 16bit range though — already to include a nobody user. (And quite frankly, I am pretty sure assigning 64K UIDs per container is nicely systematic, as the the higher 16bit of the 32bit UID values this way become a container ID, while the lower 16bit become the logical UID within each container, if you still follow what I am babbling here…). And before you ask: no this range cannot be changed right now, it's compiled in. We might change that eventually however.

  4. You might wonder what happens if you already used UIDs from the 61184–65519 range on your system for other purposes. systemd should handle that mostly fine, as long as that usage is properly registered in the user database: when allocating a dynamic user we pick a UID, see if it is currently used somehow, and if yes pick a different one, until we find a free one. Whether a UID is used right now or not is checked through NSS calls. Moreover the IPC object lists are checked to see if there are any objects owned by the UID we are about to pick. This means systemd will avoid using UIDs you have assigned otherwise. Note however that this of course makes the pool of available UIDs smaller, and in the worst cases this means that allocating a dynamic user might fail because there simply are no unused UIDs in the range.

  5. If not specified otherwise the name for a dynamically allocated user is derived from the service name. Not everything that's valid in a service name is valid in a user-name however, and in some cases a randomized name is used instead to deal with this. Often it makes sense to pick the user names to register explicitly. For that use User= and choose whatever you like.

  6. If you pick a user name with User= and combine it with DynamicUser= and the user already exists statically it will be used for the service and the dynamic user logic is automatically disabled. This permits automatic up- and downgrades between static and dynamic UIDs. For example, it provides a nice way to move a system from static to dynamic UIDs in a compatible way: as long as you select the same User= value before and after switching DynamicUser= on, the service will continue to use the statically allocated user if it exists, and only operates in the dynamic mode if it does not. This is useful for other cases as well, for example to adapt a service that normally would use a dynamic user to concepts that require statically assigned UIDs, for example to marry classic UID-based file system quota with such services.

  7. systemd always allocates a pair of dynamic UID and GID at the same time, with the same numeric ID.

  8. If the Linux kernel had a "shiftfs" or similar functionality, i.e. a way to mount an existing directory to a second place, but map the exposed UIDs/GIDs in some way configurable at mount time, this would be excellent for the implementation of StateDirectory= in conjunction with DynamicUser=. It would make the recursive chown()ing step unnecessary, as the host version of the state directory could simply be mounted into a the service's mount name-space, with a shift applied that maps the directory's owner to the services' UID/GID. But I don't have high hopes in this regard, as all work being done in this area appears to be bound to user name-spacing — which is a concept not used here (and I guess one could say user name-spacing is probably more a source of problems than a solution to one, but you are welcome to disagree on that).

And that's all for now. Enjoy your dynamic users!

All Systems Go! 2017 Schedule Published

Posted by Lennart Poettering on September 26, 2017 10:00 PM

<large>The All Systems Go! 2017 schedule has been published!</large>

I am happy to announce that we have published the All Systems Go! 2017 schedule! We are very happy with the large number and the quality of the submissions we got, and the resulting schedule is exceptionally strong.

Without further ado:

Here's the schedule for the first day (Saturday, 21st of October).

And here's the schedule for the second day (Sunday, 22nd of October).

Here are a couple of keywords from the topics of the talks: 1password, azure, bluetooth, build systems, casync, cgroups, cilium, cockpit, containers, ebpf, flatpak, habitat, IoT, kubernetes, landlock, meson, OCI, rkt, rust, secureboot, skydive, systemd, testing, tor, varlink, virtualization, wifi, and more.

Our speakers are from all across the industry: Chef CoreOS, Covalent, Facebook, Google, Intel, Kinvolk, Microsoft, Mozilla, Pantheon, Pengutronix, Red Hat, SUSE and more.

For further information about All Systems Go! visit our conference web site.

Make sure to buy your ticket for All Systems Go! 2017 now! A limited number of tickets are left at this point, so make sure you get yours before we are all sold out! Find all details here.

See you in Berlin!

libinput and the HUION PenTablet devices

Posted by Peter Hutterer on September 21, 2017 04:52 AM

HUION PenTablet devices are graphics tablet devices aimed at artists. These tablets tend to aim for the lower end of the market, driver support is often somewhere between meh and disappointing. The DIGImend project used to take care of them, but with that out of the picture, the bugs bubble up to userspace more often.

The most common bug at the moment is a lack of proximity events. On pen devices like graphics tablets, we expect a BTN_TOOL_PEN event whenever the pen goes in or out of the detectable range of the tablet ('proximity'). On most devices, proximity does not imply touching the surface (that's BTN_TOUCH or a pressure-based threshold), on anything that's not built into a screen proximity without touching the surface is required to position the cursor correctly. libinput relies on proximity events to provide the correct tool state, which again is relied upon by compositors and clients.

The broken HUION devices only send BTN_TOOL_PEN once whenever the pen first goes into proximity and then never again until the device is disconnected. To make things more fun, HUION re-uses USB ids, so we cannot even reliably detect the broken devices and do the usual approach to hardware-quirking. So far, libinput support for HUION devices has thus been spotty. The good news is that libinput git master (and thus libinput 1.9) will have a fix for this. The one thing we can rely on is that tablets keep sending events at the device's scanout frequency. So in libinput we now add a timeout to the tablets and assume proximity-out has happened. libinput fakes a proximity out event and waits for the next event from the tablet - at which point we'll fake a proximity in before processing the events. This is enabled on all HUION devices now (re-using USB IDs, remember?) but not on any other device.

One down, many more broken devices more to go. Yay.

Bluetooth on Fedora: joypads and (more) security

Posted by Bastien Nocera on September 20, 2017 01:31 PM
It's been a while since I posted about Fedora specific Bluetooth enhancements, and even longer that I posted about PlayStation controllers support.

Let's start with the nice feature.

Dual-Shock 3 and 4 support

We've had support for Dual-Shock 3 (aka Sixaxis, aka PlayStation 3 controllers) for a long while, but I've added a long-standing patchset to the Fedora packages that changes the way devices are setup.

The old way was: plug in your joypad via USB, disconnect it, and press the "P" button on the pad. At this point, and since GNOME 3.12, you would have needed the Bluetooth Settings panel opened for a question to pop up about whether the joypad can connect.

This is broken in a number of ways. If you were trying to just charge the joypad, then it would forget its original "console" and you would need to plug it in again. If you didn't have the Bluetooth panel opened when trying to use it wirelessly, then it just wouldn't have worked.

Set up is now simpler. Open the Bluetooth panel, plug in your device, and answer the question. You just want to charge it? Dismiss the query, or simply don't open the Bluetooth panel, it'll work dandily and won't overwrite the joypad's settings.


And finally, we also made sure that it works with PlayStation 4 controllers.



Note that the PlayStation 4 controller has a button combination that allows it to be visible and pairable, except that if the device trying to connect with it doesn't behave in a particular way (probably the same way the 25€ RRP USB adapter does), it just wouldn't work. And it didn't work for me on a number of different devices.

Cable pairing for the win!

And the boring stuff

Hey, do you know what happened last week? There was a security problem in a package that I glance at sideways sometimes! Yes. Again.

A good way to minimise the problems caused by problems like this one is to lock the program down. In much the same way that you'd want to restrict thumbnailers, or even end-user applications, we can forbid certain functionality from being available when launched via systemd.

We've finally done this in recent fprintd and iio-sensor-proxy upstream releases, as well as for bluez in Fedora Rawhide. If testing goes well, we will integrate this in Fedora 27.

Fun with fonts

Posted by Matthias Clasen on September 16, 2017 06:58 PM

I had the opportunity to spend some time in Montreal last week to meet with some lovely font designers and typophiles around the ATypI conference.

At the conference, variable fonts celebrated their first birthday. This is a new feature in OpenType 1.8 – but really, it is a very old feature, previously known under names like multiple master fonts.

The idea is simple: A single font file can provide not just the shapes for the glyphs of a single font family, but can also axes along which these shapes can be varied to generate multiple variations of the underlying design. An infinite number, really. Additionally, fonts may pick out certain variations and give them a name.

A lot has to happen to realize this simple idea. If you want to get a glimpse at what is going on behind the scenes, you can look at the OpenType spec.

A while ago, Behdad and I agreed that we want to have font variations available in the Linux text rendering stack. So we used the opportunity of meeting in Montreal to work on it. It is a little involved, since there are several layers of libraries that all need to know about these features before we can show anything: freetype, harfbuzz, cairo, fontconfig, pango, gtk.

freetype and harfbuzz are more or less ready with APIs like FT_Get_MM_Var or hb_font_set_variations that let us access and control the font variations. So we concentrated on the remaining pieces.

As the conference comes to a close today, it is time to present how far we got.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1900-6" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2017/09/vf-axes.webm?_=6" type="video/webm">https://blogs.gnome.org/mclasen/files/2017/09/vf-axes.webm</video>

This video is showing a font with several axes in the Font Features example in gtk-demo. As you can see, the font changes in real time as the axes get modified in the UI. It is worth pointing out that the minimum, maximum and default values for the axes, as well as their names, are all provided by the font.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1900-7" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2017/09/vf-named-styles.webm?_=7" type="video/webm">https://blogs.gnome.org/mclasen/files/2017/09/vf-named-styles.webm</video>

This video is showing the named variations (called Instances here) that are provided by the font. Selecting one of them makes the font change in real time and also updates the axis sliders below.

Eventually, it would be nice if this was available in the font chooser, so users can take advantage of it without having to wait for specific support in applications.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-1900-8" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2017/09/vf-picker.webm?_=8" type="video/webm">https://blogs.gnome.org/mclasen/files/2017/09/vf-picker.webm</video>

This video shows a quick prototype of how that could look. With all these new font features coming in, now may be a good time to have a hackfest around improving the font chooser.

One frustrating aspect of working on advanced font features is that it is just hard to know if the fonts you are using on your system have any of these fancy features, beyond just being a bag of glyphs. Therefore, I also spent a bit of time on making this information available in the font viewer.

And thats it!

Our patches for cairo, fontconfig, pango, gtk and gnome-font-viewer are currently under review on various mailing lists, bugs and branches, but they should make their way into releases in due course, so that you can have more fun with fonts too!

Flickerless Gtk3 OpenGL Transitions

Posted by Caolán McNamara on September 11, 2017 08:13 PM
While I got OpenGL transitions working under Gtk3 at the end of last year basically matching the Gtk2/Generic OpenGL quality the transition into and out of the OpenGL sequence wasn't very satisfying. And with access to HiDPI it was clearly even worse with an unscaled image momentarily appearing before the correct one.

So here's the before and after of the improvements that landed on upstream master today. Just screen-recordings with built in ctrl+shift+alt+t under gnome3 and positioned side by side and clipped roughly together in pitivi

<iframe allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/Z8cFbIO3ZUU" width="560"></iframe>

All Systems Go! 2017 CfP Closes Soon!

Posted by Lennart Poettering on August 29, 2017 10:00 PM

<large>The All Systems Go! 2017 Call for Participation is Closing on September 3rd!</large>

Please make sure to get your presentation proprosals forAll Systems Go! 2017 in now! The CfP closes on sunday!

In case you haven't heard about All Systems Go! yet, here's a quick reminder what kind of conference it is, and why you should attend and speak there:

All Systems Go! is an Open Source community conference focused on the projects and technologies at the foundation of modern Linux systems — specifically low-level user-space technologies. Its goal is to provide a friendly and collaborative gathering place for individuals and communities working to push these technologies forward. All Systems Go! 2017 takes place in Berlin, Germany on October 21st+22nd. All Systems Go! is a 2-day event with 2-3 talks happening in parallel. Full presentation slots are 30-45 minutes in length and lightning talk slots are 5-10 minutes.

In particular, we are looking for sessions including, but not limited to, the following topics:

  • Low-level container executors and infrastructure
  • IoT and embedded OS infrastructure
  • OS, container, IoT image delivery and updating
  • Building Linux devices and applications
  • Low-level desktop technologies
  • Networking
  • System and service management
  • Tracing and performance measuring
  • IPC and RPC systems
  • Security and Sandboxing

While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome too, as long as they have a clear and direct relevance for user-space.

To submit your proposal now please visit our CFP submission web site.

For further information about All Systems Go! visit our conference web site.

systemd.conf will not take place this year in lieu of All Systems Go!. All Systems Go! welcomes all projects that contribute to Linux user space, which, of course, includes systemd. Thus, anything you think was appropriate for submission to systemd.conf is also fitting for All Systems Go!

Post-GUADEC distractions

Posted by Matthias Clasen on August 18, 2017 09:25 PM

Like everybody else, I had a great time at GUADEC this year.

One of the things that made me happy is that I could convince Behdad to come, and we had a chance to finally wrap up a story that has been going on for much too long: Support for color Emoji in the GTK+ stack and in GNOME.

Behdad has been involved in the standardization process around the various formats for color glyphs in fonts since the very beginning. In 2013, he posted some prototype work for color glyph support in cairo.

This was clearly not meant for inclusion, he was looking for assistance turning this into a mergable patch. Unfortunately, nobody picked this up until I gave it a try in 2016. But my patch was not quite right, and things stalled again.

We finally picked it up this year. I produced a better cairo patch, which we reviewed, fixed and merged during the unconference days at GUADEC. Behdad also wrote and merged the necessary changes for fontconfig, so we can have an “emoji” font family, and made pango automatically choose that font when it finds Emoji.

After guadec, I worked on the input side in GTK+. As a first result, it is now possible to use Control-Shift-e to select Emoji by name or code.

<video class="wp-video-shortcode" controls="controls" height="147" id="video-1879-9" preload="metadata" width="400"><source src="https://blogs.gnome.org/mclasen/files/2017/08/c-s-e.webm?_=9" type="video/webm">https://blogs.gnome.org/mclasen/files/2017/08/c-s-e.webm</video>

This is a bit of an easter egg though, and only covers a few Emoji like ❤. The full list of supported names is here.

A more prominent way to enter Emoji is clearly needed, so i set out to implement the design we have for an Emoji chooser. The result looks like this:

As you can see, it supports variation selectors for skin tones, and lets you search by name. The clickable icon has to be enabled with a show-emoji-icon property on GtkEntry, but there is a context menu item that brings up the Emoji chooser, regardless.

I am reasonably happy with it, and it will be available both in GTK+ 3.92 and in GTK+ 3.22.19. We are bending the api stability rules a little bit here, to allow the new property for enabling the icon.

Working on this dialog gave me plenty of opportunity to play with Emoji in GTK+ entries, and it became apparent that some things were not quite right.  Some Emoji just did not appear, sometimes. This took me quite a while to debug, since I was hunting for some rendering issue, when in the end, it turned out to be insufficient support for variation selectors in pango.

Another issue that turned up was that pango did place the text caret in the middle of Emoji’s sometimes, and Backspace deleted them piece-meal, one character at a time, instead of all at once. This required fixes in pango’s implementation of the Unicode segmentation rules (TR29). Thankfully, Peng Wu had already done much of the work for this, I just fixed the remaining corner cases to handle all Emoji correctly, including skin tone variations and flags.

So, what’s still missing ? I’m thinking of adding optional support for completion of Emoji names like :grin: directly in the entry, like this:

<video class="wp-video-shortcode" controls="controls" height="450" id="video-1879-10" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2017/08/emoji-completion.webm?_=10" type="video/webm">https://blogs.gnome.org/mclasen/files/2017/08/emoji-completion.webm</video>

But this code still needs some refinement before it is ready to land. It also overlaps a bit with traditional input method functionality, and I am still pondering the best way to resolve that.

To try out color Emoji, you can either wait for GNOME 3.26, which will be released in September, or you can get:

  • cairo from git master
  • fontconfig from git master
  • pango 1.40.9 or .10
  • GTK+ from the gtk-3-22 branch
  • a suitable Emoji font, such as EmojiOne or Noto Color Emoji

It was fun to work on this, I hope you enjoy using it! ❤

Shipping PKCS7 signed metadata and firmware

Posted by Richard Hughes on August 18, 2017 04:28 PM

Over the last few days I’ve merged in the PKCS7 support into fwupd as an optional feature. I’ve done this for a few reasons:

  • Some distributors of fwupd were disabling the GPG code as it’s GPLv3, and I didn’t feel comfortable saying just use no signatures
  • Trusted vendors want to ship testing versions of firmware directly to users without first uploading to the LVFS.
  • Some firmware is inherently internal use only and needs to be signed using existing cryptographic hardware.
  • The gpgme code scares me.

Did you know GPGME is a library based around screen scraping the output of the gpg2 binary? When you perform an action using the libgpgme APIs you’re literally injecting a string into a pipe and waiting for it to return. You can’t even use libgcrypt (the thing that gpg2 uses) directly as it’s way too low level and doesn’t have any sane abstractions or helpers to read or write packaged data. I don’t want to learn LISP S-Expressions (yes, really) and manually deal with packing data just to do vanilla X509 crypto.

Although the LVFS instance only signs files and metadata with GPG at the moment I’ve added the missing bits into python-gnutls so it could become possible in the future. If this is accepted then I think it would be fine to support both GPG and PKCS7 on the server.

One of the temptations for X509 signing would be to get a certificate from an existing CA and then sign the firmware with that. From my point of view that would be bad, as any firmware signed by any certificate in my system trust store to be marked as valid, when really all I want to do is check for a specific (or a few) certificates that I know are going to be providing certified working firmware. Although I could achieve this to some degree with certificate pinning, it’s not so easy if there is a hierarchical trust relationship or anything more complicated than a simple 1:1 relationship.

So this is possible I’ve created a LVFS CA certificate, and also a server certificate for the specific instance I’m running on OpenShift. I’ve signed the instance certificate with the CA certificate and am creating detached signatures with an embedded (signed-by-the-CA) server certificate. This seems to work well, and means we can issue other certificates (or CRLs) if the server ever moves or the trust is compromised in some way.

So, tl;dr: (should have been at the top of this page…) if you see a /etc/pki/fwupd/LVFS-CA.pem appear on your system in the next release you can relax. Comments, especially from crypto experts welcome. Thanks!

Forward only binary patching

Posted by Richard Hughes on August 10, 2017 10:37 AM

A couple of weeks ago I’ve added some new functionality to dfu-tool which is shipped in fwupd. The dfu-tool utility (via libdfu) now has the ability to forward-patch binary files, somewhat like bsdiff does. To do this it compares the old firmware with the new firmware, finding blocks of data that are different and storing the new content and the offset in a .dfup file. The reason for storing the new content rather than a binary diff (like bsdiff) is that you can remove non-free and non-redistributable code without actually including it in the diff file (which, you might be doing if you’re neutering/removing the Intel Management Engine). This does make reversing the binary patch process impossible, but this isn’t a huge problem if we keep the old file around for downgrades.

$ sha1sum ~/firmware-releases/colorhug-1.1.6.bin
955386767a0108faf104f74985ccbefcd2f6050c  ~/firmware-releases/colorhug-1.1.6.bin

$ sha1sum ~/firmware-releases/colorhug-1.1.7.bin
9b7dbb24dbcae85fbbf045e7ff401fb3f57ddf31  ~/firmware-releases/colorhug-1.1.7.bin

$  dfu-tool patch-create ~/firmware-releases/colorhug-1.1.6.bin
~/firmware-releases/colorhug-1.1.7.bin colorhug-1_1_6-to-1_1_7.dfup
-v
Dfu-DEBUG: binary growing from: 19200 to 19712
Dfu-DEBUG: add chunk @0x0000 (len 3)
Dfu-DEBUG: add chunk @0x0058 (len 2)
Dfu-DEBUG: add chunk @0x023a (len 19142)
Dfu-DEBUG: blob size is 19231

$ dfu-tool patch-dump colorhug-1_1_6-to-1_1_7.dfup
checksum-old: 955386767a0108faf104f74985ccbefcd2f6050c
checksum-new: 9b7dbb24dbcae85fbbf045e7ff401fb3f57ddf31
chunk #00     0x0000, length 3
chunk #01     0x0058, length 2
chunk #02     0x023a, length 19142

$ dfu-tool patch-apply ~/firmware-releases/colorhug-1.1.6.bin
colorhug-1_1_6-to-1_1_7.dfup new.bin -v
Dfu-DEBUG: binary growing from: 19200 to 19712
Dfu-DEBUG: applying chunk 1/3 @0x0000 (length 3)
Dfu-DEBUG: applying chunk 2/3 @0x0058 (length 2)
Dfu-DEBUG: applying chunk 3/3 @0x023a (length 19142)

$ sha1sum new.bin
9b7dbb24dbcae85fbbf045e7ff401fb3f57ddf31  new.bin

Perhaps a bad example here, the compiler changed between 1.1.6 and 1.1.7 so lots of internal offsets changed and there’s no partitions inside the image; but you get the idea. For some system firmware where only a BIOS default was changed this can reduce the size of the download from megabytes to tens of bytes; the largest thing in the .cab then becomes the XML metadata (which also compresses rather well). Of course in this case you can also use bsdiff if it’s already installed — I’ve not yet decided if it makes sense for fwupd to runtime require tools like bspatch as these could be needed by the firmware builder bubblewrap functionality, or if it could just be included as statically linked binaries in the .cab file. Comments welcome.