Fedora desktop Planet

libxmlb now a dependency of fwupd and gnome-software

Posted by Richard Hughes on October 22, 2018 07:31 AM

I’ve just released libxmlb 0.1.3, and merged the branches for fwupd and gnome-software so that it becomes a hard dependency on both projects. A few people have reviewed the libxmlb code, and Mario, Kalev and Robert reviewed the fwupd and gnome-software changes so I’m pretty confident I’ve not broken anything too important — but more testing very welcome. GNOME Software RSS usage is about 50% of what is shipped in 3.30.x and fwupd is down by 65%! If you want to ship the upcoming fwupd 1.2.0 or gnome-software 3.31.2 in your distro you’ll need to have libxmlb packaged, or be happy using a meson subpackage to download libxmlb during the build of each dependent project.

Initial thoughts on MongoDB's new Server Side Public License

Posted by Matthew Garrett on October 16, 2018 10:43 PM
MongoDB just announced that they were relicensing under their new Server Side Public License. This is basically the Affero GPL except with section 13 largely replaced with new text, as follows:

If you make the functionality of the Program or a modified version available to third parties as a service, you must make the Service Source Code available via network download to everyone at no charge, under the terms of this License. Making the functionality of the Program or modified version available to third parties as a service includes, without limitation, enabling third parties to interact with the functionality of the Program or modified version remotely through a computer network, offering a service the value of which entirely or primarily derives from the value of the Program or modified version, or offering a service that accomplishes for users the primary purpose of the Software or modified version.

“Service Source Code” means the Corresponding Source for the Program or the modified version, and the Corresponding Source for all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software, all such that a user could run an instance of the service using the Service Source Code you make available.


MongoDB admit that this license is not currently open source in the sense of being approved by the Open Source Initiative, but say:We believe that the SSPL meets the standards for an open source license and are working to have it approved by the OSI.

At the broadest level, AGPL requires you to distribute the source code to the AGPLed work[1] while the SSPL requires you to distribute the source code to everything involved in providing the service. Having a license place requirements around things that aren't derived works of the covered code is unusual but not entirely unheard of - the GPL requires you to provide build scripts even if they're not strictly derived works, and you could probably make an argument that the anti-Tivoisation provisions of GPL3 fall into this category.

A stranger point is that you're required to provide all of this under the terms of the SSPL. If you have any code in your stack that can't be released under those terms then it's literally impossible for you to comply with this license. I'm not a lawyer, so I'll leave it up to them to figure out whether this means you're now only allowed to deploy MongoDB on BSD because the license would require you to relicense Linux away from the GPL. This feels sloppy rather than deliberate, but if it is deliberate then it's a massively greater reach than any existing copyleft license.

You can definitely make arguments that this is just a maximalist copyleft license, the AGPL taken to extreme, and therefore it fits the open source criteria. But there's a point where something is so far from the previously accepted scenarios that it's actually something different, and should be examined as a new category rather than already approved categories. I suspect that this license has been written to conform to a strict reading of the Open Source Definition, and that any attempt by OSI to declare it as not being open source will receive pushback. But definitions don't exist to be weaponised against the communities that they seek to protect, and a license that has overly onerous terms should be rejected even if that means changing the definition.

In general I am strongly in favour of licenses ensuring that users have the freedom to take advantage of modifications that people have made to free software, and I'm a fan of the AGPL. But my initial feeling is that this license is a deliberate attempt to make it practically impossible to take advantage of the freedoms that the license nominally grants, and this impression is strengthened by it being something that's been announced with immediate effect rather than something that's been developed with community input. I think there's a bunch of worthwhile discussion to have about whether the AGPL is strong and clear enough to achieve its goals, but I don't think that this SSPL is the answer to that - and I lean towards thinking that it's not a good faith attempt to produce a usable open source license.

(It should go without saying that this is my personal opinion as a member of the free software community, and not that of my employer)

[1] There's some complexities around GPL3 code that's incorporated into the AGPLed work, but if it's not part of the AGPLed work then it's not covered

comment count unavailable comments

Firefox on Wayland update

Posted by Martin Stransky on October 09, 2018 09:38 AM

As a next step in the Wayland effort we have new fresh Firefox packages [1] with all the goodies from Firefox 63/64 (Nightly) for you. They come with better (and fixed) rendering, v-sync support, and working HiDPI. Support for hi-res displays is not perfect yet and more fixes are on the way – thanks to Jan Horak who wrote that patches.

The builds also ship PipeWire WebRTC patch for desktop sharing created by Jan Grulich and Tomas Popela. Wayland applications are isolated from desktop and don’t have access to other windows (as X11) thus PipeWire supplies the missing functionality along the browser sandbox.

I think the rendering is generally covered now and the browser should work smoothly with Wayland backend. That’s also a reason why I make it default on Fedora 30 (Rawhide) and firefox-x11 package is available as a X11 fallback. Fedora 29 and earlier stay with default X11 backend and Wayland is provided by firefox-wayland package.

And there’s surely some work left to make Firefox perfect on Wayland – for instance correctly place popups on Gtk 3.24, update WebRender/EGL, fix KDE compositor and so on.

[1] Fedora 27 Fedora 28 Fedora 29

Flatpak, after 1.0

Posted by Matthias Clasen on October 08, 2018 05:45 PM

Flatpak 1.0 has happened a while ago, and we now have a stable base. We’re up to the 1.0.3 bug-fix release at this point, and hope to get 1.0.x adopted in all major distros before too long.

Does that mean that Flatpak is done ? Far from it! We have just created a stable branch and started landing some queued-up feature work on the master branch. This includes things like:

  • Better life-cycle control with ps and kill commands
  • Logging and history
  • File copy-paste and DND
  • A better testsuite, including coverage reports

Beyond these, we have a laundry list of things we want to work on in the near future, including

  • Using  host GL drivers (possibly with libcapsule)
  • Application renaming and end-of-life migration
  • A portal for dconf/gsettings (a stopgap measure until we have D-Bus container support)
  • A portal for webcam access
  • More tests!

We are also looking at improving the scalability of the flathub infrastructure. The repository has grown to more than 400 apps, and buildbot is not really meant for using it the way we do.

What about releases?

We have not set a strict schedule, but the consensus seems to be that we are aiming for roughly quarterly releases, with more frequent devel snapshots as needed. Looking at the calendar, that would mean we should expect a stable 1.2 release around the end of the year.

Open for contribution

One of the easiest ways to help Flatpak is to get your favorite applications on flathub, either by packaging it yourself, or by convincing the upstream to do it.

If you feel like contributing to Flatpak itself,  please do! Flatpak is still a young project, and there are plenty of small to medium-size features that can be added. The tests are also a nice place to stick your toe in and see if you can improve the coverage a bit and maybe find a bug or two.

Or, if that is more your thing, we have a nice design for improving the flatpak commandline user experience that is waiting to be implemented.

Announcing the first release of libxmlb

Posted by Richard Hughes on October 04, 2018 06:36 PM

Today I did the first 0.1.0 preview release of libxmlb. We’re at the “probably API stable, but no promises” stage. This is the library I introduced a couple of weeks ago, and since then I’ve been porting both fwupd and gnome-software to use it. The former is almost complete, and nearly ready to merge, but the latter is still work in progress with a fair bit of code to write. I did manage to launch gnome-software with libxmlb yesterday, and modulo a bit of brokenness it’s both faster to start (over 800ms faster from cold boot!) and uses an amazing 90Mb less RSS at runtime. I’m planning to merge the libxmlb branch into the unstable branch of fwupd in the next few weeks, so I need volunteers to package up the new hard dep for Debian, Ubuntu and Arch.

The tarball is in the usual place – it’s a simple Meson-built library that doesn’t do anything crazy. I’ve imported and built it already for Fedora, much thanks to Kalev for the super speedy package review.

I guess I should explain how applications are expected to use this library. At its core, there are just five different kinds of objects you need to care about:

  • XbSilo – a deduplicated string pool and a read only node tree. This is typically kept alive for the application lifetime.
  • XbNode – a “Gobject wrapped” immutable node available as a query result from XbSilo.
  • XbBuilder – a “compiler” to build the XbSilo from XbBuilderNode’s or XbBuilderSource’s. This is typically created and destroyed at startup or when the blob needs regenerating.
  • XbBuilderNode – a mutable node that can have a parent, children, attributes and a value
  • XbBuilderSource – a source of data for XbBuilder, e.g. a .xml.gz file or just a raw XML string

The way most applications will use libxmlb is to create a local XbBuilder instance, add some XbBuilderSource’s and then “ensure” a local cache file. The “ensure” process either mmap loads the binary blob if all the file mtimes are the same, or compiles a blob saving it to a new file. You can also tell the XbSilo to watch all the sources that it was built with, so that if any files change at runtime the valid property gets set to FALSE and the application can xb_builder_ensure() at a convenient time.

Once the XbBuilder has been compiled, a XbSilo pops out. With the XbSilo you can query using most common XPath statements – I actually ended up implementing a FORTH-style stack interpreter so we can now do queries like /components/component/id[contains(upper-case(text()),'GIMP')] – I’ll say a bit more on that in a minute. Queries can limit the number of results (for speed), and are deduplicated in a sane way so it’s really quite a simple process to achieve something that would be a lot of C code. It’s possible to directly query an attribute or text value from a node, so the silo doesn’t have to be passed around either.

In the process of porting gnome-software, I had to make libxmlb thread-safe – which required some internal organisation. We now have an non-exported XbMachine stack interpreter, and then the XbSilo actually registers the XML-specific methods (like contains()) and functions (like ~=). These get passed some per-method user data, and also some per-query private data that is shared with the node tree – allowing things like [last()] and position()=3 to work. The function callbacks just get passed an query-specific stack, which means you can allow things like comparing “1” to 1.00f This makes it easy to support more of XPath in the future, or to support something completely application specific like gnome-software-search() without editing the library.

If anyone wants to do any API or code review I’d be super happy to answer any questions. Coverity and valgrind seem happy enough with all the self tests, but that’s no replacement for a human asking tricky questions. Thanks!

Speeding up AppStream: mmap’ing XML using libxmlb

Posted by Richard Hughes on September 20, 2018 09:27 AM

AppStream and the related AppData are XML formats that have been adopted by thousands of upstream projects and are being used in about a dozen different client programs. The AppStream metadata shipped in Fedora is currently a huge 13Mb XML file, which with gzip compresses down to a more reasonable 3.6Mb. AppStream is awesome; it provides translations of lots of useful data into basically all languages and includes screenshots for almost everything. GNOME Software is built around AppStream, and we even use a slightly extended version of the same XML format to ship firmware update metadata from the LVFS to fwupd.

XML does have two giant weaknesses. The first is that you have to decompress and then parse the files – which might include all the ~300 tiny AppData files as well as the distro-provided AppStream files, if you want to list installed applications not provided by the distro. Seeking lots of small files isn’t so slow on a SSD, and loading+decompressing a small file is actually quicker than loading an uncompressed larger file. Parsing an XML file typically means you set up some callbacks, which then get called for every start tag, text section, then end tag – so for a 13Mb XML document that’s nested very deeply you have to do a lot of callbacks. This means you have to process the description of GIMP in every language before you can even see if Shotwell exists at all.

The typical way parsing XML involves creating a “node tree” when parsing the XML. This allows you treat the XML document as a Document Object Model (DOM) which allows you to navigate the tree and parse the contents in an object oriented way. This means you typically allocate on the heap the nodes themselves, plus copies of all the string data. AsNode in libappstream-glib has a few tricks to reduce RSS usage after parsing, which includes:

  • Interning common element names like description, p, ul, li
  • Freeing all the nodes, but retaining all the node data
  • Ignoring node data for languages you don’t understand
  • Reference counting the strings from the nodes into the various appstream-glib GObjects

This still has a both drawbacks; we need to store in hot memory all the screenshot URLs of all the apps you’re never going to search for, and we also need to parse all these long translated descriptions data just to find out if gimp.desktop is actually installable. Deduplicating strings at runtime takes nontrivial amounts of CPU and means we build a huge hash table that uses nearly as much RSS as we save by deduplicating.

On a modern system, parsing ~300 files takes less than a second, and the total RSS is only a few tens of Mb – which is fine, right? Except on resource constrained machines it takes 20+ seconds to start, and 40Mb is nearly 10% of the total memory available on the system. We have exactly the same problem with fwupd, where we get one giant file from the LVFS, all of which gets stored in RSS even though you never have the hardware that it matches against. Slow starting of fwupd and gnome-software is one of the reasons they stay resident, and don’t shutdown on idle and restart when required.

We can do better.

We do need to keep the source format, but that doesn’t mean we can’t create a managed cache to do some clever things. Traditionally I’ve been quite vocal against squashing structured XML data into databases like sqlite and Xapian as it’s like pushing a square peg into a round hole, and forces you to think like a database doing 10 level nested joins to query some simple thing. What we want to use is something like XPath, where you can query data using the XML structure itself.

We also want to be able to navigate the XML document as if it was a DOM, i.e. be able to jump from one node to it’s sibling without parsing all the great, great, great, grandchild nodes to get there. This means storing the offset to the sibling in a binary file.

If we’re creating a cache, we might as well do the string deduplication at creation time once, rather than every time we load the data. This has the added benefit in that we’re converting the string data from variable length strings that you compare using strcmp() to quarks that you can compare just by checking two integers. This is much faster, as any SAT solver will tell you. If we’re storing a string table, we can also store the NUL byte. This seems wasteful at first, but has one huge advantage – you can mmap() the string table. In fact, you can mmap the entire cache. If you order the string table in a sensible way then you store all the related data in one block (e.g. the <id> values) so that you don’t jump all over the cache invalidating almost everything just for a common query. mmap’ing the strings means you can avoid strdup()ing every string just in case; in the case of memory pressure the kernel automatically reclaims the memory, and the next time automatically loads it from disk as required. It’s almost magic.

I’ve spent the last few days prototyping a library, which is called libxmlb until someone comes up with a better name. I’ve got a test branch of fwupd that I’ve ported from libappstream-glib and I’m happy to say that RSS has reduced from 3Mb (peak 3.61Mb) to 1Mb (peak 1.07Mb) and the startup time has gone from 280ms to 250ms. Unless I’ve missed something drastic I’m going to port gnome-software too, and will expect even bigger savings as the amount of XML is two orders of magnitude larger.

So, how do I use this thing. First, lets create a baseline doing things the old way:

$ time appstream-util search gimp.desktop
real	0m0.645s
user	0m0.800s
sys	0m0.184s

To create a binary cache:

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.497s
user	0m0.453s
sys	0m0.028s

$ time xb-tool compile appstream.xmlb /usr/share/app-info/xmls/* /usr/share/appdata/* /usr/share/metainfo/*
real	0m0.016s
user	0m0.004s
sys	0m0.006s

Notice the second time it compiled nearly instantly, as none of the filename or modification timestamps of the sources changed. This is exactly what programs would do every time they are launched.

$ df -h appstream.xmlb
4.2M	appstream.xmlb

$ time xb-tool query appstream.xmlb "components/component[@type='desktop']/id[text()='firefox.desktop']"
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
RESULT: <id>firefox.desktop</id>
real	0m0.008s
user	0m0.007s
sys	0m0.002s

8ms includes the time to load the file, search for all the components that match the query and the time to export the XML. You get three results as there’s one AppData file, one entry in the distro AppStream, and an extra one shipped by Fedora to make Firefox featured in gnome-software. You can see the whole XML component of each result by appending /.. to the query. Unlike appstream-glib, libxmlb doesn’t try to merge components – which makes it much less magic, and a whole lot simpler.

Some questions answered:

  • Why not just use a GVariant blob?: I did initially, and the cache was huge. The deeply nested structure was packed inefficiently as you have to assume everything is a hash table of a{sv}. It was also slow to load; not much faster than just parsing the XML. It also wasn’t possible to implement the zero-copy XPath queries this way.
  • Is this API and ABI stable?: Not yet, as soon as gnome-software is ported.
  • You implemented XPath in c‽: No, only a tiny subset. See the README.md

Comments welcome.

Thunderbird 60 with title bar hidden

Posted by Martin Stransky on September 18, 2018 08:58 AM

Many users like hidden system titlebar as Firefox feature although it’s not finished yet. But we’re very close and I hope to have Firefox 64 in shape that the title bar can be disabled by default at least on Gnome and matches Firefox outfit at Windows and Mac.

Thunderbird 60 was finally released for Fedora and comes with a basic version of the feature as it was introduced at Firefox 60 ESR. There’s a simple checkbox at “Customize” page at Firefox but Thunderbird is missing an easy switch.

To disable the title bar at Thunderbird 60, you need to go to system menu Edit -> Preferences and choose Advanced tab. Then click at Config Editor at page left bottom corner, open it and look for mail.tabs.drawInTitlebar. Double clik on it and your bird should be titleless 🙂

Fedora Firefox – GCC/CLANG dilemma

Posted by Martin Stransky on September 17, 2018 09:00 PM

After reading Mike’s blog post about official Mozilla Firefox switch to LLVM Clang, I was wondering if we should also use that setup for official Fedora Firefox binaries.

The numbers look strong but as Honza Hubicka mentioned, Mozilla uses pretty ancient GCC6 to create binaries and it’s not very fair to compare it with up-to date LLVM Clang 6.

Also if I’m reading the mozilla bug correctly the PGO/LTO is not yet enabled for Linux, only plain optimized builds are used for now…which means the transition at Mozilla is not so far than I expected.

I also went through some Poronix tests which indicates there’s no black and white situation there although Mike claimed that LLVM Clang is generally better that GCC. But it’s possible that Firefox codebase somehow fits better LLVM Clang than GCC.

After some consideration I think we’ll stay with GCC for now and I’m going compare Fedora GCC 8 builds with the Mozilla  LLVM Clang ones when there are available. Both builds can’t use -march=native so It may be an equal comparsion. Also Fedora should enable the PGO+LTO GCC setup to get the best from GCC.

[Update] I was wrong and PGO+LTO should be enabled also for Linux builds now. The numbers looks very well and I wonder if we can match them with GCC8! 🙂

The Commons Clause doesn't help the commons

Posted by Matthew Garrett on September 10, 2018 11:26 PM
The Commons Clause was announced recently, along with several projects moving portions of their codebase under it. It's an additional restriction intended to be applied to existing open source licenses with the effect of preventing the work from being sold[1], where the definition of being sold includes being used as a component of an online pay-for service. As described in the FAQ, this changes the effective license of the work from an open source license to a source-available license. However, the site doesn't go into a great deal of detail as to why you'd want to do that.

Fortunately one of the VCs behind this move wrote an opinion article that goes into more detail. The central argument is that Amazon make use of a great deal of open source software and integrate it into commercial products that are incredibly lucrative, but give little back to the community in return. By adopting the commons clause, Amazon will be forced to negotiate with the projects before being able to use covered versions of the software. This will, apparently, prevent behaviour that is not conducive to sustainable open-source communities.

But this is where things get somewhat confusing. The author continues:

Our view is that open-source software was never intended for cloud infrastructure companies to take and sell. That is not the original ethos of open source.

which is a pretty astonishingly unsupported argument. Open source code has been incorporated into proprietary applications without giving back to the originating community since before the term open source even existed. MIT-licensed X11 became part of not only multiple Unixes, but also a variety of proprietary commercial products for non-Unix platforms. Large portions of BSD ended up in a whole range of proprietary operating systems (including older versions of Windows). The only argument in favour of this assertion is that cloud infrastructure companies didn't exist at that point in time, so they weren't taken into consideration[2] - but no argument is made as to why cloud infrastructure companies are fundamentally different to proprietary operating system companies in this respect. Both took open source code, incorporated it into other products and sold them on without (in most cases) giving anything back.

There's one counter-argument. When companies sold products based on open source code, they distributed it. Copyleft licenses like the GPL trigger on distribution, and as a result selling products based on copyleft code meant that the community would gain access to any modifications the vendor had made - improvements could be incorporated back into the original work, and everyone benefited. Incorporating open source code into a cloud product generally doesn't count as distribution, and so the source code disclosure requirements don't trigger. So perhaps that's the distinction being made?

Well, no. The GNU Affero GPL has a clause that covers this case - if you provide a network service based on AGPLed code then you must provide the source code in a similar way to if you distributed it under a more traditional copyleft license. But the article's author goes on to say:

AGPL makes it inconvenient but does not prevent cloud infrastructure providers from engaging in the abusive behavior described above. It simply says that they must release any modifications they make while engaging in such behavior.

IE, the problem isn't that cloud providers aren't giving back code, it's that they're using the code without contributing financially. There's no difference between what cloud providers are doing now and what proprietary operating system vendors were doing 30 years ago. The argument that "open source" was never intended to permit this sort of behaviour is simply untrue. The use of permissive licenses has always allowed large companies to benefit disproportionately when compared to the authors of said code. There's nothing new to see here.

But that doesn't mean that the status quo is good - the argument for why the commons clause is required may be specious, but that doesn't mean it's bad. We've seen multiple cases of open source projects struggling to obtain the resources required to make a project sustainable, even as many large companies make significant amounts of money off that work. Does the commons clause help us here?

As hinted at in the title, the answer's no. The commons clause attempts to change the power dynamic of the author/user role, but it does so in a way that's fundamentally tied to a business model and in a way that prevents many of the things that make open source software interesting to begin with. Let's talk about some problems.

The power dynamic still doesn't favour contributors

The commons clause only really works if there's a single copyright holder - if not, selling the code requires you to get permission from multiple people. But the clause does nothing to guarantee that the people who actually write the code benefit, merely that whoever holds the copyright does. If I rewrite a large part of a covered work and that code is merged (presumably after I've signed a CLA that assigns a copyright grant to the project owners), I have no power in any negotiations with any cloud providers. There's no guarantee that the project stewards will choose to reward me in any way. I contribute to them but get nothing back in return - instead, my improved code allows the project owners to charge more and provide stronger returns for the VCs. The inequity has shifted, but individual contributors still lose out.

It discourages use of covered projects

One of the benefits of being able to use open source software is that you don't need to fill out purchase orders or start commercial negotiations before you're able to deploy. Turns out the project doesn't actually fill your needs? Revert it, and all you've lost is some development time. Adding additional barriers is going to reduce uptake of covered projects, and that does nothing to benefit the contributors.

You can no longer meaningfully fork a project

One of the strengths of open source projects is that if the original project stewards turn out to violate the trust of their community, someone can fork it and provide a reasonable alternative. But if the project is released with the commons clause, it's impossible to sell any forked versions - anyone who wishes to do so would still need the permission of the original copyright holder, and they can refuse that in order to prevent a fork from gaining any significant uptake.

It doesn't inherently benefit the commons

The entire argument here is that the cloud providers are exploiting the commons, and by forcing them to pay for a license that allows them to make use of that software the commons will benefit. But there's no obvious link between these things. Maybe extra money will result in more development work being done and the commons benefiting, but maybe extra money will instead just result in greater payout to shareholders. Forcing cloud providers to release their modifications to the wider world would be of benefit to the commons, but this is explicitly ruled out as a goal. The clause isn't inherently incompatible with this - the negotiations between a vendor and a project to obtain a license to be permitted to sell the code could include a commitment to provide patches rather money, for instance, but the focus on money makes it clear that this wasn't the authors' priority.

What we're left with is a license condition that does nothing to benefit individual contributors or other users, and costs us the opportunity to fork projects in response to disagreements over design decisions or governance. What it does is ensure that a range of VC-backed projects are in a better position to improve their returns, without any guarantee that the commons will be left better off. It's an attempt to solve a problem that's existed since before the term "open source" was even coined, by simply layering on a business model that's also existed since before the term "open source" was even coined[3]. It's not anything new, and open source derives from an explicit rejection of this sort of business model.

That's not to say we're in a good place at the moment. It's clear that there is a giant level of power disparity between many projects and the consumers of those projects. But we're not going to fix that by simply discarding many of the benefits of open source and going back to an older way of doing things. Companies like Tidelift[4] are trying to identify ways of making this sustainable without losing the things that make open source a better way of doing software development in the first place, and that's what we should be focusing on rather than just admitting defeat to satisfy a small number of VC-backed firms that have otherwise failed to develop a sustainable business model.

[1] It is unclear how this interacts with licenses that include clauses that assert you can remove any additional restrictions that have been applied
[2] Although companies like Hotmail were making money from running open source software before the open source definition existed, so this still seems like a reach
[3] "Source available" predates my existence, let alone any existing open source licenses
[4] Disclosure: I know several people involved in Tidelift, but have no financial involvement in the company

comment count unavailable comments

ASG! 2018 Tickets

Posted by Lennart Poettering on September 10, 2018 10:00 PM

<large>All Systems Go! 2018 Tickets Selling Out Quickly!</large>

Buy your tickets for All Systems Go! 2018 soon, they are quickly selling out! The conference takes place on September 28-30, in Berlin, Germany, in a bit over two weeks.

Why should you attend? If you are interested in low-level Linux userspace, then All Systems Go! is the right conference for you. It covers all topics relevant to foundational open-source Linux technologies. For details on the covered topics see our schedule for day #1 and for day #2.

For more information please visit our conference website!

See you in Berlin!

On Flatpak dependencies

Posted by Matthias Clasen on September 07, 2018 05:50 PM

Package managers have to deal with dependencies – too many of them. Over time things have gotten complicated: there are now soft dependencies, reverse dependencies and boolean conditions. So complicated that you can probably do general computation in the dependency solver now.

Thankfully flatpak is a lot simpler: there’s apps, and there’s runtimes, and every app depends on exactly one runtime. Simple and beautiful.

Of course, thats not the whole story.  Lets take a look.

Dependencies

The only hard dependencies in Flatpak are between an app and the runtime that it uses.

Every app uses exactly one runtime. When installing the app, Flatpak will automatically install the runtime it needs, and it will refuse to uninstall the runtime as long as there are apps using it. You can override this using the –force-remove option.

Related refs

The common term Flatpak uses for apps and runtimes is refs. Apart from dependencies, Flatpak has a notion of related refs. These are what other packaging systems might classify as soft dependencies, and they come in various forms and shapes.

The first group are standard pieces that get split off at build time, like .Locale and .Debug refs. The first contain translations and are similar to what other packaging systems call langpacks. The second contain debug information for binaries, and are the equivalent of debuginfo packages in other systems.

The next group are extensions. Both applications and runtimes can declare extension points,  and flatpak will look for refs that match those when setting up a sandbox, and mounts the ones it finds inside the sandbox.

Some extensions are a bit more special, in that they are conditional on some condition of the system. For example, they might only be relevant if the system has an Nvidia GPU, or apply for a certain GTK+ theme.

Another kind of relationship exists between a runtime and its matching sdk.

Whats used ?

The Flatpak uninstall command has a –unused option that makes an educated guess about what refs are no longer needed on your system. For each ref, it looks if it is an application, or the runtime of an installed application, or related to one of these. In each of these cases, it considers the ref used and skips it. Whatever is left afterwards get removed.

There are some heuristics involved when looking at related refs, and we are still fine-tuning these. One change that we’re recently made is to consider sdks used, to make –unused usable for developers.

There is still more fine-tuning to be done. For example, Flatpak will happily use a runtime from the system installation to run an application from the user installation. But the uninstall command works only on a single installation, so it does not see these dependencies, and might remove the runtime. Thankfully, it is easy to recover, should this happen to you: just install the runtime again.

3 Million Firmware Files and Counting…

Posted by Richard Hughes on September 07, 2018 09:55 AM

In the last two years the LVFS has supplied over 3 million firmware files to end users. We now have about a two dozen companies uploading firmware, of which 9 are multi-billion dollar companies.

Every month about 200,000 more devices get upgraded and from the reports so far the number of failed updates is less than 0.01% averaged over all firmware types. The number of downloads is going up month-on-month, although we’re no longer growing exponentially, thank goodness. The server load average is 0.18, and we’ve made two changes recently to scale even more for less money: signing files in a 30 minute cron job rather than immediately, and switching from Amazon to BunnyCDN.

The LVFS is mainly run by just one person (me!) and my time is sponsored by the ever-awesome Red Hat. The hardware costs, which recently included random development tools for testing the dfu and nvme plugins, and the server and bandwidth costs are being paid from charitable donations from the community. We’re even cost positive now, so I’m building up a little pot for the next server or CDN upgrade. By pretty much any metric, the LVFS is a huge success, and I’m super grateful to all the people that helped the project grow.

The LVFS does have one weakness, that it has a bus factor of one. In other words, if I got splattered by a bus, the LVFS would probably cease to exist in the current form. To further grow the project, and to reduce the dependence on me, we’re going to be moving various parts of the LVFS to the Linux Foundation. This means that they’ll be sysadmins who don’t have to google basic server things, a proper community charter, and access to an actual legal team. From a OEM point of view, nothing much should change, including the most important thing that it’ll continue to be free to use for everyone. The existing server and all the content will be migrated to the LVFS infrastructure. From a users point of view, new metadata and firmware will be signed by the Linux Foundation key, rather than my key, although we haven’t decided on a date for the switch-over yet. The LF key has been trusted by fwupd for firmware since 1.0.8 and it’s trivial to backport to older branches if required.

Before anyone gets too excited and starts pointing me at all my existing bugs for my other stuff: I’ll probably still be the person “onboarding” vendors onto the LVFS, and I’m fully expecting to remain the maintainer and core contributor to the lvfs-website code itself — but I certainly should have a bit more time for GNOME Software and color stuff.

In related news, even more vendors are jumping on the LVFS. No more public announcements yet, but hopefully soon. For a lot of hardware companies they’ll be comfortable “going public” when the new hardware currently in development is on shelves in stores. So, please raise a glass to the next 3 million downloads!

What's new in libinput 1.12

Posted by Peter Hutterer on September 04, 2018 03:34 AM

libinput 1.12 was a massive development effort (over 300 patchsets) with a bunch of new features being merged. It'll be released next week or so, so it's worth taking a step back and looking at what actually changed.

The device quirks files replace the previously used hwdb-based udev properties. I've written about this in more detail here but the gist is: we have our own .ini style file format that can match on devices and apply the various quirks devices need. This simplifies debugging a lot, we can now reliably tell users why a quirks file applies or doesn't apply, historically a problem with the hwdb.

The sphinx-based documentation was merged, fixed and added to. We switched to sphinx for the docs and the result is much more user-friendly. Which was the point, it was a switch from a developer-oriented documentation to a user-oriented one. Not that documentation is ever finished.

The usual set of touchpad improvements went in, e.g. the slight motion on finger up is now ignored. We have size-based thumb detection now (useful for Apple touchpads!). And of course various quirks for better pressure ranges, etc. Tripletap on some synaptics touchpads had a tendency to cause multiple taps because of some weird event sequence. Movement in the software button now generates events, the buttons are not just a dead zone anymore. Pointer jump detection is more adaptive now and catches and discards smaller jumps that previously slipped through the cracks. A particularly quirky behaviour was seen on Dell XPS i2c touchpads that exhibit a huge pointer jump, courtesy of the trackpoint controller going to sleep and taking its time to wake up. The delay is still there but the pointer at least lands in the correct location.

We now have improved direction-locking for two-finger scrolling on touchpads. Scrolling up/down should not generate horizontal scroll events anymore as long as the movement is close enough to vertical. This feature is transparent, a diagonal or horizontal movement will immediately disable the direction lock and produce horizontal scroll events as expected.

The trackpoint acceleration has been re-done, see this post for more details and links to the background articles. I've only received one bug report for the new acceleration so it seems to work quite well now. Trackpoints that send events in bursts (e.g. bluetooth ones) are smoothened now to avoid jerky movement.

Velocity averaging was dropped to increase pointer accuracy. Previously we averaged the velocity across multiple events which makes the motion smoother on jittery devices but less accurate on good devices.

We build on FreeBSD now. Presumably this also means it works on FreeBSD :)

libinput now supports palm detection on touchscreens, at least where the ABS_MT_TOOL_TYPE evdev bit is provided.

I think that's about it. Busy days...

Realtek on the LVFS!

Posted by Richard Hughes on August 27, 2018 01:02 PM

For the last week I’ve been working with Realtek engineers adding USB3 hub firmware support to fwupd. We’re still fleshing out all the details, as we also want to also update any devices attached to the hub using i2c – which is more important than it first seems. A lot of “multifunction” dongles or USB3 hubs are actually USB3 hubs with other hardware connected internally. We’re going to be working on updating the HDMI converter firmware next, probably just by dropping a quirk file and adding some standard keys to fwupd. This will let us use the same plugin for any hardware that uses the rts54xx chipset as the base.

Realtek have been really helpful and open about the hardware, which is a refreshing difference to a lot of other hardware companies. I’m hopeful we can get the new plugin in fwupd 1.1.2 although supported hardware won’t be available for a few months yet, which also means there’s no panic getting public firmware on the LVFS. It will mean we get a “works out of the box” experience when the new OEM branded dock/dongle hardware starts showing up.

About Flatpak installations

Posted by Matthias Clasen on August 26, 2018 09:04 PM

If you have tried Flatpak, you probably know that it can install apps per user or system-wide.  Installing an app system-wide has the advantage that all users on the system can use it. Installing per-user has the advantage that it doesn’t require privileges.

Most of the time, that’s more than enough choice. But there is more.

Standard locations

Before moving on, it is useful to briefly look at where Flatpak installs apps by default.

flatpak install --system flathub org.gnome.Todo

When you install an app like this. it ends up in in /var/lib/flatpak. If you instead use the –user option, it ends up in ~/.local/share/flatpak. To be 100% correct, I should say $XDG_DATA_HOME/flatpak, since Flatpak does respect the XDG basedir spec.

Anatomy of an installation

Flatpak calls the places where it installs apps installations. Every installation has a few subdirectories that are useful to know:

  • repo – this is the OSTree repository where the files for the installed apps and runtimes reside. It is a single repository, so all the apps and runtimes that are part of the same installation get the benefit of deduplication via content-addressing.
  • exports – when an app is installed, Flatpak extracts some files that need to be visible to the outside world, such a desktop files, icons, D-Bus services files, and this is where they end up.
  • appstream – a flatpak repository contains the appstream data for the apps it contains as a separate branch, and Flatpak extracts it on the client-side for consumers like KDE’s Discover or GNOME Software.
  • app, runtime – the deployed versions of apps and runtimes get checked out here. Diving deeper, you see the files of an app in app/org.gimp.GIMP/current/active/files. This directory is what gets mounted in the sandbox as /app if you run the GIMP.

Custom installations

So far, so good. But maybe you have a setup with dozens of machines, and have an existing setup where /opt is shared. Wouldn’t it be nice to have a flatpak installation there, instead of duplicating it in /var on every machine ? This is where custom installations come in.

You can tell Flatpak about another place to  install apps by dropping a file in /etc/flatpak/installations.d/. It can be as simple as the following:

[Installation "bigleaf"]
Path=/opt/flatpak
DisplayName=bigleaf

See the flatpak-installation man page for all the details about this file.

GNOME Software currently doesn’t know such custom installations, and you will have to adjust the shell glue in /etc/profile.d/flatpak.sh for GNOME shell to see apps from there, but that is easy enough.

A patch to make flatpak.sh pick up custom installations automatically would be a welcome contribution!

Apps on a stick

Flatpak has a few more tricks up its sleeve when it comes to sharing apps between machines. A pretty cool one is the recently added create-usb command. It lets you copy one (or more) apps on a usb stick, and install it from there on another machine. While trying this out, I hit a few hurdles, that I’ll briefly point out here.

To make this work, Flatpak relies on an extra piece of information about the remote, the collection ID. The collection ID is a property of the actual remote repository. Flathub has one, org.flathub.Stable.

To make use of it, we need to add it to the configuration for the remote, like this:

$ flatpak remote-modify --collection-id=org.flathub.Stable flathub
$ flatpak update

If you don’t add the collection ID to your remote configuration, you will be greeted by an error saying “Remote ‘flathub’ does not have a collection ID set”. If you omit the flatpak update, the error will say “No such branch (org.flathub.Stable, ostree-metadata) in repository”.

Another error you may hit is “fsetxattr: Operation not supported”.  I think the create-usb command is meant to work with FAT-formatted usb sticks, so this will hopefully be fixed soon. For now, just format your usb stick as EXT4.

After these preparations, we are ready for:

$ flatpak --verbose create-usb /run/media/mclasen/Flatpak org.gimp.GIMP

which will take some time to copy things to the usb stick (which happes to be mounted at /run/media/mclasen/Flatpak) . When this command is done, we can inspect the contents of the OSTree repository like this:

$ flatpak remote-ls file:///run/media/mclasen/Flatpak/.ostree/repo
Ref 
org.freedesktop.Platform.Icontheme.Adwaita
org.freedesktop.Platform.VAAPI.Intel 
org.freedesktop.Platform.ffmpeg 
org.gimp.GIMP 
org.gnome.Platform

Flatpak copied not just the GIMP itself, but also runtimes and extensions that it uses. This ensures that we can install the app from the usb stick even if some of these related refs are missing on the target system.

But of course, we still need to see it work! So I uninstalled the GIMP, disabled my network, plugged the usb stick back in, and:

$ flatpak install --user flathub org.gimp.GIMP
0 metadata, 0 content objects imported; 569 B transferred in 0 seconds

flatpak install: Error updating remote metadata for 'flathub': [6] Couldn't resolve host name
Installing in user:
org.gimp.GIMP/x86_64/stable flathub 1eb97e2d4cde
permissions: ipc, network, wayland, x11
file access: /tmp, host, xdg-config/GIMP, xdg-config/gtk-3.0
dbus access: org.gtk.vfs, org.gtk.vfs.*
Is this ok [y/n]: y
Installing for user: org.gimp.GIMP/x86_64/stable from flathub
[####################] 495 metadata, 4195 content objects imported; 569 B transferred in 1 seconds
Now at 1eb97e2d4cde.

Voilà, an offline installation of a Flatpak. I left the error message in there as proof that I was actually offline :) A nice detail of the collection ID approach is that Flatpak knows that it can still update the GIMP from flathub when I’m online.

Coming soon, peer-to-peer

This post is already too long, so I’ll leave peer-to-peer and advertising Flatpak repositories on the local network via avahi for another time.

Until then, happy Flatpaking! 💓📦💓📦

 

Fun with SuperIO

Posted by Richard Hughes on August 23, 2018 11:44 AM

While I’m waiting back for NVMe vendors (already one tentatively onboard!) I’ve started looking at “embedded controller” devices. The EC on your laptop historically used to just control the PS/2 keyboard and mouse, but now does fan control, power management, UARTs, GPIOs, LEDs, SMBUS, and various tasks the main CPU is too important to care about. Vendors issue firmware updates for this kind of device, but normally wrap up the EC update as part of the “BIOS” update as the system firmware and EC work together using various ACPI methods. Some vendors do the EC update out-of-band and so we need to teach fwupd about how to query the EC to get the model and version on that specific hardware. The Linux laptop vendor Tuxedo wants to update the EC and system firmware separately using the LVFS, and helpfully loaned me an InfinityBook Pro 13 that was immediately disassembled and connected to all kinds of exotic external programmers. On first impressions the N131WU seems quick, stable and really well designed internally — I’m sure would get a 10/10 for repairability.

At the moment I’m just concentrating on SuperIO devices from ITE. If you’re interested what SuperIO chip(s) you have on your machine you can either use superiotool from coreboot-utils or sensors-detect from lm_sensors. If you’ve got a SuperIO device from ITE please post what signature, vendor and model machine you have in the comments and I’ll ask if I need any more information from you. I’m especially interested in vendors that use devices with the signature 0x8587, which seems to be a favourite with the Clevo reference board. Thanks!

Please welcome AKiTiO to the LVFS

Posted by Richard Hughes on August 23, 2018 08:08 AM

Another week, another vendor. This time the vendor is called AKiTiO, a vendor that make a large number of very nice ThunderBolt peripherals.

Over the last few weeks AKiTiO added support for the Node and Node Lite devices, and I’m sure they’ll be more in the future. It’s been a pleasure working with the engineers and getting them up to speed with uploading to the LVFS.

In other news, Lenovo also added support for the ThinkPad T460 on the LVFS, so get any updates while they’re hot. If you want to try this you’ll have to enable the lvfs-testing remote either using fwupdmgr enable-remote lvfs-testing or using the sources dialog in recent versions of GNOME Software. More Lenovo updates coming soon, and hopefully even more vendor announcements too.

Adventures with NVMe, part 2

Posted by Richard Hughes on August 21, 2018 08:01 AM

A few days ago I asked people to upload their NVMe “cns” data to the LVFS. So far, 908 people did that, and I appreciate each and every submission. I promised I’d share my results, and this is what I’ve found:

Number of vendors implementing slot 1 read only “s1ro” factory fallback: 10 – this was way less than I hoped. Not all is lost: the number of slots in a device “nfws”indicate how many different versions of firmware the drive can hold, just like some wireless broadband cards. The idea is that a bad firmware flash means you can “fall back” to an old version that actually works. It was surprising how many drives didn’t have this feature because they only had one slot in total:

I also wanted to know how many firmware versions there were for a specific model (deduping by removing the capacity string in the model); the idea being that if drives with the same model string all had the same version firmware then the vendor wasn’t supplying firmware updates at all, and might be a lost cause, or have perfect firmware. Vendors don’t usually change shipped firmware on NMVe drives for no reason, and so a vendor having multiple versions of firmware for a given model could indicate a problem or enhancement important enough to re-run all the QA checks:

So, not all bad, but we can’t just assume that trying to flash a firmware is a safe thing to do for all drives. The next, much bigger problem was trying to identify which drives should be flashed with a specific firmware. You’d think this would be a simple problem, where the existing firmware version would be stored in the “fr” firmware revision string and the model name would be stored in the “mn” string. Alas, only Lenovo and Apple store a sane semver like 1.2.3, other vendors seem to encode the firmware revision using as-yet-unknown methods. Helpfully, the model name alone isn’t all we need to identify the firmware to flash as different drives can have different firmware for the laptop OEM without changing the mn or fr. For this I think we need to look into the elusive “vs” vendor-defined block which was the reason I was asking for the binary dump of the CNS rather than the nvme -H or nvme -o json output. The vendor block isn’t formally defined as part of the NVMe specification and the ODM (and maybe the OEM?) can use this however they want.

Only 137 out of the supplied ~650 NVMe CNS blobs contained vendor data. SK hynix drives contain an interesting-looking string of something like KX0WMJ6KS0760T6G01H0, but I have no idea how to parse that. Seagate has simply 2002. Liteon has a string like TW01345GLOH006BN05SXA04. Some Samsung drives have things like KR0N5WKK0184166K007HB0 and CN08Y4V9SSX0087702TSA0 – the same format as Toshiba CN08D5HTTBE006BEC2K1A0 but it’s weird that the blob is all ASCII – I was somewhat hoping for a packed GUID in the sea of NULs. They do have some common sub-sections, so if you know what these are please let me know!

I’ve built a fwupd plugin that should be able to update firmware on NVMe drives, but it’s 100% untested. I’m going to use the leftover donation money for the LVFS to buy various types of NVMe hardware that I can flash with different firmware images and not cry if all the data gets wiped or the device get bricked. I’ve already emailed my contact at Samsung and fingers crossed something nice happens. I’ll do the same with Toshiba and Lenovo next week. I’ll also update this blog post next week with the latest numbers, so if you upload your data now it’s still useful.

NVMe Firmware: I Need Your Data

Posted by Richard Hughes on August 17, 2018 02:45 PM

In a recent Google Plus post I asked what kind of hardware was most interesting to be focusing on next. UEFI updating is now working well with a large number of vendors, and the LVFS “onboarding” process is well established now. On that topic we’ll hopefully have some more announcements soon. Anyway, back to the topic in hand: The overwhelming result from the poll was that people wanted NVMe hardware supported, so that you can trivially update the firmware of your SSD. Firmware updates for SSDs are important, as most either address data consistency issues or provide nice performance fixes.

Unfortunately there needs to be some plumbing put in place first, so don’t expect anything awesome very fast. The NVMe ecosystem is pretty new, and things like “what version number firmware am I running now” and “is this firmware OEM firmware or retail firmware” are still queried using vendor-specific extensions. I only have two devices to test with (Lenovo P50 and Dell XPS 13) and so I’m asking for some help with data collection. Primarily I’m trying to find out what NMVe hardware people are actually using, so I can approach the most popular vendors first (via the existing OEMs). I’m also going to be looking at the firmware revision string that each vendor sets to find quirks we need — for instance, Toshiba encodes MODEL VENDOR, and everyone else specifies VENDOR MODEL. Some drives contain the vendor data with a GUID, some don’t, I have no idea of the relative number or how many different formats there are. I’d also like to know how many firmware slots the average SSD has, and the percentage of drives that have a protected slot 1 firmware. This all lets us work out how safe it would be to attempt a new firmware update on specific hardware — the very last thing we want to do is brick an expensive new NMVe SSD with all your data on.

So, what do I would like you to do. You don’t need to reboot, unmount any filesystems or anything like that. Just:

  1. Install nvme (e.g. dnf install nvme-cli or build it from source
  2. Run the following command:
    sudo nvme id-ctrl --raw-binary /dev/nvme0 > /tmp/id-ctrl
    
  3. If that worked, run the following command:
    curl -F type=nvme \
        -F "machine_id="`cat /etc/machine-id` \
        -F file=@/tmp/id-ctrl \
        https://staging.fwupd.org/lvfs/upload_hwinfo

If you’re not sure if you have a NVMe drive you can check with the nvme command above. The command isn’t doing anything with the firmware; it’s just asking the NVMe drive to report what it knows about itself. It should be 100% safe, the kernel already did the same request at system startup.

We are sending your random machine ID to ensure we don’t record duplicate submissions — if that makes you unhappy for some reason just choose some other 32 byte hex string. In the binary file created by nvme there is the encoded model number and serial number of your drive; if this makes you uneasy please don’t send the file.

Many thanks, and needless to say I’ll be posting some stats here when I’ve got enough submissions to be statistically valid.

libinput's "new" trackpoint acceleration method

Posted by Peter Hutterer on August 16, 2018 04:47 AM

This is mostly a request for testing, because I've received zero feedback on the patches that I merged a month ago and libinput 1.12 is due to be out. No comments so far on the RC1 and RC2 either, so... well, maybe this gets a bit broader attention so we can address some things before the release. One can hope.

Required reading for this article: Observations on trackpoint input data and X server pointer acceleration analysis - part 5.

As the blog posts linked above explain, the trackpoint input data is difficult and largely arbitrary between different devices. The previous pointer acceleration libinput had relied on a fixed reporting rate which isn't true at low speeds, so the new acceleration method switches back to velocity-based acceleration. i.e. we convert the input deltas to a speed, then apply the acceleration curve on that. It's not speed, it's pressure, but it doesn't really matter unless you're a stickler for technicalities.

Because basically every trackpoint has different random data ranges not linked to anything easily measurable, libinput's device quirks now support a magic multiplier to scale the trackpoint range into something resembling a sane range. This is basically what we did before with the systemd POINTINGSTICK_CONST_ACCEL property except that we're handling this in libinput now (which is where acceleration is handled, so it kinda makes sense to move it here). There is no good conversion from the previous trackpoint range property to the new multiplier because the range didn't really have any relation to the physical input users expected.

So what does this mean for you? Test the libinput RCs or, better, libinput from master (because it's stable anyway), or from the Fedora COPR and check if the trackpoint works. If not, check the Trackpoint Configuration page and follow the instructions there.

How the 60-evdev.hwdb works

Posted by Peter Hutterer on August 09, 2018 02:17 AM

libinput made a design decision early on to use physical reference points wherever possible. So your virtual buttons are X mm high/across, the pointer movement is calculated in mm, etc. Unfortunately this exposed us to a large range of devices that don't bother to provide that information or just give us the wrong information to begin with. Patching the kernel for every device is not feasible so in 2015 the 60-evdev.hwdb was born and it has seen steady updates since. Plenty a libinput bug was fixed by just correcting the device's axis ranges or resolution. To take the magic out of the 60-evdev.hwdb, here's a blog post for your perusal, appreciation or, failing that, shaking a fist at. Note that the below is caller-agnostic, it doesn't matter what userspace stack you use to process your input events.

There are four parts that come together to fix devices: a kernel ioctl and a trifecta of udev rules hwdb entries and a udev builtin.

The kernel's EVIOCSABS ioctl

It all starts with the kernel's struct input_absinfo.


struct input_absinfo {
__s32 value;
__s32 minimum;
__s32 maximum;
__s32 fuzz;
__s32 flat;
__s32 resolution;
};
The three values that matter right now: minimum, maximum and resolution. The "value" is just the most recent value on this axis, ignore fuzz/flat for now. The min/max values simply specify the range of values the device will give you, the resolution how many values per mm you get. Simple example: an x axis given at min 0, max 1000 at a resolution of 10 means your devices is 100mm wide. There is no requirement for min to be 0, btw, and there's no clipping in the kernel so you may get values outside min/max. Anyway, your average touchpad looks like this in evemu-record:

# Event type 3 (EV_ABS)
# Event code 0 (ABS_X)
# Value 2572
# Min 1024
# Max 5112
# Fuzz 0
# Flat 0
# Resolution 41
# Event code 1 (ABS_Y)
# Value 4697
# Min 2024
# Max 4832
# Fuzz 0
# Flat 0
# Resolution 37
This is the information returned by the EVIOCGABS ioctl (EVdev IOCtl Get ABS). It is usually run once on device init by any process handling evdev device nodes.

Because plenty of devices don't announce the correct ranges or resolution, the kernel provides the EVIOCSABS ioctl (EVdev IOCtl Set ABS). This allows overwriting the in-kernel struct with new values for min/max/fuzz/flat/resolution, processes that query the device later will get the updated ranges.

udev rules, hwdb and builtins

The kernel has no notification mechanism for updated axis ranges so the ioctl must be applied before any process opens the device. This effectively means it must be applied by a udev rule. udev rules are a bit limited in what they can do, so if we need to call an ioctl, we need to run a program. And while udev rules can do matching, the hwdb is easier to edit and maintain. So the pieces we have is: a hwdb that knows when to change (and the values), a udev program to apply the values and a udev rule to tie those two together.

In our case the rule is 60-evdev.rules. It checks the 60-evdev.hwdb for matching entries [1], then invokes the udev-builtin-keyboard if any matching entries are found. That builtin parses the udev properties assigned by the hwdb and converts them into EVIOCSABS ioctl calls. These three pieces need to agree on each other's formats - the udev rule and hwdb agree on the matches and the hwdb and the builtin agree on the property names and value format.

By itself, the hwdb itself has no specific format beyond this:


some-match-that-identifies-a-device
PROPERTY_NAME=value
OTHER_NAME=othervalue
But since we want to match for specific use-cases, our udev rule assembles several specific match lines. Have a look at 60-evdev.rules again, the last rule in there assembles a string in the form of "evdev:name:the device name:content of /sys/class/dmi/id/modalias". So your hwdb entry could look like this:

evdev:name:My Touchpad Name:dmi:*svnDellInc*
EVDEV_ABS_00=0:1:3
If the name matches and you're on a Dell system, the device gets the EVDEV_ABS_00 property assigned. The "evdev:" prefix in the match line is merely to distinguish from other match rules to avoid false positives. It can be anything, libinput unsurprisingly used "libinput:" for its properties.

The last part now is understanding what EVDEV_ABS_00 means. It's a fixed string with the axis number as hex number - 0x00 is ABS_X. And the values afterwards are simply min, max, resolution, fuzz, flat, in that order. So the above example would set min/max to 0:1 and resolution to 3 (not very useful, I admit).

Trailing bits can be skipped altogether and bits that don't need overriding can be skipped as well provided the colons are in place. So the common use-case of overriding a touchpad's x/y resolution looks like this:


evdev:name:My Touchpad Name:dmi:*svnDellInc*
EVDEV_ABS_00=::30
EVDEV_ABS_01=::20
EVDEV_ABS_35=::30
EVDEV_ABS_36=::20
0x00 and 0x01 are ABS_X and ABS_Y, so we're setting those to 30 units/mm and 20 units/mm, respectively. And if the device is multitouch capable we also need to set ABS_MT_POSITION_X and ABS_MT_POSITION_Y to the same resolution values. The min/max ranges for all axes are left as-is.

The most confusing part is usually: the hwdb uses a binary database that needs updating whenever the hwdb entries change. A call to systemd-hwdb update does that job.

So with all the pieces in place, let's see what happens when the kernel tells udev about the device:

  • The udev rule assembles a match and calls out to the hwdb,
  • The hwdb applies udev properties where applicable and returns success,
  • The udev rule calls the udev keyboard-builtin
  • The keyboard builtin parses the EVDEV_ABS_xx properties and issues an EVIOCSABS ioctl for each axis,
  • The kernel updates the in-kernel description of the device accordingly
  • The udev rule finishes and udev sends out the "device added" notification
  • The userspace process sees the "device added" and opens the device which now has corrected values
  • Celebratory champagne corks are popping everywhere, hands are shaken, shoulders are patted in congratulations of another device saved from the tyranny of wrong axis ranges/resolutions

Once you understand how the various bits fit together it should be quite easy to understand what happens. Then the remainder is just adding hwdb entries where necessary but the touchpad-edge-detector tool is useful for figuring those out.

[1] Not technically correct, the udev rule merely calls the hwdb builtin which searches through all hwdb entries. It doesn't matter which file the entries are in.

GNOME Software and automatic updates

Posted by Richard Hughes on August 08, 2018 07:20 PM

For GNOME 3.30 we’ve enabled something that people have been asking for since at least the birth of the gnome-software project: automatically installing updates.

This of course comes with some caveats. Since it’s still not safe to auto-update packages (trust me, I triaged the hundreds of bugs) we will restrict automatic updates to Flatpaks. Although we do automatically download things like firmware updates, ostree content, and package updates by default they’re deployed manually like before. I guess it’s important to say that the auto-update of Flatpaks is optional and can easily be turned off in the GUI, and that you’ll be notified when applications have been auto-updated and need restarting.

Another common complaint with gnome-software was that it didn’t show the same list of updates as command line tools like dnf. The internal refactoring required for auto-deploying updates also allows us to show updates that are available, but not yet downloaded. We’ll still try and auto-download them ahead of time if possible, but won’t hide them until that. This does mean that “new” updates could take some time to download in the updates panel before either the firmware update is performed or the offline update is scheduled.

This also means we can add some additional UI controlling whether updates should be downloaded and deployed automatically. This doesn’t override the existing logic regarding metered connections or available battery power, but does give the user some more control without resorting to using gsettings invocation on the command line.

Also notable for GNOME 3.30 is that we’ve switched to the new libflatpak transaction API, which both simplifies the flatpak plugin considerably, and it means we install the same runtimes and extensions as the flatpak CLI. This was another common source of frustration as anyone trying to install from a flatpakref with RuntimeRepo set will testify.

With these changes we’ve also bumped the plugin interface version, so if you have out-of-tree plugins they’ll need recompiling before they work again. After a little more polish, the new GNOME Software 2.29.90 will soon be available in Fedora Rawhide, and will thus be available in Fedora 29. If 3.30 is as popular as I think it might be, we might even backport gnome-software 3.30.1 into Fedora 28 like we did for 3.28 and Fedora 27 all those moons ago.

Comments welcome.

Please welcome Lenovo to the LVFS

Posted by Richard Hughes on August 06, 2018 08:45 AM

I’d like to formally welcome Lenovo to the LVFS. For the last few months myself and Peter Jones have been working with partners of Lenovo and the ThinkPad, ThinkStation and ThinkCenter groups inside Lenovo to get automatic firmware updates working across a huge number of different models of hardware.

Obviously, this is a big deal. Tens of thousands of people are likely to be offered a firmware update in the next few weeks, and hundreds of thousands over the next few months. Understandably we’re not just flipping a switch and opening the floodgates, so if you’ve not seen anything appear in fwupdmgr update or in GNOME Software don’t panic. Over the next couple of weeks we’ll be moving a lot of different models from the various testing and embargoed remotes to the stable remote, and so the list of supported hardware will grow. That said, we’ll only be supporting UEFI hardware produced fairly recently, so there’s no point looking for updates on your beloved T61. I also can’t comment on what other Lenovo branded hardware is going to be supported in the future as I myself don’t know.

Bringing Lenovo to the LVFS has been a lot of work. It needed changes to the low level fwupdate library, fwupd, and even the LVFS admin portal itself for various vendor-defined reasons. We’ve been working in semi-secret for a long time, and I’m sure it’s been frustrating to all involved not being able to speak openly about the grand plan. I do think Lenovo should be applauded for the work done so far due to the enormity of the task, rather than chastised about coming to the party a little late. If anyone from HP is reading this, you’re now officially late.

We’re still debugging a few remaining issues, and also working on making the update metadata better quality, so please don’t judge Lenovo (or me!) too harshly if there are initial niggles with the update process. Updating the firmware is slightly odd in that it sometimes needs to reboot a few times with some scary-sounding beeps, and on some hardware the first UEFI update you do might look less than beautiful. If you want to do the firmware update on Lenovo hardware, you’ll have a lot more success with newer versions of fwupd and fwupdate, although we should do a fairly good job of not offering the update if it’s not going to work. All our testing has been done with a fully updated Fedora 28 workstation. It of course works with SecureBoot turned on, but if you’ve enabled the BootOrder lock manually you’ll need to turn that off first.

I’d like to personally thank all the Lenovo engineers and managers I’ve worked with over the last few months. All my time has been sponsored by Red Hat, and they rightfully deserve love too.

Flatpak portal experiments

Posted by Matthias Clasen on August 04, 2018 03:19 AM

One of the signs that a piece of software is reaching a mature state is its ability to serve  use cases that nobody had anticipated when it was started. I’ve recently had this experience with Flatpak.

We have been discussing some possible new directions for the GTK+ file chooser. And it occurred to me that it might be convenient to use the file chooser portal as a way to experiment with different file choosers without having to change either GTK+ itself or the applications.

To verify this idea, I wrote a quick portal implementation that uses the venerable GTK+ 2 file chooser.

Here is Corebird (a GTK+ 3 application) using the GTK+ 2 file chooser to select an image.

On Flatpak updates

Posted by Matthias Clasen on August 02, 2018 05:06 PM

Maybe you remember times when updating your system was risky business – your web browser might crash of start to behave funny because the update pulled data files or fonts out from underneath the running process, leading to fireworks or, more likely, crashes.

Flatpak updates on the other hand are 100% safe. You can call

 flatpak update

and the running instances of are not affected in any way. Flatpak keeps existing deployments around until the last user is gone.  If you quit the application and restart it, you will get the updated version, though.

This is very nice, and works just fine. But maybe we can do even better?

Improving the system

It would be great if the system was aware of the running instances, and offered me to restart them to take advantage of the new version that is now available. There is a good chance that GNOME Software will gain this feature before too long.

But for now, it does not have it.

Do it yourself

Many apps, in particular those that are not native to the Linux distro world, expect to update themselves, and we have had requests to enable this functionality in flatpak. We do think that updating software is a system responsibility that should be controlled by global policies and be under the users control, so we haven’t quite followed the request.

But Flatpak 1.0 does have an API that is useful in this context, the “Flatpak portal“. It has a Spawn method that allows applications to launch a process in a new sandbox.

Spawn (IN  ay    cwd_path,
       IN  aay   argv,
       IN  a{uh} fds,
       IN  a{ss} envs,
       IN  u     flags,
       IN  a{sv} options,
       OUT u     pid)

There are several use cases for this, from sandboxing thumbnailers (which create thumbnails for possibly untrusted content files) to sandboxing web browser tabs individually. The use case we are interested in here is restarting the latest version of the app itself.

One complication that I’ve run into when trying this out is the “unique application” pattern that is built into GApplication and similar application classes: Since there is already an owner for the application ID on the session bus, my newly spawned version will just back off and exit. Which is clearly not what I intended in this case.

Make it stop

The workaround I came up with is not very pretty, but functional. It requires several parts.

First, I need a “quit” action exported on the session bus. The newly spawned version will activate this action of the running instance to convince it to go away. Thankfully, my example app already had this action, for the Quit item in the app menu.

I don’t want this to happen unconditionally, but only if I am spawning a new version. To achieve this, I made my app only activate “quit” if the –replace option is present, and add that option to the commandline that I pass to the “Spawn” call.

The code for this part is less pretty than it could be, since GApplication gets in the way a bit. I have to manually check for the –replace option and do the “quit” D-Bus call by hand.

Doing the “quit” call synchronously is not quite enough to avoid a race condition between the running instance dropping the bus name and my new instance attempting to take it. Therefore, I explicitly wait for the bus name to become unowned before entering g_application_run().

<video class="wp-video-shortcode" controls="controls" height="267" id="video-2309-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-124710-PM.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-124710-PM.webm</video>

But it all works fine. To test it, i exported a “restart” action and added it to the app menu.

Tell me about it

But who can remember to open the app menu and click “Restart”. That is just too cumbersome. Thankfully, flatpak has a solution for this: When you update an app that is running, it creates a marker file named

/app/.updated

inside the sandbox for each running instance.

That makes it very easy for the app to find out when it has been updated, by just monitoring this file. Once the file appears, it can pop up a dialog that offers the user to restart the newer version of the app. A good quality implementation of this will of course save and restore the state when doing this.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-2309-2" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-125142-PM.webm?_=2" type="video/webm">https://blogs.gnome.org/mclasen/files/2018/08/Screencast-from-08-02-2018-125142-PM.webm</video>

Voilá, updates made easy!

You can find the working example in the portal-test repository.

A Fedora COPR for libinput git master

Posted by Peter Hutterer on August 01, 2018 01:07 AM

To make testing libinput git master easier, I set up a whot/libinput-git Fedora COPR yesterday. This repo gets the push triggers directly from GitLab so it will rebuild with whatever is currently on git master.

To use the COPR, simply run:


sudo dnf copr enable whot/libinput-git
sudo dnf upgrade libinput
This will give you the libinput package from git. It'll have a date/time/git sha based NVR, e.g. libinput-1.11.901-201807310551git22faa97.fc28.x86_64. Easy to spot at least.

To revert back to the regular Fedora package run:


sudo dnf copr disable whot/libinput-git
sudo dnf distro-sync "libinput-*"

Disclaimer: This is an automated build so not every package is tested. I'm running git master exclusively (from a a ninja install) and I don't push to master unless the test suite succeeds. So the risk for ending up with a broken system is low.

On that note: if you are maintaining a similar repo for other distributions and would like me to add a push trigger in GitLab for automatic rebuilds, let me know.

Porting Coreboot to the 51NB X210

Posted by Matthew Garrett on July 31, 2018 05:28 AM
The X210 is a strange machine. A set of Chinese enthusiasts developed a series of motherboards that slot into old Thinkpad chassis, providing significantly more up to date hardware. The X210 has a Kabylake CPU, supports up to 32GB of RAM, has an NVMe-capable M.2 slot and has eDP support - and it fits into an X200 or X201 chassis, which means it also comes with a classic Thinkpad keyboard . We ordered some from a Facebook page (a process that involved wiring a large chunk of money to a Chinese bank which wasn't at all stressful), and a couple of weeks later they arrived. Once I'd put mine together I had a quad-core i7-8550U with 16GB of RAM, a 512GB NVMe drive and a 1920x1200 display. I'd transplanted over the drive from my XPS13, so I was running stock Fedora for most of this development process.

The other fun thing about it is that none of the firmware flashing protection is enabled, including Intel Boot Guard. This means running a custom firmware image is possible, and what would a ridiculous custom Thinkpad be without ridiculous custom firmware? A shadow of its potential, that's what. So, I read the Coreboot[1] motherboard porting guide and set to.

My life was made a great deal easier by the existence of a port for the Purism Librem 13v2. This is a Skylake system, and Skylake and Kabylake are very similar platforms. So, the first job was to just copy that into a new directory and start from there. The first step was to update the Inteltool utility so it understood the chipset - this commit shows what was necessary there. It's mostly just adding new PCI IDs, but it also needed some adjustment to account for the GPIO allocation being different on mobile parts when compared to desktop ones. One thing that bit me - Inteltool relies on being able to mmap() arbitrary bits of physical address space, and the kernel doesn't allow that if CONFIG_STRICT_DEVMEM is enabled. I had to disable that first.

The GPIO pins got dropped into gpio.h. I ended up just pushing the raw values into there rather than parsing them back into more semantically meaningful definitions, partly because I don't understand what these things do that well and largely because I'm lazy. Once that was done, on to the next step.

High Definition Audio devices (or HDA) have a standard interface, but the codecs attached to the HDA device vary - both in terms of their own configuration, and in terms of dealing with how the board designer may have laid things out. Thankfully the existing configuration could be copied from /sys/class/sound/card0/hwC0D0/init_pin_configs[2] and then hda_verb.h could be updated.

One more piece of hardware-specific configuration is the Video BIOS Table, or VBT. This contains information used by the graphics drivers (firmware or OS-level) to configure the display correctly, and again is somewhat system-specific. This can be grabbed from /sys/kernel/debug/dri/0/i915_vbt.

A lot of the remaining platform-specific configuration has been split out into board-specific config files. and this also needed updating. Most stuff was the same, but I confirmed the GPE and genx_dec register values by using Inteltool to dump them from the vendor system and copy them over. lspci -t gave me the bus topology and told me which PCIe root ports were in use, and lsusb -t gave me port numbers for USB. That let me update the root port and USB tables.

The final code update required was to tell the OS how to communicate with the embedded controller. Various ACPI functions are actually handled by this autonomous device, but it's still necessary for the OS to know how to obtain information from it. This involves writing some ACPI code, but that's largely a matter of cutting and pasting from the vendor firmware - the EC layout depends on the EC firmware rather than the system firmware, and we weren't planning on changing the EC firmware in any way. Using ifdtool told me that the vendor firmware image wasn't using the EC region of the flash, so my assumption was that the EC had its own firmware stored somewhere else. I was ready to flash.

The first attempt involved isis' machine, using their Beaglebone Black as a flashing device - the lack of protection in the firmware meant we ought to be able to get away with using flashrom directly on the host SPI controller, but using an external flasher meant we stood a better chance of being able to recover if something went wrong. We flashed, plugged in the power and… nothing. Literally. The power LED didn't turn on. The machine was very, very dead.

Things like managing battery charging and status indicators are up to the EC, and the complete absence of anything going on here meant that the EC wasn't running. The most likely reason for that was that the system flash did contain the EC's firmware even though the descriptor said it didn't, and now the system was very unhappy. Worse, the flash wouldn't speak to us any more - the power supply from the Beaglebone to the flash chip was sufficient to power up the EC, and the EC was then holding onto the SPI bus desperately trying to read its firmware. Bother. This was made rather more embarrassing because isis had explicitly raised concern about flashing an image that didn't contain any EC firmware, and now I'd killed their laptop.

After some digging I was able to find EC firmware for a related 51NB system, and looking at that gave me a bunch of strings that seemed reasonably identifiable. Looking at the original vendor ROM showed very similar code located at offset 0x00200000 into the image, so I added a small tool to inject the EC firmware (basing it on an existing tool that does something similar for the EC in some HP laptops). I now had an image that I was reasonably confident would get further, but we couldn't flash it. Next step seemed like it was going to involve desoldering the flash from the board, which is a colossal pain. Time to sleep on the problem.

The next morning we were able to borrow a Dediprog SPI flasher. These are much faster than doing SPI over GPIO lines, and also support running the flash at different voltage. At 3.5V the behaviour was the same as we'd seen the previous night - nothing. According to the datasheet, the flash required at least 2.7V to run, but flashrom listed 1.8V as the next lower voltage so we tried. And, amazingly, it worked - not reliably, but sufficiently. Our hypothesis is that the chip is marginally able to run at that voltage, but that the EC isn't - we were no longer powering the EC up, so could communicated with the flash. After a couple of attempts we were able to write enough that we had EC firmware on there, at which point we could shift back to flashing at 3.5V because the EC was leaving the flash alone.

So, we flashed again. And, amazingly, we ended up staring at a UEFI shell prompt[3]. USB wasn't working, and nor was the onboard keyboard, but we had graphics and were executing actual firmware code. I was able to get USB working fairly quickly - it turns out that Linux numbers USB ports from 1 and the FSP numbers them from 0, and fixing that up gave us working USB. We were able to boot Linux! Except there were a whole bunch of errors complaining about EC timeouts, and also we only had half the RAM we should.

After some discussion on the Coreboot IRC channel, we figured out the RAM issue - the Librem13 only has one DIMM slot. The FSP expects to be given a set of i2c addresses to probe, one for each DIMM socket. It is then able to read back the DIMM configuration and configure the memory controller appropriately. Running i2cdetect against the system SMBus gave us a range of devices, including one at 0x50 and one at 0x52. The detected DIMM was at 0x50, which made 0x52 seem like a reasonable bet - and grepping the tree showed that several other systems used 0x52 as the address for their second socket. Adding that to the list of addresses and passing it to the FSP gave us all our RAM.

So, now we just had to deal with the EC. One thing we noticed was that if we flashed the vendor firmware, ran it, flashed Coreboot and then rebooted without cutting the power, the EC worked. This strongly suggested that there was some setup code happening in the vendor firmware that configured the EC appropriately, and if we duplicated that it would probably work. Unfortunately, figuring out what that code was was difficult. I ended up dumping the PCI device configuration for the vendor firmware and for Coreboot in case that would give us any clues, but the only thing that seemed relevant at all was that the LPC controller was configured to pass io ports 0x4e and 0x4f to the LPC bus with the vendor firmware, but not with Coreboot. Unfortunately the EC was supposed to be listening on 0x62 and 0x66, so this wasn't the problem.

I ended up solving this by using UEFITool to extract all the code from the vendor firmware, and then disassembled every object and grepped them for port io. x86 systems have two separate io buses - memory and port IO. Port IO is well suited to simple devices that don't need a lot of bandwidth, and the EC is definitely one of these - there's no way to talk to it other than using port IO, so any configuration was almost certainly happening that way. I found a whole bunch of stuff that touched the EC, but was clearly depending on it already having been enabled. I found a wide range of cases where port IO was being used for early PCI configuration. And, finally, I found some code that reconfigured the LPC bridge to route 0x4e and 0x4f to the LPC bus (explaining the configuration change I'd seen earlier), and then wrote a bunch of values to those addresses. I mimicked those, and suddenly the EC started responding.

It turns out that the writes that made this work weren't terribly magic. PCs used to have a SuperIO chip that provided most of the legacy port functionality, including the floppy drive controller and parallel and serial ports. Individual components (called logical devices, or LDNs) could be enabled and disabled using a sequence of writes that was fairly consistent between vendors. Someone on the Coreboot IRC channel recognised that the writes that enabled the EC were simply using that protocol to enable a series of LDNs, which apparently correspond to things like "Working EC" and "Working keyboard". And with that, we were done.

Coreboot doesn't currently have ACPI support for the latest Intel graphics chipsets, so right now my image doesn't have working backlight control.Backlight control also turned out to be interesting. Most modern Intel systems handle the backlight via registers in the GPU, but the X210 uses the embedded controller (possibly because it supports both LVDS and eDP panels). This means that adding a simple display stub is sufficient - all we have to do on a backlight set request is store the value in the EC, and it does the rest.

Other than that, everything seems to work (although there's probably a bunch of power management optimisation to do). I started this process knowing almost nothing about Coreboot, but thanks to the help of people on IRC I was able to get things working in about two days of work[4] and now have firmware that's about as custom as my laptop.

[1] Why not Libreboot? Because modern Intel SoCs haven't had their memory initialisation code reverse engineered, so the only way to boot them is to use the proprietary Intel Firmware Support Package.
[2] Card 0, device 0
[3] After a few false starts - it turns out that the initial memory training can take a surprisingly long time, and we kept giving up before that had happened
[4] Spread over 5 or so days of real time

comment count unavailable comments

libinput now has ReadTheDocs-style documentation

Posted by Peter Hutterer on July 30, 2018 04:16 AM

libinput's documentation started out as doxygen of the developer API - they were the main target 4 years ago. Over time, more and more extra documentation was added and now most of it is aimed at users (for self-debugging and troubleshooting or just to explain concepts and features). Unfortunately, with doxygen this all ends up in the "Related Pages". The developer API documentation itself became a less important part, by now all the major compositors have libinput support and it doesn't change much. So while it needs to be there, most of the traffic goes to the user documentation (I think, it's not like I'm running stats).

Something more suited for prose-style docs was needed. I prefer the RTD look so last week I converted most of the libinput documentation into RST format and it's now built with sphinx and the RTD theme. Same URL as before: http://wayland.freedesktop.org/libinput/doc/latest/.

The biggest difference is that the Developer API Documentation (still doxygen) is now at http://wayland.freedesktop.org/libinput/doc/latest/api/, (i.e. add /api/ to the link). If you're programming against libinput's API (e.g. because you're writing a compositor), that's where you need to go.

It's still basically the same content as before, I'll be tidying things up and adding to it over the next few weeks. Hopefully without breaking existing links. There is probably detritus from the doxygen → rst change floating around, I'll be fixing that too. If you want to help out please don't hesitate, I'll do my best to be quick to review any merge requests.

ASG! 2018 CfP Closes TODAY

Posted by Lennart Poettering on July 29, 2018 10:00 PM

<large>The All Systems Go! 2018 Call for Participation Closes TODAY!</large>

The Call for Participation (CFP) for All Systems Go! 2018 will close TODAY, on 30th of July! We’d like to invite you to submit your proposals for consideration to the CFP submission site quickly!

ASG image

All Systems Go! is everybody's favourite low-level Userspace Linux conference, taking place in Berlin, Germany in September 28-30, 2018.

For more information please visit our conference website!

The Ascendance of nftables

Posted by Dan Williams on July 27, 2018 07:20 PM
<figure class="wp-caption aligncenter" id="attachment_851" style="width: 660px"><figcaption class="wp-caption-text">The Sun sets on iptables (image by fdecomite, CC BY 2.0)</figcaption></figure>

iptables is the default Linux firewall and packet manipulation tool. If you’ve ever been responsible for a Linux machine (aside from an Android phone perhaps) then you’ve had to touch iptables. It works, but that’s about the best thing anyone can say about it.

At Red Hat we’ve been working hard to replace iptables with its successor: nftables. Which has actually been around for years but for various reasons was unable to completely replace iptables.  Until now.

What’s Wrong With iptables?

iptables is slow. It processes rules linearly which was fine in the days of 10/100Mbit ethernet. But we can do better, and nftables does; it uses maps and concatenations to touch packets as little as possible for a given action.

Most of nftables’ intelligence is in the userland tools rather than the kernel, reducing the possibility for downtime due to kernel bugs. iptables puts most of its logic in the kernel and you can guess where that leads.

When adding or updating even a single rule, iptables must read the entire existing table from the kernel, make the change, and send the whole thing back. iptables also requires locking workarounds to prevent parallel processes from stomping on each other or returning errors. Updating an entire table requires some synchronization across all CPUs meaning the more CPUs you have, the longer it takes. These issues cause problems in container orchestration systems (like OpenShift and Kubernetes) where 100,000 rules and 15 second iptables-restore runs are not uncommon. nftables can update one or many rules without touching any of the others.

iptables requires duplicate rules for IPv4 and IPv6 packets and for multiple actions, which just makes the performance and maintenance problems worse. nftables allows the same rule to apply to both IPv4 and IPv6 and supports multiple actions in the same rule, keeping your ruleset small and simple.

If you’ve every had to log or debug iptables, you know how awful that can be. nftables allows logging and other actions in the same rule, saving you time, effort, and cirrhosis of the liver. It also provides the “nft monitor trace” command to watch how rules apply to live packets.

nftables also uses the same netlink API infrastructure as other modern kernel systems like /sbin/ip, the Wi-Fi stack, and others, so it’s easier to use in other programs without resorting to command-line parsing and execing random binaries.

Finally, nftables has integrated set support with consistent syntax rather than requiring a separate tool like ipset.

What about eBPF?

You might have heard that eBPF will replace everything and give everyone a unicorn. It might, if/when it gets enhancements for accountability, traceability, debuggability, auditability, and broad driver support for XDP. But nftables has been around for years and has most (all?) of these things today.

nftables Everywhere

I’d like to highlight the great work by members of my team to bring nftables over the finish line:

  • Phil Sutter is almost done with compat versions of arptables and ebtables and has been adding testcases everywhere. He also added a JSON interface to libnftables (much like /sbin/ip) for easier programmatic use which firewalld will use in the near future.
  • Eric Garver updated firewalld (the default firewall manager on Fedora, RHEL, and other distros) to use nftables by default. This change alone will seamlessly flip the nftables switch for countless users. It’s a huge deal.
  • Florian Westphal figured out how to make nftables and iptables NAT coexist in the kernel. He also fixed up the iptables compat commands and handles the upstream releases to make sure we can actually use this stuff.
  • And of course the upstream netfilter community!

Thanks iptables; it’s been a nice ride. But nftables is better.

 

Why it's not a good idea to handle evdev directly

Posted by Peter Hutterer on July 25, 2018 02:34 AM

Gather round children, it's story time. Especially for you children who lurk on /r/linux and think you may learn something there. Today, I'll tell you a horror story. The one where we convert kernel input events into touchpad events, with the subtle subtitle of "friends don't let friends handle evdev events".

The question put forward is "why do we need libinput at all", when, as frequently suggested on the usual websites, it's sufficient to just read evdev data and there's really no need for libinput. That is of course true. You can use evdev events from the kernel directly. Did you know that the events the kernel gives you are absolute coordinates? And that not all touchpads have buttons? Or that some touchpads have specific event sequences that need to be filtered? No? Well, boy, are you in for a few surprises! Anyway, let's go and handle evdev events ourselves and write our own libmyinput.

How do we know something is a touchpad? Well, we look at the exposed evdev bits. We need ABS_X, ABS_Y and BTN_TOOL_FINGER but don't want INPUT_PROP_DIRECT. If the latter bit is set then we have a touchscreen (probably). We don't actually care about buttons here, that comes later. ABS_X and ABS_Y give us device-absolute coordinates. On touch down you get the evdev frame of "a finger is down at x/y device units from the top-left". As you move around, you get the x/y coordinate updates. The data itself is exactly the same as you would get from a touchscreen, but we know it's a touchpad because we queried the other bits at startup. So your first job is to convert the absolute x/y coordinates to deltas by subtracting the previous position.

Touchpads have different resolutions for x and y so a delta of 10/10 does not mean it's a 45-degree movement. Better check with the resolution to convert this to physical distances to be on the safe side. Oh, btw, the axes aren't reliable. The min/max ranges and the resolutions are wrong on a large number of touchpads. Luckily systemd fixes this for you with the 60-evdev.hwdb. But I should probably note that hwdb only exists because of libinput... Either way, you don't have to care about it because the road's already paved. You're welcome.

Oh wait, you do have to care a little because there are touchpads (e.g. HP Stream 11, ZBook Studio G3, ...) where bits are missing or wrong. So you better write a device database that tells you when you have correct the evdev bits. You could implement this as config option but that's just saying "I know what's wrong here, I know how to fix it but I'm still going to make you google for it and edit a local configuration file to make it work". You could treat your users this way, but you really shouldn't.

As you're happily processing your deltas, you notice that on some touchpads you get motion before you touch the touchpad. Ooops, we need a way to tell whether a finger is down. Luckily the kernel gives you BTN_TOUCH for that event, so you switch your implementation to only calculate deltas when BTN_TOUCH is set. But then you realise that is effectively a hardcoded threshold in the kernel and does not match a lot of devices. Some devices require too-hard finger pressure to trigger BTN_TOUCH, others send it on super-light pressure or even while hovering. After grinding some enamel away you find that many touchpads give you ABS_PRESSURE. Awesome, let's make touches pressure-based instead. Let's use a threshold, no, I mean a device-specific threshold (because if two touchpads would be the same the universe will stop doing whatever a universe does, I clearly haven't thought this through). Luckily we already have the device database so we just add the thresholds there.

Oh, if you want this to run on a Apple touchpad better implement touch size handling (ABS_MT_TOUCH_MAJOR/ABS_MT_TOUCH_MINOR). These axes give you the size of the touching ellipse which is great. Except that the value is just an arbitrary number range that have no reflection to physical properties, so better update your database so you can add those thresholds.

Ok, now we have single-finger handling in our libnotinput. Let's add some sophisticated touchpad features like button clicks. Buttons are easy, the kernel gives us BTN_LEFT and BTN_RIGHT and, if you're lucky, BTN_MIDDLE. Unless you have a clickpad of course in which case you only ever get BTN_LEFT because the whole touchpad can be depressed (much like you, if you continue writing your own evdev handling). Those clickpads are in the majority of laptops these days, so we have to deal with them. The two approaches we have are "software button areas" and "clickfinger". The former detects where your finger is when you push the touchpad down - if it's in the bottom right corner we convert the kernel's BTN_LEFT to a BTN_RIGHT and pass that on. Decide how big the buttons will be (note: some touchpads that need software buttons are only 50mm high, others exceed 100mm height). Whatever size you choose, it's an invisible line on the touchpad. Do you know yet how you will handle a finger that moves from outside the button are into the button area before the click? Or the other way round? Maybe add this to your todo list for fixing later.

Maybe "clickfinger" is easier? It counts how many fingers are on the touchpad when clicking (1 finger == left click, 2 fingers == right click, 3 fingers == middle click). Much easier, except that so far we only handle one finger. The easy fix is to use BTN_TOOL_DOUBLETAP and BTN_TOOL_TRIPLETAP which are bitflags that tell you when a second/third finger are down. Add that to your libthisisnotlibinput. Coincidentally, users often click with their thumb while moving. So you have one finger moving the pointer, then a thumb click. Two fingers down but the user doesn't perceive it as such, this should be a left click. Oops, we don't actually know where the second finger is.

Let's switch our libstillnotlibinput to use ABS_MT_POSITION_X and ABS_MT_POSITION_Y because that gives us per-finger position information (once you understand how the kernel's MT protocol slots work). And when I say "switch" of course I meant "add" because there are still touchpads in use that don't support multitouch so you get to keep both implementations. There are also a bunch of touchpads that can give you the position of two fingers but not of the third. Wipe that tear away and pencil that into your todo list. I haven't mentioned semi-mt devices yet that will give you multitouch position data for two fingers but it won't track them correctly - the first touch position is always the top/left of the bounding box, the second touch is always the bottom/right of the bounding box. Do the right thing for our libwhathaveidone and just pretend semi-mt devices are single-touch touchpads. libinput (the real one) does the same because my sanity is something to be cherished.

Oh, on another note, some touchpads don't have any buttons (some Wacom tablets are large touchpads). Add that to your todo list. You wanted middle buttons to work? Few touchpads have a middle button (clickpads never do anyway). Better write a middle button emulation system that generates BTN_MIDDLE when both buttons are pressed. Or when a finger is on the left and another finger is on the right software button. Or when a finger is in a virtual middle button area. All these need to be present because if not, you get dissed by users for not implementing their favourite interaction method.

So we're several paragraphs in and so far we have: finger tracking and some button handling. And a bunch of things on the todo list. We haven't even started with other fancy features like edge scrolling, two-finger scrolling, pinch/swipe gestures or thumb and palm detection. Oh, and you're not yet handling any other devices like graphics tablets which are a world of their own. If you think all the other features and devices are any less of a mess... well, an Austrian comedian once said (paraphrased): "optimism is just a fancy word for ignorance".

All this is just handling features that users have come to expect. Examples for non-features that you'll have to implement: on some Lenovo series (*50 and newer) you will get a pointer jump after a series of of events that only have pressure information. You'll have to detect and discard that jump. The HP Pavilion DM4 touchpad has random jumps in the slot data. Synaptics PS/2 touchpads may 'randomly' end touches and restart them on the next event frame 10ms later. If you don't handle that you'll get ghost taps. And so on and so forth.

So as you, happily or less so, continue writing your libthisismoreworkthanexpected you'll eventually come to realise that you're just reimplementing libinput. Congratulations or condolences, whichever applies.

libinput's raison d'etre is that it deals with all the mess above so that compositor authors can be blissfully unaware of all this. That's the reason why all the major/general-purpose compositors have switched to libinput. That's the reason most distributions now use libinput with the X server (through the xf86-input-libinput driver). libinput has made some design decisions that you may disagree with but honestly, that's life. Deal with it. It doesn't even do all I want and I wrote >90% of it. Suggesting that you can just handle evdev directly is like suggesting you can use GPS coordinates directly to navigate. Sure you can, but there's a reason why people instead use a Tom Tom or Google Maps.

ASG! 2018 CfP Closes Soon

Posted by Lennart Poettering on July 22, 2018 10:00 PM

<large>The All Systems Go! 2018 Call for Participation Closes in One Week!</large>

The Call for Participation (CFP) for All Systems Go! 2018 will close in one week, on 30th of July! We’d like to invite you to submit your proposals for consideration to the CFP submission site quickly!

ASG image

Notification of acceptance and non-acceptance will go out within 7 days of the closing of the CFP.

All topics relevant to foundational open-source Linux technologies are welcome. In particular, however, we are looking for proposals including, but not limited to, the following topics:

  • Low-level container executors and infrastructure
  • IoT and embedded OS infrastructure
  • BPF and eBPF filtering
  • OS, container, IoT image delivery and updating
  • Building Linux devices and applications
  • Low-level desktop technologies
  • Networking
  • System and service management
  • Tracing and performance measuring
  • IPC and RPC systems
  • Security and Sandboxing

While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome, as long as they have a clear and direct relevance for user-space.

For more information please visit our conference website!

Flatpak – a look behind the portal

Posted by Matthias Clasen on July 19, 2018 04:58 PM

Flatpak allows sandboxed apps to interact with the rest of the system via portals. Portals are simply D-Bus services that are designed to be safe to expose to untrusted apps.

Principles

There are several principles that have guided the design of the existing portals.

 Keep the user in control

To achieve this, most portals  will show a dialog to let the user accept or deny the applications’ request. This is not a hard rule — in some cases, a dialog is just not practical.

Avoid yes/no questions

Direct questions about permissions tend to be dismissed without much thought, since they get in the way of the task at hand. Therefore, portals avoid this kind of question whenever possible and instead just let the user get on with the task.

For example, when an app is requesting to open a file on the host, we just present the user with a fille chooser. By selecting a file, the user implicitly grants the application access to the file. Or he can cancel the file selection and implicitly deny the applications’ request.

Don’t be annoying

Nothing is worse than having to answer the same question over and over. Portals make use of a database to record previous decisions and avoid asking repeatedly for the same thing.

Practice

The database used by portals is called the permission store. The permission store is organized in tables, with a table for each portal that needs one. It has a D-Bus api, but it is more convenient to explore it using the recently added flatpak commands:

flatpak permission-list
flatpak permission-list devices
flatpak permission-list desktop-used-apps video/webm

The first command will list all permissions in all tables, the second will show the content of the “devices” table, and the last one will show just the row for video/webm in the “desktop-used-apps” table.

There are also commands that deal with permissions on a per-application basis.

flatpak permission-show org.gnome.Epiphany
flatpak permission-reset org.gnome.Epiphany

The first command will show all the permissions that apply to the application, the second will remove all permissions for the application.

And more…

The most important table in the permission store is the “documents” table, where the documents portal stores information about host files that have been exported for applications. The documents portal makes the exported files available via a fuse filesystem at

/run/user/1000/doc

A useful subdirectory here is by-app, where the exported files are visible on a per-application bases (when setting up a sandbox, flatpak makes only this part of the document store available inside the sandbox).

It is instructive to browse this filesystem, but flatpak also has a dedicated set of commands for exploring the contents of the documents portal.

flatpak document-list
flatpak document-list org.gnome.Epiphany

The first command lists all exported files, the second shows only the files that are exported for an individual application.

flatpak document-info $HOME/example.pdf

This command shows information about a file that is exported in the document portal, such as which applications have access to it, and what they are allowed to do.

Lastly, there are document-export and document-unexport commands that allow to add or remove files from the document portal.

Summary

If you want to explore how portals work, or just need to double-check which files an app has access to, flatpak has tools that let you do so conveniently.

The Flatpak BoF at Guadec

Posted by Matthias Clasen on July 14, 2018 03:54 PM

Here is a quick summary of the Flatpak BoF that happened last week at Guadec.

1.0 approaching fast

We started by going over the list of outstanding 1.0 items. It is a very short list, and they should all be included in an upcoming 0.99.3 release.

  • Alex wants to add information about renaming to desktop files
  • Owen will hold his OCI appstream work for 1.1 at this point
  • James is interested in getting more information from snapd to portal backends, but this also does not need to block 1.0, and can be pulled into a 1.1 portals release
  • Matthias will review the open portal issues and make sure everything is in good shape for a 1.0 portal release

1.0 preparation

Alex will do a 0.99.3 release with all outstanding changes for 1.0 (Update: this release has happened by now). Matthias will work with  Allan and Bastien on the press release and other materials. Nick is very interested in having information about runtime availability, lifetime and stability easily available on the website for 1.0.

We agreed to remove the ‘beta’ label from the flathub website.

Post 1.0 plans

There was a suggestion that we should have an autostart portal. This request spawned a bigger discussion of application life-cycle control, background apps and services. We need to come up with a design for these intertwined topics before adding portals for it.

After 1.0, Alex wants to focus on tests and ci for a while. One idea in this area is to have a scriptable test app that can make portal requests.

Automatic migration on renames or EOL is on Endless’ wishlist.

Exporting repositories in local networks is a feature that Endless has, but it may end up upstream in ostree instead of flatpak.

Everybody agreed that GNOME Software should merge apps from different sources in a better way.

For runtimes, the GNOME release teams aims to have the GNOME runtime built using buildstream, on top of the freedesktop 1.8 runtime. This may or may not happen in time for GNOME 3.30.

meson fails with "ERROR: Native dependency 'foo' not found" - and how to fix it

Posted by Peter Hutterer on July 08, 2018 11:46 PM

A common error when building from source is something like the error below:


meson.build:50:0: ERROR: Native dependency 'foo' not found
or a similar warning

meson.build:63:0: ERROR: Invalid version of dependency, need 'foo' ['>= 1.1.0'] found '1.0.0'.
Seeing that can be quite discouraging, but luckily, in many cases it's not too difficult to fix. As usual, there are many ways to get to a successful result, I'll describe what I consider the simplest.

What does it mean? Dependencies are simply libraries or tools that meson needs to build the project. Usually these are declared like this in meson.build:


dep_foo = dependency('foo', version: '>= 1.1.0')
In human words: "we need the development headers for library foo (or 'libfoo') of version 1.1.0 or later". meson uses the pkg-config tool in the background to resolve that request. If we require package foo, pkg-config searches for a file foo.pc in the following directories:
  • /usr/lib/pkgconfig,
  • /usr/lib64/pkgconfig,
  • /usr/share/pkgconfig,
  • /usr/local/lib/pkgconfig,
  • /usr/local/share/pkgconfig
The error message simply means pkg-config couldn't find the file and you need to install the matching package from your distribution or from source.

And important note here: in most cases, we need the development headers of said library, installing just the library itself is not sufficient. After all, we're trying to build against it, not merely run against it.

What package provides the foo.pc file?

In many cases the package is the development version of the package name. Try foo-devel (Fedora, RHEL, SuSE, ...) or foo-dev (Debian, Ubuntu, ...). yum and dnf provide a great shortcut to install any pkg-config dependency:


$> dnf install "pkgconfig(foo)"

$> yum install "pkgconfig(foo)"
will automatically search and install the right package, including its dependencies.
apt-get requires a bit more effort:

$> apt-get install apt-file
$> apt-file update
$> apt-file search --package-only foo.pc
foo-dev
$> apt-get install foo-dev
For those running Arch and pacman, the sequence is:

$> pacman -S pkgfile
$> pkgfile -u
$> pkgfile foo.pc
extra/foo
$> pacman -S extra/foo
Once that's done you can re-run meson and see if all dependencies have been met. If more packages are missing, follow the same process for the next file.

Any users of other distributions - let me know how to do this on yours and I'll update the post

My version is wrong!

It's not uncommon to see the following error after installing the right package:


meson.build:63:0: ERROR: Invalid version of dependency, need 'foo' ['>= 1.1.0'] found '1.0.0'.
Now you're stuck and you have a problem. What this means is that the package version your distribution provides is not new enough to build your software. This is where the simple solutions and and it all gets a bit more complicated - with more potential errors. Unless you are willing to go into the deep end, I recommend moving on and accepting that you can't have the newest bits on an older distribution. Because now you have to build the dependencies from source and that may then require to build their dependencies from source and before you know you've built 30 packages. If you're willing read on, otherwise - sorry, you won't be able to run your software today.

Manually installing dependencies

Now you're in the deep end, so be aware that you may see more complicated errors in the process. First of all you need to figure out where to get the source from. I'll now use cairo as example instead of foo so you see actual data. On rpm-based distributions like Fedora run dnf or yum:


$> dnf info cairo-devel # or yum info cairo-devel
Loaded plugins: auto-update-debuginfo, langpacks
Installed Packages
Name : cairo-devel
Arch : x86_64
Version : 1.13.1
Release : 0.1.git337ab1f.fc20
Size : 2.4 M
Repo : installed
From repo : fedora
Summary : Development files for cairo
URL : http://cairographics.org
License : LGPLv2 or MPLv1.1
Description : Cairo is a 2D graphics library designed to provide high-quality
: display and print output.
:
: This package contains libraries, header files and developer
: documentation needed for developing software which uses the cairo
: graphics library.
The important field here is the URL line - got to that and you'll find the source tarballs. That should be true for most projects but you may need to google for the package name and hope. Search for the tarball with the right version number and download it. On Debian and related distributions, cairo is provided by the libcairo2-dev package. Run apt-cache show on that package:

$> apt-cache show libcairo2-dev
Package: libcairo2-dev
Source: cairo
Version: 1.12.2-3
Installed-Size: 2766
Maintainer: Dave Beckett <dajobe>
Architecture: amd64
Provides: libcairo-dev
Depends: libcairo2 (= 1.12.2-3), libcairo-gobject2 (= 1.12.2-3),[...]
Suggests: libcairo2-doc
Description-en: Development files for the Cairo 2D graphics library
Cairo is a multi-platform library providing anti-aliased
vector-based rendering for multiple target backends.
.
This package contains the development libraries, header files needed by
programs that want to compile with Cairo.
Homepage: http://cairographics.org/
Description-md5: 07fe86d11452aa2efc887db335b46f58
Tag: devel::library, role::devel-lib, uitoolkit::gtk
Section: libdevel
Priority: optional
Filename: pool/main/c/cairo/libcairo2-dev_1.12.2-3_amd64.deb
Size: 1160286
MD5sum: e29852ae8e8e5510b00b13dbc201ce66
SHA1: 2ed3534d02c01b8d10b13748c3a02820d10962cf
SHA256: a6099cfbcc6bd891e347dd9abc57b7f137e0fd619deaff39606fd58f0cc60d27
In this case it's the Homepage line that matters, but the process of downloading tarballs is the same as above. For Arch users, the interesting line is URL as well:

$> pacman -Si cairo | grep URL
Repository : extra
Name : cairo
Version : 1.12.16-1
Description : Cairo vector graphics library
Architecture : x86_64
URL : http://cairographics.org/
Licenses : LGPL MPL
....

Now to the complicated bit: In most cases, you shouldn't install the new version over the system version because you may break other things. You're better off installing the dependency into a custom folder ("prefix") and point pkg-config to it. So let's say you downloaded the cairo tarball, now you need to run:


$> mkdir $HOME/dependencies/
$> tar xf cairo-someversion.tar.xz
$> cd cairo-someversion
$> autoreconf -ivf
$> ./configure --prefix=$HOME/dependencies
$> make && make install
$> export PKG_CONFIG_PATH=$HOME/dependencies/lib/pkgconfig:$HOME/dependencies/share/pkgconfig
# now go back to original project and run meson again
So you create a directory called dependencies and install cairo there. This will install cairo.pc as $HOME/dependencies/lib/cairo.pc. Now all you need to do is tell pkg-config that you want it to look there as well - so you set PKG_CONFIG_PATH. If you re-run meson in the original project, pkg-config will find the new version and meson should succeed. If you have multiple packages that all require a newer version, install them into the same path and you only need to set PKG_CONFIG_PATH once. Remember you need to set PKG_CONFIG_PATH in the same shell as you are running configure from.

In the case of dependencies that use meson, you replace autotools and make with meson and ninja:


$> mkdir $HOME/dependencies/
$> tar xf foo-someversion.tar.xz
$> cd foo-someversion
$> meson builddir -Dprefix=$HOME/dependencies
$> ninja -C builddir install
$> export PKG_CONFIG_PATH=$HOME/dependencies/lib/pkgconfig:$HOME/dependencies/share/pkgconfig
# now go back to original project and run meson again

If you keep seeing the version error the most common problem is that PKG_CONFIG_PATH isn't set in your shell, or doesn't point to the new cairo.pc file. A simple way to check is:


$> pkg-config --modversion cairo
1.13.1
Is the version number the one you installed or the system one? If it is the system one, you have a typo in PKG_CONFIG_PATH, just re-set it. If it still doesn't work do this:

$> cat $HOME/dependencies/lib/pkgconfig/cairo.pc
prefix=/usr
exec_prefix=/usr
libdir=/usr/lib64
includedir=/usr/include

Name: cairo
Description: Multi-platform 2D graphics library
Version: 1.13.1

Requires.private: gobject-2.0 glib-2.0 >= 2.14 [...]
Libs: -L${libdir} -lcairo
Libs.private: -lz -lz -lGL
Cflags: -I${includedir}/cairo
If the Version field matches what pkg-config returns, then you're set. If not, keep adjusting PKG_CONFIG_PATH until it works. There is a rare case where the Version field in the installed library doesn't match what the tarball said. That's a defective tarball and you should report this to the project, but don't worry, this hardly ever happens. In almost all cases, the cause is simply PKG_CONFIG_PATH not being set correctly. Keep trying :)

Let's assume you've managed to build the dependencies and want to run the newly built project. The only problem is: because you built against a newer library than the one on your system, you need to point it to use the new libraries.


$> export LD_LIBRARY_PATH=$HOME/dependencies/lib
and now you can, in the same shell, run your project.

Good luck!

Flatpak, making contribution easy

Posted by Matthias Clasen on July 07, 2018 04:09 PM

One vision that i’ve talked abut in the past is that moving to flatpak could make it much easier to contribute to applications.

Fast-forward 3 years, and the vision is (almost) here!

Every application on flathub has a Sources extension that you can install just like anything else from a flatpak repo:

flatpak install flathub org.seul.pingus.Sources

This extension contains a flatpak manifest which lists the exact revisions of all the sources that went into the build. This lets you reproduce the build — if you can find the manifest!

Assuming you install the sources per-user, the manifest is here (using org.seul.pingus as an example):

$HOME/.local/share/flatpak/runtime/org.seul.pingus.Sources/x86_64/stable/active/files/manifest/org.seul.pingus.json

And you can build it like this:

flatpak-builder build org.seul.pingus.json

I said the vision is almost, but not quite there. Here is why: gnome-builder also has a way to kickstart a project from a manifest:

gnome-builder --manifest org.seul.pingus.json

But sadly, this currently crashes. I filed an issue for it, and it will hopefully work very soon. Next step, flip-to-hack !

Affiliated Vendors on the LVFS

Posted by Richard Hughes on July 02, 2018 04:33 PM

We’ve just about to deploy another feature to the LVFS that might be interesting to some of you. First, some nomenclature:

OEM: Original Equipment Manufacturer, the user-known company name on the outside of the device, e.g. Sony, Panasonic, etc
ODM: Original Device Manufacturer, typically making parts for one or more OEMs, e.g. Foxconn, Compal

There are some OEMs where the ODM is the entity responsible for uploading the firmware to the LVFS. The per-device QA is typically done by the OEM, rather than the ODM, although it can be both. Before today we didn’t have a good story about how to handle this other than having a “fake” oem_odm@oem.com useraccounts that were shared by all users at the ODM. The fake account isn’t of good design from a security or privacy point of view and so we needed something better.

The LVFS administrator can now mark other vendors as “affiliates” of other vendors. This gives the ODM permission to upload firmware that is “owned” by the OEM on the LVFS, and that appears in the OEM embargo metadata. The OEM QA team is also able to edit the update description, move the firmware to testing and stable (or delete it entirely) as required. The ODM vendor account also doesn’t have to appear in the search results or the vendor table, making it hidden to all users except OEMs.

This also means if an ODM like Foxconn builds firmware for two different OEMs, they also have to specify which vendor should “own” the firmware at upload time. This is achieved with a simple selection widget on the upload page, but will only be shown if affiliations have been set up. The ODM is able to manage their user accounts directly, either using local accounts with passwords, or ODM-specific OAuth which is the preferred choice as it means there is only one place to manage credentials.

If anyone needs more information, please just email me or leave a comment below. Thanks!

fwupdate is {nearly} dead; long live fwupd

Posted by Richard Hughes on July 02, 2018 03:58 PM

If the title confuses you, you’re not the only one that’s been confused with the fwupdate and fwupd project names. The latter used the shared library of the former to schedule UEFI updates, with the former also providing the fwup.efi secure-boot signed binary that actually runs the capsule update for the latter.

In Fedora the only user of libfwupdate was fwupd and the fwupdate command line tool itself. It makes complete sense to absorb the redundant libfwupdate library interface into the uefi plugin in fwupd. Benefits I can see include:

  • fwupd and fwupdate are very similar names; a lot of ODMs and OEMs have been confused, especially the ones not so Linux savy.
  • fwupd already depends on efivar for other things, and so there are no additional deps in fwudp.
  • Removal of an artificial library interface, with all the soname and package-induced pain. No matter how small, maintaining any project is a significant use of resources.
  • The CI and translation hooks are already in place for fwupd, and we can use the merging of projects as a chance to write lots of low-level tests for all the various hooks into the system.
  • We don’t need to check for features or versions in fwupd, we can just develop the feature (e.g. the BGRT localised background image) all in one branch without #ifdefs everwhere.
  • We can do cleverer things whilst running as a daemon, for instance uploading the fwup.efi to the ESP as required rather than installing it as part of the distro package.
    • The last point is important; several distros don’t allow packages to install files on the ESP and this was blocking fwupdate being used by them. Also, 95% of the failures reported to the LVFS are from Arch Linux users who didn’t set up the ESP correctly as the wiki says. With this new code we can likely reduce the reported error rate by several orders of magnitude.

      Note, fwupd doesn’t actually obsolete fwupdate, as the latter might still be useful if you’re testing capsule updates on something super-embedded that doesn’t ship Glib or D-Bus. We do ship a D-Bus-less fwupdate-compatible command line in /usr/libexec/fwupd/fwupdate if you’re using the old CLI from a shell script. We’re all planning to work on the new integrated fwupd version, but I’m sure they’ll be some sharing of fixes between the projects as libfwupdate is shipped in a lot of LTS releases like RHEL 7.

      All of this new goodness is available in fwupd git master, which will be the new 1.1.0 release probably available next week. The 1_0_X branch (which depends on libfwupdate) will be maintained for a long time, and is probably the better choice to ship in LTS releases at the moment. Any distros that ship the new 1.1.x fwupd versions will need to ensure that the fwup.efi files are signed properly if they want SecureBoot to work; in most cases just copying over the commands from the fwupdate package is all that is required. I’ll be updating Fedora Rawhide with the new package as soon as it’s released.

      Comments welcome.

Flatpak in detail, part 3

Posted by Matthias Clasen on July 02, 2018 08:16 AM

The previous in this series looked at runtimes and filesystem organization. Here, we’ll take a look at the flatpak sandbox and explore how the world looks to a running flatpak app.

The sandbox, a view from the inside

Flatpak uses container technologies to set up a sandbox for the apps it runs. The name container brings to mind an image of a vessel with solid walls, which is a bit misleading.

In reality, a container is just a plain old process which talks directly to the kernel and other processes on the system. It is the job of the container management system, in this case flatpak, to launch the process with reduced privileges to limit the harm it can do to the rest of the system. To do this, flatpak uses the same kernel features that tools like docker use: namespaces, cgroups, seccomp.

Processes

One namespace that is used by flatpak is the pid namespace, which hides processes outside the sandbox from the sandboxed app.  If you run ps inside the sandbox, you’ll see something like this:

$ ps
PID TTY TIME CMD
1 ? 00:00:00 bwrap
2 ? 00:00:00 sh
3 ? 00:00:00 ps

One thing you’ll notice is that the app itself (the sh process, in this case) ends up with a PID of 2. Just something to keep in mind when getting your app ready for running in a sandbox: PIDs can’t be used to identify the app to the rest of the system anymore. This has caused minor problems, e.g with the _NET_WM_PID property that some window managers look at to decide about window decorations.

Filesystems

Another namespace that flatpak makes extensive use of is the mount namespace. It is used to create a customized view of the filesystem for the sandboxed app.

Running mount inside the sandbox will show you all the details, but its a bit much to show here, so we’ll just stick to the most important parts:

  • /app – this is where the apps files are mounted. It is a typical Unix filesystem with /app/share, /app/bin/, /app/lib and so on. When building an app for flatpak packaging, it typically gets configured with –prefix=/app for this reason.
  • /usr – the runtime files are placed here.
  • /run/host/ – various parts of the host filesystem that are safe to expose are mounted here, for example fonts (in /run/host/fonts) and icons (in /run/host/share/icons).
  • /usr/share/runtime/locale/ – the Locale extension of the runtime is mounted here.
  • Other extensions are mounted in their pre-determined places.
  • $HOME/.var/app/$APPID is made available to the app as well. This is the place in the home directory where flatpaked apps can store persistent data. The XDG_DATA_HOME, XDG_CONFIG_HOME and XDG_CACHE_HOME environment variables are set up to point to subdirectories in this place, and respecting these variables is a good way for an app to just work in a flatpak.

D-Bus

A common way for Linux apps to communicate with services and apps in the user session is via D-Bus, and most flatpak sandboxes allow the app to talk on the session bus (it can be turned off with a setting in the metadata). But we can’t just let the app talk to any other process on the bus, that would be a big risk. Therefore, flatpak sets up a filtering proxy that only allows the app to talk to certain names (this is again set up in the metadata).

The exception to this are portals – these are D-Bus interfaces specifically designed to be safe to expose, and apps are always allowed to talk to them. Portals offer APIs for a number of common needs, such as

  • org.freedestkop.portal.FileChooser  – Open a file
  • org.freedesktop.portal.OpenUri – Show a file in an app
  • org.freedesktop.portal.Print – Print a file
  • org.freedesktop.portal.Notification – Show a notification

Most of the time, portals will present a dialog to keep the user in control of what system access the sandboxed apps gets.

Toolkits like GTK+ can transparently use portals in some cases when they detect that the app is inside a sandbox, which greatly reduces the effort in using portals.

Metadata

But how does GTK+ find out that is being used inside a sandbox?

It looks for a file called .flatpak-info which flatpak places in the filesystem root of every sandbox. This file is not just a marker, it contains some useful information about the details of the sandbox setup, and is worth looking at. Some apps show information from here in their about dialog.

Summary

Flatpak sets up a customized sandbox for the apps it runs. It tries hard to preserve an environment that typical Linux desktop apps expect, including XDG locations and the session bus.

Trying out GTK+ 4

Posted by Matthias Clasen on June 29, 2018 04:41 PM

I was asked today if there is already a Flatpak runtime that includes GTK+ 3.94. A very natural question. GTK+ 4 and flatpak are both cool, so of course you want to try them together.

But the answer is: No, GTK+ 4 is not included in a runtime yet.

It is still a bit early for that – the master branch of the GNOME runtime includes development versions of libraries, but those are libraries that are following the GNOME release schedule, where we can expect them to have a stable release in time for GNOME 3.30 in September. That is not the case for GTK+ 4.

However, that does not mean you cannot take your first steps with GTK+ 3.94 and flatpak right now. Flatpak is a flexible system and the bundling capabilities are very well-suited for this kind of scenario.

Just add GTK+ to the list of bundled modules in your flatpak manifest:

...
 "modules": [
   {
     "name": "gtk+",
     "buildsystem": "meson",
     "builddir": true,
     "config-opts": [
       "--prefix="/app",
       "--libdir="lib",
       "-Ddemos=false",
       "-Dbuild-tests=false"
     ],
     "sources": [
       "type": "archive",
       "url": "https://download.gnome.org/sources/gtk+/3.94/gtk+-3.94.0.tar.xz"
     ]
   },
   {
     "name": "your-app",
     ...
   }
 ]
...

If you are more adventurous, you can also build the GTK+ master branch from git, to see our latest progress towards GTK+ 4 as it happens.

To see a complete example of a flatpak manifest bundling GTK+ and its dependencies, you can look at corebird.

We have the beginnings of a GTK+ 3 → 4 migration guide in the GTK+ documentation. If you give it a try, please let us know what works and what is missing.

One way to so so is to file an issue. But you also welcome to come by the GTK+ BoF at Guadec in Almeria and tell us in person.

X server pointer acceleration analysis - part 5

Posted by Peter Hutterer on June 26, 2018 11:15 PM

This post is part of a series: Part 1, Part 2, Part 3, Part 4, Part 5.

In this post I'll describe the X server pointer acceleration for trackpoints. You will need to read Observations on trackpoint input data first to make sense of this post.

As described in that linked post, trackpoint input data varies wildly. Combined with the options we have in the server to configure everything makes this post a bit pointless as almost every single behaviour can be changed.

The linked post also describes the three subjective pressure ranges: no real physical pressure, some physical pressure, and serious pressure. The line between the first two ranges is roughly where the trackpoint sends deltas at the maximum reporting rate (100Hz) but with a value of 1. Below that pressure, the intervals increase but the delta remains at 1. Above that pressure, the interval remains constant at 10ms but the deltas increase. I've used the default kernel trackpoint sensitivity of 128 for any data listed here. Here is the visualisation of how deltas and intervals change again.

The default pointer acceleration profile in the X server is the simple profile. We know this from the earlier posts, it has a double-plateau shape. On a trackpoint mm/s doesn't make sense here so let's look at it in units/ms instead. A unit is simply a device-specific measurement of distance/pressure/tilt/whatever - it all depends on the device. On trackpoints that is (mostly) sideways pressure or tilt. On mice and touchpads we can convert units to mm based on their resolution. On trackpoints, we don't have a physical reference and we thus have to deal with it in units. The obvious problem here is that 1 unit on one device does not equal 1 unit on another device. And for configurable trackpoints, the definition of a unit changes as the sensitivity changes. And that's after the kernel already mangles it (if it does, it doesn't for all devices). So here's a box of asterisks, please sprinkle it liberally.

The smallest delta the kernel can send is 1. At a hardware report rate of 100Hz, continuous pressure to the smallest detected threshold thus generates 1 unit every 10 milliseconds or 0.1 units/ms. If I push uncomfortably hard, I can get deltas of around 10 units every 10ms or 1 unit/ms. In other words, we better zoom in here. Let's look at the meaningful range of this curve.

On my trackpoint, below 0.1 units/ms means virtually no pressure (pressure range one). Pressure range two is 0.1 to 0.4, approximately. Beyond that is pressure range three but that is also the range that becomes pointless quickly - I simply wouldn't want to press this hard in normal operation. 1 unit per ms (10 units per report) is very high pressure. This means the pointer acceleration curve is actually defined for the usable range with only outliers hitting the maximum acceleration. For mice this curve was effectively a constant acceleration for all but slow movements (see here). However, any configuration can change this curve to a point where none of the above applies.

Back to the minimum constant movement of 0.1 units/ms. That one effectively matches the start of the 'no accel' plateau. Anything below that will be decelerated, i.e. a delta of 1 unit will result a pointer delta less than 1 pixel. In other words, anything up to where you have to apply real pressure is decelerated.

The constant factor plateau goes all the way to 0.4 units/ms. Then there's the buggy jump to a factor of ~1.5, followed by a smooth curve to 0.8 units/ms where the factor maxes out. A bit of testing here suggests that 0.4 units/ms is in the upper limits of the second pressure range mentioned above. Going past 0.6 or 0.7 is definitely well within the third pressure range where things get uncomfortable quickly. This means that the acceleration bug is actually sitting right in the highest interesting range. Apparently no-one has noticed for 10 years.

But what does it matter? Well, probably not even that much. The only interesting bit I I can see here is that we have deceleration for most low-pressure movements and a constant acceleration of 1 for most realistic movements. I very much doubt that the range above 0.4 really matters.

But hey, this is just the default configuration. It is affected when someone changes the speed slider in GNOME, or when someone changes the sensitivity at the sysfs level. Other trackpoints wont have the exact same behaviour. Any analysis is thrown out of the window as soon as someone changes the sysfs sensitivity or increases the acceleration threshold.

Let's talk sysfs - if we increase my trackpoint sensitivity to 200, the deltas coming from the trackpoint change. First, the pressure required to give me a constant stream of events often gives me deltas of size 2 or 3. So we're half-way into the no acceleration plateau here. Higher pressures easily give me deltas of size 10 or 1 unit per ms, the edge of the image above.

I wish I could analyse this any further but realistically, the only takeaway here is that any change in configuration options results in some version of trial-and-error by the user until the trackpoint moves as they want to. But without knowing all those options, we just cannot know what exactly is happening.

However, what this is useful for is comparing it to libinput. libinput got a custom trackpoint acceleration function in 1.8, designed around the hardware delta range. The idea was that you (or someone) measures the trackpoint device's range once, if it's outside of the assumed default ranges we add a hwdb entry and voila, it scales back to the right ranges and that device is fixed for good.

Except - this doesn't work. libinput scales into the delta range and calculates the factor from that but it doesn't take the time stamps into account. It works on the assumption that a trackpoint deltas are at a constant frequency with a varying delta. That is simply not the case and the dynamic range of the trackpoint is so small that any acceleration of the deltas results in jerky movement.

This is of course fixable, we can just convert the deltas into a speed and then apply the acceleration curve based on that. So that's the next task, if you're interested in that, subscribe yourself to this issue.

Walkthrough for Portable Services

Posted by Lennart Poettering on June 26, 2018 10:00 PM

Portable Services with systemd v239

systemd v239 contains a great number of new features. One of them is first class support for Portable Services. In this blog story I'd like to shed some light on what they are and why they might be interesting for your application.

What are "Portable Services"?

The "Portable Service" concept takes inspiration from classic chroot() environments as well as container management and brings a number of their features to more regular system service management.

While the definition of what a "container" really is is hotly debated, I figure people can generally agree that the "container" concept primarily provides two major features:

  1. Resource bundling: a container generally brings its own file system tree along, bundling any shared libraries and other resources it might need along with the main service executables.

  2. Isolation and sand-boxing: a container operates in a name-spaced environment that is relatively detached from the host. Besides living in its own file system namespace it usually also has its own user database, process tree and so on. Access from the container to the host is limited with various security technologies.

Of these two concepts the first one is also what traditional UNIX chroot() environments are about.

Both resource bundling and isolation/sand-boxing are concepts systemd has implemented to varying degrees for a longer time. Specifically, RootDirectory= and RootImage= have been around for a long time, and so have been the various sand-boxing features systemd provides. The Portable Services concept builds on that, putting these features together in a new, integrated way to make them more accessible and usable.

OK, so what precisely is a "Portable Service"?

Much like a container image, a portable service on disk can be just a directory tree that contains service executables and all their dependencies, in a hierarchy resembling the normal Linux directory hierarchy. A portable service can also be a raw disk image, containing a file system containing such a tree (which can be mounted via a loop-back block device), or multiple file systems (in which case they need to follow the Discoverable Partitions Specification and be located within a GPT partition table). Regardless whether the portable service on disk is a simple directory tree or a raw disk image, let's call this concept the portable service image.

Such images can be generated with any tool typically used for the purpose of installing OSes inside some directory, for example dnf --installroot= or debootstrap. There are very few requirements made on these trees, except the following two:

  1. The tree should carry systemd unit files for relevant services in them.

  2. The tree should carry /usr/lib/os-release (or /etc/os-release) OS release information.

Of course, as you might notice, OS trees generated from any of today's big distributions generally qualify for these two requirements without any further modification, as pretty much all of them adopted /usr/lib/os-release and tend to ship their major services with systemd unit files.

A portable service image generated like this can be "attached" or "detached" from a host:

  1. "Attaching" an image to a host is done through the new portablectl attach command. This command dissects the image, reading the os-release information, and searching for unit files in them. It then copies relevant unit files out of the images and into /etc/systemd/system/. After that it augments any copied service unit files in two ways: a drop-in adding a RootDirectory= or RootImage= line is added in so that even though the unit files are now available on the host when started they run the referenced binaries from the image. It also symlinks in a second drop-in which is called a "profile", which is supposed to carry additional security settings to enforce on the attached services, to ensure the right amount of sand-boxing.

  2. "Detaching" an image from the host is done through portable detach. It reverses the steps above: the unit files copied out are removed again, and so are the two drop-in files generated for them.

While a portable service is attached its relevant unit files are made available on the host like any others: they will appear in systemctl list-unit-files, you can enable and disable them, you can start them and stop them. You can extend them with systemctl edit. You can introspect them. You can apply resource management to them like to any other service, and you can process their logs like any other service and so on. That's because they really are native systemd services, except that they have 'twist' if you so will: they have tougher security by default and store their resources in a root directory or image.

And that's already the essence of what Portable Services are.

A couple of interesting points:

  1. Even though the focus is on shipping service unit files in portable service images, you can actually ship timer units, socket units, target units, path units in portable services too. This means you can very naturally do time, socket and path based activation. It's also entirely fine to ship multiple service units in the same image, in case you have more complex applications.

  2. This concept introduces zero new metadata. Unit files are an existing concept, as are os-release files, and — in case you opt for raw disk images — GPT partition tables are already established too. This also means existing tools to generate images can be reused for building portable service images to a large degree as no completely new artifact types need to be generated.

  3. Because the Portable Service concepts introduces zero new metadata and just builds on existing security and resource bundling features of systemd it's implemented in a set of distinct tools, relatively disconnected from the rest of systemd. Specifically, the main user-facing command is portablectl, and the actual operations are implemented in systemd-portabled.service. If you so will, portable services are a true add-on to systemd, just making a specific work-flow nicer to use than with the basic operations systemd otherwise provides. Also note that systemd-portabled provides bus APIs accessible to any program that wants to interface with it, portablectl is just one tool that happens to be shipped along with systemd.

  4. Since Portable Services are a feature we only added very recently we wanted to keep some freedom to make changes still. Due to that we decided to install the portablectl command into /usr/lib/systemd/ for now, so that it does not appear in $PATH by default. This means, for now you have to invoke it with a full path: /usr/lib/systemd/portablectl. We expect to move it into /usr/bin/ very soon though, and make it a fully supported interface of systemd.

  5. You may wonder which unit files contained in a portable service image are the ones considered "relevant" and are actually copied out by the portablectl attach operation. Currently, this is derived from the image name. Let's say you have an image stored in a directory /var/lib/portables/foobar_4711/ (or alternatively in a raw image /var/lib/portables/foobar_4711.raw). In that case the unit files copied out match the pattern foobar*.service, foobar*.socket, foobar*.target, foobar*.path, foobar*.timer.

  6. The Portable Services concept does not define any specific method how images get on the deployment machines, that's entirely up to administrators. You can just scp them there, or wget them. You could even package them as RPMs and then deploy them with dnf if you feel adventurous.

  7. Portable service images can reside in any directory you like. However, if you place them in /var/lib/portables/ then portablectl will find them easily and can show you a list of images you can attach and suchlike.

  8. Attaching a portable service image can be done persistently, so that it remains attached on subsequent boots (which is the default), or it can be attached only until the next reboot, by passing --runtime to portablectl.

  9. Because portable service images are ultimately just regular OS images, it's natural and easy to build a single image that can be used in three different ways:

    1. It can be attached to any host as a portable service image.

    2. It can be booted as OS container, for example in a container manager like systemd-nspawn.

    3. It can be booted as host system, for example on bare metal or in a VM manager.

    Of course, to qualify for the latter two the image needs to contain more than just the service binaries, the os-release file and the unit files. To be bootable an OS container manager such as systemd-nspawn the image needs to contain an init system of some form, for example systemd. To be bootable on bare metal or as VM it also needs a boot loader of some form, for example systemd-boot.

Profiles

In the previous section the "profile" concept was briefly mentioned. Since they are a major feature of the Portable Services concept, they deserve some focus. A "profile" is ultimately just a pre-defined drop-in file for unit files that are attached to a host. They are supposed to mostly contain sand-boxing and security settings, but may actually contain any other settings, too. When a portable service is attached a suitable profile has to be selected. If none is selected explicitly, the default profile called default is used. systemd ships with four different profiles out of the box:

  1. The default profile provides a medium level of security. It contains settings to drop capabilities, enforce system call filters, restrict many kernel interfaces and mount various file systems read-only.

  2. The strict profile is similar to the default profile, but generally uses the most restrictive sand-boxing settings. For example networking is turned off and access to AF_NETLINK sockets is prohibited.

  3. The trusted profile is the least strict of them all. In fact it makes almost no restrictions at all. A service run with this profile has basically full access to the host system.

  4. The nonetwork profile is mostly identical to default, but also turns off network access.

Note that the profile is selected at the time the portable service image is attached, and it applies to all service files attached, in case multiple are shipped in the same image. Thus, the sand-boxing restriction to enforce are selected by the administrator attaching the image and not the image vendor.

Additional profiles can be defined easily by the administrator, if needed. We might also add additional profiles sooner or later to be shipped with systemd out of the box.

What's the use-case for this? If I have containers, why should I bother?

Portable Services are primarily intended to cover use-cases where code should more feel like "extensions" to the host system rather than live in disconnected, separate worlds. The profile concept is supposed to be tunable to the exact right amount of integration or isolation needed for an application.

In the container world the concept of "super-privileged containers" has been touted a lot, i.e. containers that run with full privileges. It's precisely that use-case that portable services are intended for: extensions to the host OS, that default to isolation, but can optionally get as much access to the host as needed, and can naturally take benefit of the full functionality of the host. The concept should hence be useful for all kinds of low-level system software that isn't shipped with the OS itself but needs varying degrees of integration with it. Besides servers and appliances this should be particularly interesting for IoT and embedded devices.

Because portable services are just a relatively small extension to the way system services are otherwise managed, they can be treated like regular service for almost all use-cases: they will appear along regular services in all tools that can introspect systemd unit data, and can be managed the same way when it comes to logging, resource management, runtime life-cycles and so on.

Portable services are a very generic concept. While the original use-case is OS extensions, it's of course entirely up to you and other users to use them in a suitable way of your choice.

Walkthrough

Let's have a look how this all can be used. We'll start with building a portable service image from scratch, before we attach, enable and start it on a host.

Building a Portable Service image

As mentioned, you can use any tool you like that can create OS trees or raw images for building Portable Service images, for example debootstrap or dnf --installroot=. For this example walkthrough run we'll use mkosi, which is ultimately just a fancy wrapper around dnf and debootstrap but makes a number of things particularly easy when repetitively building images from source trees.

I have pushed everything necessary to reproduce this walkthrough locally to a GitHub repository. Let's check it out:

$ git clone https://github.com/systemd/portable-walkthrough.git

Let's have a look in the repository:

  1. First of all, walkthroughd.c is the main source file of our little service. To keep things simple it's written in C, but it could be in any language of your choice. The daemon as implemented won't do much: it just starts up and waits for SIGTERM, at which point it will shut down. It's ultimately useless, but hopefully illustrates how this all fits together. The C code has no dependencies besides libc.

  2. walkthroughd.service is a systemd unit file that starts our little daemon. It's a simple service, hence the unit file is trivial.

  3. Makefile is a short make build script to build the daemon binary. It's pretty trivial, too: it just takes the C file and builds a binary from it. It can also install the daemon. It places the binary in /usr/local/lib/walkthroughd/walkthroughd (why not in /usr/local/bin? because it's not a user-facing binary but a system service binary), and its unit file in /usr/local/lib/systemd/walkthroughd.service. If you want to test the daemon on the host we can now simply run make and then ./walkthroughd in order to check everything works.

  4. mkosi.default is file that tells mkosi how to build the image. We opt for a Fedora-based image here (but we might as well have used Debian here, or any other supported distribution). We need no particular packages during runtime (after all we only depend on libc), but during the build phase we need gcc and make, hence these are the only packages we list in BuildPackages=.

  5. mkosi.build is a shell script that is invoked during mkosi's build logic. All it does is invoke make and make install to build and install our little daemon, and afterwards it extends the distribution-supplied /etc/os-release file with an additional field that describes our portable service a bit.

Let's now use this to build the portable service image. For that we use the mkosi tool. It's sufficient to invoke it without parameter to build the first image: it will automatically discover mkosi.default and mkosi.build which tells it what to do. (Note that if you work on a project like this for a longer time, mkosi -if is probably the better command to use, as it that speeds up building substantially by using an incremental build mode). mkosi will download the necessary RPMs, and put them all together. It will build our little daemon inside the image and after all that's done it will output the resulting image: walkthroughd_1.raw.

Because we opted to build a GPT raw disk image in mkosi.default this file is actually a raw disk image containing a GPT partition table. You can use fdisk -l walkthroughd_1.raw to enumerate the partition table. You can also use systemd-nspawn -i walkthroughd_1.raw to explore the image quickly if you need.

Using the Portable Service Image

Now that we have a portable service image, let's see how we can attach, enable and start the service included within it.

First, let's attach the image:

# /usr/lib/systemd/portablectl attach ./walkthroughd_1.raw
(Matching unit files with prefix 'walkthroughd'.)
Created directory /etc/systemd/system/walkthroughd.service.d.
Written /etc/systemd/system/walkthroughd.service.d/20-portable.conf.
Created symlink /etc/systemd/system/walkthroughd.service.d/10-profile.conf → /usr/lib/systemd/portable/profile/default/service.conf.
Copied /etc/systemd/system/walkthroughd.service.
Created symlink /etc/portables/walkthroughd_1.raw → /home/lennart/projects/portable-walkthrough/walkthroughd_1.raw.

The command will show you exactly what is has been doing: it just copied the main service file out, and added the two drop-ins, as expected.

Let's see if the unit is now available on the host, just like a regular unit, as promised:

# systemctl status walkthroughd.service
● walkthroughd.service - A simple example service
   Loaded: loaded (/etc/systemd/system/walkthroughd.service; disabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/walkthroughd.service.d
           └─10-profile.conf, 20-portable.conf
   Active: inactive (dead)

Nice, it worked. We see that the unit file is available and that systemd correctly discovered the two drop-ins. The unit is neither enabled nor started however. Yes, attaching a portable service image doesn't imply enabling nor starting. It just means the unit files contained in the image are made available to the host. It's up to the administrator to then enable them (so that they are automatically started when needed, for example at boot), and/or start them (in case they shall run right-away).

Let's now enable and start the service in one step:

# systemctl enable --now walkthroughd.service
Created symlink /etc/systemd/system/multi-user.target.wants/walkthroughd.service → /etc/systemd/system/walkthroughd.service.

Let's check if it's running:

# systemctl status walkthroughd.service
● walkthroughd.service - A simple example service
   Loaded: loaded (/etc/systemd/system/walkthroughd.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/walkthroughd.service.d
           └─10-profile.conf, 20-portable.conf
   Active: active (running) since Wed 2018-06-27 17:55:30 CEST; 4s ago
 Main PID: 45003 (walkthroughd)
    Tasks: 1 (limit: 4915)
   Memory: 4.3M
   CGroup: /system.slice/walkthroughd.service
           └─45003 /usr/local/lib/walkthroughd/walkthroughd

Jun 27 17:55:30 sigma walkthroughd[45003]: Initializing.

Perfect! We can see that the service is now enabled and running. The daemon is running as PID 45003.

Now that we verified that all is good, let's stop, disable and detach the service again:

# systemctl disable --now walkthroughd.service
Removed /etc/systemd/system/multi-user.target.wants/walkthroughd.service.
# /usr/lib/systemd/portablectl detach ./walkthroughd_1.raw
Removed /etc/systemd/system/walkthroughd.service.
Removed /etc/systemd/system/walkthroughd.service.d/10-profile.conf.
Removed /etc/systemd/system/walkthroughd.service.d/20-portable.conf.
Removed /etc/systemd/system/walkthroughd.service.d.
Removed /etc/portables/walkthroughd_1.raw.

And finally, let's see that it's really gone:

# systemctl status walkthroughd
Unit walkthroughd.service could not be found.

Perfect! It worked!

I hope the above gets you started with Portable Services. If you have further questions, please contact our mailing list.

Further Reading

A more low-level document explaining details is shipped along with systemd.

There are also relevant manual pages: portablectl(1) and systemd-portabled(8).

For further information about mkosi see its homepage.

Thomson 8-bit computers, a history

Posted by Bastien Nocera on June 22, 2018 03:06 PM
In March 1986, my dad was in the market for a Thomson TO7/70. I have the circled classified ads in “Téo” issue 1 to prove that :)



TO7/70 with its chiclet keyboard and optical pen, courtesy of MO5.com

The “Plan Informatique pour Tous” was in full swing, and Thomson were supplying schools with micro-computers. My dad, as a primary school teacher, needed to know how to operate those computers, and eventually teach them to kids.

The first thing he showed us when he got the computer, on the living room TV, was a game called “Panic” or “Panique” where you controlled a missile, protecting a town from flying saucers that flew across the screen from either side, faster and faster as the game went on. I still haven't been able to locate this game again.

A couple of years later, the TO7/70 was replaced by a TO9, with a floppy disk, and my dad used that computer to write an educational software about top-down additions, as part of a training program run by the teachers schools (“Écoles Normales” renamed to “IUFM“ in 1990).

After months of nagging, and some spring cleaning, he found the listings of his educational software, which I've liberated, with his permission. I'm currently still working out how to generate floppy disks that are usable directly in emulators. But here's an early screenshot.


Later on, my dad got an IBM PC compatible, an Olivetti PC/1, on which I'd play a clone of Asteroids for hours, but that's another story. The TO9 got passed down to me, and after spending a full summer doing planning for my hot-dog and chips van business (I was 10 or 11, and I had weird hobbies already), and entering every game from the “102 Programmes pour...” series of books, the TO9 got put to the side at Christmas, replaced by a Sega Master System, using that same handy SCART connector on the Thomson monitor.

But how does this concern you. Well, I've worked with RetroManCave on a Minitel episode not too long ago, and he agreed to do a history of the Thomson micro-computers. I did a fair bit of the research and fact-checking, as well as some needed repairs to the (prototype!) hardware I managed to find for the occasion. The result is this first look at the history of Thomson.

<iframe allowfullscreen="allowfullscreen" class="YOUTUBE-iframe-video" data-thumbnail-src="https://i.ytimg.com/vi/UPxQR2r6iVA/0.jpg" frameborder="0" height="266" src="https://www.youtube.com/embed/UPxQR2r6iVA?feature=player_embedded" width="320"></iframe>


Finally, if you fancy diving into the Thomson computers, there will be an episode coming shortly about the MO5E hardware, and some games worth running on it, on the same YouTube channel.

I'm currently working on bringing the “TeoTO8D emulator to Flathub, for Linux users. When that's ready, grab some games from the DCMOTO archival site, and have some fun!

I'll also be posting some nitty gritty details about Thomson repairs on my Micro Repairs Twitter feed for the more technically enclined among you.

LibreOffice color selector as GTK widgets

Posted by Caolán McNamara on June 22, 2018 02:21 PM
Here's what the native GTK widget mode for the color picker looks like at the moment under Wayland. A GtkMenuButton displaying a color preview of the currently selected color and a GtkPopover containing the color selection widgetry.
<iframe allow="autoplay; encrypted-media" allowfullscreen="allowfullscreen" frameborder="0" height="315" src="https://www.youtube.com/embed/ZiisWMArKDw" width="560"></iframe>
While under X, because there the GtkPopover is constrained to its parents size, falling back to using a GtkWindow to contain the widgetry

Flatpak in detail, part 2

Posted by Matthias Clasen on June 20, 2018 03:44 AM

The first post in this series looked at runtimes and extensions. Here, we’ll look at how flatpak keeps the applications and runtimes on your system organized, with installations, repositories, branches, commits and deployments.

Installations and repositories

An installation is a place on your filesystem where flatpak can install apps and runtimes. By default, flatpak has a system-wide installation in /var/lib/flatpak, and a user installation in $HOME/.local/share/flatpak.

It is possible to define additional system-wide installations by placing a key file in /etc/flatpak/installations.d. For example, this can be used to keep apps on a portable drive.

Part of the data that flatpak keeps for each installation is a list of remotes. A remote is a reference to an ostree repository that is available somewhere on the network.

Each installation also has its own local ostree repository (for example, the system-wide installation has its repo in /var/lib/flatpak/repo). You can explore the contents of these repositories using the ostree utility;

$ ostree --repo=$HOME/.local/share/flatpak/repo/ refs
gnome-nightly:appstream/x86_64
flathub:runtime/org.freedesktop.Platform.Locale/x86_64/1.6
flathub:app/de.wolfvollprecht.UberWriter/x86_64/stable
...

Branches and versions

Similar to git, ostree organizes the data in a repository in commits, which are grouped in branches. Commits are identified by a hash and branches are identified by a name.

While ostree does not care about the format of a branch name, flatpak uses branch names of the form $KIND/$ID/$ARCH/$BRANCH to uniquely identify branches.

Here are some examples:

app/org.inkscape.Inkscape/x86_64/stable
runtime/org.gnome.Platform/x86_64/master

Most of the  time, it is clear from the context if an app or runtime is being named, and only one architecture is relevant. For this case, flatpak allows a shorthand notation for branch names omitting the $KIND and $ARCH parts: $ID//$BRANCH.

In this notation, the above examples shrink to:

org.inkscape.Inkscape//stable
org.gnome.Platform//master

Deployments

Installing an app or runtime really consists of two steps: first, flatpak caches that data in the local repo of the installation, and then it deploys it, which means it creates a check-out of the branch from the local repo. The check-outs are organized in a folder structure that reflects the branch name organization.

For example, Inkscape will be checked-out in $HOME/.local/share/flatpak/app/org.inkscape.Inkscape/x86_64/stable/$COMMIT, where $COMMIT is the hash of the commit that is being deployed.

It is possible to have multiple commits from the same branch deployed, but one of them is considered active and will be used by default. Flatpak maintains symlink in the check-out directory that points at the active commit.

It is also possible to have multiple branches of an app or runtime deployed at the same time; the directory structure of checkouts is designed to allow that. One of the branches is considered current. Flatpak maintains a symlink at the toplevel of the checkout that points at the current checkout.

Flatpak can run an app from any deployed commit, regardless whether it is active or current or not. To run a particular commit, you can use the –commit option of flatpak run.

The relevance of being active and current is that flatpak exports some data (in particular, desktop files) from the active commit of the current branch, by symlinking it into ~/.local/share/flatpak/exports,  where for example gnome-shell will find it and allow you to run the app from the overview.

Note: Even though it is perfectly ok to have multiple versions of the same app installed, running more than one at the same time will typically not work, since the different versions will claim the app ID as their unique bus name on the session bus. A way around this limitation is to explicitly give one of the versions a different ID, for example, by appending a “.nightly” suffix.

Application data

One last aspect of filesystem organization to mention here is that every app that is run with flatpak gets a some filesystem space to use for permanent storage. This space is in $HOME/.var/app/$ID, and it has subdirectories called cache, config and data. At runtime, flatpak sets the XDG_CACHE_DIR, XDG_CONFIG_HOME and XDG_DATA_HOME environment variables to point at these directores.

For example, the persistent data from the inkscape flatpak can be found in $HOME/.var/app/org.inkscape.Inkscape.

Summary

Flatpak installations may look a bit intimidating with their deep diretory tree, but they have a well-defined structure and this post hopefully helps to explain the various components.

Flatpak in detail

Posted by Matthias Clasen on June 13, 2018 09:19 AM

At this point, Flatpak is a mature system for deploying and running desktop applications. It has accumulated quite some sophistication over time, which can make it appear more complicated than it is.

In this post, I’ll try to look in depth at some of the core concepts behind Flatpak, namely runtimes and extensions.

In the beginning: bundles

At its very core, the idea behind Flatpak is to bundle applications with their dependencies, and ship them as a self-contained unit.  There are good reasons that bundling is attractive for application developers:

  • There is a much bigger chance that the app will run on an arbitrary end-user system, which may have different versions of libraries, different themes, or a different kernel
  • You are not relying on all the different update mechanisms and policies of Linux distributions
  • Distribution updates to your dependencies will not break your app behind your back
  • You can test the same code that your users run

Best of both worlds: runtimes

In the age-old debate beween bundlers and packagers, there are good arguments on both sides. The usual arguments against bundling are:

  • Code duplication. If a library gets hit by a security issue you have to fix it in all the apps that bundle it
  • Wastefulness. if every app ships an entire library stack, this blows up the required bandwidth for downloads and the required disk space for installing them

With this in mind, Flatpak early on introduced the concept of runtimes. The idea behind runtimes is that many desktop applications use a deep library stack, but it is often a similar set of libraries. Therefore, it makes sense to take these common library stacks and distribute them separately as “GNOME runtime” or “KDE runtime”, and have apps declare in their metadata which runtime they need.

It then becomes the responsibility of the flatpak tooling to assemble  the applications filesystem tree with the runtimes filesystem tree when it creates the sandbox environment that the app runs in.

To avoid conflicts, Flatpak requires that the applications filesystem tree is rooted in /app, while runtimes have a traditional /usr tree.

Splitting off runtimes preserves most of the benefits that I outlined for bundles, while greatly reducing code duplication and letting us update libraries independently of applications.

Of course, it also brings back some of the risks of modularity: updating the libraries independently carries, once again, the risk of breaking the applications that use the runtime. So the team maintaining a runtime has to be very careful to avoid introducing problematic changes or incompatibilities.

Going further: extensions

As I said, shipping runtimes separately saves a lot of bandwidth, since the runtime has to be downloaded only once for all the applications that share it. But a runtime is still a pretty massive download, and contains a lot of things that may not be useful most of the time or are just optional.

A good examples for this are translations. It is not uncommon for desktop apps to be translated in 50 locales. But the average user will only ever use a single one of these. In traditional packaging, this is sometimes addressed by breaking translations out as “lang packs” that can be installed separately.

Another example is debug information. You don’t need symbols and other debug information unless you encounter a crash and want to submit a meaningful stacktrace. In traditional packaging, this is addressed by splitting off “debuginfo” packages that can be installed when needed.

Flatpak provides a mechanism  to address these use cases.  Runtimes (and applications too) can declare extension points, which are designated locations in their filesystem tree where additional runtimes can be mounted. These additional runtimes are called extensions. When constructing a sandbox for running an app, flatpak tooling will look for matching extensions and mount them at the right place.

Flatpak is not a generic solution, but tailored towards the use case of desktop applications, and it tries to do the the right thing out of the box: flatpak-builder automatically breaks out .Locale and .Debug extensions when building apps or runtimes, and when installing things, flatpak installs the matching .Locale extension. But it goes beyond that and only installs the subset of it that is relevant for the current locale, thereby recreating the space-saving effect of lang packs.

Extensions: infinite variations

The extension mechanism is flexible enough to cover not just locales and debuginfo, but all sorts of other optional components that applications might need. To give just some examples:

  • OpenGL drivers that match the GPU
  • Other hardware-specific APIs like vaaapi
  • Media codecs
  • Widget themes

All of these can be provided as extensions. Flatpak has the smarts built-in to know whether a given OpenGL driver extension matches the hardware or whether a given theme extension matches the current desktop theme, and it will automatically install and use matching extensions.

At last: the host OS

The examples in the previous paragraph are realized as extensions because the shared objects or theme components need to match the runtime they are used with.

But some things just don’t change very much over time, and don’t need exact matching against the runtime to be used by applications. Examples in this category are fonts, icons or certificates.

Flatpak makes these components from the host OS available in the sandbox, by mounting them below /run/host/ in the sandbox, and appending /run/host/share to the XDG_DATA_DIRS environment variable.

Summary

Flatpak does a lot of hard work behind the scenes to ensure that the apps it runs find an environment that looks similar to a traditional Linux desktop, by combining the application, its runtime, optional extensions and host components.

The Flatpak documentation provides more information about working with Flatpak as an application developer.

libinput and its device quirks files

Posted by Peter Hutterer on June 13, 2018 06:16 AM

This post does not describe a configuration system. If that's all you care about, read this post here and go be angry at someone else. Anyway, with that out of the way let's get started.

For a long time, libinput has supported model quirks (first added in Apr 2015). These model quirks are bitflags applied to some devices so we can enable special behaviours in the code. Model flags can be very specific ("this is a Lenovo x230 Touchpad") or generic ("This is a trackball") and it just depends on what the specific behaviour is that we need. The x230 touchpad for example has a custom pointer acceleration but trackballs are marked so they get some config options mice don't have/need.

In addition to model tags we also have custom attributes. These are free-form and provide information that we cannot get from the kernel. These too can be specific ("this model needs a pressure threshold of N") or generic ("bluetooth keyboards are an external keyboards").

Overall, it's a good system. Most users never have to care that we even have this. The whole point is that any device-specific quirks need to be merged only once for each model, then everyone with the same device gets to benefit on the next update.

Originally quirks were hardcoded but this required rebuilding libinput for any changes. So we moved this to utilise the udev hwdb. For the trivial work of fetching udev properties we got a lot of flexibility in how we can match against devices. For example, an entry may look like this:


libinput:name:*AlpsPS/2 ALPS GlidePoint:dmi:*svnDellInc.:pnLatitudeE6220:*
LIBINPUT_ATTR_PRESSURE_RANGE=100:90
The above uses a name match and the dmi modalias match to apply a property for the touchpad on the Dell Latitude E6330. The exact match format is defined by a bunch of udev rules that ship as part of libinput.

Using the udev hwdb maked the quirk storage a plaintext file that can be updated independently of libinput, including local overrides for testing things before merging them upstream. Having said that, it's definitely not public API and can change even between stable branch updates as properties are renamed or rescoped to fit the behaviour more accurately. For example, a model-specific tag may be renamed to a behaviour-specific tag as we find more devices affected by the same issue.

The main issue with the quirks now is that we keep accumulating more and more of them and I'm starting to hit limits with the udev hwdb match behaviour. The hwdb is great for single matches but not so great for cascading matches where one match may overwrite another match. The hwdb match system is largely implementation-defined so it's not always predictable which match rule wins out in the end.

Second, debugging the udev hwdb is not at all trivial. It's a bit like git - once you're used to it it's just fine but until then the air turns yellow with all the swearing being excreted by the unsuspecting user.

So long story short, libinput 1.12 will replace the hwdb model quirks database with a set of .ini files. The model quirks will be installed in /usr/share/libinput/ or whatever prefix your distribution prefers instead. It's a bunch of files with fairly simplistic instructions, each [section] has a set of MatchFoo=Bar directives and the ModelFoo=bar or AttrFoo=bar tags. See this file for an example. If all MatchFoo directives apply to a device, the Model and Attr tags are applied. Matching works in inter- and intra-file sequential order so the last section in a file overrides the first section of that file and the highest-sorting file overrides the lowest-sorting file. Otherwise the tags are accumulated, so if two files match on the same device with different tags, both tags are applied. So far, so unexciting.

Sometimes it's necessary to install a temporary local quirk until upstream libinput is updated or the distribution updates its package. For this, the /etc/libinput/local-overrides.quirks file is read in as well (if it exists). Note though that the config files are considered internal API, so any local overrides may stop working on the next libinput update. Should've upstreamed that quirk, eh?

These files give us the same functionality as the hwdb - we can drop in extra files without recompiling. They're more human-readable than a hwdb match and it's a lot easier to add extra match conditions to it. And we can extend the file format at will. But the biggest advantage is that we can quite easily write debugging tools to figure out why something works or doesn't work. The libinput list-quirks tool shows what tags apply to a device and using the --verbose flag shows you all the files and sections and how they apply or don't apply to your device.

As usual, the libinput documentation has details.

Fingerprint reader support, the second coming

Posted by Bastien Nocera on June 12, 2018 06:00 PM
Fingerprint readers are more and more common on Windows laptops, and hardware makers would really like to not have to make a separate SKU without the fingerprint reader just for Linux, if that fingerprint reader is unsupported there.

The original makers of those fingerprint readers just need to send patches to the libfprint Bugzilla, I hear you say, and the problem's solved!

But it turns out it's pretty difficult to write those new drivers, and those patches, without an insight on how the internals of libfprint work, and what all those internal, undocumented APIs mean.

Most of the drivers already present in libfprint are the results of reverse engineering, which means that none of them is a best-of-breed example of a driver, with all the unknown values and magic numbers.

Let's try to fix all this!

Step 1: fail faster

When you're writing a driver, the last thing you want is to have to wait for your compilation to fail. We ported libfprint to meson and shaved off a significant amount of time from a successful compilation. We also reduced the number of places where new drivers need to be declared to be added to the compilation.

Step 2: make it clearer

While doxygen is nice because it requires very little scaffolding to generate API documentation, the output is also not up to the level we expect. We ported the documentation to gtk-doc, which has a more readable page layout, easy support for cross-references, and gives us more control over how introductory paragraphs are laid out. See the before and after for yourselves.

Step 3: fail elsewhere

You created your patch locally, tested it out, and it's ready to go! But you don't know about git-bz, and you ended up attaching a patch file which you uploaded. Except you uploaded the wrong patch. Or the patch with the right name but from the wrong directory. Or you know git-bz but used the wrong commit id and uploaded another unrelated patch. This is all a bit too much.

We migrated our bugs and repository for both libfprint and fprintd to Freedesktop.org's GitLab. Merge Requests are automatically built, discussions are easier to follow!

Step 4: show it to me

Now that we have spiffy documentation, unified bug, patches and sources under one roof, we need to modernise our website. We used GitLab's CI/CD integration to generate our website from sources, including creating API documentation and listing supported devices from git master, to reduce the need to search the sources for that information.

Step 5: simplify

This process has started, but isn't finished yet. We're slowly splitting up the internal API between "internal internal" (what the library uses to work internally) and "internal for drivers" which we eventually hope to document to make writing drivers easier. This is partially done, but will need a lot more work in the coming months.

TL;DR: We migrated libfprint to meson, gtk-doc, GitLab, added a CI, and are writing docs for driver authors, everything's on the website!

Observations on trackpoint input data

Posted by Peter Hutterer on June 07, 2018 06:03 AM

This time we talk trackpoints. Or pointing sticks, or whatever else you want to call that thing between the GHB keys. If you don't have one and you've never seen one, prepare to be amazed. [1]

Trackpoints are tiny joysticks that react to pressure [2], convert that pressure into relative x/y events and pass that on to whoever is interested in it. The harder you push, the higher the deltas. This is where the simple and obvious stops and it gets difficult. But then again, if it was that easy I wouldn't write this post, you wouldn't have anything to read, so somehow everyone wins. Whoop-dee-doo.

All the data and measurements below refer to my trackpoint, a Lenovo T440s. It may not apply to any other trackpoints, including those on on different laptop models or even on the same laptop model with different firmware versions. I've written the below with a lot of cringing and handwringing. I want to write data that is irrefutable, but the universe is against me and what the universe wants, the universe gets. Approximately every second sentence below has a footnote of "actual results may vary". Feel free to re-create the data on your device though.

Measuring trackpoint range is highly subjective, so you'll have to trust me when I describe how specific speeds/pressure ranges feel. There are three ranges of pressure on my trackpoint (sort-of):

  • Pressure range one: When resting the finger on the trackpoint I don't really need to apply noticable pressure to make the trackpoint send events. Just moving the finger on the trackpoint makes it send events, albeit sporadically.
  • Pressure range two: Going beyond range one requires applying real pressure and feels to me like we're getting into RSI territory. Not a problem for short periods, but definitely not something I'd want all the time. It's the pressure I'd use to cross the screen.
  • Pressure range three: I have to push hard. I definitely wouldn't want to do this during everyday interaction and it just feels wrong anyway. This pressure range is for testing maximum deltas, not one you would want to use otherwise.
The first/second range are easier delineated than the second/third range because going from almost no pressure to some real pressure is easy. Going from some pressure to too much pressure is more blurry, there is some overlap between second and third range. Either way, keep these ranges in mind though as I'll be using them in the explanations below.

Ok, so with the physical conditions explained, let's look at what we have to worry about in software:

  • It is impossible to provide a constant input to a trackpoint if you're a puny human. Without a robotic setup you just cannot apply constant pressure so any measurements have some error. You also get to enjoy a feedback loop - pressure influences pointer motion but that pointer motion influences how much pressure you inadvertently apply. This makes any comparison filled with errors. I don't know if I'm applying the same pressure on the two devices I'm testing, I don't know if a user I'm asking to test something uses constant/the same/the right pressure.
  • Not all trackpoints are created equal. Some trackpoints (mostly in Lenovos), have configurable sensibility - 256 levels of it. [3] So one trackpoint measured does not equal another trackpoint unless you keep track of the firmware-set sensibility. Those trackpoints also have other toggles. More importantly and AFAIK, this type of trackpoint also has a built-in acceleration curve. [4] Other trackpoints (ALPS) just have a fixed sensibility, I have no idea whether those have a built-in acceleration curve or merely have a linear-ish pressure->delta mappings.

    Due to some design choices we did years ago, systemd increases the sensitivity on some devices (the POINTINGSTICK_SENSITIVITY property). So even on a vanilla install, you can't actually rely on the trackpoint being set to the manufacturer default. This was in an attempt to make trackpoints behave more consistently, systemd had the hwdb and it seemed like the right place to put device-specific quirks. In hindsight, it was the wrong design choice.
  • Deltas are ... unreliable. At high sensitivity and high pressures you might get a sequence of [7, 7, 14, 8, 3, 7]. At lower pressure you get the deltas at seemingly random intervals. This could be because it's hard to keep exact constant pressure, it could be a hardware issue.
  • evdev has been the default driver for almost a decade and before that it was the mouse driver for a long time. So the kernel will "Divide 4 since trackpoint's speed is too fast" [sic] for some trackpoints. Or by 8. Or not at all. In other words, the kernel adjusts for what the default user space is and userspace is based on what the kernel provides. On the newest ALPS trackpoints the kernel has stopped doing any in-kernel scaling (good!) but that means that the deltas are out by a factor of 8 now.
  • Trackpoints don't always have the same pressure ranges for x/y. AFAICT the y range is usually a bit less than the x range on many or most trackpoints. A bit weird because the finger position would suggest that strong vertical pressure is easier to apply than sideways pressure.
  • (Some? All?) Trackpoints have built-in calibration procedures to find and set their own center-point. Without that you'll get the trackpoint eventually being ever so slightly off center over time, causing a mouse pointer that just wanders off the screen, possibly into the woods, without the obligatory red cape and basket full of whatever grandma eats when she's sick.

    So the calibration is required but can be triggered accidentally by the user: If you push with the same pressure into the same direction for 2-5 seconds (depending on $THINGS) you trigger the calibration procedure and the current position becomes the new center point. When you release, the cursor wanders off for a few seconds until the calibration sets things straight again. If you ever see the cursor buzz off in a fixed direction or walking backwards for a centimetre or two you've triggered that calibration. The only way to avoid this is to make sure the pointer acceleration mechanism allows you to reach any target within 2 seconds and/or never forces you to apply constant pressure for more than 2 seconds. Now there's a challenge...

Ok. If you've been paying attention instead of hoping for a TLDR that's more elusive than Godot, we're now aware of the various drawbacks of collecting data from a trackpoint. Let's go and look at data. Sensitivity is set to the kernel default of 128 in sysfs, the default reporting rate is 100Hz. All observations are YMMV and whatnot, especially the latter.

Trackpoint deltas are in integers but the dynamic range of delta values is tiny. You mostly get 1 or 2 and it requires quite a fair bit of pressure to get up to 5 or more. At low pressure you get deltas of 1, but less frequently. Visualised, the relationship between deltas and the interval between deltas is like this:

At low pressure, we get deltas of 1 but high intervals. As the pressure increases, the interval between events shrinks until at some point the interval between events matches the reporting rate (100Hz/10ms). Increasing the pressure further now increases the deltas while the intervals remain at the reporting rate. For example, here's an event sequence at low pressure:

E: 63796.187226 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +20ms
E: 63796.227912 0002 0001 0001 # EV_REL / REL_Y 1
E: 63796.227912 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +40ms
E: 63796.277549 0002 0000 -001 # EV_REL / REL_X -1
E: 63796.277549 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +50ms
E: 63796.436793 0002 0000 -001 # EV_REL / REL_X -1
E: 63796.436793 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +159ms
E: 63796.546114 0002 0001 0001 # EV_REL / REL_Y 1
E: 63796.546114 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +110ms
E: 63796.606765 0002 0000 -001 # EV_REL / REL_X -1
E: 63796.606765 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +60ms
E: 63796.786510 0002 0000 -001 # EV_REL / REL_X -1
E: 63796.786510 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +180ms
E: 63796.885943 0002 0001 0001 # EV_REL / REL_Y 1
E: 63796.885943 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +99ms
E: 63796.956703 0002 0000 -001 # EV_REL / REL_X -1
E: 63796.956703 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +71ms
This was me pressing lightly but with perceived constant pressure and the time stamps between events go from 20m to 180ms. Remember what I said above about unreliable deltas? Yeah, that.

Here's an event sequence from a trackpoint at a pressure that triggers almost constant reporting:


E: 72743.926045 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.926045 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.926045 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +10ms
E: 72743.939414 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.939414 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.939414 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +13ms
E: 72743.949159 0002 0000 -002 # EV_REL / REL_X -2
E: 72743.949159 0002 0001 -002 # EV_REL / REL_Y -2
E: 72743.949159 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +10ms
E: 72743.956340 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.956340 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.956340 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +7ms
E: 72743.978602 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.978602 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.978602 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +22ms
E: 72743.989368 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.989368 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.989368 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +11ms
E: 72743.999342 0002 0000 -001 # EV_REL / REL_X -1
E: 72743.999342 0002 0001 -001 # EV_REL / REL_Y -1
E: 72743.999342 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +10ms
E: 72744.009154 0002 0000 -001 # EV_REL / REL_X -1
E: 72744.009154 0002 0001 -001 # EV_REL / REL_Y -1
E: 72744.009154 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +10ms
E: 72744.018965 0002 0000 -002 # EV_REL / REL_X -2
E: 72744.018965 0002 0001 -003 # EV_REL / REL_Y -3
E: 72744.018965 0000 0000 0000 # ------------ SYN_REPORT (0) ---------- +9ms
Note how there is an events in there with 22ms? Maintaining constant pressure is hard. You can re-create the above recordings by running evemu-record.

Pressing hard I get deltas up to maybe 5. That's staying within the second pressure range outlined above, I can force higher deltas but what's the point. So the dynamic range for deltas alone is terrible - we have a grand total of 5 values across the comfortable range.

Changing the sensitivity setting higher than the default will send higher deltas, including deltas greater than 1 before reaching the report rate. Setting it to lower than the default (does anyone do that?) sends smaller deltas. But doing so means changing the hardware properties, similar to how some gaming mice can switch dpi on the fly.

I leave you with a fun thought exercise in correlation vs. causation: your trackpoint uses PS/2, your touchpad probably uses PS/2. Your trackpoint has a reporting rate of 100Hz but when you touch the touchpad half the bandwidth is used by the touchpad. So your trackpoint sends half the events when you have the palm resting on the touchpad. From my observations, the deltas don't double in size. In other words, your trackpoint just slows down to roughly half the speed. I can reduce the reporting rate to approximately a third by putting two or more fingers onto the touchpad. Trackpoints haven't changed that much over the years but touchpads have. So the takeway is: 10 years ago touchpads were smaller and trackpoints were faster. Simply because you could use them without touching the touchpad. Mind blown (if true, measuring these things is hard...)

Well, that was fun, wasn't it. I'm glad you stayed that long, because I did and it'd feel lonely otherwise. In the next post I'll outline the pointer acceleration curves for trackpoints and what we're going to to about that. Besides despairing, that is.

[1] I doubt you will be, but it always pays to be prepared.
[2] In this post I'm using "pressure" here as side-ways pressure, not downwards pressure. Some trackpoints can handle downwards pressure and modify the acceleration based on it (or expect userland to do so).
[3] Not that this number is always correct, the Lenovo CompactKeyboard USB with Trackpoint has a default sensibility of 5 - any laptop trackpoint would be unusable at that low value (their default is 128).
[4] I honestly don't know this for sure but ages ago I found a hw spec document that actually detailed the process. Search for ""TrackPoint System Version 4.0 Engineering Specification", page 43 "2.6.2 DIGITAL TRANSFER FUNCTION"

Updating Wacom Firmware In Linux

Posted by Richard Hughes on June 06, 2018 10:11 AM

I’ve been working with Wacom engineers for a few months now, adding support for the custom update protocol used in various tablet devices they build. The new wacomhid plugin will be included in the soon-to-be released fwupd 1.0.8 and will allow you safely update the bluetooth, touch and main firmware of devices that support the HID protocol. Wacom is planning to release a new device that will be released with LVFS support out-of-the-box.

<figure class="wp-caption aligncenter" id="attachment_1615" style="width: 1638px"><figcaption class="wp-caption-text">My retail device now has a 0.05″ SWI debugging header installed…</figcaption></figure>

In other news, we now build both flatpak and snap versions of the standalone fwupdtool tool that can be used to update all kinds of hardware on distributions that won’t (or can’t) easily update the system version of fwupd. This lets you easily, for example, install the Logitech unifying security updates when running older versions of RHEL using flatpak and update the Dell Thunderbolt controller on Ubuntu 16.04 using snapd. Neither bundle installs the daemon or fwupdmgr by design, and both require running as root (and outside the sandbox) for obvious reasons. I’ll upload the flatpak to flathub when fwupd and all the deps have had stable releases. Until then, my flatpak bundle is available here.

Working with the Wacom engineers has been a pleasure, and the hardware is designed really well. The next graphics tablet you buy can now be 100% supported in Linux. More announcements soon.