Fedora desktop Planet

Verifying your system state in a secure and private way

Posted by Matthew Garrett on January 20, 2020 12:53 PM
Most modern PCs have a Trusted Platform Module (TPM) and firmware that, together, support something called Trusted Boot. In Trusted Boot, each component in the boot chain generates a series of measurements of next component of the boot process and relevant configuration. These measurements are pushed to the TPM where they're combined with the existing values stored in a series of Platform Configuration Registers (PCRs) in such a way that the final PCR value depends on both the value and the order of the measurements it's given. If any measurements change, the final PCR value changes.

Windows takes advantage of this with its Bitlocker disk encryption technology. The disk encryption key is stored in the TPM along with a policy that tells it to release it only if a specific set of PCR values is correct. By default, the TPM will release the encryption key automatically if the PCR values match and the system will just transparently boot. If someone tampers with the boot process or configuration, the PCR values will no longer match and boot will halt to allow the user to provide the disk key in some other way.

Unfortunately the TPM keeps no record of how it got to a specific state. If the PCR values don't match, that's all we know - the TPM is unable to tell us what changed to result in this breakage. Fortunately, the system firmware maintains an event log as we go along. Each measurement that's pushed to the TPM is accompanied by a new entry in the event log, containing not only the hash that was pushed to the TPM but also metadata that tells us what was measured and why. Since the algorithm the TPM uses to calculate the hash values is known, we can replay the same values from the event log and verify that we end up with the same final value that's in the TPM. We can then examine the event log to see what changed.

Unfortunately, the event log is stored in unprotected system RAM. In order to be able to trust it we need to compare the values in the event log (which can be tampered with) with the values in the TPM (which are much harder to tamper with). Unfortunately if someone has tampered with the event log then they could also have tampered with the bits of the OS that are doing that comparison. Put simply, if the machine is in a potentially untrustworthy state, we can't trust that machine to tell us anything about itself.

This is solved using a procedure called Remote Attestation. The TPM can be asked to provide a digital signature of the PCR values, and this can be passed to a remote system along with the event log. That remote system can then examine the event log, make sure it corresponds to the signed PCR values and make a security decision based on the contents of the event log rather than just on the final PCR values. This makes the system significantly more flexible and aids diagnostics. Unfortunately, it also means you need a remote server and an internet connection and then some way for that remote server to tell you whether it thinks your system is trustworthy and also you need some way to believe that the remote server is trustworthy and all of this is well not ideal if you're not an enterprise.

Last week I gave a talk at linux.conf.au on one way around this. Basically, remote attestation places no constraints on the network protocol in use - while the implementations that exist all do this over IP, there's no requirement for them to do so. So I wrote an implementation that runs over Bluetooth, in theory allowing you to use your phone to serve as the remote agent. If you trust your phone, you can use it as a tool for determining if you should trust your laptop.

I've pushed some code that demos this. The current implementation does nothing other than tell you whether UEFI Secure Boot was enabled or not, and it's also not currently running on a phone. The phone bit of this is pretty straightforward to fix, but the rest is somewhat harder.

The big issue we face is that we frequently don't know what event log values we should be seeing. The first few values are produced by the system firmware and there's no standardised way to publish the expected values. The Linux Vendor Firmware Service has support for publishing these values, so for some systems we can get hold of this. But then you get to measurements of your bootloader and kernel, and those change every time you do an update. Ideally we'd have tooling for Linux distributions to publish known good values for each package version and for that to be common across distributions. This would allow tools to download metadata and verify that measurements correspond to legitimate builds from the distribution in question.

This does still leave the problem of the initramfs. Since initramfs files are usually generated locally, and depend on the locally installed versions of tools at the point they're built, we end up with no good way to precalculate those values. I proposed a possible solution to this a while back, but have done absolutely nothing to help make that happen. I suck. The right way to do this may actually just be to turn initramfs images into pre-built artifacts and figure out the config at runtime (dracut actually supports a bunch of this already), so I'm going to spend a while playing with that.

If we can pull these pieces together then we can get to a place where you can boot your laptop and then, before typing any authentication details, have your phone compare each component in the boot process to expected values. Assistance in all of this extremely gratefully received.

comment count unavailable comments

Fedora Firefox team at 2019

Posted by Martin Stransky on January 07, 2020 02:43 PM

logoI think the last year was the strongest one in whole Fedora Firefox team history. We have been always contributed at Mozilla but in 2019 we finished some major outstanding projects at upstream and also ship them at Fedora.

The first finished project I’d like to mention is disabled system titlebar by default on Gnome. Firefox UI on Linux finally matches Windows/MacOS and provides similar user experience. We also implement various tweaks like styled and HiDPI titlebar button rendering and left/right button placement.

A rather small by code changes but highly impacted was gcc optimization with PGO/LTO.  In cooperation with Jakub Jelinek and SuSE guys we managed to match and even slightly outperform default Mozilla Firefox binaries which are built with clang. I’m going to post more accurate numbers in some follow up post as was already published by a Czech  linux magazine.

Firefox Gnome search provider is another small but useful feature we introduced last year. It’s not integrated at upstream yet because it needs an update for an upcoming async history lookup API at Firefox side but we ship it as tech preview to get more user feedback.

And then there’s our biggest project so far – Firefox with native Wayland backend. Fedora 31 ships it by default for Gnome which closes an initial developer phase and we can focus on polishing, bug fixing and adding more features now. It’s the biggest project we have been working on so far and also extends the Gtk2 to Gtk3 transition. There are also many people from and outside of Mozilla who helped with it and some of them are brand new contributors to Firefox which is awesome.

The Wayland backend is going to get more and more features in the future. We’re investigating possible advantages of DMA-BUF backend which can be used for HW accelerated video playback or direct WebGL rendering. We need to address missing Xvfb on Wayland to run tests on Wayland and build Firefox with PGO/LTO there. We’re also going to look at other Wayland compositors like Plasma and Sway to make sure Firefox works fine there – so many challenges and a lot of fun are waiting for fearless fox hackers! 😉

Wifi deauthentication attacks and home security

Posted by Matthew Garrett on December 27, 2019 03:26 AM
I live in a large apartment complex (it's literally a city block big), so I spend a disproportionate amount of time walking down corridors. Recently one of my neighbours installed a Ring wireless doorbell. By default these are motion activated (and the process for disabling motion detection is far from obvious), and if the owner subscribes to an appropriate plan these recordings are stored in the cloud. I'm not super enthusiastic about the idea of having my conversations recorded while I'm walking past someone's door, so I decided to look into the security of these devices.

One visit to Amazon later and I had a refurbished Ring Video Doorbell 2™ sitting on my desk. Tearing it down revealed it uses a TI SoC that's optimised for this sort of application, linked to a DSP that presumably does stuff like motion detection. The device spends most of its time in a sleep state where it generates no network activity, so on any wakeup it has to reassociate with the wireless network and start streaming data.

So we have a device that's silent and undetectable until it starts recording you, which isn't a great place to start from. But fortunately wifi has a few, uh, interesting design choices that mean we can still do something. The first is that even on an encrypted network, the packet headers are unencrypted and contain the address of the access point and whichever device is communicating. This means that it's possible to just dump whatever traffic is floating past and build up a collection of device addresses. Address ranges are allocated by the IEEE, so it's possible to map the addresses you see to manufacturers and get some idea of what's actually on the network[1] even if you can't see what they're actually transmitting. The second is that various management frames aren't encrypted, and so can be faked even if you don't have the network credentials.

The most interesting one here is the deauthentication frame that access points can use to tell clients that they're no longer welcome. These can be sent for a variety of reasons, including resource exhaustion or authentication failure. And, by default, they're entirely unprotected. Anyone can inject such a frame into your network and cause clients to believe they're no longer authorised to use the network, at which point they'll have to go through a new authentication cycle - and while they're doing that, they're not able to send any other packets.

So, the attack is to simply monitor the network for any devices that fall into the address range you want to target, and then immediately start shooting deauthentication frames at them once you see one. I hacked airodump-ng to ignore all clients that didn't look like a Ring, and then pasted in code from aireplay-ng to send deauthentication packets once it saw one. The problem here is that wifi cards can only be tuned to one frequency at a time, so unless you know the channel your potential target is on, you need to keep jumping between frequencies while looking for a target - and that means a target can potentially shoot off a notification while you're looking at other frequencies.

But even with that proviso, this seems to work reasonably reliably. I can hit the button on my Ring, see it show up in my hacked up code and see my phone receive no push notification. Even if it does get a notification, the doorbell is no longer accessible by the time I respond.

There's a couple of ways to avoid this attack. The first is to use 802.11w which protects management frames. A lot of hardware supports this, but it's generally disabled by default. The second is to just ignore deauthentication frames in the first place, which is a spec violation but also you're already building a device that exists to record strangers engaging in a range of legal activities so paying attention to social norms is clearly not a priority in any case.

Finally, none of this is even slightly new. A presentation from Def Con in 2016 covered this, demonstrating that Nest cameras could be blocked in the same way. The industry doesn't seem to have learned from this.

[1] The Ring Video Doorbell 2 just uses addresses from TI's range rather than anything Ring specific, unfortunately

comment count unavailable comments

More on Flatpak updates

Posted by Matthias Clasen on December 20, 2019 03:33 AM

The last time I talked about flatpak updates, I explained how flatpak apps can detect that a newer version has been installed, and restart themselves. That is great, and may almost be good enough when you have automatic updates. But that is not always the case.

Thankfully, we can do better. Since 1.5, Flatpak has a portal API that lets applications monitor for updates, and request updating themselves.

Here is how this looks when it is all put together:

<video class="wp-video-shortcode" controls="controls" height="267" id="video-9100-1" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2019/12/update-monitor.webm?_=1" type="video/webm">https://blogs.gnome.org/mclasen/files/2019/12/update-monitor.webm</video>

In the terminal, I’m building a new version of the the portal test app, and update my (local) repository. The flatpak portal is noticing that the update appeared (I’m running it with a short poll timeout here, instead of the usual 30 minutes), and sends out a D-Bus signal to the application, which requests to be updated, and then restarts itself.

Using the portal API directly is not very convenient, since you have to listen to D-Bus signals and whatnot. Therefore, we now have a library called libportal, which is providing simple async wrappers for most portals. That is what the portal test app in the demo is using, and you should be using it too in your applications.

The first stable release of libportal will appear very soon, with Flatpak 1.6, and then it will find its way into runtimes.

Update: Since this is a portal, users are in control of what apps are allowed to do. If you don’t want an application to update itself, you can put an end to it with

flatpak permission-set flatpak updates $APPID no

Use ‘ask’ instead of ‘no’ to get a confirmation dialog. The permission-set command is new in flatpak 1.6.

GMemoryMonitor (low-memory-monitor, 2nd phase)

Posted by Bastien Nocera on December 17, 2019 11:53 PM
TL;DR

Use GMemoryMonitor in glib 2.63.3 and newer in your applications to lower overall memory usage, and detect low memory conditions.

low-memory-monitor

To start with, let's come back to low-memory-monitor, announced at the end of August.

It's not really a “low memory monitor”. I know, the name is deceiving, but it actually monitors memory pressure stalls, and how hard it is for the kernel to allocate memory when applications need it. The longer it takes to allocate memory, the longer the kernel takes to allocate it, usually because it needs to move memory around to make room for a big allocation, when an application starts up for example, or prepares an in-memory buffer for saving.

It is not a daemon that will kill programs on low memory. It's not a user-space out-of-memory killer, and does not take those policy decisions. It can however be configured to ask the kernel to do that. The kernel doesn't really know what it's doing though, and user-space isn't helping either, so best disable that for now...

As listed in low-memory-monitor's README (and in the announcement post), there were a number of similar projects around, but none that would offer everything we needed, eg.:
  • Has a D-Bus interface to propagate low memory conditions
  • Requires Linux 5.2's kernel memory pressure stalls information (Android's lowmemorykiller daemon has loads of code to get the same information from the kernel for older versions, and it really is quite a lot of code)
  • Written in a compiled language to save on startup/memory usage costs (around 500 lines of C code, as counted by sloccount)
  • Built-in policy, based upon values used in Android and Endless OS
 GMemoryMonitor

Next up, in our effort to limit memory usage, we'll need some help from applications. That's where GMemoryMonitor comes in. It's simple enough, listen to the low-memory-warning signal and free some image thumbnails, index caches, or dump some data to disk, when you receive a signal.

The signal also gives you a “warning level”, with 255 being when low-memory-monitor would trigger the kernel's OOM killer, and lower values different levels of “try to be a good citizen”.

The more astute amongst you will have noticed that low-memory-monitor runs as root, on the system bus, and wonder how those new fangled (5 years old today!) sandboxed applications would receive those signals. Fear not! Support for a portal version of GMemoryMonitor landed in xdg-desktop-portal on the same day as in glib. Everything tied together with installed tests that use the real xdg-desktop-portal to test the portal and unsandboxed versions.

How about an OOM killer?

By using memory pressure stall information, we receive information about the state of the kernel before getting into swapping that'd cause the machine to become unusable. This also means that, as our threshold for keeping everything ticking is low, if we were to kill high memory consumers, we'd get a butter smooth desktop, but, based on my personal experience, your browser and your mail client would take it in turns disappearing from your desktop in a way that you wouldn't even notice.

We'll definitely need to think about our next step in application state management, and changing our running applications paradigm.

Distributions should definitely disable the OOM killer for now, and possibly try their hands at upstream some systemd OOMPolicy and OOMScoreAdjust options for system daemons.

Conclusion

Creating low-memory-monitor was easy enough, getting everything else in place was decidedly more complicated. In addition to requiring changes to glib, xdg-desktop-portal and python-dbusmock, it also required a lot of work on the glib CI to save me from having to write integration tests in C that would have required a lot of scaffolding. So thanks to all involved in particular Philip Withnall for his patience reviewing my changes.

Dual-GPU support follow-up: NVIDIA driver support

Posted by Bastien Nocera on December 13, 2019 04:15 PM
If you remember, back in 2016, I did the work to get a “Launch on Discrete GPU” menu item added to application in gnome-shell.

This cycle I worked on adding support for the NVIDIA proprietary driver, so that the menu item shows up, and the right environment variables are used to launch applications on that device.

Tested with another unsupported device...


Behind the scenes

There were a number of problems with the old detection code in switcheroo-control:
- it required the graphics card to use vga_switcheroo in the kernel, which the NVIDIA driver didn't do
- it could support more than 2 GPUs
- and it didn't really actually know which GPU was going to be the “main” one

And, on top of all that, gnome-shell expected the Mesa OpenGL stack to be used, so it only knew the right environment variables to do that, and only for one secondary GPU.

So we've extended switcheroo-control and its API to do all this.

(As a side note, commenters asked me about the KDE support, and how it would integrate, and it turns out that KDE's code just checks for the presence of a file in /sys, which is only present when vga_switcheroo is used. So I would encourage KDE to adopt the switcheroo-control D-Bus API for this)

Closing

All this will be available in Fedora 32, using GNOME 3.36 and switcheroo-control 2.0. We might backport this to Fedora 31 after it's been tested, and if there is enough interest.

Improving the security model of the LVFS

Posted by Richard Hughes on December 11, 2019 08:34 AM

There are lots of layers of security in the LVFS and fwupd design, including restricted account modes, 2FA, and server side AppStream namespaces. The most powerful one is the so-called vendor-id that the vendors cannot assign themselves, and is assigned by me when creating the vendor account on the LVFS. The way this works is that all firmware from the vendor is tagged with a vendor-id string like USB:0x056A which in this case matches the USB consortium vendor assigned ID. Client side, the vendor-id from the signed metadata is checked against the physical device and the firmware is updated only if the ID matches. This ensures that malicious or careless users on the LVFS can never ship firmware updates for other vendors hardware. About 90% of the vendors on the LVFS are locked down with this mechanism.

Some vendors have to have IDs that they don’t actually own, a good example here is for a DFU device like the 8bitdo controllers. In runtime mode they use the USB-assigned 8bitdo VID, but in bootloader mode they use a generic VID which is assigned to the chip supplier as they are using the reference bootloader. This is obviously fine, and both vendor IDs are assigned to 8bitdo on the LVFS for this reason. Another example is where Lenovo is responsible for updating Lenovo-specific NVMe firmware, but where the NVMe vendor isn’t always Lenovo’s PCI ID.

Where this breaks down a little more is for hardware devices that don’t have a built-in assigned vendor mapping. There are three plugins which are causing minor headaches:

  • Redfish — there’s seemingly no PCI vendor code for the enumerated devices themselves
  • ATA — the ATA-ATAPI-5 specification bizarrely makes no mention of any kind of vendor ID in the IDENTIFY block
  • UEFI — the ESRT table frustratingly just lists the version number and the GUID of devices, but no actual sysfs link to each

All the other plugins can be handled in a sane way, mostly automatically as the vast majority derive from either FuUsbDevice or FuUdevDevice.

As UEFI UpdateCapsule updates seem to be the 2nd most popular way to distribute firmware updates we probably ought to think of a sane way of limiting firmware updates to the existing BIOS vendor. We could query the DMI data, so that for instance Lenovo is only able to update Lenovo hardware — but we have to use a made-up pseudo-vendor-id of DMI:Lenovo. Maybe this isn’t so bad. Perhaps the vendor ID isn’t so useful with UEFI Update Capsule as the capsules themselves have to be signed by the firmware vendor before they’ll actually be run.

Anyway, to the point of this blog post: Until recently fwupd would refuse to apply the update if the metadata contained a vendor-id, but the device had not set one. This situation now might happen if for instance a vendor had to have no vendor-id because the device traditionally had no PCI or USB VID, and now in newer versions of fwupd the device would actually have a virtual ID, and so the vendor could be locked down on the LVFS. The fix here is to ignore the metadata vendor-id if there’s no device vendor-id, rather than failing to update.

Most people should be running fwupd 1.3.x, which is the latest and greatest branch of fwupd. I appreciate some LTS distros can’t rebase to a newer minor version, and so for old versions of fwupd I’ve backported the fix. These are the fixes you want if you’re running 0.9.x, 1.0.x, 1.1.x or 1.2.x.

I’ll make the vendor-id a hard requirement for all vendors in about 6 months time, so if you maintain a distro packaged version of fwupd you have that much time before some updates will stop working. If anyone has comments or concerns, please let me know.

OSFC 2019 – Introducing the Linux Vendor Firmware Service

Posted by Richard Hughes on December 04, 2019 12:00 PM

A few months ago I gave a talk at OSFC.io titled Introducing the Linux Vendor Firmware Service.

If you have a few minutes it’s a really useful high-level view of the entire architecture, along with a few quick dives into some of the useful things the LVFS can do. Questions and comments welcome!

coverity scan

Posted by Caolán McNamara on November 23, 2019 08:26 PM
When we make C++17 a requirement for LibreOffice at the end of 2018 the version of coverity provided by scan.coverity.com no longer worked for us. In July 2019 a newer version of the coverity tooling was available which supported C++17 and analysis resumed.

Prior to losing coverity support we had a defect density (i.e. defects per 1,000 line of code) of 0, on its return this had inflated to 0.06 due to both new defects introduced during the down period and old defects newly detected due to additional checks introduced in the new version.

Today we're finally back to 0

Growing the fwupd ecosystem

Posted by Richard Hughes on November 19, 2019 12:26 PM

Yesterday I wrote a blog about what hardware vendors need to provide so I can write them a fwupd plugin. A few people contacted me telling me that I should make it more generic, as I shouldn’t be the central point of failure in this whole ecosystem. The sensible thing, of course, is growing the “community” instead, and building up a set of (paid) consultants that can help the OEMs and ODMs, only getting me involved to review pull requests or for general advice. This would certainly reduce my current feeling of working at 100% and trying to avoid burnout.

As a first step, I’ve created an official page that will list any consulting companies that I feel are suitable to recommend for help with fwupd and the LVFS. The hardware vendors would love to throw money at this stuff, so they don’t have to care about upstream project release schedules and dealing with a gumpy maintainer like me. I’ve pinged the usual awesome people like Igalia, and hopefully more companies will be added to this list during the next few days.

If you do want your open-source consultancy to be added, please email me a two paragraph corporate-friendly blurb I can include on that new page, also with a link I can use for the “more details” button. If you’re someone I’ve not worked with before, you should be in a position to explain the difference between a capsule update and a DFU update, and be able to tell me what a version format is. I don’t want to be listing companies that don’t understand what fwupd actually is :)

Google and fwupd sitting in a tree

Posted by Richard Hughes on November 18, 2019 03:41 PM

I’ve been told by several sources (but not by Google directly, heh) that from Christmas onwards the “Designed for ChromeBook” sticker requires hardware vendors to use fwupd rather than random non-free binaries. This does make a lot of sense for Google, as all the firmware flash tools I’ve seen the source for are often decades old, contain layer-on-layers of abstractions, have dubious input sanitisation and are quite horrible to use. Many are setuid, which doesn’t make me sleep well at night, and I suspect the security team at Google also. Most vendor binaries are built for the specific ODM hardware device, and all of them but one doesn’t use any kind of source control or formal review process.

The requirement from Google has caused mild panic among silicon suppliers and ODMs, as they’re having to actually interact with an open source upstream project and a slightly grumpy maintainer that wants to know lots of details about hardware that doesn’t implement one of the dozens of existing protocols that fwupd supports. These are companies that have never had to deal with working with “outside” people to develop software, and it probably comes as quite a shock to the system. To avoid repeating myself these are my basic rules when adding support for a device with a custom protocol in fwupd:

  • I can give you advice on how to write the plugin if you give me the specifications without signing an NDA, and/or the existing code under a LGPLv2+ license. From experience, we’ll probably not end up using any of your old code in fwupd but the error defines and function names might be similar, and I don’t anyone to get “tainted” from looking at non-free code, so it’s safest all round if we have some reference code marked with the right license that actually compiles on Fedora 31. Yes, I know asking the legal team about releasing previously-nonfree code with a GPLish licence is difficult.
  • If you are running Linux, and want our help to debug or test your new plugin, you need to be running Fedora 30 or 31. If you run Ubuntu you’ll need to use the snap version of fwupd, and I can’t help you with random Ubuntu questions or interactions between the snap version and the distro version. I know your customer might be running Debian Stable or Ubuntu LTS, but that’s not what I’m paid to support. If you do use Fedora 29+ or RHEL 7+ you can also use the nice COPR I provide with git snapshots of master.
  • Please reflect the topology of your device. If writes have to go through another interface, passthru or IC, please give us access to documentation about that device too. I’m fed up having to reverse engineer protocols from looking at the “wrong side” of the client source code. If the passthru is implemented by different vendor, they’ll need to work on the same terms as this.
  • If you want to design and write all of the plugin yourself, that’s awesome, but please follow the existing style and don’t try to wrap your existing code base with the fwupd plugin API. If your device has three logical children with different version numbers or firmware formats, we want to see three devices in fwupdmgr. If you want to restrict the child devices to a parent vendor, that’s fine, we now support that in fwupd and on the LVFS. If you’re adding custom InstanceIDs, these have to be documented in the README.md file.
  • If you’re using an nonstandard firmware format (as in, not DFU, Intel HEX or Motorola SREC) then you’ll need to write a firmware parser that’s going to be valgrind’ed and fuzzed. We will need all the header/footer documentation so we can verify the parser and add some small redistributable fuzz targets. If the blob is being passed to the hardware without parsing, you still might need to know the format of the header so that the plugin can do a sanity check that the firmware is suitable for the hardware, and that any internal CRC is actually correct. All the firmware parsers have to be paranoid and written defensively, because it’s me that looks bad on LWN if CVEs get issued.
  • If you want me to help with the plugin, I’m probably going to ask for test hardware, and two different versions of the firmware that can actually be flashed to the hardware you sent. A bare PCB is fine, but if you send me something please let me know so I can give you my personal address rather than have to collect it from a Red Hat office. If you send me hardware, ensure you also include a power supply that’s going to work in the UK, e.g. 240V. If you want it back, you’ll also need to provide me with UPS/DHL collection sticker.
  • You do need to think how to present your device version number. e.g. is 0x12345678 meant to be presented as “12.34.5678” or “18.52.86.120” – the LVFS really cares if this is correct, and users want to see the “same” version numbers as on the OEM web-page.
  • You also need to know if the device is fully functional during the update, or if it operates in a degraded or bootloader mode. We also need to know what happens if flashing fails, e.g. is the device a brick, or is there some kind of A/B partition that makes a flash failure harmless? If the device is a brick, how can it be recovered without an RMA?
  • After the update is complete fwupd need to “restart” the device so that the new firmware version can be verified, so there needs to be some kind of command the device understands – we can ask the user to reboot or re-plug the device if this is the only way to do this, although in 2019 we can really do better than that.
  • If you’re sharing a huge LGPLv2+ lump of code, we need access to someone who actually understands it, preferably the person that wrote it in the first place. Typically the code is uncommented and a recipe for a headache so being able to ask a human questions is invaluable. For this, either IRC, email or even just communicating via a shared Google doc (more common than you would believe…) is fine. I can’t discuss this stuff on Telegram, Hangouts or WhatsApp, sorry.
  • Once a plugin exists in fwupd and is upstream, we will expect pull requests to add either more VID/PIDs, #defines or to add variations to the protocol for new versions of the hardware. I’m going to be grumpy if I just get sent a random email with demands about backporting all the VID/PIDs to Debian stable. I have zero control on when Debian backports anything, and very little influence on when Ubuntu does a SRU. I have a lot of influence on when various Fedora releases get a new fwupd, and when RHEL gets backports for new hardware support.

Now, if all this makes me sound like a grumpy upstream maintainer then I apologize. I’m currently working with about half a dozen silicon suppliers who all failed some or all of the above bullets. I’m multiplexing myself with about a dozen companies right now, and supporting fwupd isn’t actually my entire job at Red Hat. I’m certainly not going to agree to “signing off a timetable” for each vendor as none of the vendors actually pay me to do anything…

Given interest in fwupd has exploded in the last year or so, I wanted to post something like this rather than have a 10-email back and forth about my expectations with each vendor. Some OEMs and even ODMs are now hiring developers with Linux experience, and I’m happy to work with them as fwupd becomes more important. I’ve already helped quite a few developers at random vendors get up to speed with fwupd and would be happy to help more. As the importance of fwupd and the LVFS grows more and more, vendors will need to hire developers who can build, extend and support their hardware. As fwupd grows, I’ll be asking vendors to do more of the work, as “get upstream to do it” doesn’t scale.

Extending proprietary PC embedded controller firmware

Posted by Matthew Garrett on November 18, 2019 08:19 AM
I'm still playing with my X210, a device that just keeps coming up with new ways to teach me things. I'm now running Coreboot full time, so the majority of the runtime platform firmware is free software. Unfortunately, the firmware that's running on the embedded controller (a separate chip that's awake even when the rest of the system is asleep and which handles stuff like fan control, battery charging, transitioning into different power states and so on) is proprietary and the manufacturer of the chip won't release data sheets for it. This was disappointing, because the stock EC firmware is kind of annoying (there's no hysteresis on the fan control, so it hits a threshold, speeds up, drops below the threshold, turns off, and repeats every few seconds - also, a bunch of the Thinkpad hotkeys don't do anything) and it would be nice to be able to improve it.

A few months ago someone posted a bunch of fixes, a Ghidra project and a kernel patch that lets you overwrite the EC's code at runtime for purposes of experimentation. This seemed promising. Some amount of playing later and I'd produced a patch that generated keyboard scancodes for all the missing hotkeys, and I could then use udev to map those scancodes to the keycodes that the thinkpad_acpi driver would generate. I finally had a hotkey to tell me how much battery I had left.

But something else included in that post was a list of the GPIO mappings on the EC. A whole bunch of hardware on the board is connected to the EC in ways that allow it to control them, including things like disabling the backlight or switching the wifi card to airplane mode. Unfortunately the ACPI spec doesn't cover how to control GPIO lines attached to the embedded controller - the only real way we have to communicate is via a set of registers that the EC firmware interprets and does stuff with.

One of those registers in the vendor firmware for the X210 looked promising, with individual bits that looked like radio control. Unfortunately writing to them does nothing - the EC firmware simply stashes that write in an address and returns it on read without parsing the bits in any way. Doing anything more with them was going to involve modifying the embedded controller code.

Thankfully the EC has 64K of firmware and is only using about 40K of that, so there's plenty of room to add new code. The problem was generating the code in the first place and then getting it called. The EC is based on the CR16C architecture, which binutils supported until 10 days ago. To be fair it didn't appear to actually work, and binutils still has support for the more generic version of the CR16 family, so I built a cross assembler, wrote some assembly and came up with something that Ghidra was willing to parse except for one thing.

As mentioned previously, the existing firmware code responded to writes to this register by saving it to its RAM. My plan was to stick my new code in unused space at the end of the firmware, including code that duplicated the firmware's existing functionality. I could then replace the existing code that stored the register value with code that branched to my code, did whatever I wanted and then branched back to the original code. I hacked together some assembly that did the right thing in the most brute force way possible, but while Ghidra was happy with most of the code it wasn't happy with the instruction that branched from the original code to the new code, or the instruction at the end that returned to the original code. The branch instruction differs from a jump instruction in that it gives a relative offset rather than an absolute address, which means that branching to nearby code can be encoded in fewer bytes than going further. I was specifying the longest jump encoding possible in my assembly (that's what the :l means), but the linker was rewriting that to a shorter one. Ghidra was interpreting the shorter branch as a negative offset, and it wasn't clear to me whether this was a binutils bug or a Ghidra bug. I ended up just hacking that code out of binutils so it generated code that Ghidra was happy with and got on with life.

Writing values directly to that EC register showed that it worked, which meant I could add an ACPI device that exposed the functionality to the OS. My goal here is to produce a standard Coreboot radio control device that other Coreboot platforms can implement, and then just write a single driver that exposes it. I wrote one for Linux that seems to work.

In summary: closed-source code is more annoying to improve, but that doesn't mean it's impossible. Also, strange Russians on forums make everything easier.

comment count unavailable comments

Native GTK Dialogs in LibreOffice

Posted by Caolán McNamara on October 31, 2019 08:01 PM

LibreOffice Native GTK Dialog Status

The LibreOffice UI was traditionally implemented with its own VCL toolkit which via theming emulated the host desktop toolkit.

Then we migrated the file format the dialogs were described in to the GtkBuilder file format. But still implemented with VCL widgetry, though with additional GTK-alike layout widgets.

Then migrated the translation format to gettext .mo files, which added plural form translation support we had lacked.

Then incrementally migrated the code driving the dialogs to a new API with two implementations, one for VCL widgetry and one for GTK.


Over the last few major releases the GTK version of LibreOffice has increasingly had true GTK dialogs and less VCL dialogs and in master, as of this week, there are now no direct uses of the VCL dialog APIs.

There are still some non-dialog utility windows and other elements to port over, but dialogs are complete.

LibreOffice has a lot of UI. There are 1029 XML UI definition files in master. 480 definitions of a GtkDialog and 75 additional GtkMessageDialog definitions. The remainder of the files typically describe a single page of a Notebook, Assistant or Sidebar, often appearing in multiple dialogs.

Here are some gifs of a small set of the dialogs from master under Fedora 31, taken under Wayland with peek, showing some of the stock animations of the default GTK 3.24 Adwaita theme

The Writer Character dialog

Notebook, Color Selector MenuButton, and ToggleButton animations

The Calc Page dialog

SpinButtons and legacy Preview widgets hosted in a native dialog


The Writer Paragraph dialog

"Double Decker" Notebook and Scale widgets

The Writer AutoCorrect dialog

Smooth scrolling of huge Emoji autocorrect list

Chart 3D View dialog

Amusingly Over-engineered custom lighting direction widget


The Options dialog

TreeView, Overlay ScrollBar, fade in animation of CheckButtons


GNOME, and Free Software Is Under Attack

Posted by Richard Hughes on October 22, 2019 01:34 PM

A month ago, GNOME was hit by a patent troll. We’re fighting, but need some money to fund the legal defense, and counterclaim. I just donated, and if you use or develop free software you should too.

Letting Birds scooters fly free

Posted by Matthew Garrett on October 18, 2019 11:44 AM
(Note: These issues were disclosed to Bird, and they tell me that fixes have rolled out. I haven't independently verified)

Bird produce a range of rental scooters that are available in multiple markets. With the exception of the Bird Zero[1], all their scooters share a common control board described in FCC filings. The board contains three primary components - a Nordic NRF52 Bluetooth controller, an STM32 SoC and a Quectel EC21-V modem. The Bluetooth and modem are both attached to the STM32 over serial and have no direct control over the rest of the scooter. The STM32 is tied to the scooter's engine control unit and lights, and also receives input from the throttle (and, on some scooters, the brakes).

The pads labeled TP7-TP11 near the underside of the STM32 and the pads labeled TP1-TP5 near the underside of the NRF52 provide Serial Wire Debug, although confusingly the data and clock pins are the opposite way around between the STM and the NRF. Hooking this up via an STLink and using OpenOCD allows dumping of the firmware from both chips, which is where the fun begins. Running strings over the firmware from the STM32 revealed "Set mode to Free Drive Mode". Challenge accepted.

Working back from the code that printed that, it was clear that commands could be delivered to the STM from the Bluetooth controller. The Nordic NRF52 parts are an interesting design - like the STM, they have an ARM Cortex-M microcontroller core. Their firmware is split into two halves, one the low level Bluetooth code and the other application code. They provide an SDK for writing the application code, and working through Ghidra made it clear that the majority of the application firmware on this chip was just SDK code. That made it easier to find the actual functionality, which was just listening for writes to a specific BLE attribute and then hitting a switch statement depending on what was sent. Most of these commands just got passed over the wire to the STM, so it seemed simple enough to just send the "Free drive mode" command to the Bluetooth controller, have it pass that on to the STM and win. Obviously, though, things weren't so easy.

It turned out that passing most of the interesting commands on to the STM was conditional on a variable being set, and the code path that hit that variable had some impressively complicated looking code. Fortunately, I got lucky - the code referenced a bunch of data, and searching for some of the values in that data revealed that they were the AES S-box values. Enabling the full set of commands required you to send an encrypted command to the scooter, which would then decrypt it and verify that the cleartext contained a specific value. Implementing this would be straightforward as long as I knew the key.

Most AES keys are 128 bits, or 16 bytes. Digging through the code revealed 8 bytes worth of key fairly quickly, but the other 8 bytes were less obvious. I finally figured out that 4 more bytes were the value of another Bluetooth variable which could be simply read out by a client. The final 4 bytes were more confusing, because all the evidence made no sense. It looked like it came from passing the scooter serial number to atoi(), which converts an ASCII representation of a number to an integer. But this seemed wrong, because atoi() stops at the first non-numeric value and the scooter serial numbers all started with a letter[2]. It turned out that I was overthinking it and for the vast majority of scooters in the fleet, this section of the key was always "0".

At that point I had everything I need to write a simple app to unlock the scooters, and it worked! For about 2 minutes, at which point the network would notice that the scooter was unlocked when it should be locked and sent a lock command to force disable the scooter again. Ah well.

So, what else could I do? The next thing I tried was just modifying some STM firmware and flashing it onto a board. It still booted, indicating that there was no sort of verified boot process. Remember what I mentioned about the throttle being hooked through the STM32's analogue to digital converters[3]? A bit of hacking later and I had a board that would appear to work normally, but about a minute after starting the ride would cut the throttle. Alternative options are left as an exercise for the reader.

Finally, there was the component I hadn't really looked at yet. The Quectel modem actually contains its own application processor that runs Linux, making it significantly more powerful than any of the chips actually running the scooter application[4]. The STM communicates with the modem over serial, sending it an AT command asking it to make an SSL connection to a remote endpoint. It then uses further AT commands to send data over this SSL connection, allowing it to talk to the internet without having any sort of IP stack. Figuring out just what was going over this connection was made slightly difficult by virtue of all the debug functionality having been ripped out of the STM's firmware, so in the end I took a more brute force approach - I identified the address of the function that sends data to the modem, hooked up OpenOCD to the SWD pins on the STM, ran OpenOCD's gdb stub, attached gdb, set a breakpoint for that function and then dumped the arguments being passed to that function. A couple of minutes later and I had a full transaction between the scooter and the remote.

The scooter authenticates against the remote endpoint by sending its serial number and IMEI. You need to send both, but the IMEI didn't seem to need to be associated with the serial number at all. New connections seemed to take precedence over existing connections, so it would be simple to just pretend to be every scooter and hijack all the connections, resulting in scooter unlock commands being sent to you rather than to the scooter or allowing someone to send fake GPS data and make it impossible for users to find scooters.

In summary: Secrets that are stored on hardware that attackers can run arbitrary code on probably aren't secret, not having verified boot on safety critical components isn't ideal, devices should have meaningful cryptographic identity when authenticating against a remote endpoint.

Bird responded quickly to my reports, accepted my 90 day disclosure period and didn't threaten to sue me at any point in the process, so good work Bird.

(Hey scooter companies I will absolutely accept gifts of interesting hardware in return for a cursory security audit)

[1] And some very early M365 scooters
[2] The M365 scooters that Bird originally deployed did have numeric serial numbers, but they were 6 characters of type code followed by a / followed by the actual serial number - the number of type codes was very constrained and atoi() would terminate at the / so this was still not a large keyspace
[3] Interestingly, Lime made a different design choice here and plumb the controls directly through to the engine control unit without the application processor having any involvement
[4] Lime run their entire software stack on the modem's application processor, but because of [3] they don't have any realtime requirements so this is more straightforward

comment count unavailable comments

libinput and tablet pad keys

Posted by Peter Hutterer on October 17, 2019 11:23 PM

Upcoming in libinput 1.15 is a small feature to support Wacom tablets a tiny bit better. If you look at the higher-end devices in Wacom's range, e.g. the Cintiq 27QHD you'll notice that at the top right of the device are three hardware-buttons with icons. Those buttons are intended to open the config panel, the on-screen display or the virtual keyboard. They've been around for a few years and supported in the kernel for a few releases. But in userspace, they events from those keys were ignored, casted out in the wild before eventually running out of electrons and succumbing to misery. Well, that's all changing now with a new interface being added to libinput to forward those events.

Step back a second and let's look at the tablet interfaces. We have one for tablet tools (styli) and one for tablet pads. In the latter, we have events for rings, strips and buttons. The latter are simply numerically ordered, so button 1 is simply button 1 with no special meaning. Anything more specific needs to be handled by the compositor/client side which is responsible for assigning e.g. keyboard shortcuts to those buttons.

The special keys however are different, they have a specific function indicated by the icon on the key itself. So libinput 1.15 adds a new event type for tablet pad keys. The events look quite similar to the button events but they have a linux/input-event-codes.h specific button code that indicates what they are. So the compositor can start the OSD, or control panel, or whatever directly without any further configuration required.

This interface hasn't been merged yet, it's waiting for the linux kernel 5.4 release which has a few kernel-level fixes for those keys.

libinput and button scrolling locks

Posted by Peter Hutterer on October 17, 2019 10:56 PM

For a few years now, libinput has provided button scrolling. Holding a designated button down and moving the device up/down or left/right creates the matching scroll events. We enable this behaviour by default on some devices (e.g. trackpoints) but it's available on mice and some other devices. Users can change the button that triggers it, e.g. assign it to the right button. There are of course a couple of special corner cases to make sure you can still click that button normally but as I said, all this has been available for quite some time now.

New in libinput 1.15 is the button lock feature. The button lock removes the need to hold the button down while scrolling. When the button lock is enabled, a single button click (i.e. press and release) of that button holds that button logically down for scrolling and any subsequent movement by the device is translated to scroll events. A second button click releases that button lock and the device goes back to normal movement. That's basically it, though there are some extra checks to make sure the button can still be used for normal clicking (you will need to double-click for a single logical click now though).

This is primarily an accessibility feature and is likely to find it's way into the GUI tools under the accessibility headers.

Riddle me this

Posted by Benjamin Otte on October 17, 2019 10:46 PM

Found this today while playing around, thought people might enjoy this riddle.

$> echo test.c
typedef int foo;
int main()
{
  foo foo = 1;
  return (foo) +0;
}
$> gcc -Wall -o test test.c && ./test && echo $?

What does this print?

  1. 0
  2. 1
  3. Some compilation warnings, then 0.
  4. Some compilation warnings, then 1.
  5. It doesn’t compile.

I’ll put an answer in the comments.

libinput's bus factor is 1

Posted by Peter Hutterer on October 16, 2019 05:56 AM

A few weeks back, I was at XDC and gave a talk about various current and past input stack developments (well, a subset thereof anyway). One of the slides pointed out libinput's bus factor and I'll use this blog to make this a bit more widely known.

If you don't know what the bus factor is, Wikipedia defines it as:

The "bus factor" is the minimum number of team members that have to suddenly disappear from a project before the project stalls due to lack of knowledgeable or competent personnel.
libinput has a bus factor of 1.

Let's arbitrarily pick the 1.9.0 release (roughly 2 years ago) and look at the numbers: of the ~1200 commits since 1.9.0, just under 990 were done by me. In those 2 years we had 76 contributors in total, but only 24 of which have more than one commit and only 6 contributors have more than 5 commits. The numbers don't really change much even if we go all the way back to 1.0.0 in 2015. These numbers do not include the non-development work: release maintenance for new releases and point releases, reviewing CI failures [1], writing documentation (including the stuff on this blog), testing and bug triage. Right now, this is effectively all done by one person.

This is... less than ideal. At this point libinput is more-or-less the only input stack we have [2] and all major distributions rely on it. It drives mice, touchpads, tablets, keyboards, touchscreens, trackballs, etc. so basically everything except joysticks.

Anyway, I'm largely writing this blog post in the hope that someone gets motivated enough to dive into this. Right now, if you get 50 patches into libinput you get the coveted second-from-the-top spot, with all the fame and fortune that entails (i.e. little to none, but hey, underdogs are big in popular culture). Short of that, any help with building an actual community would be appreciated too.

Either way, lest it be said that no-one saw it coming, let's ring the alarm bells now before it's too late. Ding ding!

[1] Only as of a few days ago can we run the test suite as part of the CI infrastructure, thanks to Benjamin Tissoires. Previously it was run on my laptop and virtually nowhere else.
[2] fyi, xf86-input-evdev: 5 patches in the same timeframe, xf86-input-synaptics: 6 patches (but only 3 actual changes) so let's not pretend those drivers are well-maintained.

Investigating the security of Lime scooters

Posted by Matthew Garrett on October 04, 2019 06:04 AM
(Note: to be clear, this vulnerability does not exist in the current version of the software on these scooters. Also, this is not the topic of my Kawaiicon talk.)

I've been looking at the security of the Lime escooters. These caught my attention because:
(1) There's a whole bunch of them outside my building, and
(2) I can see them via Bluetooth from my sofa
which, given that I'm extremely lazy, made them more attractive targets than something that would actually require me to leave my home. I did some digging. Limes run Linux and have a single running app that's responsible for scooter management. They have an internal debug port that exposes USB and which, until this happened, ran adb (as root!) over this USB. As a result, there's a fair amount of information available in various places, which made it easier to start figuring out how they work.

The obvious attack surface is Bluetooth (Limes have wifi, but only appear to use it to upload lists of nearby wifi networks, presumably for geolocation if they can't get a GPS fix). Each Lime broadcasts its name as Lime-12345678 where 12345678 is 8 digits of hex. They implement Bluetooth Low Energy and expose a custom service with various attributes. One of these attributes (0x35 on at least some of them) sends Bluetooth traffic to the application processor, which then parses it. This is where things get a little more interesting. The app has a core event loop that can take commands from multiple sources and then makes a decision about which component to dispatch them to. Each command is of the following form:

AT+type,password,time,sequence,data$

where type is one of either ATH, QRY, CMD or DBG. The password is a TOTP derived from the IMEI of the scooter, the time is simply the current date and time of day, the sequence is a monotonically increasing counter and the data is a blob of JSON. The command is terminated with a $ sign. The code is fairly agnostic about where the command came from, which means that you can send the same commands over Bluetooth as you can over the cellular network that the Limes are connected to. Since locking and unlocking is triggered by one of these commands being sent over the network, it ought to be possible to do the same by pushing a command over Bluetooth.

Unfortunately for nefarious individuals, all commands sent over Bluetooth are ignored until an authentication step is performed. The code I looked at had two ways of performing authentication - you could send an authentication token that was derived from the scooter's IMEI and the current time and some other stuff, or you could send a token that was just an HMAC of the IMEI and a static secret. Doing the latter was more appealing, both because it's simpler and because doing so flipped the scooter into manufacturing mode at which point all other command validation was also disabled (bye bye having to generate a TOTP). But how do we get the IMEI? There's actually two approaches:

1) Read it off the sticker that's on the side of the scooter (obvious, uninteresting)
2) Take advantage of how the scooter's Bluetooth name is generated

Remember the 8 digits of hex I mentioned earlier? They're generated by taking the IMEI, encrypting it using DES and a static key (0x11, 0x22, 0x33, 0x44, 0x55, 0x66, 0x77, 0x88), discarding the first 4 bytes of the output and turning the last 4 bytes into 8 digits of hex. Since we're discarding information, there's no way to immediately reverse the process - but IMEIs for a given manufacturer are all allocated from the same range, so we can just take the entire possible IMEI space for the modem chipset Lime use, encrypt all of them and end up with a mapping of name to IMEI (it turns out this doesn't guarantee that the mapping is unique - for around 0.01%, the same name maps to two different IMEIs). So we now have enough information to generate an authentication token that we can send over Bluetooth, which disables all further authentication and enables us to send further commands to disconnect the scooter from the network (so we can't be tracked) and then unlock and enable the scooter.

(Note: these are actual crimes)

This all seemed very exciting, but then a shock twist occurred - earlier this year, Lime updated their authentication method and now there's actual asymmetric cryptography involved and you'd need to engage in rather more actual crimes to obtain the key material necessary to authenticate over Bluetooth, and all of this research becomes much less interesting other than as an example of how other companies probably shouldn't do it.

In any case, congratulations to Lime on actually implementing security!

comment count unavailable comments

Some Flatpak updates

Posted by Matthias Clasen on October 03, 2019 11:43 AM

Flatpak development is not standing still. Here is a quick summary of recent and coming changes.

Better extensions

In 1.4.2, Flatpak gained the ability to use extra-data for extensions. This mechanism has been around for applications for a long time, but it is a new feature for extensions.

The 19.08 version of the freedesktop runtime uses it for its new org.freedesktop.Platform.openh264 extension, which uses the Cisco openh264 builds.

Since we are taking the ‘run everywhere’ aspect of Flatpak seriously, we’ve backported this feature from the 1.4 branch to older stable branches and released 1.2.4 and 1.0.9, so even users on very stable distributions can enjoy this new feature.

Future plans

We’ve quietly started to work on Flatpak 1.6, which should be out before the end of the year.

On the roadmap for the this release, we have

  • Support for masking updates and pinning apps.  This gives users more control about what updates Flatpak installs, without having to answer questions every time.
  • Parental controls. This optional feature uses libmalcontent to implement policies about what applications users can install and run, based on OARS content ratings.
  • Disk space checks. This is an ongoing effort to improve the accuracy of our disk- and download-size handling and to handle low disk space situations more gracefully.
  • Infrastructure for purchases/donations. This is still a bit of a research topic.

You can follow the discussion around these features, the flatpak roadmap and general flatpak topics on the flatpak mailing list.

Coming soon to portals

Things are happening on the portal side too. Some of these have already landed, and will appear in a release soon.

Secrets

We have a secrets portal now.  It works by providing a master secret to the sandboxed app, which is then used to store the applications secrets in an encrypted file inside the sandbox . The master secret is stored in the session keyring.

This is nice in that applications don’t leave their secrets behind in the keyring when they are uninstalled, and the application secrets are safe from others.

The backend for this portal will be provided by gnome-keyring and libsecret will automatically use it inside a sandbox. Backend implementations for other environments are more than welcome.

The secret portal is the work of Daiki Ueno, who gave a talk about it at Guadec.

Self-updates

The Flatpak commandline and tools like Discover or the Elementary app store do a fine job of handling updates for Flatpak apps and runtimes.

But the reality is that self-updating is a popular feature for applications, so we added an update portal that lets them do this in a clean way, with proper integration in the Flatpak machinery.

Backgrounds 1

The background portal monitors applications that are running in the background (without open windows). It gives apps a way to request permission to run in the background, and it notifies users when apps are trying to do so sneakily without permission. The portal also lets applications request to be started automatically when the user logs in.

To implement this, the portal needs information from the compositor about open windows, and which applications they belong to. Currently, this is implemented for gnome-shell, other backends are more than welcome.

Window sharing

The screencast portal now lets you select individual windows, in addition to screens, if the application asks for this.

For now, the portal identifies windows by the application icon and window title. We are looking to improve this by using thumbnails.

Backgrounds 2

We will add a small bit of desktop integration with a portal for setting desktop wallpapers.

A portal library

In the ideal case, portal functionality is used transparently by existing desktop libraries without the need for apps to do anything special. Examples for this are GtkFileChooserNative using the file chooser portal, or libsecret using the new secret portal.

But for some portals, there is no natural library api, and in these cases, doing the portal interaction with D-Bus calls can be a bit cumbersome.

Therefore, we are working on a libportal library that will provide GIO-style async apis for portal requests.

Open for contribution

If you want to get involved with Flatpak development, or are just curious, check out the flatpak project on github, chime in on the Flatpak mailing list, or find us on IRC in #flatpak on freenode.

Do we need to rethink what free software is?

Posted by Matthew Garrett on September 27, 2019 05:47 PM
Licensing has always been a fundamental tool in achieving free software's goals, with copyleft licenses deliberately taking advantage of copyright to ensure that all further recipients of software are in a position to exercise free software's four essential freedoms. Recently we've seen people raising two very different concerns around existing licenses and proposing new types of license as remedies, and while both are (at present) incompatible with our existing concepts of what free software is, they both raise genuine issues that the community should seriously consider.

The first is the rise in licenses that attempt to restrict business models based around providing software as a service. If users can pay Amazon to provide a hosted version of a piece of software, there's little incentive for them to pay the authors of that software. This has led to various projects adopting license terms such as the Commons Clause that effectively make it nonviable to provide such a service, forcing providers to pay for a commercial use license instead.

In general the entities pushing for these licenses are VC backed companies[1] who are themselves benefiting from free software written by volunteers that they give nothing back to, so I have very little sympathy. But it does raise a larger issue - how do we ensure that production of free software isn't just a mechanism for the transformation of unpaid labour into corporate profit? I'm fortunate enough to be paid to write free software, but many projects of immense infrastructural importance are simultaneously fundamental to multiple business models and also chronically underfunded. In an era where people are becoming increasingly vocal about wealth and power disparity, this obvious unfairness will result in people attempting to find mechanisms to impose some degree of balance - and given the degree to which copyleft licenses prevented certain abuses of the commons, it's likely that people will attempt to do so using licenses.

At the same time, people are spending more time considering some of the other ethical outcomes of free software. Copyleft ensures that you can share your code with your neighbour without your neighbour being able to deny the same freedom to others, but it does nothing to prevent your neighbour using your code to deny other fundamental, non-software, freedoms. As governments make more and more use of technology to perform acts of mass surveillance, detention, and even genocide, software authors may feel legitimately appalled at the idea that they are helping enable this by allowing their software to be used for any purpose. The JSON license includes a requirement that "The Software shall be used for Good, not Evil", but the lack of any meaningful clarity around what "Good" and "Evil" actually mean makes it hard to determine whether it achieved its aims.

The definition of free software includes the assertion that it must be possible to use the software for any purpose. But if it is possible to use software in such a way that others lose their freedom to exercise those rights, is this really the standard we should be holding? Again, it's unsurprising that people will attempt to solve this problem through licensing, even if in doing so they no longer meet the current definition of free software.

I don't have solutions for these problems, and I don't know for sure that it's possible to solve them without causing more harm than good in the process. But in the absence of these issues being discussed within the free software community, we risk free software being splintered - on one side, with companies imposing increasingly draconian licensing terms in an attempt to prop up their business models, and on the other side, with people deciding that protecting people's freedom to life, liberty and the pursuit of happiness is more important than protecting their freedom to use software to deny those freedoms to others.

As stewards of the free software definition, the Free Software Foundation should be taking the lead in ensuring that these issues are discussed. The priority of the board right now should be to restructure itself to ensure that it can legitimately claim to represent the community and play the leadership role it's been failing to in recent years, otherwise the opportunity will be lost and much of the activist energy that underpins free software will be spent elsewhere.

If free software is going to maintain relevance, it needs to continue to explain how it interacts with contemporary social issues. If any organisation is going to claim to lead the community, it needs to be doing that.

[1] Plus one VC firm itself - Bain Capital, an investment firm notorious for investing in companies, extracting as much value as possible and then allowing the companies to go bankrupt

comment count unavailable comments

Synaptics CX Audio Support

Posted by Richard Hughes on September 25, 2019 04:02 PM

A couple of weeks ago, Synaptics (who now own Conexant) sent me 22,000+ lines of LGPLv2+ licensed C++ that was capable of updating the firmware of all the CXxxxx audio devices that exist in various laptops and peripherals. Most of last week was spent reading the code, and refactoring it to be a CX audio plugin in fwupd. There were a few things I could do to reduce the code size considerably:

  • Use the abstractions shared with all the other plugins, e.g. SREC file format processing, data chunking and low level USB HID
  • Drop support for hardware families which are no longer supported and not likely to receive updates
  • Remove the layers of abstractions and the macros-of-macros-of-macros so common with a codebase age measured in decades
  • Use helper objects in GLib and GObject rather than having to create everything from scratch

So, after all that we got down to a 1377 line fwupd plugin which is a 16x code reduction. It’s broadly comparable in functionality to the 22,000 line code drop but only works in fwupd as a plugin rather than as a standalone updater. To add support for new hardware to the plugin all we have to do is add an entry to the quirk file, which tells us which CX family the specific USB VID/PID is using. The rest is auto-detected.

I can’t tell you the OEM or the hardware all this work is being driven by, but eagle-eyed readers will work it out :) In some cases you might see an extra device appear in fwupdmgr get-devices if you’re running the soon-to-be-released fwupd 1.3.2 and hopefully we can get firmware updates which use this new device on the LVFS some time this year.

It's time to talk about post-RMS Free Software

Posted by Matthew Garrett on September 14, 2019 11:57 AM
Richard Stallman has once again managed to demonstrate incredible insensitivity[1]. There's an argument that in a pure technical universe this is irrelevant and we should instead only consider what he does in free software[2], but free software isn't a purely technical topic - the GNU Manifesto is nakedly political, and while free software may result in better technical outcomes it is fundamentally focused on individual freedom and will compromise on technical excellence if otherwise the result would be any compromise on those freedoms. And in a political movement, there is no way that we can ignore the behaviour and beliefs of that movement's leader. Stallman is driving away our natural allies. It's inappropriate for him to continue as the figurehead for free software.

But I'm not calling for Stallman to be replaced. If the history of social movements has taught us anything, it's that tying a movement to a single individual is a recipe for disaster. The FSF needs a president, but there's no need for that person to be a leader - instead, we need to foster an environment where any member of the community can feel empowered to speak up about the importance of free software. A decentralised movement about returning freedoms to individuals can't also be about elevating a single individual to near-magical status. Heroes will always end up letting us down. We fix that by removing the need for heroes in the first place, not attempting to find increasingly perfect heroes.

Stallman was never going to save us. We need to take responsibility for saving ourselves. Let's talk about how we do that.

[1] There will doubtless be people who will leap to his defense with the assertion that he's neurodivergent and all of these cases are consequences of that.

(A) I am unaware of a formal diagnosis of that, and I am unqualified to make one myself. I suspect that basically everyone making that argument is similarly unqualified.
(B) I've spent a lot of time working with him to help him understand why various positions he holds are harmful. I've reached the conclusion that it's not that he's unable to understand, he's just unwilling to change his mind.

[2] This argument is, obviously, bullshit

comment count unavailable comments

GNOME Firmware 3.34.0 Release

Posted by Richard Hughes on September 13, 2019 01:12 PM

This morning I tagged the newest fwupd release, 1.3.1. There are a lot of new things in this release and a whole lot of polishing, so I encourage you to read the release notes if this kind of thing interests you.

Anyway, to the point of this post. With the new fwupd 1.3.1 you can now build just the libfwupd library, which makes it easy to build GNOME Firmware (old name: gnome-firmware-updater) in Flathub. I tagged the first official release 3.34.0 to celebrate the recent GNOME release, and to indicate that it’s ready for use by end users. I guess it’s important to note this is just a random app hacked together by 3 engineers and not something lovelingly designed by the official design team. All UX mistakes are my own :)

GNOME Firmware is designed to be a not-installed-by-default power-user tool to investigate, upgrade, downgrade and re-install firmware.
GNOME Software will continue to be used for updates as before. Vendor helpdesks can ask users to install GNOME Firmware rather than getting them to look at command line output.

We need to polish up GNOME Firmware going forwards, and add the last few features we need. If this interests you, please send email and I’ll explain what needs doing. We also need translations, although that can perhaps wait until GNOME Firmware moves to GNOME proper, rather than just being a repo in my personal GitLab. If anyone does want to translate it before then, please open merge requests, and be sure to file issues if any of the strings are difficult to translate or ambigious. Please also file issues (or even better merge requests!) if it doesn’t build or work for you.

If you just want to try out a new application, it takes 10 seconds to install it from Flathub.

Unit-testing static functions in C

Posted by Peter Hutterer on September 12, 2019 04:21 AM

An annoying thing about C code is that there are plenty of functions that cannot be unit-tested by some external framework - specifically anything declared as static. Any larger code-base will end up with hundreds of those functions, many of which are short and reasonably self-contained but complex enough to not trust them by looks only. But since they're static I can't access them from the outside (and "outside" is defined as "not in the same file" here).

The approach I've chosen in the past is to move the more hairy ones into separate files or at least declare them normally. That works but is annoying for some cases, especially those that really only get called once. In case you're wondering whether you have at least one such function in your source tree: yes, the bit that parses your commandline arguments is almost certainly complicated and not tested.

Anyway, this week I've finally found the right combination of hacks to make testing static functions easy, and it's:

  • #include the source file in your test code.
  • Mock any helper functions you'd need to trick the called functions
  • Instruct the linker to ignore unresolved symbols
And boom, you can write test cases to only test a single file within your source tree. And without any modifications to the source code itself.

A more detailed writeup is available in this github repo.

For the impatient, the meson snippet for a fictional source file example.c would look like this:


test('test-example',
executable('test-example',
'example.c', 'test-example.c',
dependencies: [dep_ext_library],
link_args: ['-Wl,--unresolved-symbols=ignore-all',
'-Wl,-zmuldefs',
'-no-pie'],
install: false),
)

There is no restriction on which test suite you can use. I've started adding a few of test cases based on this approach to libinput and so far it's working well. If you have a better approach or improvements, I'm all ears.

Please welcome Acer to the LVFS

Posted by Richard Hughes on September 11, 2019 11:44 AM

Acer has now officialy joined the LVFS, promoting the Aspire A315 firmware to stable.

Acer has been testing the LVFS for some time and now all the legal and technical checks have been completed. Other models will follow soon!

Realizing that I’m not Super Human: Part 1

Posted by Richard Hughes on August 28, 2019 12:44 PM

Most of the content on this blog is technical in nature, as is my twitter feed. I wanted to step to one side, and talk a bit about one of the little things I’ve learned about my body: I’m not super human any more.

I’m one of those people that have been really lucky with my general physical and mental health over the years. I used to play a lot of rugby and got the odd injury, but nothing a long hot bath couldn’t fix. Modulo catching the flu a few years ago I don’t really get ill very much.

About this time last year I began to get a small amount of back pain when sitting for a long time, or when walking around for over an hour or so. This was the first warning. Over the next few months this got worse to the point it was now an electrical tingling all down one leg whenever I did “too much” walking or playing with the kids. I self-diagnosed this as some kind of sciatica and didn’t pay too much attention to it. This was the second warning sign. After my back finally “went pop” a couple of times in one week leaving me unable to walk properly at all, I finally went to a private physiotherapist and asked for some advice. Luckily for me this was all covered as part of my Red Hat compensation package and I didn’t have to pay a thing, which I know really isn’t the case if you’re paying for healthcare yourself.

The Physio did quite a lot of tests and then announced that my posture was, put bluntly, total crap. There was no magic pill nor any special sports massage to make it better, but everything could be fixed with a little bit of hard work. I had to make some immediate changes: my comfy armchair was out, a standing desk was in. 10 hours sitting in a chair coding was bad, hourly breaks were enforced. I was given some exercises to do every day (which I did) and after about 6 weeks of visits I was discharged as the tingling had gone and the back pain was much less. The physio suggested I do a weekly Pilates class to further improve my posture and to keep everything where it should be.

This was waaaay outside my comfort zone, as I’d never done any kind of exercise or group class before. I went to a group class and immediately realized I was at least two orders of magnitude less capable than everyone else. I could barely touch my knees when they could all touch the floor. The instructor was really kind and showed me all the positions and things to do and not do, but I still felt a bit weird in a class of mostly middle aged women dragging them all down to my level. I asked the instructor if he did 1:1 classes and he said yes; Since then I’ve been doing a 1 hour Pilates class every other week and, against all odds, I’m actually quite enjoying it now. My posture is much better; when I run I feel less like I’m flopping about and now have a stable “core” of muscle holding me all together. I can throw my children around in the park, and not worry about discs in my back bulging to the point of rupture. My breathing and concentration has improved, and if anything I guess I’m slightly more productive with hourly breaks.

Talking to other men, it seems quite a few people also do Pilates, but for some reason are a bit embarrassed to admit it to other people. I suppose I was initially too, but not now. My wife does Yoga, and I guess to me Pilates feels like a more physical Yoga without all the spiritual stuff mixed in. I’m not quite a card-carrying evangelist, but I really would recommend you try Pilates if you sit at a desk all day hunched over an editor all day, like I used to. Doing 1:1 classes is expensive (about £80/month) but it is 100% worth it with the results I’ve had so far.

So, the conclusion: I’m not Super Human any more, but that’s okay. If you’ve read this far – shoulders back, chin up, and get back to coding. If you’re interested, want an awesome instructor and you live in West London, give Ash a call.

GNOME Firmware Updater

Posted by Richard Hughes on August 28, 2019 09:37 AM

A few months ago, Dell asked if I’d like to co-mentor an intern over the summer. The task was to create a GTK “power user” application for managing firmware. The idea being that someone like Dell support could ask the user to run a little application and then read back firmware versions or downgrade to an older firmware version rather than getting them to use the command line. GNOME and KDE software centers deliberately show a “simple” view of firmware, only showing devices when updates are pending.

In June I was introduced to Andrew Schwenn, who was our intern for the summer. This blog isn’t about Andrew, but I will say he did amazingly well and was soon up to speed filing excellent pull requests even with a grumpy anally-retentive maintainer like me. Andrew has finished his internship now, but I wouldn’t be surprised if we work again with him in the future. Most of the work so far is from Andrew, so I can’t claim too much credit here.

GNOME Firmware Updater was designed in the style of a GNOME Control Center panel, and all the code is written in a way to make a port very simple indeed if that’s what we actually want. At the moment it’s a seporate project and binary, as we’re still prototyping the UI and working out what kind of UX we want from a power user tool. It’s mostly complete and a few weeks away from it’s first release. When it does get an official release, I’ll be sure to upload it to Flathub to make it easy for the world to install. If this sounds interesting to you the code is here. I don’t have a huge amount of time to dedicate to this power user tool, but please open pull requests or issues if there’s something you’d like to see fixed.

Tuhi - an application to support Wacom SmartPad devices

Posted by Peter Hutterer on August 26, 2019 11:19 AM

Sounds like déjà vu? Right, I posted a post with an almost identical title 18 months ago or so. This is about Tuhi 0.2, new and remodeled and completely different to that. Sort-of.

Tuhi is an application that supports the Wacom SmartPad devices - Bamboo Spark, Bamboo Slate, Bamboo Folio and Intuos Pro. The Bamboo range are digital notepads. They come with a real pen, you draw normally on the pad and use Bluetooth LE and Wacom's Inkspace application later to sync the files to disk. The Intuos Pro is the same but it's designed as a "normal" tablet with the paper mode available as well.

18 months ago, Benjamin Tissoires and I wrote Tuhi as a DBus session daemon. Tuhi would download the drawings from the file and make them available as JSON files over DBus to be converted to SVG or some other format by ... "clients". We wrote a simple commandline tool to debug Tuhi but no GUI, largely in the hope that maybe someone would be interested in doing that. Fast forward to now and that hasn't happened but I had some spare cycles over the last weeks so I present to you: Tuhi 0.2, now with a GTK GUI:

It's basic but also because it shouldn't do much more than just downloading the drawings and allowing you to save them. This is not an editing UI, it's effectively a file manager for the drawings on the tablet. And since by design those drawings get deleted as you download them, there isn't even much to that (don't worry, Tuhi doesn't really delete files, you can recover almost everything).

Under the hood there were some internal changes too but I suspect they'll be boring to most. The more interesting bits are reworks so we can test the conversions a lot better now and - worst case - recover files if Tuhi crashes. It is largely reverse-engineered after all.

On that note I would like to also extend my thanks to Wacom who have provided us with some of the specs for the protocol (under NDA, we cannot share these with the community, sorry). These specs helped tremendously understanding the protocol bits that were confusing at best and unknown at worst. There are still some corners in the protocol that we don't know but for the most recent generation (i.e. Intuos Pro) we should have correct parsing of the protocol.

And many thanks to Jakub Steiner for the fancy logo.

And, as of a few minutes ago, Tuhi is available as flatpak from flathub.org. For the forseeable future is is the best way to install Tuhi.

low-memory-monitor: new project announcement

Posted by Bastien Nocera on August 21, 2019 10:57 AM
I'll soon be flying to Greece for GUADEC but wanted to mention one of the things I worked on the past couple of weeks: the low-memory-monitor project is off the ground, though not production-ready.

low-memory-monitor, as its name implies, monitors the amount of free physical memory on the system and will shoot off signals to interested user-space applications, usually session managers, or sandboxing helpers, when that memory runs low, making it possible for applications to shrink their memory footprints before it's too late either to recover a usable system, or avoid taking a performance hit.

It's similar to Android's lowmemorykiller daemon, Facebook's oomd, Endless' psi-monitor, amongst others

Finally a GLib helper and a Flatpak portal are planned to make it easier for applications to use, with an API similar to iOS' or Android's.

Combined with work in Fedora to use zswap and remove the use of disk-backed swap, this should make most workstation uses more responsive and enjoyable.

Musings on the Microsoft Component Firmware Update (CFU) Protocol

Posted by Richard Hughes on August 15, 2019 10:23 AM

CFU is a new specification from Microsoft designed to be the one true protocol for updating hardware. No vendor seems to be shipping hardware supporting CFU (yet?), although I’ve had two peripheral vendors ask my opinion which is why I’m posting here.

CFU has a bazaar pre-download phase before sending the firmware to the microcontroller so the uC can check if the firmware is required and compatible. CFU also requires devices to be able to transfer the entire new transfer mode in runtime mode. The pre-download “offer” allows the uC to check any sub-components attached (e.g. other devices attached to the SoC) and forces it to do dep resolution in case sub-components have to be updated in a specific order.

Pushing the dep resolution down to the uC means the uC has to do all the version comparisons and also know all the logic with regard to protocol incompatibilities. You could be in a position where the uC firmware needs to be updated so that it “knows” about the new protocol restrictions, which are needed to update the uC and the things attached in the right order in a subsequent update. If we always update the uC to the latest, the probably-factory-default running version doesn’t know about the new restrictions.

The other issue with this is that the peripheral is unaware of the other devices in the system, so for instance couldn’t only install a new firmware version for only new builds of Windows for example. Something that we support in fwupd is being able to restrict the peripheral device firmware to a specific SMBIOS CHID or a system firmware vendor, which lets vendors solve the “same hardware in different chassis, with custom firmware” problem. I don’t see how that could be possible using CFU unless I misunderstand the new .inf features. All the dependency resolution should be in the metadata layer (e.g. in the .inf file) rather than being pushed down to the hardware running the old firmware.

What is possibly the biggest failure I see is the doubling of flash storage required to do an runtime transfer, the extra power budget of being woken up to process the “offer” and enough bulk power to stay alive if “unplugged” during a A/B swap. Realistically it’s an extra few dollars for a ARM uC to act as a CFU “bridge” for legacy silicon and IP, which I can’t see as appealing to an ODM given they make other strange choices just to save a few cents on a BOM. I suppose the CFU “bridge” could also do firmware signing/encryption but then you still have a physical trace on the PCB with easy-to-read/write unsigned firmware. CFU could have defined a standardized way to encrypt and sign firmware, but they kinda handwave it away letting the vendors do what they think is best, and we all know how that plays out.

CFU downloads in the runtime mode, but from experience, most of the devices can transfer a few hundred Kb in less than ~200ms. Erasing flash is typically the slowest thing, typically less than 2s, writing next at ~1s both done in the bootloader phase. I’ve not seen a single device that can do a flash-addr-swap to be able to do the A/B solution they’ve optimized for, with the exception of enterprise UEFI firmware which CFU can’t update anyway.

By far the longest process in the whole update step is the USB re-enumeration (up to twice) which we have to allow 5s (!!!) for in fwupd due to slow hubs and other buggy hardware. So, CFU doubles the flash size requirement for millions of device to save ~5 seconds for a procedure which might be done once or twice in the devices lifetime. It’s also not the transfer that’s the limitation even over bluetooth as if the dep resolution is “higher up” you only need to send the firmware to the device when it needs an update, rather that every time you scan the device.

I’m similarly unimpressed with the no-user-interaction idea where firmware updates just happen in the background, as the user really needs to know when the device is going to disappear and re-appear for 5 seconds (even CFU has to re-enumerate…) — image it happening during a presentation or as the machine is about to have the lid shut to go into S3.

so, tl;dr: Not a fan, but could support in fwupd if required.

libfprint 1.0 (and fprintd 0.9.0)

Posted by Bastien Nocera on August 08, 2019 01:53 PM
After more than a year of work libfprint 1.0 has just been released!

It contains a lot of bug fixes for a number of different drivers, which would make it better for any stable or unstable release of your OS.

There was a small ABI break between versions 0.8.1 and 0.8.2, which means that any dependency (really just fprintd) will need to be recompiled. And it's good seeing as we also have a new fprintd release which also fixes a number of bugs.

Benjamin Berg will take over maintenance and development of libfprint with the goal of having a version 2 in the coming months that supports more types of fingerprint readers that cannot be supported with the current API.

From my side, the next step will be some much needed modernisation for fprintd, both in terms of code as well as in the way it interacts with users.

Pango 1.44 wrap-up

Posted by Matthias Clasen on August 07, 2019 09:06 PM

In my last post discussing changes in Pango 1.44, I’ve asked for feedback. We’ve received some, thanks to everybody who reported issues!

We tried to address some of the fallout in several follow-up releases. I’ll do a 1.44.4 release with the last round of fixes before too long.

Here is a summary.

Bitmap fonts

As expected, not supporting Type 1 and BDF fonts anymore is an unwelcome change for people whose favorite fonts are in these formats.

Clearly, a robust conversion script would be a very good thing to have; people have had mixed success with fontforge-based scripts (see this issue). I hope that we can get some help from the font packager community with this.

One follow-up fix that we did here is to make sure that Pango’s font enumeration code does not return fonts in formats that we don’t support. This makes font fallback work to replace bitmap fonts, and helps to avoid ‘black box’ output.

Subpixel positioning

Font rendering is a sensitive topic; every change here is likely to upset some people (in particular those with carefully tuned font setups).

We did not help things by enabling subpixel positioning unconditionally in Pango, when it is only supported in cairo master.  When used with the released cairo, this leads to unpleasantly uneven glyph placement.  Even with cairo master, some compositors have not been updated to support subpixel positioning (e.g. win32, xcb).

To address this problem, subpixel positioning is now optional, and off by default. Use

pango_context_set_round_glyph_positions (context, FALSE)

to turn it on.

Even without subpixel positioning, there is are still small differences in glyph positioning between Pango 1.43 and 1.44. These are caused by differences in glyph extent calculations between cairo and harfbuzz; see this issue for the ongoing discussion.

API changes

I was a bit overzealous in my attempt to reduce our dependency on freetype when I changed the return type of pango_fc_font_lock_face() to gpointer. This is a harmless change for the C API, but it broke some users of Pango in C++. The next release will have the old return type back.

Line spacing

Another new feature that turned out to be better of being off by default is the new line spacing. In the initial 1.44 release, it was on by default, causing line spacing UIs (e.g. in the GIMP) to stop working, which is not acceptable. It is now off by default. Call

pango_layout_set_line_spacing (layout, factor)

to enable it.

Hyphenation

We’ve received one bug report pointing out that hyphens could be confusing in some contexts, for example when breaking filenames. As a consequence, there is  now a text attribute to suppress the insertion of hyphens.

Miscellaneous bugs

Naturally, some bugs crept in; there were some crash fixes, and some hyphens got inserted in the wrong place (such as: hyphens after hyphens, or hyphens after spaces). These were easy.

One bug that took me a while to track down was making lines grow higher when they are ellipsized, causing misrendering. It turned out to be a mixup with text attributes, that let us to pick the wrong font  for the ellipsis character. This will be fixed in the next release.

More text rendering updates

Posted by Matthias Clasen on July 27, 2019 08:53 PM

There is a Pango 1.44 release now. It contains all the changes I outlined recently. We also managed to sneak in a few features and fixes for longstanding bugs. That is the topic of this post.

Line breaking

One area for improvements in this release is line breaking.

Hyphenation

We don’t have TeX-style automatic hyphenation yet (although it may happen eventually). But at least, Pango inserts hyphens now when it breaks a line in the middle of a word (for example, at a soft hyphen character).

<figure aria-describedby="caption-attachment-3701" class="wp-caption aligncenter" id="attachment_3701" style="width: 300px"><figcaption class="wp-caption-text" id="caption-attachment-3701">Example with soft hyphens</figcaption></figure>

This is something i have wanted to do for a very long time, so I am quite happy that switching to harfbuzz for shaping on all platforms has finally enabled us to do this without too much effort.

Better line breaks

Pango follows Unicode UAX14 and UAX29 for finding word boundaries and line break opportunities.  The algorithm described in there is language-independent, but allows for language-specific tweaks. The Unicode standard calls this tailoring.

While Pango has had implementations for both the language-independent and -dependent parts before, we didn’t have them clearly separated in the API, until now.

In 1.44, we introduce a new pango_tailor_break() function which applies language-specific tweaks to a segment of text that has a uniform language. It is meant to be called after pango_default_break().

Line break control

Since my focus was on line-breaking already, I’ve added support for a text attribute to control line breaking. You can now say:

Don't break <span allow_break="false">here!</span>

in Pango markup, and Pango will obey.

In the hyphenation example above, the words showing possible hyphenation points (like im‧peachment) are marked up in this way.

Placement

Another area with significant changes is placement, both of lines and of individual glyphs.

Line height

Up to now, Pango has been placing the lines of a paragraph directly below each other, possibly with a fixed amount of spacing between them. While this works ok most of the time, a more typographically correct way to go about this is to control the baseline-to-baseline distance between lines.

Fonts contain a recommended value for this distance, so the first step was to make this value available with a new pango_font_metrics_get_height() API.

To make use of it, we added a new parameter to PangoLayout that tells it to place lines according to baseline-to-baseline distance. Once we had this, it was very easy to turn the parameter into a floating point number and allow things like double-spaced lines, by saying

pango_layout_set_line_spacing (layout, 2.0)
<figure aria-describedby="caption-attachment-3719" class="wp-caption aligncenter" id="attachment_3719" style="width: 300px"><figcaption class="wp-caption-text" id="caption-attachment-3719">Line spacing 1, 1.5, and 2</figcaption></figure>

You can still use the old way of spacing if you set line-spacing to 0.

Subpixel positions

Pango no longer rounds glyph positions and font metrics to integral pixel numbers. This lets consumers of the formatted glyphs (basically, implementations of PangoRenderer) decide for themselves if they want to place glyphs at subpixel positions or pixel-aligned.

<figure aria-describedby="caption-attachment-3728" class="wp-caption aligncenter" id="attachment_3728" style="width: 300px"><figcaption class="wp-caption-text" id="caption-attachment-3728">Non-integral extents</figcaption></figure>

The cairo renderer in libpangocairo will do subpixel positioning, but you need cairo master for best results. GTK master will soon have the necessary changes to take advantage of it for its GL and Vulkan renderers too.

This is likely one of the more controversial changes in this release—any change to font rendering causes strong reactions. One of the reasons for doing the release now is that it gives us enough time to make sure it works ok for all users of Pango before going out in the next round of upstream and distro releases in the fall.

Visualization

Finally, I spent some time implementing  some long-requested features around missing glyphs, and their rendering as hex boxes. These are also known as tofu (which is the origin of the name for the Noto fonts – ‘no tofu’).

Invisible space

Some fonts don’t have a glyph for the space character – after all, there is nothing to draw. In the past, Pango would sometimes draw a hex box in this case. This is entirely unnecessary – we can just leave a gap of the right size and pretend that nothing happened.  Pango 1.44 will do just that: no more hex boxes for space.

Visible space

On the other hand, sometimes you do want to see where spaces and other whitespace characters such as tabs, are. We’ve added an attribute that lets you request visible rendering of whitespace:

<span show="spaces">Some space here</span>
<figure aria-describedby="caption-attachment-3731" class="wp-caption aligncenter" id="attachment_3731" style="width: 300px"><figcaption class="wp-caption-text" id="caption-attachment-3731">Visible space</figcaption></figure>

This is implemented in the cairo backend, so you will need to use pangocairo to see it.

Special characters

In the same vein, sometimes it is helpful to see special characters such as left-to-right controls in the output.  Unicode calls these characters default-ignorable.

The show attribute also lets you make default-ignorables visible:

<span show=”ignorables”>Hidden treasures</span>

<figure aria-describedby="caption-attachment-3734" class="wp-caption aligncenter" id="attachment_3734" style="width: 300px"><figcaption class="wp-caption-text" id="caption-attachment-3734">Visible default-ignorable characters</figcaption></figure>

As you can see, we use nicknames for ignorables.

Font information

Pango has been shipping a simple tool called pango-list for a while. It produces a list of all the fonts Pango can find.  This can be very helpful in tracking down changes between systems that are caused by differences in the available fonts.

In 1.44, pango-list can optionally show font metrics and variation axes as well. This may be a little obsure, but it has helped me fix the CI tests for Pango.

Summary

This release contains a significant amount of change; I’ve closed a good number of ‘teenage’ bugs while working on it. Please let us know if you see problems or unexpected changes with it!

Westcoast hackfest; GTK updates

Posted by Matthias Clasen on July 21, 2019 11:51 PM

After Behdad left, Christian and I turned our attention to GtkTextView, and made some progress.

Scrolling

GtkTextView is a very old widget. It started out as a port of the tk text widget, and it has not seen a lot of architectural updates over the years. A few years ago, we added a pixel cache to it, to improve its scrolling, but on a high resolution display, its still a lot of pixels to shovel around.

As we’ve moved widgets to GTK4’s rendering models, everybody avoided GtkTextView, so it was using the fallback cairo rendering path, even as we ported other text rendering in GTK to a new pango renderer which produces render nodes.

Until yesterday. We decided to just have a look at how hard it would be to switch the text view over to the new pango renderer. This went much more smoothly than we expected, and the new code is in master today.

<iframe allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="allowfullscreen" frameborder="0" height="267" src="https://www.youtube.com/embed/zDLCJCX1kL0?feature=oembed" title="Gtk 4 smooth scrolling with GPU backed textview" width="474"></iframe>

So far, this is just a straight port with no optimizations (we want to look at smarter caching of render nodes for the visible range). But it is already noticeably smoother to scroll text.

The video does not really do it justice. If you want to try for yourself, the commit is here.

Blinking

After this unexpected success, we looked for another small thing we could to make text editing in GTK feel more modern: better blinking cursors.

<video class="wp-video-shortcode" controls="controls" height="267" id="video-3230-2" preload="metadata" width="474"><source src="https://blogs.gnome.org/mclasen/files/2019/07/cursor-blinks.webm?_=2" type="video/webm">https://blogs.gnome.org/mclasen/files/2019/07/cursor-blinks.webm</video>

For the last 20 years, our cursor blinking was very simple: We turn it off, and then we turn it on again. With GTK4, it is very straightforward to do a little better, and fade the cursor in and out smoothly.

A subtle change, but it improves the experience.

Pango updates

Posted by Matthias Clasen on July 19, 2019 07:35 PM
<header class="entry-header">I have recently spent some time on Pango again, in preparation for the Westcoast hackfest. Behdad is here, and we’ve made great progress on the first day.

</header>

My last Pango update laid out our plans for Pango. Today I’ll summarize the major changes that will be in the next Pango release, 1.44.

Unicode APIs

I had a planned to replace PangoScript by GUnicodeScript outright, but doing so caused breakage in introspection and elsewhere. So, for now, we’ve just deprecated it and recommend that everybody should use GUnicodeScript instead. We did get a registered GType for this (and other) enumerations into GObject, so the lack of a type is no longer an obstacle.

Harfbuzz passthrough

We have added an api to get a Harfbuzz font object from a PangoFont:

hb_font_t *pango_font_get_hb_font (PangoFont *f)

This makes technologies such as OpenType features or variations available to applications without adding more Pango apis in the future.

Reduced freetype dependency

Pango uses harfbuzz for getting font and glyph metrics , glyph IDs and other kinds of font information now, so we don’t need an FT_Face anymore, and pango_fc_font_lock_face() has been deprecated.

Unified shaping

We are using harfbuzz for shaping on all platforms now.  This has allowed us to drop the remaining internal uses of shape and language engines.

Unhinted rendering

Pango no longer forces glyph positions and sizes to be on integral pixel positions. This allows renderers to place glyphs on a subpixel grid. cairo master has the necessary changes to make this work.

libinput's new thumb detection code

Posted by Peter Hutterer on July 18, 2019 08:40 AM

The average user has approximately one thumb per hand. That thumb comes in handy for a number of touchpad interactions. For example, moving the cursor with the index finger and clicking a button with the thumb. On so-called Clickpads we don't have separate buttons though. The touchpad itself acts as a button and software decides whether it's a left, right, or middle click by counting fingers and/or finger locations. Hence the need for thumb detection, because you may have two fingers on the touchpad (usually right click) but if those are the index and thumb, then really, it's just a single finger click.

libinput has had some thumb detection since the early days when we were still hand-carving bits with stone tools. But it was quite simplistic, as the old documentation illustrates: two zones on the touchpad, a touch started in the lower zone was always a thumb. Where a touch started in the upper thumb area, a timeout and movement thresholds would decide whether it was a thumb. Internally, the thumb states were, Schrödinger-esque, "NO", "YES", and "MAYBE". On top of that, we also had speed-based thumb detection - where a finger was moving fast enough, a new touch would always default to being a thumb. On the grounds that you have no business dropping fingers in the middle of a fast interaction. Such a simplistic approach worked well enough for a bunch of use-cases but failed gloriously in other cases.

Thanks to Matt Mayfields' work, we now have a much more sophisticated thumb detection algorithm. The speed detection is still there but it better accounts for pinch gestures and two-finger scrolling. The exclusion zones are still there but less final about the state of the touch, a thumb can escape that "jail" and contribute to pointer motion where necessary. The new documentation has a bit of a general overview. A requirement for well-working thumb detection however is that your device has the required (device-specific) thresholds set up. So go over to the debugging thumb thresholds documentation and start figuring out your device's thresholds.

As usual, if you notice any issues with the new code please let us know, ideally before the 1.14 release.

ASG! 2019 CfP Re-Opened!

Posted by Lennart Poettering on July 14, 2019 10:00 PM

<large>The All Systems Go! 2019 Call for Participation Re-Opened for ONE DAY!</large>

Due to popular request we have re-opened the Call for Participation (CFP) for All Systems Go! 2019 for one day. It will close again TODAY, on 15 of July 2019, midnight Central European Summit Time! If you missed the deadline so far, we’d like to invite you to submit your proposals for consideration to the CFP submission site quickly! (And yes, this is the last extension, there's not going to be any more extensions.)

ASG image

All Systems Go! is everybody's favourite low-level Userspace Linux conference, taking place in Berlin, Germany in September 20-22, 2019.

For more information please visit our conference website!

Settings, in a sandbox world

Posted by Matthias Clasen on July 12, 2019 06:19 PM

GNOME applications (and others) are commonly using the GSettings API for storing their application settings.

GSettings has many nice aspects:

  • flexible data types, with GVariant
  • schemas, so others can understand your settings (e.,g. dconf-editor)
  • overrides, so distros can tweak defaults they don’t like

And it has different backends, so it can be adapted to work transparently in many situations. One example for where this comes in handy is when we use a memory backend to avoid persisting any settings while running tests.

The GSettings backend that is typically used for normal operation is the DConf one.

DConf

DConf features include profiles,  a stack of databases, a facility for locking down keys so they are not writable, and a single-writer design with a central service.

The DConf design is flexible and enterprisey – we have taken advantage of this when we created fleet commander to centrally manage application and desktop settings for large deployments.

But it is not a great fit for sandboxing, where we want to isolate applications from each other and from the host system.  In DConf, all settings are stored in a single database, and apps are free to read and write any keys, not just their own – plenty of potential for mischief and accidents.

Most of the apps that are available as flatpaks today are poking a ‘DConf hole’ into their sandbox to allow the GSettings code to keep talking to the dconf daemon on the session bus, and mmap the dconf database.

Here is how the DConf hole looks in the flatpak metadata file:

[Context]
filesystems=xdg-run/dconf;~/.config/dconf:ro;

[Session Bus Policy]
ca.desrt.dconf=talk

Sandboxes

Ideally, we want sandboxed apps to only have access to their own settings, and maybe readonly access to a limited set of shared settings (for things like the current font, or accessibility settings). It would also be nice if uninstalling a sandboxed app did not leave traces behind, like leftover settings  in some central database.

It might be possible to retrofit some of this into DConf. But when we looked, it did not seem easy, and would require reconsidering some of the central aspects of the DConf design. Instead of going down that road, we decided to take advantage of another GSettings backend that already exists, and stores settings in a keyfile.

Unsurprisingly, it is called the keyfile backend.

Keyfiles

The keyfile backend was originally created to facilitate the migration from GConf to GSettings, and has been a bit neglected, but we’ve given it some love and attention, and it can now function as the default GSettings backend inside sandboxes.

It provides many of the isolation aspects we want: Apps can only read and write their own settings, and the settings are in a single file, in the same place as all the application data:

~/.var/app/$APP/config/glib-2.0/settings/keyfile

One of the things we added to the keyfile backend is support for locks and overrides, so that fleet commander can keep working for apps that are in flatpaks.

For shared desktop-wide settings, there is a companion Settings portal, which provides readonly access to some global settings. It is used transparently by GTK and Qt for toolkit-level settings.

What does all this mean for flatpak apps?

If your application is not yet available as a flatpak, and you want to provide one, you don’t have to do anything in particular. Things will just work. Don’t poke a hole in your sandbox for DConf, and GSettings will use the keyfile backend without any extra work on your part.

If your flatpak is currently shipping with a DConf hole, you can keep doing that for now. When you are ready for it, you should

  • Remove the DConf hole from your flatpak metadata
  • Instruct flatpak to migrate existing DConf settings, by adding a migrate-path setting to the X-DConf section in your flatpak metadata. The value fo the migrate-path key is the DConf path prefix where your application’s settings are stored.

Note that this is a one-time migration; it will only happen if the keyfile does not exist. The existing settings will be left in the DConf database, so if you need to do the migration again for whatever reason, you can simply remove the the keyfile.

This is how the migrate-path key looks in the metadata file:

[X-DConf]
migrate-path=/org/gnome/builder/

Closing the DConf hole is what makes GSettings use the keyfile backend, and the migrate-path key tells flatpak to migrate settings from DConf – you need both parts for a seamless transition.

There were some recent fixes to the keyfile backend code, so you want to make sure that the runtime has GLib 2.60.6, for best results.

Happy flatpaking!

Update: One of the most recent fixes in the keyfile backend was to correct under what circumstances GSettings will choose it as the default backend. If you have problems where the wrong backend is chosen, as a short-term workaround, you can override the choice with the GSETTINGS_BACKEND environment variable.

Update 2: To add the migrate-path setting with flatpak-builder, use the following option:

--metadata=X-DConf=migrate-path=/your/path/


GNOME Software in Fedora will no longer support snapd

Posted by Richard Hughes on July 12, 2019 12:51 PM

In my slightly infamous email to fedora-devel I stated that I would turn off the snapd support in the gnome-software package for Fedora 31. A lot of people agreed with the technical reasons, but failed to understand the bigger picture and asked me to explain myself.

I wanted to tell a little, fictional, story:

In 2012 the ISO institute started working on a cross-vendor petrol reference vehicle to reduce the amount of R&D different companies had to do to build and sell a modern, and safe, saloon car.

Almost immediately, Mercedes joins ISO, and starts selling the ISO car. Fiat joins in 2013, Peugeot in 2014 and General Motors finally joins in 2015 and adds support for Diesel engines. BMW, who had been trying to maintain the previous chassis they designed on their own (sold as “BMW Kar Koncept”), finally adopts the ISO car also in 2015. BMW versions of the ISO car use BMW-specific transmission oil as it doesn’t trust oil from the ISO consortium.

Mercedes looks to the future, and adds high-voltage battery support to the ISO reference car also in 2015, adding the required additional wiring and regenerative braking support. All the other members of the consortium can use their own high voltage batteries, or use the reference battery. The battery can be charged with electricity from any provider.

In 2016 BMW stops marketing the “ISO Car” like all the other vendors, and instead starts calling it “BMW Car” instead. At about the same time BMW adds support for hydrogen engines to the reference vehicle. All the other vendors can ship the ISO car with a Hydrogen engine, but all the hydrogen must be purchased from a BMW-certified dealer. If any vendor other than BMW uses the hydrogen engines, they can’t use the BMW-specific heat shield which protects the fuel tank from exploding in the event on a collision.

In 2017 Mercedes adds traction control and power steering to the ISO reference car. It is enabled almost immediately and used by nearly all the vendors with no royalties and many customer lives are saved.

In 2018 BMW decides that actually producing vendor-specific oil for it’s cars is quite a lot of extra work, and tells all customers existing transmission oil has to be thrown away, but now all customers can get free oil from the ISO consortium. The ISO consortium distributes a lot more oil, but also has to deal with a lot more customer queries about transmission failures.

In 2019 BMW builds a special cut-down ISO car, but physically removes all the petrol and electric functionality from the frame. It is rebranded as “Kar by BMW”. It then sends a private note to the chair of the ISO consortium that it’s not going to be using ISO car in 2020, and that it’s designing a completely new “Kar” that only supports hydrogen engines and does not have traction control or seatbelts. The explanation given was that BMW wanted a vehicle that was tailored specifically for hydrogen engines. Any BMW customers using petrol or electricity in their car must switch to hydrogen by 2020.

The BMW engineers that used to work on ISO Car have been shifted to work on Kar, although have committed to also work on Car if it’s not too much extra work. BMW still want to be officially part of the consortium and to be able to sell the ISO Car as an extra vehicle to the customer that provides all the engine types (as some customers don’t like hydrogen engines), but doesn’t want to be seen to support anything other than a hydrogen-based future. It’s also unclear whether the extra vehicle sold to customers would be the “ISO Car” or the “BMW Car”.

One ISO consortium member asks whether they should remove hydrogen engine support from the ISO car as they feel BMW is not playing fair. Another consortium member thinks that the extra functionality could just be disabled by default and any unused functionality should certainly be removed. All members of the consortium feel like BMW has pushed them too far. Mercedes stop selling the hydrogen ISO Car model stating it’s not safe without the heat shield, and because BMW isn’t going to be supporting the ISO Car in 2020.

Bug bounties and NDAs are an option, not the standard

Posted by Matthew Garrett on July 09, 2019 09:15 PM
Zoom had a vulnerability that allowed users on MacOS to be connected to a video conference with their webcam active simply by visiting an appropriately crafted page. Zoom's response has largely been to argue that:

a) There's a setting you can toggle to disable the webcam being on by default, so this isn't a big deal,
b) When Safari added a security feature requiring that users explicitly agree to launch Zoom, this created a poor user experience and so they were justified in working around this (and so introducing the vulnerability), and,
c) The submitter asked whether Zoom would pay them for disclosing the bug, and when Zoom said they'd only do so if the submitter signed an NDA, they declined.

(a) and (b) are clearly ludicrous arguments, but (c) is the interesting one. Zoom go on to mention that they disagreed with the severity of the issue, and in the end decided not to change how their software worked. If the submitter had agreed to the terms of the NDA, then Zoom's decision that this was a low severity issue would have led to them being given a small amount of money and never being allowed to talk about the vulnerability. Since Zoom apparently have no intention of fixing it, we'd presumably never have heard about it. Users would have been less informed, and the world would have been a less secure place.

The point of bug bounties is to provide people with an additional incentive to disclose security issues to companies. But what incentive are they offering? Well, that depends on who you are. For many people, the amount of money offered by bug bounty programs is meaningful, and agreeing to sign an NDA is worth it. For others, the ability to publicly talk about the issue is worth more than whatever the bounty may award - being able to give a presentation on the vulnerability at a high profile conference may be enough to get you a significantly better paying job. Others may be unwilling to sign an NDA on principle, refusing to trust that the company will ever disclose the issue or fix the vulnerability. And finally there are people who can't sign such an NDA - they may have discovered the issue on work time, and employer policies may prohibit them doing so.

Zoom are correct that it's not unusual for bug bounty programs to require NDAs. But when they talk about this being an industry standard, they come awfully close to suggesting that the submitter did something unusual or unreasonable in rejecting their bounty terms. When someone lets you know about a vulnerability, they're giving you an opportunity to have the issue fixed before the public knows about it. They've done something they didn't need to do - they could have just publicly disclosed it immediately, causing significant damage to your reputation and potentially putting your customers at risk. They could potentially have sold the information to a third party. But they didn't - they came to you first. If you want to offer them money in order to encourage them (and others) to do the same in future, then that's great. If you want to tie strings to that money, that's a choice you can make - but there's no reason for them to agree to those strings, and if they choose not to then you don't get to complain about that afterwards. And if they make it clear at the time of submission that they intend to publicly disclose the issue after 90 days, then they're acting in accordance with widely accepted norms. If you're not able to fix an issue within 90 days, that's very much your problem.

If your bug bounty requires people sign an NDA, you should think about why. If it's so you can control disclosure and delay things beyond 90 days (and potentially never disclose at all), look at whether the amount of money you're offering for that is anywhere near commensurate with the value the submitter could otherwise gain from the information and compare that to the reputational damage you'll take from people deciding that it's not worth it and just disclosing unilaterally. And, seriously, never ask for an NDA before you're committing to a specific $ amount - it's never reasonable to ask that someone sign away their rights without knowing exactly what they're getting in return.

tl;dr - a bug bounty should only be one component of your vulnerability reporting process. You need to be prepared for people to decline any restrictions you wish to place on them, and you need to be prepared for them to disclose on the date they initially proposed. If they give you 90 days, that's entirely within industry norms. Remember that a bargain is being struck here - you offering money isn't being generous, it's you attempting to provide an incentive for people to help you improve your security. If you're asking people to give up more than you're offering in return, don't be surprised if they say no.

comment count unavailable comments

Creating hardware where no hardware exists

Posted by Matthew Garrett on July 07, 2019 07:46 PM
The laptop industry was still in its infancy back in 1990, but it still faced a core problem that we do today - power and thermal management are hard, but also critical to a good user experience (and potentially to the lifespan of the hardware). This is in the days where DOS and Windows had no memory protection, so handling these problems at the OS level would have been an invitation for someone to overwrite your management code and potentially kill your laptop. The safe option was pushing all of this out to an external management controller of some sort, but vendors in the 90s were the same as vendors now and would do basically anything to avoid having to drop an extra chip on the board. Thankfully(?), Intel had a solution.

The 386SL was released in October 1990 as a low-powered mobile-optimised version of the 386. Critically, it included a feature that let vendors ensure that their power management code could run without OS interference. A small window of RAM was hidden behind the VGA memory[1] and the CPU configured so that various events would cause the CPU to stop executing the OS and jump to this protected region. It could then do whatever power or thermal management tasks were necessary and return control to the OS, which would be none the wiser. Intel called this System Management Mode, and we've never really recovered.

Step forward to the late 90s. USB is now a thing, but even the operating systems that support USB usually don't in their installers (and plenty of operating systems still didn't have USB drivers). The industry needed a transition path, and System Management Mode was there for them. By configuring the chipset to generate a System Management Interrupt (or SMI) whenever the OS tried to access the PS/2 keyboard controller, the CPU could then trap into some SMM code that knew how to talk to USB, figure out what was going on with the USB keyboard, fake up the results and pass them back to the OS. As far as the OS was concerned, it was talking to a normal keyboard controller - but in reality, the "hardware" it was talking to was entirely implemented in software on the CPU.

Since then we've seen even more stuff get crammed into SMM, which is annoying because in general it's much harder for an OS to do interesting things with hardware if the CPU occasionally stops in order to run invisible code to touch hardware resources you were planning on using, and that's even ignoring the fact that operating systems in general don't really appreciate the entire world stopping and then restarting some time later without any notification. So, overall, SMM is a pain for OS vendors.

Change of topic. When Apple moved to x86 CPUs in the mid 2000s, they faced a problem. Their hardware was basically now just a PC, and that meant people were going to try to run their OS on random PC hardware. For various reasons this was unappealing, and so Apple took advantage of the one significant difference between their platforms and generic PCs. x86 Macs have a component called the System Management Controller that (ironically) seems to do a bunch of the stuff that the 386SL was designed to do on the CPU. It runs the fans, it reports hardware information, it controls the keyboard backlight, it does all kinds of things. So Apple embedded a string in the SMC, and the OS tries to read it on boot. If it fails, so does boot[2]. Qemu has a driver that emulates enough of the SMC that you can provide that string on the command line and boot OS X in qemu, something that's documented further here.

What does this have to do with SMM? It turns out that you can configure x86 chipsets to trap into SMM on arbitrary IO port ranges, and older Macs had SMCs in IO port space[3]. After some fighting with Intel documentation[4] I had Coreboot's SMI handler responding to writes to an arbitrary IO port range. With some more fighting I was able to fake up responses to reads as well. And then I took qemu's SMC emulation driver and merged it into Coreboot's SMM code. Now, accesses to the IO port range that the SMC occupies on real hardware generate SMIs, trap into SMM on the CPU, run the emulation code, handle writes, fake up responses to reads and return control to the OS. From the OS's perspective, this is entirely invisible[5]. We've created hardware where none existed.

The tree where I'm working on this is here, and I'll see if it's possible to clean this up in a reasonable way to get it merged into mainline Coreboot. Note that this only handles the SMC - actually booting OS X involves a lot more, but that's something for another time.

[1] If the OS attempts to access this range, the chipset directs it to the video card instead of to actual RAM.
[2] It's actually more complicated than that - see here for more.
[3] IO port space is a weird x86 feature where there's an entire separate IO bus that isn't part of the memory map and which requires different instructions to access. It's low performance but also extremely simple, so hardware that has no performance requirements is often implemented using it.
[4] Some current Intel hardware has two sets of registers defined for setting up which IO ports should trap into SMM. I can't find anything that documents what the relationship between them is, but if you program the obvious ones nothing happens and if you program the ones that are hidden in the section about LPC decoding ranges things suddenly start working.
[5] Eh technically a sufficiently enthusiastic OS could notice that the time it took for the access to occur didn't match what it should on real hardware, or could look at the CPU's count of the number of SMIs that have occurred and correlate that with accesses, but good enough

comment count unavailable comments

Fun with the ODRS, part 2

Posted by Richard Hughes on July 05, 2019 07:58 PM

For the last few days I’ve been working on the ODRS, the review server used by GNOME Software and other open source software centers. I had to do a lot of work initially to get the codebase up to modern standards, but now it has unit tests (86% coverage!), full CI and is using the latest versions of everything. All this refactoring allowed me to add some extra new features we’ve needed for a while.

The first feature changes how we do moderation. The way the ODRS works means that any unauthenticated user can mark a review for moderation for any reason in just one click. This means that it’s no longer shown to any other user and requires a moderator to perform one of three actions:

  • Decide it’s okay, and clear the reported counter back to zero
  • Decide it’s not very good, and either modify it or delete it
  • Decide it’s spam or in any way hateful, and delete all the reviews from the submitter, adding them to the user blocklist

For the last few years it’s been mostly me deciding on the ~3k marked-for-moderatation reviews with the help of Google Translate. Let me tell you, after all that my threshold for dealing with internet trolls is super low. There are already over 60 blocked users on the ODRS, although they’ll never really know they are shouting into /dev/null

One change I’ve made here is that it now takes two “reports” of a review before it needs moderation; the logic being that a lot of reports seem accidental and a really bad review is already normally reported by multiple people in the few days after it’s been posted. The other change is that we now have a locale-specific “bad word list” that submitted reports are checked against at submission time. If they are flagged, the moderator has to decide on the action before it’s ever shown to other users. This has already correctly flagged 5 reviews in the couple of days since it was deployed. If you contributed to the spreadsheet with “bad words” for your country I’m very grateful. That bad word list will be available as a JSON dump on the ODRS on Monday in case it’s useful to other people. I fully expect it’ll grow and change over time.

The other big change is dealing with different application IDs. Over the last decade some applications have moved from “launchable-style” inkscape.desktop IDs to AppStream-style IDs like org.inkscape.Inkscape.desktop and are even reported in different forms, e.g. the Flathub-inspired org.inkscape.Inkscape and the Snappy io.snapcraft.inkscape-tIrcA87dMWthuDORCCRU0VpidK5SBVOc. Until today a review submitted against the old desktop ID wouldn’t match for the Flatpak one, and now it does. The same happens when we get the star ratings which means that apps that change ID don’t start with a clean slate and inherit all the positivity of the old version. Of course, the usual per-request ordering and filtering is done, so older versions than the one requested might be shown lower than newer versions anyway.

This is also your monthly reminder to use <provides><id>oldname.desktop</id></provides> in your metainfo.xml file if you change your desktop ID. That includes you Flathub and Snapcraft maintainers too. If you do that client side then you at least probably get the right reviews if the software center does the right thing, but doing it server side as well makes really sure you’re getting the reviews and ratings you want in all cases.

If all this sounds interesting, and you’d like to know more about the ODRS development, or would like to be a moderator for your language, please join the mailing list and I’ll post there next week when I’ve made the moderator experience nicer than it is now. It’ll also be the place to request help, guidance and also ask for new features.

Initial Fun with the Open Desktop Ratings Service: Swearing!

Posted by Richard Hughes on July 03, 2019 02:30 PM

The ODRS is the service that produces ratings and reviews for gnome-software. I built the service a few years ago, and it’s been dutifully trucking on ever since. There are over 25,000 reviews, 50k votes, and over 4k different applications reviewed. Over half a million clients get application reviews every single day.

Recently it’s been showing signs of needing work, and so I’ve spent a few days converting it to Python 3, then to SQLAlchemy, and then fixing all the broken stuff that we’ve lived with for a while (e.g. no emoji support because we were not using utf8mb4…). Part of the new work will be making it easier to flag and then moderate reviews, and that needs your help. Although any unauthenticated user can report a review for any reason, some reviews should be automatically marked at submission if they contain known bad words. There is almost no reason to write a review in locale en_GB and use the word fuck and so I think marking that review as needing moderation before it’s shown to thousands of people is a sensible thing to do.

To this to work, I can’t just use a blacklist of words as some words are only really vulgar in some regions, and some are perfectly valid words in other languages. For this reason I need the blacklist to be keyed to the submitted locale.

This is where I need your help. If you can spare 2 minutes, and know a lot of dirty words in your language can you please add them to this spreadsheet. Much appreciated.

Which smart bulbs should you buy (from a security perspective)

Posted by Matthew Garrett on June 30, 2019 08:10 PM
People keep asking me which smart bulbs they should buy. It's a great question! As someone who has, for some reason, ended up spending a bunch of time reverse engineering various types of lightbulb, I'm probably a reasonable person to ask. So. There are four primary communications mechanisms for bulbs: wifi, bluetooth, zigbee and zwave. There's basically zero compelling reasons to care about zwave, so I'm not going to.

Wifi


Advantages: Doesn't need an additional hub - you can just put the bulbs wherever. The bulbs can connect out to a cloud service, so you can control them even if you're not on the same network.
Disadvantages: Only works if you have wifi coverage, each bulb has to have wifi hardware and be configured appropriately.
Which should you get: If you search Amazon for "wifi bulb" you'll get a whole bunch of cheap bulbs. Don't buy any of them. They're mostly based on a custom protocol from Zengge and they're shit. Colour reproduction is bad, there's no good way to use the colour LEDs and the white LEDs simultaneously, and if you use any of the vendor apps they'll proxy your device control through a remote server with terrible authentication mechanisms. Just don't. The ones that aren't Zengge are generally based on the Tuya platform, whose security model is to have keys embedded in some incredibly obfuscated code and hope that nobody can find them. TP-Link make some reasonably competent bulbs but also use a weird custom protocol with hand-rolled security. Eufy are fine but again there's weird custom security. Lifx are the best bulbs, but have zero security on the local network - anyone on your wifi can control the bulbs. If that's something you care about then they're a bad choice, but also if that's something you care about maybe just don't let people you don't trust use your wifi.
Conclusion: If you have to use wifi, go with lifx. Their security is not meaningfully worse than anything else on the market (and they're better than many), and they're better bulbs. But you probably shouldn't go with wifi.

Bluetooth


Advantages: Doesn't need an additional hub. Doesn't need wifi coverage. Doesn't connect to the internet, so remote attack is unlikely.
Disadvantages: Only one control device at a time can connect to a bulb, so harder to share. Control device needs to be in Bluetooth range of the bulb. Doesn't connect to the internet, so you can't control your bulbs remotely.
Which should you get: Again, most Bluetooth bulbs you'll find on Amazon are shit. There's a whole bunch of weird custom protocols and the quality of the bulbs is just bad. If you're going to go with anything, go with the C by GE bulbs. Their protocol is still some AES-encrypted custom binary thing, but they use a Bluetooth controller from Telink that supports a mesh network protocol. This means that you can talk to any bulb in your network and still send commands to other bulbs - the dual advantages here are that you can communicate with bulbs that are outside the range of your control device and also that you can have as many control devices as you have bulbs. If you've bought into the Google Home ecosystem, you can associate them directly with a Home and use Google Assistant to control them remotely. GE also sell a wifi bridge - I have one, but haven't had time to review it yet, so make no assertions around its competence. The colour bulbs are also disappointing, with much dimmer colour output than white output.

Zigbee


Advantages: Zigbee is a mesh protocol, so bulbs can forward messages to each other. The bulbs are also pretty cheap. Zigbee is a standard, so you can obtain bulbs from several vendors that will then interoperate - unfortunately there are actually two separate standards for Zigbee bulbs, and you'll sometimes find yourself with incompatibility issues there.
Disadvantages: Your phone doesn't have a Zigbee radio, so you can't communicate with the bulbs directly. You'll need a hub of some sort to bridge between IP and Zigbee. The ecosystem is kind of a mess, and you may have weird incompatibilities.
Which should you get: Pretty much every vendor that produces Zigbee bulbs also produces a hub for them. Don't get the Sengled hub - anyone on the local network can perform arbitrary unauthenticated command execution on it. I've previously recommended the Ikea Tradfri, which at the time only had local control. They've since added remote control support, and I haven't investigated that in detail. But overall, I'd go with the Philips Hue. Their colour bulbs are simply the best on the market, and their security story seems solid - performing a factory reset on the hub generates a new keypair, and adding local control users requires a physical button press on the hub to allow pairing. Using the Philips hub doesn't tie you into only using Philips bulbs, but right now the Philips bulbs tend to be as cheap (or cheaper) than anything else.

But what about


If you're into tying together all kinds of home automation stuff, then either go with Smartthings or roll your own with Home Assistant. Both are definitely more effort if you only want lighting.

My priority is software freedom


Excellent! There are various bulbs that can run the Espurna or AiLight firmwares, but you'll have to deal with flashing them yourself. You can tie that into Home Assistant and have a completely free stack. If you're ok with your bulbs being proprietary, Home Assistant can speak to most types of bulb without an additional hub (you'll need a supported Zigbee USB stick to control Zigbee bulbs), and will support the C by GE ones as soon as I figure out why my Bluetooth transmissions stop working every so often.

Conclusion


Outside niche cases, just buy a Hue. Philips have done a genuinely good job. Don't buy cheap wifi bulbs. Don't buy a Sengled hub.

(Disclaimer: I mentioned a Google product above. I am a Google employee, but do not work on anything related to Home.)

comment count unavailable comments

libinput and tablet proximity handling

Posted by Peter Hutterer on June 19, 2019 12:34 AM

This is merely an update on the current status quo, if you read this post in a year's time some of the details may have changed

libinput provides an API to handle graphics tablets, i.e. the tablets that are used by artists. The interface is based around tools, each of which can be in proximity at any time. "Proximity" simply means "in detectable range". libinput promises that any interaction is framed by a proximity in and proximity out event pair, but getting to this turned out to be complicated. libinput has seen a few changes recently here, so let's dig into those. Remember that proverb about seeing what goes into a sausage? Yeah, that.

In the kernel API, the proximity events for pens are the BTN_TOOL_PEN bit. If it's 1, we're in proximity, if it's 0, we're out of proximity. That's the theory.

Wacom tablets (or rather the kernel driver) always reset all axes on proximity out. So libinput needs to take care not to send a 0 value to the caller, lest you want a jump to the top left corner every time you move the pen away from the tablet. Some Wacom pens have serial numbers and we use those to uniquely identify a tool. But some devices start sending proximity and axis events before we get the serial numbers which means we can't identify the tool until several ms later. In that case we simply discard the serial. This means we cannot uniquely identify those pens but so far no-one has complained.

A bunch of tablets (HUION) don't have proximity at all. For those, we start getting events and then stop getting events, without any other information. So libinput has a timer - if we don't get events for a given time, we force a proximity out. Of course, this means we also need to force a proximity in when the next event comes in. These tablets are common enough that recently we just enabled the proximity timeout for all tablets. Easier than playing whack-a-mole, doubly so because HUION re-uses USD ids so you can't easily identify them anyway.

Some tablets (HP Spectre 13) have proximity but never send it. So they advertise the capability, just don't generate events for it. Same handling as the ones that don't have proximity at all.

Some tablets (HUION) have proximity, but only send it once per plug-in, after that it's always in proximity. Since libinput may start after the first pen interaction, this means we have to a) query the initial state of the device and b) force proximity in/out based on the timer, just like above.

Some tablets (Lenovo Flex 5) sometimes send proximity out events, but sometimes do not. So for those we have a timer and forced proximity events, but only when our last interaction didn't trigger a proximity event.

The Dell Active Pen always sends a proximity out event, but with a delay of ~200ms. That timeout is longer than the libinput timeout so we'll get a proximity out event, but only after we've already forced proximity out. We can just discard that event.

The Dell Canvas pen (identifies as "Wacom HID 4831 Pen") can have random delays of up to ~800ms in its event reporting. Which would trigger forced proximity out events in libinput. Luckily it always sends proximity out events, so we could quirk out to specifically disable the timer.

The HP Envy x360 sends a proximity in for the pen, followed by a proximity in from the eraser in the next event. This is still an unresolved issue at the time of writing.

That's the current state of things, I'm sure it'll change in a few months time again as more devices decide to be creative. They are artist's tools after all.

The lesson to take away here: all of the above are special cases that need to be implemented but this can only be done on demand. There's no way any one person can test every single device out there and testing by vendors is often nonexistent. So if you want your device to work, don't complain on some random forum, file a bug and help with debugging and testing instead.

libinput and the Dell Canvas Totem

Posted by Peter Hutterer on June 18, 2019 11:37 PM

We're on the road to he^libinput 1.14 and last week I merged the Dell Canvas Totem support. "Wait, what?" I hear you ask, and "What is that?". Good question - but do pay attention to random press releases more. The Totem (Dell.com) is a round knob that can be placed on the Dell Canvas. Which itself is a pen and touch device, not unlike the Wacom Cintiq range if you're familiar with those (if not, there's always lmgtfy).

The totem's intended use is as secondary device - you place it on the screen while you're using the pen and up pops a radial menu. You can rotate the totem to select items, click it to select something and bang, you're smiling like a stock photo model eating lettuce. The radial menu is just an example UI, there are plenty others. I remember reading papers about bimanual interaction with similar interfaces that dated back to the 80s, so there's a plethora to choose from. I'm sure someone at Dell has written Totem-Pong and if they have not, I really question their job priorities. The technical side is quite simple, the totem triggers a set of touches in a specific configuration, when the firmware detects that arrangement it knows this isn't a finger but the totem.

Pen and touch we already handle well, but the totem required kernel changes and a few new interfaces in libinput. And that was the easy part, the actual UI bits will be nasty.

The kernel changes went into 4.19 and as usual you can throw noises of gratitude at Benjamin Tissoires. The new kernel API basically boils down to the ABS_MT_TOOL_TYPE axis sending MT_TOOL_DIAL whenever the totem is detected. That axis is (like others of the ABS_MT range) an odd one out. It doesn't work as an axis but rather an enum that specifies the tool within the current slot. We already had finger, pen and palm, adding another enum value means, well, now we have a "dial". And that's largely it in terms of API - handle the MT_TOOL_DIAL and you're good to go.

libinput's API is only slightly more complicated. The tablet interface has a new tool type called the LIBINPUT_TABLET_TOOL_TYPE_TOTEM and a new pair of axes for the tool, the size of the touch ellipse. With that you can get the position of the totem and the size (so you know how big the radial menu needs to be). And that's basically it in regards to the API. The actual implementation was a bit more involved, especially because we needed to implement location-based touch arbitration first.

I haven't started on the Wayland protocol additions yet but I suspect they'll look the same as the libinput API (the Wayland tablet protocol is itself virtually identical to the libinput API). The really big changes will of course be in the toolkits and the applications themselves. The totem is not a device that slots into existing UI paradigms, it requires dedicated support. Whether this will be available in your favourite application is likely going to be up to you. Anyway, christmas in July [1] is coming up so now you know what to put on your wishlist.

[1] yes, that's a thing. Apparently christmas with summery temperature, nice weather, sandy beaches is so unbearable that you have to re-create it in the misery of winter. Explains everything you need to know about humans, really.

WOGUE is no friend of GNOME

Posted by Richard Hughes on June 09, 2019 08:18 PM

Alex Diavatis is the person behind the WOGUE account on YouTube. For a while he’s been posting videos about GNOME. I think the latest idea is that he’s trying to “shame” developers into working harder. From the person who’s again on the other end of his rants it’s having the opposite effect.

We’re all doing our best, and I’m personally balancing about a dozen different plates trying to keep them all spinning. If any of the plates fall on the floor, perhaps helping with triaging bugs, fixing little niggles or just saying something positive might be a good idea. In fact, saying nothing would be better than the sarcasm and making silly videos.

Breaking apart Dell UEFI Firmware CapsuleUpdate packages

Posted by Richard Hughes on June 02, 2019 12:10 PM

When firmware is uploaded to the LVFS we perform online checks on it. For example, one of the tests is looking for known badness like embedded UTF-8/UTF-16 BEGIN RSA PRIVATE KEY strings. As part of this we use CHIPSEC (in the form of chipsec_util -n uefi decode) which searches the binary for a UEFI volume header which is a simple string of _FVH and then decompresses the volumes which we then read back as component shards. This works well on plain EDK2 firmware, and the packages uploaded by Lenovo and HP which use IBVs of AMI and Phoenix. The nice side effect is that we can show the user what binaries have changed, as the vendor might have accidentally forgotten to mention something in the release notes.

The elephants in the room were all the hundreds of archives from Dell which could not be loaded by chipsec with no volume header detected. I spent a few hours last night adding support for these archives, and the secret is here:

  1. Decompress the firmware.cab archive into firmware.bin, disregarding the signing and metadata.
  2. If CHIPSEC fails to analyse firmware.bin, look for a > 512kB decompress-able Zlib section somewhere after the capsule header, actually in the PE binary.
  3. The decompressed blob is in PFS format, which seems to be some Dell-specific format that’s already been reverse engineered.
  4. The PFS blob is not further compressed and is in one continuous block, and so the entire PFS volume can be passed to chipsec for analysis.

The Zlib start offset seems to jump around for each release, and I’ve not found any information in the original PE file that indicates the offset. If anyone wants to give me a hint to avoid searching the multimegabyte blob for two bytes (and then testing if it’s just chance, or indeed an Zlib stream…) I would be very happy, even if you have to remain anonymous.

So, to sum up:

CapsuleHeader
  PE Binary
    Zlib stream
      PFS
        FVH
          PE DXEs
          PE PEIMs
          …

I’ll see if chipsec upstream wants a patch to do this as it’s probably useful outside of the LVFS too.