Fedora summer-coding Planet

Using Rust Generics to Enforce DB Record State

Posted by William Brown on April 12, 2019 02:00 PM

Using Rust Generics to Enforce DB Record State

In a database, entries go through a lifecycle which represents what attributes they have have, db record keys, and if they have conformed to schema checking.

I’m currently working on a (private in 2019, public in july 2019) project which is a NoSQL database writting in Rust. To help us manage the correctness and lifecycle of database entries, I have been using advice from the Rust Embedded Group’s Book.

As I have mentioned in the past, state machines are a great way to design code, so let’s plot out the state machine we have for Entries:

Entry State Machine

The lifecyle is:

  • A new entry is submitted by the user for creation
  • We schema check that entry
  • If it passes schema, we commit it and assign internal ID’s
  • When we search the entry, we retrieve it by internal ID’s
  • When we modify the entry, we need to recheck it’s schema before we commit it back
  • When we delete, we just remove the entry.

This leads to a state machine of:

             (create operation)
            [ New + Invalid ] -(schema check)-> [ New + Valid ]
                                               (send to backend)
                                                      v    v-------------\
[Commited + Invalid] <-(modify operation)- [ Commited + Valid ]          |
          |                                          ^   \       (write to backend)
          \--------------(schema check)-------------/     ---------------/

This is a bit rough - The version on my whiteboard was better :)

The main observation is that we are focused only on the commitability and validty of entries - not about where they are or if the commit was a success.

Entry Structs

So to make these states work we have the following structs:

struct EntryNew;
struct EntryCommited;

struct EntryValid;
struct EntryInvalid;

struct Entry<STATE, VALID> {
    state: STATE,
    valid: VALID,
    // Other db junk goes here :)

We can then use these to establish the lifecycle with functions (similar) to this:

impl Entry<EntryNew, EntryInvalid> {
    fn new() -> Self {
        Entry {
            state: EntryNew,
            valid: EntryInvalid,


impl<STATE> Entry<STATE, EntryInvalid> {
    fn validate(self, schema: Schema) -> Result<Entry<STATE, EntryValid>, ()> {
        if schema.check(self) {
            Ok(Entry {
                state: self.state,
                valid: EntryValid,
        } else {

    fn modify(&mut self, ...) {
        // Perform any modifications on the entry you like, only works
        // on invalidated entries.

impl<STATE> Entry<STATE, EntryValid> {
    fn seal(self) -> Entry<EntryCommitted, EntryValid> {
        // Assign internal id's etc
        Entry {
            state: EntryCommited,
            valid: EntryValid,

    fn compare(&self, other: Entry<STATE, EntryValid>) -> ... {
        // Only allow compares on schema validated/normalised
        // entries, so that checks don't have to be schema aware
        // as the entries are already in a comparable state.

impl Entry<EntryCommited, EntryValid> {
    fn invalidate(self) -> Entry<EntryCommited, EntryInvalid> {
        // Invalidate an entry, to allow modifications to be performed
        // note that modifications can only be applied once an entry is created!
        Entry {
            state: self.state,
            valid: EntryInvalid,

What this allows us to do importantly is to control when we apply search terms, send entries to the backend for storage and more. Benefit is this is compile time checked, so you can never send an entry to a backend that is not schema checked, or run comparisons or searches on entries that aren’t schema checked, and you can even only modify or delete something once it’s created. For example other parts of the code now have:

impl BackendStorage {
    // Can only create if no db id's are assigned, IE it must be new.
    fn create(&self, ..., entry: Entry<EntryNew, EntryValid>) -> Result<...> {

    // Can only modify IF it has been created, and is validated.
    fn modify(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {

    // Can only delete IF it has been created and committed.
    fn delete(&self, ..., entry: Entry<EntryCommited, EntryValid>) -> Result<...> {

impl Filter<STATE> {
    // Can only apply filters (searches) if the entry is schema checked. This has an
    // important behaviour, where we can schema normalise. Consider a case-insensitive
    // type, we can schema-normalise this on the entry, then our compare can simply
    // be a string.compare, because we assert both entries *must* have been through
    // the normalisation routines!
    fn apply_filter(&self, ..., entry: &Entry<STATE, EntryValid>) -> Result<bool, ...> {

Using this with Serde?

I have noticed that when we serialise the entry, that this causes the valid/state field to not be compiled away - because they have to be serialised, regardless of the empty content meaning the compiler can’t eliminate them.

A future cleanup will be to have a serialised DBEntry form such as the following:

struct DBEV1 {
    // entry data here

enum DBEntryVersion {

struct DBEntry {
    data: DBEntryVersion

impl From<Entry<EntryNew, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryNew, EntryValid>) -> Self {
        // assign db id's, and return a serialisable entry.

impl From<Entry<EntryCommited, EntryValid>> for DBEntry {
    fn from(e: Entry<EntryCommited, EntryValid>) -> Self {
        // Just translate the entry to a serialisable form

This way we still have the zero-cost state on Entry, but we are able to move to a versioned seralised structure, and we minimise the run time cost.

Testing the Entry

To help with testing, I needed to be able to shortcut and move between anystate of the entry so I could quickly make fake entries, so I added some unsafe methods:

unsafe fn to_new_valid(self, Entry<EntryNew, EntryInvalid>) -> {
    Entry {
        state: EntryNew,
        valid: EntryValid

These allow me to setup and create small unit tests where I may not have a full backend or schema infrastructure, so I can test specific aspects of the entries and their lifecycle. It’s limited to test runs only, and marked unsafe. It’s not “technically” memory unsafe, but it’s unsafe from the view of “it could absolutely mess up your database consistency guarantees” so you have to really want it.


Using statemachines like this, really helped me to clean up my code, make stronger assertions about the correctness of what I was doing for entry lifecycles, and means that I have more faith when I and future-contributors will work on the code base that we’ll have compile time checks to ensure we are doing the right thing - to prevent data corruption and inconsistency.

Debugging MacOS bluetooth audio stutter

Posted by William Brown on April 07, 2019 02:00 PM

Debugging MacOS bluetooth audio stutter

I was noticing that audio to my bluetooth headphones from my iPhone was always flawless, but I started to noticed stutter and drops from my MBP. After exhausting some basic ideas, I was stumped.

To the duck duck go machine, and I searched for issues with bluetooth known issues. Nothing appeared.

However, I then decided to debug the issue - thankfully there was plenty of advice on this matter. Press shift + option while clicking bluetooth in the menu-bar, and then you have a debug menu. You can also open Console.app and search for “bluetooth” to see all the bluetooth related logs.

I noticed that when the audio stutter occured that the following pattern was observed.

default     11:25:45.840532 +1000   wirelessproxd   About to scan for type: 9 - rssi: -90 - payload: <00000000 00000000 00000000 00000000 00000000 0000> - mask: <00000000 00000000 00000000 00000000 00000000 0000> - peers: 0
default     11:25:45.840878 +1000   wirelessproxd   Scan options changed: YES
error       11:25:46.225839 +1000   bluetoothaudiod Error sending audio packet: 0xe00002e8
error       11:25:46.225899 +1000   bluetoothaudiod Too many outstanding packets. Drop packet of 8 frames (total drops:451 total sent:60685 percentDropped:0.737700) Outstanding:17

There was always a scan, just before the stutter initiated. So what was scanning?

I searched for the error related to packets, and there were a lot of false leads. From weird apps to dodgy headphones. In this case I could eliminate both as the headphones worked with other devices, and I don’t have many apps installed.

So I went back and thought about what macOS services could be the problem, and I found that airdrop would scan periodically for other devices to send and recieve files. Disabling Airdrop from the sharing menu in System Prefrences cleared my audio right up.

GDB autoloads for 389 DS

Posted by William Brown on April 02, 2019 01:00 PM

GDB autoloads for 389 DS

I’ve been writing a set of extensions to help debug 389-ds a bit easier. Thanks to the magic of python, writing GDB extensions is really easy.

On OpenSUSE, when you start your DS instance under GDB, all of the extensions are automatically loaded. This will help make debugging a breeze.

zypper in 389-ds gdb
gdb /usr/sbin/ns-slapd
GNU gdb (GDB; openSUSE Tumbleweed) 8.2
(gdb) ds-
ds-access-log  ds-backtrace
(gdb) set args -d 0 -D /etc/dirsrv/slapd-<instance name>
(gdb) run

All the extensions are under the ds- namespace, so they are easy to find. There are some new ones on the way, which I’ll discuss here too:


As DS is a multithreaded process, it can be really hard to find the active thread involved in a problem. So we provided a command that knows how to fold duplicated stacks, and to highlight idle threads that you can (generally) skip over.

Thread 37 (LWP 70054))
Thread 36 (LWP 70053))
Thread 35 (LWP 70052))
Thread 34 (LWP 70051))
Thread 33 (LWP 70050))
Thread 32 (LWP 70049))
Thread 31 (LWP 70048))
Thread 30 (LWP 70047))
Thread 29 (LWP 70046))
Thread 28 (LWP 70045))
Thread 27 (LWP 70044))
Thread 26 (LWP 70043))
Thread 25 (LWP 70042))
Thread 24 (LWP 70041))
Thread 23 (LWP 70040))
Thread 22 (LWP 70039))
Thread 21 (LWP 70038))
Thread 20 (LWP 70037))
Thread 19 (LWP 70036))
Thread 18 (LWP 70035))
Thread 17 (LWP 70034))
Thread 16 (LWP 70033))
Thread 15 (LWP 70032))
Thread 14 (LWP 70031))
Thread 13 (LWP 70030))
Thread 12 (LWP 70029))
Thread 11 (LWP 70028))
Thread 10 (LWP 70027))
#0  0x00007ffff65db03c in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libpthread.so.0
#1  0x00007ffff66318b0 in PR_WaitCondVar () at /usr/lib64/libnspr4.so
#2  0x00000000004220e0 in [IDLE THREAD] connection_wait_for_new_work (pb=0x608000498020, interval=4294967295) at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:970
#3  0x0000000000425a31 in connection_threadmain () at /home/william/development/389ds/ds/ldap/servers/slapd/connection.c:1536
#4  0x00007ffff6637484 in None () at /usr/lib64/libnspr4.so
#5  0x00007ffff65d4fab in start_thread () at /lib64/libpthread.so.0
#6  0x00007ffff6afc6af in clone () at /lib64/libc.so.6

This example shows that there are 17 idle threads (look at frame 2) here, that all share the same trace.


The access log is buffered before writing, so if you have a coredump, and want to see the last few events before they were written to disk, you can use this to display the content:

(gdb) ds-access-log
===== BEGIN ACCESS LOG =====
$2 = 0x7ffff3c3f800 "[03/Apr/2019:10:58:42.836246400 +1000] conn=1 fd=64 slot=64 connection from to
[03/Apr/2019:10:58:42.837199400 +1000] conn=1 op=0 BIND dn=\"\" method=128 version=3
[03/Apr/2019:10:58:42.837694800 +1000] conn=1 op=0 RESULT err=0 tag=97 nentries=0 etime=0.0001200300 dn=\"\"
[03/Apr/2019:10:58:42.838881800 +1000] conn=1 op=1 SRCH base=\"\" scope=2 filter=\"(objectClass=*)\" attrs=ALL
[03/Apr/2019:10:58:42.839107600 +1000] conn=1 op=1 RESULT err=32 tag=101 nentries=0 etime=0.0001070800
[03/Apr/2019:10:58:42.840687400 +1000] conn=1 op=2 UNBIND
[03/Apr/2019:10:58:42.840749500 +1000] conn=1 op=2 fd=64 closed - U1
", '\276' <repeats 3470 times>

At the end the line that repeats shows the log is “empty” in that segment of the buffer.


This command shows the in-memory entry. It can be common to see Slapi_Entry * pointers in the codebase, so being able to display these is really helpful to isolate what’s occuring on the entry. Your first argument should be the Slapi_Entry pointer.

(gdb) ds-entry-print ec
Display Slapi_Entry: cn=config
cn: config
objectClass: top
objectClass: extensibleObject
objectClass: nsslapdConfig
nsslapd-schemadir: /opt/dirsrv/etc/dirsrv/slapd-standalone1/schema
nsslapd-lockdir: /opt/dirsrv/var/lock/dirsrv/slapd-standalone1
nsslapd-tmpdir: /tmp
nsslapd-certdir: /opt/dirsrv/etc/dirsrv/slapd-standalone1

Programming Lessons and Methods

Posted by William Brown on February 25, 2019 01:00 PM

Programming Lessons and Methods

Everyone has their own lessons and methods that they use when they approaching programming. These are the lessons that I have learnt, which I think are the most important when it comes to design, testing and communication.

Comments and Design

Programming is the art of writing human readable code, that a machine will eventually run. Your program needs to be reviewed, discussed and parsed by another human. That means you need to write your program in a way they can understand first.

Rather than rushing into code, and hacking until it works, I find it’s great to start with comments such as:

fn data_access(search: Search) -> Type {
    // First check the search is valid
    //  * No double terms
    //  * All schema is valid

    // Retrieve our data based on the search

    // if debug, do an un-indexed assert the search matches

    // Do any need transform

    // Return the data

After that, I walk away, think about the issue, come back, maybe tweak these comments. When I eventually fill in the code inbetween, I leave all the comments in place. This really helps my future self understand what I was thinking, but it also helps other people understand too.

State Machines

State machines are a way to design and reason about the states a program can be in. They allow exhaustive represenations of all possible outcomes of a function. A simple example is a microwave door.

  /----\            /----- close ----\          /-----\
  |     \          /                 v         v      |
  |    -------------                ---------------   |
open   | Door Open |                | Door Closed |  close
  |    -------------                ---------------   |
  |    ^          ^                  /          \     |
  \---/            \------ open ----/            \----/

When the door is open, opening it again does nothing. Only when the door is open, and we close the door (and event), does the door close (a transition). Once closed, the door can not be closed any more (event does nothing). It’s when we open the door now, that a state change can occur.

There is much more to state machines than this, but they allow us as humans to reason about our designs and model our programs to have all possible outcomes considered.

Zero, One and Infinite

In mathematics there are only three numbers that matter. Zero, One and Infinite. It turns out the same is true in a computer too.

When we are making a function, we can define limits in these terms. For example:

fn thing(argument: Type)

In this case, argument is “One” thing, and must be one thing.

fn thing(argument: Option<Type>)

Now we have argument as an option, so it’s “Zero” or “One”.

fn thing(argument: Vec<Type>)

Now we have argument as vec (array), so it’s “Zero” to “Infinite”.

When we think about this, our functions have to handle these cases properly. We don’t write functions that take a vec with only two items, we write a function with two arguments where each one must exist. It’s hard to handle “two” - it’s easy to handle two cases of “one”.

It also is a good guide for how to handle data sets, assuming they could always be infinite in size (or at least any arbitrary size).

You can then apply this to tests. In a test given a function of:

fn test_me(a: Option<Type>, b: Vec<Type>)

We know we need to test permutations of:

  • a is “Zero” or “One” (Some, None)
  • b is “Zero”, “One” or “Infinite” (.len() == 0, .len() == 1, .len() > 0)

Note: Most languages don’t have an array type that is “One to Infinite”, IE non-empty. If you want this condition (at least one item), you have to assert it yourself ontop of the type system.

Correct, Simple, Fast

Finally, we can put all these above tools together and apply a general philosophy. When writing a program, first make it correct, then simpify the program, then make it fast.

If you don’t do it in this order you will hit barriers - social and technical. For example, if you make something fast, simple, correct, you will likely have issues that can be fixed without making a decrease in performance. People don’t like it when you introduce a patch that drops performance, so as a result correctness is now sacrificed. (Spectre anyone?)

If you make something too simple, you may never be able to make it correctly handle all cases that exist in your application - likely facilitating a future rewrite to make it correct.

If you do correct, fast, simple, then your program will be correct, and fast, but hard for a human to understand. Because programming is the art of communicating intent to a person sacrificing simplicity in favour of fast will make it hard to involve new people and educate and mentor them into development of your project.

  • Correct: Does it behave correctly, handle all states and inputs correctly?
  • Simple: Is it easy to comprehend and follow for a human reader?
  • Fast: Is it performant?

Meaningful 2fa on modern linux

Posted by William Brown on February 11, 2019 01:00 PM

Meaningful 2fa on modern linux

Recently I heard of someone asking the question:

“I have an AD environment connected with <product> IDM. I want to have 2fa/mfa to my linux machines for ssh, that works when the central servers are offline. What’s the best way to achieve this?”

Today I’m going to break this down - but the conclusion for the lazy is:

This is not realistically possible today: use ssh keys with ldap distribution, and mfa on the workstations, with full disk encryption.


So there are a few parts here. AD is for intents and purposes an LDAP server. The <product> is also an LDAP server, that syncs to AD. We don’t care if that’s 389-ds, freeipa or vendor solution. The results are basically the same.

Now the linux auth stack is, and will always use pam for the authentication, and nsswitch for user id lookups. Today, we assume that most people run sssd, but pam modules for different options are possible.

There are a stack of possible options, and they all have various flaws.

  • FreeIPA + 2fa
  • PAM TOTP modules
  • PAM radius to a TOTP server
  • Smartcards

FreeIPA + 2fa

Now this is the one most IDM people would throw out. The issue here is the person already has AD and a vendor product. They don’t need a third solution.

Next is the fact that FreeIPA stores the TOTP in the LDAP, which means FreeIPA has to be online for it to work. So this is eliminated by the “central servers offline” requirement.

PAM radius to TOTP server

Same as above: An extra product, and you have a source of truth that can go down.

PAM TOTP module on hosts

Okay, even if you can get this to scale, you need to send the private seed material of every TOTP device that could login to the machine, to every machine. That means any compromise, compromises every TOTP token on your network. Bad place to be in.


Are notoriously difficult to have functional, let alone with SSH. Don’t bother. (Where the Smartcard does TLS auth to the SSH server this is.)

Come on William, why are you so doom and gloom!

Lets back up for a second and think about what we we are trying to prevent by having mfa at all. We want to prevent single factor compromise from having a large impact and we want to prevent brute force attacks. (There are probably more reasons, but these are the ones I’ll focus on).

So the best answer: Use mfa on the workstation (password + totp), then use ssh keys to the hosts.

This means the target of the attack is small, and the workstation can be protected by things like full disk encryption and group policy. To sudo on the host you still need the password. This makes sudo MFA to root as you need something know, and something you have.

If you are extra conscious you can put your ssh keys on smartcards. This works on linux and osx workstations with yubikeys as I am aware. Apparently you can have ssh keys in TPM, which would give you tighter hardware binding, but I don’t know how to achieve this (yet).

To make all this better, you can distributed your ssh public keys in ldap, which means you gain the benefits of LDAP account locking/revocation, you can remove the keys instantly if they are breached, and you have very little admin overhead to configuration of this service on the linux server side. Think about how easy onboarding is if you only need to put your ssh key in one place and it works on every server! Let alone shutting down a compromised account: lock it in one place, and they are denied access to every server.

SSSD as the LDAP client on the server can also cache the passwords (hashed) and the ssh public keys, which means a disconnected client will still be able to be authenticated to.

At this point, because you have ssh key auth working, you could even deny password auth as an option in ssh altogether, eliminating an entire class of bruteforce vectors.

For bonus marks: You can use AD as the generic LDAP server that stores your SSH keys. No additional vendor products needed, you already have everything required today, for free. Everyone loves free.


If you want strong, offline capable, distributed mfa on linux servers, the only choice today is LDAP with SSH key distribution.

Want to know more? This blog contains how-tos on SSH key distribution for AD, SSH keys on smartcards, and how to configure SSSD to use SSH keys from LDAP.

Using the latest 389-ds on OpenSUSE

Posted by William Brown on January 29, 2019 01:00 PM

Using the latest 389-ds on OpenSUSE

Thanks to some help from my friend who works on OBS, I’ve finally got a good package in review for submission to tumbleweed. However, if you are impatient and want to use the “latest” and greatest 389-ds version on OpenSUSE (docker anyone?).

docker run -i -t opensuse/tumbleweed:latest
zypper ar obs://network:ldap network:ldap
zypper in 389-ds

Now, we still have an issue with “starting” from dsctl (we don’t really expect you to do it like this ….) so you have to make a tweak to defaults.inf:

vim /usr/share/dirsrv/inf/defaults.inf
# change the following to match:
with_systemd = 0

After this, you should now be able to follow our new quickstart guide on the 389-ds website.

I’ll try to keep this repo up to date as much as possible, which is great for testing and early feedback to changes!

EDIT: Updated 2019-04-03 to change repo as changes have progressed forward.

SUSE Open Build Service cheat sheet

Posted by William Brown on January 18, 2019 01:00 PM

SUSE Open Build Service cheat sheet

Part of starting at SUSE has meant that I get to learn about Open Build Service. I’ve known that the project existed for a long time but I have never had a chance to use it. So far I’m thoroughly impressed by how it works and the features it offers.

As A Consumer

The best part of OBS is that it’s trivial on OpenSUSE to consume content from it. Zypper can add projects with the command:

zypper ar obs://<project name> <repo nickname>
zypper ar obs://network:ldap network:ldap

I like to give the repo nickname (your choice) to be the same as the project name so I know what I have enabled. Once you run this you can easily consume content from OBS.

Package Management

As someone who has started to contribute to the suse 389-ds package, I’ve been slowly learning how this work flow works. OBS similar to GitHub/Lab allows a branching and request model.

On OpenSUSE you will want to use the osc tool for your workflow:

zypper in osc
# If you plan to use the "service" command
zypper in obs-service-tar obs-service-obs_scm obs-service-recompress obs-service-set_version obs-service-download_files

You can branch from an existing project to make changes with:

osc branch <project> <package>
osc branch network:ldap 389-ds

This will branch the project to my home namespace. For me this will land in “home:firstyear:branches:network:ldap”. Now I can checkout the content on to my machine to work on it.

osc co <project>
osc co home:firstyear:branches:network:ldap

This will create the folder “home:…:ldap” in the current working directory.

From here you can now work on the project. Some useful commands are:

Add new files to the project (patches, new source tarballs etc).

osc add <path to file>
osc add feature.patch
osc add new-source.tar.xz

Edit the change log of the project (I think this is used in release notes?)

osc vc

To ammend your changes, use:

osc vc -e

Build your changes locally matching the system you are on. Packages normally build on all/most OpenSUSE versions and architectures, this will build just for your local system and arch.

osc build

Make sure you clean up files you aren’t using any more with:

osc rm <filename>
# This commands removes anything untracked by osc.
osc clean

Commit your changes to the OBS server, where a complete build will be triggered:

osc commit

View the results of the last commit:

osc results

Enable people to use your branch/project as a repository. You edit the project metadata and enable repo publishing:

osc meta prj -e <name of project>
osc meta prj -e home:firstyear:branches:network:ldap

# When your editor opens, change this section to enabled (disabled by default):
  <enabled />

NOTE: In some cases if you have the package already installed, and you add the repo/update it won’t install from your repo. This is because in SUSE packages have a notion of “vendoring”. They continue to update from the same repo as they were originally installed from. So if you want to change this you use:

zypper [d]up --from <repo name>

You can then create a “request” to merge your branch changes back to the project origin. This is:

osc sr

A helpful maintainer will then review your changes. You can see this with.

osc rq show <your request id>

If you change your request, to submit again, use:

osc sr

And it will ask if you want to replace (supercede) the previous request.

I was also helped by a friend to provie a “service” configuration that allows generation of tar balls from git. It’s not always appropriate to use this, but if the repo has a “_service” file, you can regenerate the tar with:

osc service ra

So far this is as far as I have gotten with OBS, but I already appreciate how great this work flow is for package maintainers, reviewers and consumers. It’s a pleasure to work with software this well built.

As an additional piece of information, it’s a good idea to read the OBS Packaging Guidelines
to be sure that you are doing the right thing!

Structuring Rust Transactions

Posted by William Brown on January 18, 2019 01:00 PM

Structuring Rust Transactions

I’ve been working on a database-related project in Rust recently, which takes advantage of my concurrently readable datastructures. However I ran into a problem of how to structure Read/Write transaction structures that shared the reader code, and container multiple inner read/write types.

Some Constraints

To be clear, there are some constraints. A “parent” write, will only ever contain write transaction guards, and a read will only ever contain read transaction guards. This means we aren’t going to hit any deadlocks in the code. Rust can’t protect us from mis-ording locks. An additional requirement is that readers and a single write must be able to proceed simultaneously - but having a rwlock style writer or readers behaviour would still work here.

Some Background

To simplify this, imagine we have two concurrently readable datastructures. We’ll call them db_a and db_b.

struct db_a { ... }

struct db_b { ... }

Now, each of db_a and db_b has their own way to protect their inner content, but they’ll return a DBWriteGuard or DBReadGuard when we call db_a.read()/write() respectively.

impl db_a {
    pub fn read(&self) -> DBReadGuard {

    pub fn write(&self) -> DBWriteGuard {

Now we make a “parent” wrapper transaction such as:

struct server {
    a: db_a,
    b: db_b,

struct server_read {
    a: DBReadGuard,
    b: DBReadGuard,

struct server_write {
    a: DBWriteGuard,
    b: DBWriteGuard,

impl server {
    pub fn read(&self) -> server_read {
        server_read {

    pub fn write(&self) -> server_write {
        server_read {

The Problem

Now the problem is that on my server_read and server_write I want to implement a function for “search” that uses the same code. Search or a read or write should behave identically! I wanted to also avoid the use of macros as the can hide issues while stepping in a debugger like LLDB/GDB.

Often the answer with rust is “traits”, to create an interface that types adhere to. Rust also allows default trait implementations, which sounds like it could be a solution here.

pub trait server_read_trait {
    fn search(&self) -> SomeResult {
        let result_a = self.a.search(...);
        let result_b = self.b.search(...);
        SomeResult(result_a, result_b)

In this case, the issue is that &self in a trait is not aware of the fields in the struct - traits don’t define that fields must exist, so the compiler can’t assume they exist at all.

Second, the type of self.a/b is unknown to the trait - because in a read it’s a “a: DBReadGuard”, and for a write it’s “a: DBWriteGuard”.

The first problem can be solved by using a get_field type in the trait. Rust will also compile this out as an inline, so the correct thing for the type system is also the optimal thing at run time. So we’ll update this to:

pub trait server_read_trait {
    fn get_a(&self) -> ???;

    fn get_b(&self) -> ???;

    fn search(&self) -> SomeResult {
        let result_a = self.get_a().search(...); // note the change from self.a to self.get_a()
        let result_b = self.get_b().search(...);
        SomeResult(result_a, result_b)

impl server_read_trait for server_read {
    fn get_a(&self) -> &DBReadGuard {
    // get_b is similar, so ommitted

impl server_read_trait for server_write {
    fn get_a(&self) -> &DBWriteGuard {
    // get_b is similar, so ommitted

So now we have the second problem remaining: for the server_write we have DBWriteGuard, and read we have a DBReadGuard. There was a much longer experimentation process, but eventually the answer was simpler than I was expecting. Rust allows traits to have Self types that enforce trait bounds rather than a concrete type.

So provided that DBReadGuard and DBWriteGuard both implement “DBReadTrait”, then we can have the server_read_trait have a self type that enforces this. It looks something like:

pub trait DBReadTrait {
    fn search(&self) -> ...;

impl DBReadTrait for DBReadGuard {
    fn search(&self) -> ... { ... }

impl DBReadTrait for DBWriteGuard {
    fn search(&self) -> ... { ... }

pub trait server_read_trait {
    type GuardType: DBReadTrait; // Say that GuardType must implement DBReadTrait

    fn get_a(&self) -> &Self::GuardType; // implementors must return that type implementing the trait.

    fn get_b(&self) -> &Self::GuardType;

    fn search(&self) -> SomeResult {
        let result_a = self.get_a().search(...);
        let result_b = self.get_b().search(...);
        SomeResult(result_a, result_b)

impl server_read_trait for server_read {
    fn get_a(&self) -> &DBReadGuard {
    // get_b is similar, so ommitted

impl server_read_trait for server_write {
    fn get_a(&self) -> &DBWriteGuard {
    // get_b is similar, so ommitted

This works! We now have a way to write a single “search” type for our server read and write types. In my case, the DBReadTrait also uses a similar technique to define a search type shared between the DBReadGuard and DBWriteGuard.

Useful USG pro 4 commands and hints

Posted by William Brown on January 01, 2019 01:00 PM

Useful USG pro 4 commands and hints

I’ve recently changed from a FreeBSD vm as my router to a Ubiquiti PRO USG4. It’s a solid device, with many great features, and I’m really impressed at how it “just works” in many cases. So far my only disappointment is lack of documentation about the CLI, especially for debugging and auditing what is occuring in the system, and for troubleshooting steps. This post will aggregate some of my knowledge about the topic.

Current config

Show the current config with:

mca-ctrl -t dump-cfg

You can show system status with the “show” command. Pressing ? will cause the current compeletion options to be displayed. For example:

# show <?>
arp              date             dhcpv6-pd        hardware


The following commands show the DNS statistics, the DNS configuration, and allow changing the cache-size. The cache-size is measured in number of records cached, rather than KB/MB. To make this permanent, you need to apply the change to config.json in your controllers sites folder.

show dns forwarding statistics
show system name-server
set service dns forwarding cache-size 10000
clear dns forwarding cache


You can see and aggregate of system logs with

show log

Note that when you set firewall rules to “log on block” they go to dmesg, not syslog, so as a result you need to check dmesg for these.

It’s a great idea to forward your logs in the controller to a syslog server as this allows you to aggregate and see all the events occuring in a single time series (great when I was diagnosing an issue recently).


To show the system interfaces

show interfaces

To restart your pppoe dhcp6c:

release dhcpv6-pd interface pppoe0
renew dhcpv6-pd interface pppoe0

There is a current issue where the firmware will start dhcp6c on eth2 and pppoe0, but the session on eth2 blocks the pppoe0 client. As a result, you need to release on eth2, then renew of pppoe0

If you are using a dynamic prefix rather than static, you may need to reset your dhcp6c duid.

delete dhcpv6-pd duid

To restart an interface with the vyatta tools:

disconnect interface pppoe
connect interface pppoe


I have setup customised OpenVPN tunnels. To show these:

show interfaces openvpn detail

These are configured in config.json with:

# Section: config.json - interfaces - openvpn
    "vtun0": {
            "encryption": "aes256",
            # This assigns the interface to the firewall zone relevant.
            "firewall": {
                    "in": {
                            "ipv6-name": "LANv6_IN",
                            "name": "LAN_IN"
                    "local": {
                            "ipv6-name": "LANv6_LOCAL",
                            "name": "LAN_LOCAL"
                    "out": {
                            "ipv6-name": "LANv6_OUT",
                            "name": "LAN_OUT"
            "mode": "server",
            # By default, ubnt adds a number of parameters to the CLI, which
            # you can see with ps | grep openvpn
            "openvpn-option": [
                    # If you are making site to site tunnels, you need the ccd
                    # directory, with hostname for the file name and
                    # definitions such as:
                    # iroute
                    "--client-config-dir /config/auth/openvpn/ccd",
                    "--keepalive 10 60",
                    "--user nobody",
                    "--group nogroup",
                    "--proto udp",
                    "--port 1195"
            "server": {
                    "push-route": [
                    "subnet": ""
            "tls": {
                    "ca-cert-file": "/config/auth/openvpn/vps/vps-ca.crt",
                    "cert-file": "/config/auth/openvpn/vps/vps-server.crt",
                    "dh-file": "/config/auth/openvpn/dh2048.pem",
                    "key-file": "/config/auth/openvpn/vps/vps-server.key"


Net flows allow a set of connection tracking data to be sent to a remote host for aggregation and analysis. Sadly this process was mostly undocumented, bar some useful forum commentors. Here is the process that I came up with. This is how you configure it live:

set system flow-accounting interface eth3.11
set system flow-accounting netflow server port 6500
set system flow-accounting netflow version 5
set system flow-accounting netflow sampling-rate 1
set system flow-accounting netflow timeout max-active-life 1

To make this persistent:

"system": {
            "flow-accounting": {
                    "interface": [
                    "netflow": {
                            "sampling-rate": "1",
                            "version": "5",
                            "server": {
                                    "": {
                                            "port": "6500"
                            "timeout": {
                                    "max-active-life": "1"

To show the current state of your flows:

show flow-accounting

The idea of CI and Engineering

Posted by William Brown on January 01, 2019 01:00 PM

The idea of CI and Engineering

In software development I see and interesting trend and push towards continuous integration, continually testing, and testing in production. These techniques are designed to allow faster feedback on errors, use real data for application testing, and to deliver features and changes faster.

But is that really how people use software on devices? When we consider an operation like google or amazon, this always online technique may work, but what happens when we apply a continous integration and “we’ll patch it later” mindset to devices like phones or internet of things?

What happens in other disciplines?

In real engineering disciplines like aviation or construction, techniques like this don’t really work. We don’t continually build bridges, then fix them when they break or collapse. There are people who provide formal analysis of materials, their characteristics. Engineers consider careful designs, constraints, loads and situations that may occur. The structure is planned, reviewed and verified mathematically. Procedures and oversight is applied to ensure correct building of the structure. Lessons are learnt from past failures and incidents and are applied into every layer of the design and construction process. Communication between engineers and many other people is critical to the process. Concerns are always addressed and managed.

The first thing to note is that if we just built lots of scale-model bridges and continually broke them until we found their limits, this would waste many resources to do this. Bridges are carefully planned and proven.

So whats the point with software?

Today we still have a mindset that continually breaking and building is a reasonable path to follow. It’s not! It means that the only way to achieve quality is to have a large test suite (requires people and time to write), which has to be further derived from failures (and those failures can negatively affect real people), then we have to apply large amounts of electrical energy to continually run the tests. The test suites can’t even guarantee complete coverage of all situations and occurances!

This puts CI techniques out of reach of many application developers due to time and energy (translated to dollars) limits. Services like travis on github certainly helps to lower the energy requirement, but it doesn’t stop the time and test writing requirements.

No matter how many tests we have for a program, if that program is written in C or something else, we continually see faults and security/stability issues in that software.

What if we CI on … a phone?

Today we even have hardware devices that are approached as though they “test in production” is a reasonable thing. It’s not! People don’t patch, telcos don’t allow updates out to users, and those that are aware, have to do custom rom deployment. This creates an odd dichomtemy of “haves” and “haves not”, of those in technical know how who have a better experience, and the “haves not” who have to suffer potentially insecure devices. This is especially terrifying given how deeply personal phones are.

This is a reality of our world. People do not patch. They do not patch phones, laptops, network devices and more. Even enterprises will avoid patching if possible. Rather than trying to shift the entire culture of humans to “update always”, we need to write software that can cope in harsh conditions, for long term. We only need to look to software in aviation to see we can absolutely achieve this!

What should we do?

I believe that for software developers to properly become software engineers we should look to engineers in civil and aviation industries. We need to apply:

  • Regualation and ethics (Safety of people is always first)
  • Formal verification
  • Consider all software will run long term (5+ years)
  • Improve team work and collaboration on designs and development

The reality of our world is people are deploying devices (routers, networks, phones, lights, laptops more …) where they may never be updated or patched in their service life. Even I’m guilty (I have a modem that’s been unpatched for about 6 years but it’s pretty locked down …). As a result we need to rely on proof that the device can not fail at build time, rather than patch it later which may never occur! Putting formal verification first, and always considering user safety and rights first, shifts a large burden to us in terms of time. But many tools (Coq, fstar, rust …) all make formal verification more accessible to use in our industry. Verifying our software is a far stronger assertion of quality than “throw tests at it and hope it works”.

You’re crazy William, and also wrong

Am I? Looking at “critical” systems like iPhone encryption hardware, they are running the formally verified Sel4. We also heard at Kiwicon in 2018 that Microsoft and XBox are using formal verification to design their low levels of their system to prevent exploits from occuring in the first place.

Over time our industry will evolve, and it will become easier and more cost effective to formally verify than to operate and deploy CI. This doesn’t mean we don’t need tests - it means that the first line of quality should be in verification of correctness using formal techniques rather than using tests and CI to prove correct behaviour. Tests are certainly still required to assert further behavioural elements of software.

Today, if you want to do this, you should be looking at Coq and program extraction, fstar and the kremlin (project everest, a formally verified https stack), Rust (which has a subset of the safe language formally proven). I’m sure there are more, but these are the ones I know off the top of my head.


Over time our industry must evolve to put the safety of humans first. To achive this we must look to other safety driven cultures such as aviation and civil engineering. Only by learning from their strict disciplines and behaviours can we start to provide software that matches behavioural and quality expectations humans have for software.

Nextcloud and badrequest filesize incorrect

Posted by William Brown on December 30, 2018 01:00 PM

Nextcloud and badrequest filesize incorrect

My friend came to my house and was trying to share some large files with my nextcloud instance. Part way through the upload an error occurred.

"Exception":"Sabre\\DAV\\Exception\\BadRequest","Message":"expected filesize 1768906752 got 1768554496"

It turns out this error can be caused by many sources. It could be timeouts, bad requests, network packet loss, incorrect nextcloud configuration or more.

We tried uploading larger files (by a factor of 10 times) and they worked. This eliminated timeouts as a cause, and probably network loss. Being on ethernet direct to the server generally also helps to eliminate packet loss as a cause compared to say internet.

We also knew that the server must not have been misconfigured because a larger file did upload, so no file or resource limits were being hit.

This also indicated that the client was likely doing the right thing because larger and smaller files would upload correctly. The symptom now only affected a single file.

At this point I realised, what if the client and server were both victims to a lower level issue? I asked my friend to ls the file and read me the number of bytes long. It was 1768906752, as expected in nextcloud.

Then I asked him to cat that file into a new file, and to tell me the length of the new file. Cat encountered an error, but ls on the new file indeed showed a size of 1768554496. That means filesystem corruption! What could have lead to this?


Apple’s legacy filesystem (and the reason I stopped using macs) is well known for silently eating files and corrupting content. Here we had yet another case of that damage occuring, and triggering errors elsewhere.

Bisecting these issues and eliminating possibilities through a scientific method is always the best way to resolve the cause, and it may come from surprising places!

Identity ideas …

Posted by William Brown on December 20, 2018 01:00 PM

Identity ideas …

I’ve been meaning to write this post for a long time. Taking half a year away from the 389-ds team, and exploring a lot of ideas from other projects has led me to come up with some really interesting ideas about what we do well, and what we don’t. I feel like this blog could be divisive, as I really think that for our services to stay relevant we need to make changes that really change our own identity - so that we can better represent yours.

So strap in, this is going to be long …

What’s currently on the market

Right now the market for identity has two extremes. At one end we have the legacy “create your own” systems, that are build on technologies like LDAP and Kerberos. I’m thinking about things like 389 Directory Server, OpenLDAP, Active Directory, FreeIPA and more. These all happen to be constrained heavily by complexity, fragility, and administrative workload. You need to spend months to learn these and even still, you will make mistakes and there will be problems.

At the other end we have hosted “Identity as a Service” options like Azure AD and Auth0. These have very intelligently, unbound themself from legacy, and tend to offer HTTP apis, 2fa and other features that “just work”. But they are all in the cloud, and outside your control.

But there is nothing in the middle. There is no option that “just works”, supports modern standards, and is unhindered by legacy that you can self deploy with minimal administrative fuss - or years of experience.

What do I like from 389?

  • Replication

The replication system is extremely robust, and has passed many complex tests for cases of eventual consistency correctness. It’s very rare to hear of any kind of data corruption or loss within our replication system, and that’s testament to the great work of people who spent years looking at the topic.

  • Performance

We aren’t as fast as OpenLDAP is 1 vs 1 server, but our replication scalability is much higher, where in any size of MMR or read-only replica topology, we have higher horizontal scaling, nearly linear based on server additions. If you want to run a cloud scale replicated database, we scale to it (and people already do this!).

  • Stability

Our server stability is well known with administrators, and honestly is a huge selling point. We see servers that only go down when administrators are performing upgrades. Our work with sanitising tools and the careful eyes of the team has ensured our code base is reliable and solid. Having extensive tests and amazing dedicated quality engineers also goes a long way.

  • Feature rich

There are a lot of features I really like, and are really useful as an admin deploying this service. Things like memberof (which is actually a group resolution cache when you think about it …), automember, online backup, unique attribute enforcement, dereferencing, and more.

  • The team

We have a wonderful team of really smart people, all of whom are caring and want to advance the state of identity management. Not only do they want to keep up with technical changes and excellence, they are listening to and want to improve our social awareness of identity management.

Pain Points

  • C

Because DS is written in C, it’s risky and difficult to make changes. People constantly make mistakes that introduce unsafety (even myself), and worse. No amount of tooling or intelligence can take away the fact that C is just hard to use, and people need to be perfect (people are not perfect!) and today we have better tools. We can not spend our time chasing our tails on pointless issues that C creates, when we should be doing better things.

  • Everything about dynamic admin, config, and plugins is hard and can’t scale

Because we need to maintain consistency through operations from start to end but we also allow changing config, plugins, and more during the servers operation the current locking design just doesn’t scale. It’s also not 100% safe either as the values are changed by atomics, not managed by transactions. We could use copy-on-write for this, but why? Config should be managed by tools like ansible, but today our dynamic config and plugins is both a performance over head and an admin overhead because we exclude best practice tools and have to spend a large amount of time to maintain consistent data when we shouldn’t need to. Less features is less support overhead on us, and simpler to test and assert quality and correct behaviour.

  • Plugins to address shortfalls, but a bit odd.

We have all these features to address issues, but they all do it … kind of the odd way. Managed Entries creates user private groups on object creation. But the problem is “unix requires a private group” and “ldap schema doesn’t allow a user to be a group and user at the same time”. So the answer is actually to create a new objectClass that let’s a user ALSO be it’s own UPG, not “create an object that links to the user”. (Or have a client generate the group from user attributes but we shouldn’t shift responsibility to the client.)

Distributed Numeric Assignment is based on the AD rid model, but it’s all about “how can we assign a value to a user that’s unique?”. We already have a way to do this, in the UUID, so why not derive the UID/GID from the UUID. This means there is no complex inter-server communication, pooling, just simple isolated functionality.

We have lots of features that just are a bit complex, and could have been made simpler, that now we have to support, and can’t change to make them better. If we rolled a new “fixed” version, we would then have to support both because projects like FreeIPA aren’t going to just change over.

  • client tools are controlled by others and complex (sssd, openldap)

Every tool for dealing with ldap is really confusing and arcane. They all have wild (unhelpful) defaults, and generally this scares people off. I took months of work to get a working ldap server in the past. Why? It’s 2018, things need to “just work”. Our tools should “just work”. Why should I need to hand edit pam? Why do I need to set weird options in SSSD.conf? All of this makes the whole experience poor.

We are making client tools that can help (to an extent), but they are really limited to system administration and they aren’t “generic” tools for every possible configuration that exists. So at some point people will still find a limit where they have to touch ldap commands. A common request is a simple to use web portal for password resets, which today only really exists in FreeIPA, and that limits it’s application already.

  • hard to change legacy

It’s really hard to make code changes because our surface area is so broad and the many use cases means that we risk breakage every time we do. I have even broken customer deployments like this. It’s almost impossible to get away from, and that holds us back because it means we are scared to make changes because we have to support the 1 million existing work flows. To add another is more support risk.

Many deployments use legacy schema elements that holds us back, ranging from the inet types, schema that enforces a first/last name, schema that won’t express users + groups in a simple away. It’s hard to ask people to just up and migrate their data, and even if we wanted too, ldap allows too much freedom so we are more likely to break data, than migrate it correctly if we tried.

This holds us back from technical changes, and social representation changes. People are more likely to engage with a large migrational change, than an incremental change that disturbs their current workflow (IE moving from on prem to cloud, rather than invest in smaller iterative changes to make their local solutions better).

  • ACI’s are really complex

389’s access controls are good because they are in the tree and replicated, but bad because the syntax is awful, complex, and has lots of traps and complexity. Even I need to look up how to write them when I have to. This is not good for a project that has such deep security concerns, where your ACI’s can look correct but actually expose all your data to risks.

  • LDAP as a protocol is like an 90’s drug experience

LDAP may be the lingua franca of authentication, but it’s complex, hard to use and hard to write implementations for. That’s why in opensource we have a monoculture of using the openldap client libraries because no one can work out how to write a standalone library. Layer on top the complexity of the object and naming model, and we have a situation where no one wants to interact with LDAP and rather keeps it at arm length.

It’s going to be extremely hard to move forward here, because the community is so fragmented and small, and the working groups dispersed that the idea of LDAPv4 is a dream that no one should pursue, even though it’s desperately needed.

  • TLS

TLS is great. NSS databases and tools are not.


GSSAPI and Kerberos are a piece of legacy that we just can’t escape from. They are a curse almost, and one we need to break away from as it’s completely unusable (even if it what it promises is amazing). We need to do better.

That and SSO allows loads of attacks to proceed, where we actually want isolated token auth with limited access scopes …

What could we offer

  • Web application as a first class consumer.

People want web portals for their clients, and they want to be able to use web applications as the consumer of authentication. The HTTP protocols must be the first class integration point for anything in identity management today. This means using things like OAUTH/OIDC.

  • Systems security as a first class consumer.

Administrators still need to SSH to machines, and people still need their systems to have identities running on them. Having pam/nsswitch modules is a very major requirement, where those modules have to be fast, simple, and work correctly. Users should “imply” a private group, and UID/GID should by dynamic from UUID (or admins can override it).

  • 2FA/u2f/TOTP.

Multi-factor auth is here (not coming, here), and we are behind the game. We already have Apple and MS pushing for webauthn in their devices. We need to be there for these standards to work, and to support the next authentication tool after that.

  • Good RADIUS integration.

RADIUS is not going away, and is important in education providers and business networks, so RADIUS must “just work”. Importantly, this means mschapv2 which is the universal default for all clients to operate with, which means nthash.

However, we can make the nthash unlinked from your normal password, so you can then have wifi password and a seperate loging password. We could even generate an NTHash containing the TOTP token for more high security environments.

  • better data structure (flat, defined by object types).

The tree structure of LDAP is confusing, but a flatter structure is easier to manage and understand. We can use ideas from kubernetes like tags/labels which can be used to provide certain controls and filtering capabilities for searches and access profiles to apply to.

  • structured logging, with in built performance profiling.

Being able to diagnose why an operation is slow is critical and having structured logs with profiling information is key to allowing admins and developers to resolve performance issues at scale. It’s also critical to have auditing of every single change made in the system, including internal changes that occur during operations.

  • access profiles with auditing capability.

Access profiles that express what you can access, and how. Easier to audit, generate, and should be tightly linked to group membership for real RBAC style capabilities.

  • transactions by allowing batch operations.

LDAP wants to provide a transaction system over a set of operations, but that may cause performance issues on write paths. Instead, why not allow submission of batches of changes that all must occur “at the same time” or “none”. This is faster network wise, protocol wise, and simpler for a server to implement.

What’s next then …

Instead of fixing what we have, why not take the best of what we have, and offer something new in parallel? Start a new front end that speaks in an accessible way, that has modern structures, and has learnt from the lessons of the past? We can build it to standalone, or proxy from the robust core of 389 Directory Server allowing migration paths, but eschew the pain of trying to bring people to the modern world. We can offer something unique, an open source identity system that’s easy to use, fast, secure, that you can run on your terms, or in the cloud.

This parallel project seems like a good idea … I wonder what to name it …

Work around docker exec bug

Posted by William Brown on December 08, 2018 01:00 PM

Work around docker exec bug

There is currently a docker exec bug in Centos/RHEL 7 that causes errors such as:

# docker exec -i -t instance /bin/sh
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused \"read parent: connection reset by peer\""

As a work around you can use nsenter instead:

PID=docker inspect --format {{.State.Pid}} <name of container>
nsenter --target $PID --mount --uts --ipc --net --pid /bin/sh

For more information, you can see the bugreport here.

High Available RADVD on Linux

Posted by William Brown on October 31, 2018 01:00 PM

High Available RADVD on Linux

Recently I was experimenting again with high availability router configurations so that in the cause of an outage or a failover the other router will take over and traffic is still served.

This is usually done through protocols like VRRP to allow virtual ips to exist that can be failed between. However with ipv6 one needs to still allow clients to find the router, and in the cause of a failure, the router advertisments still must continue for client renewals.

To achieve this we need two parts. A shared Link Local address, and a special RADVD configuration.

Because of howe ipv6 routers work, all traffic (even global) is still sent to your link local router. We can use an address like:


This doesn’t clash with any reserved or special ipv6 addresses, and it’s easy to remember. Because of how link local works, we can put this on many interfaces of the router (many vlans) with no conflict.

So now to the two components.


Keepalived is a VRRP implementation for linux. It has extensive documentation and sometimes uses some implementation specific language, but it works well for what it does.

Our configuration looks like:

#  /etc/keepalived/keepalived.conf
global_defs {
  vrrp_version 3

vrrp_sync_group G1 {
 group {

vrrp_instance ipv6_ens256 {
   interface ens256
   virtual_router_id 62
   priority 50
   advert_int 1.0
   virtual_ipaddress {
   garp_master_delay 1

Note that we provide both a global address and an LL address for the failover. This is important for services and DNS for the router to have the global, but you could omit this. The LL address however is critical to this configuration and must be present.

Now you can start up vrrp, and you should see one of your two linux machines pick up the address.


For RADVD to work, a feature of the 2.x series is required. Packaging this for el7 is out of scope of this post, but fedora ships the version required.

The feature is that RADVD can be configured to specify which address it advertises for the router, rather than assuming the interface LL autoconf address is the address to advertise. The configuration appears as:

# /etc/radvd.conf
interface ens256
    AdvSendAdvert on;
    MinRtrAdvInterval 30;
    MaxRtrAdvInterval 100;
    AdvRASrcAddress {
    prefix 2001:db8::/64
        AdvOnLink on;
        AdvAutonomous on;
        AdvRouterAddr off;

Note the AdvRASrcAddress parameter? This defines a priority list of address to advertise that could be available on the interface.

Now start up radvd on your two routers, and try failing over between them while you ping from your client. Remember to ping LL from a client you need something like:

ping6 fe80::1:1%en1

Where the outgoing interface of your client traffic is denoted after the ‘%’.

Happy failover routing!

IBM To Buy RedHat &amp; Dev Null Update

Posted by Mo Morsi on October 30, 2018 01:01 AM

For the record I had no idea about the acquisition when I left RedHat last spring (I was merely a peon Software Engineer during my tenure there!). But I'll just leave this one here:

Rht journey

In other news, Dev Null is going good, the product has more or less stabilized, we're hitting up conferences all over the country to promote it, and it's gaining some traction with the XRP community on social media. There's still along ways to go to see Wipple reach full fruition but I'm excited by the progress and eager to continue driving development.

That's all for now, I have a laundry list of topics which I'd like to blog about but until I get some time to do so, they will have to stay on the backburner!


Rust RwLock and Mutex Performance Oddities

Posted by William Brown on October 18, 2018 01:00 PM

Rust RwLock and Mutex Performance Oddities

Recently I have been working on Rust datastructures once again. In the process I wanted to test how my work performed compared to a standard library RwLock and Mutex. On my home laptop the RwLock was 5 times faster, the Mutex 2 times faster than my work.

So checking out my code on my workplace workstation and running my bench marks I noticed the Mutex was the same - 2 times faster. However, the RwLock was 4000 times slower.

What’s a RwLock and Mutex anyway?

In a multithreaded application, it’s important that data that needs to be shared between threads is consistent when accessed. This consistency is not just logical consistency of the data, but affects hardware consistency of the memory in cache. As a simple example, let’s examine an update to a bank account done by two threads:

acc = 10
deposit = 3
withdrawl = 5

[ Thread A ]            [ Thread B ]
acc = load_balance()    acc = load_balance()
acc = acc + deposit     acc = acc - withdrawl
store_balance(acc)      store_balance(acc)

What will the account balance be at the end? The answer is “it depends”. Because threads are working in parallel these operations could happen:

  • At the same time
  • Interleaved (various possibilities)
  • Sequentially

This isn’t very healthy for our bank account. We could lose our deposit, or have invalid data. Valid outcomes at the end are that acc could be 13, 5, 8. Only one of these is correct.

A mutex protects our data in multiple ways. It provides hardware consistency operations so that our cpus cache state is valid. It also allows only a single thread inside of the mutex at a time so we can linearise operations. Mutex comes from the word “Mutual Exclusion” after all.

So our example with a mutex now becomes:

acc = 10
deposit = 3
withdrawl = 5

[ Thread A ]            [ Thread B ]
mutex.lock()            mutex.lock()
acc = load_balance()    acc = load_balance()
acc = acc + deposit     acc = acc - withdrawl
store_balance(acc)      store_balance(acc)
mutex.unlock()          mutex.unlock()

Now only one thread will access our account at a time: The other thread will block until the mutex is released.

A RwLock is a special extension to this pattern. Where a mutex guarantees single access to the data in a read and write form, a RwLock (Read Write Lock) allows multiple read-only views OR single read and writer access. Importantly when a writer wants to access the lock, all readers must complete their work and “drain”. Once the write is complete readers can begin again. So you can imagine it as:

Time ->

T1: -- read --> x
T3:     -- read --> x                x -- read -->
T3:     -- read --> x                x -- read -->
T4:                   | -- write -- |
T5:                                  x -- read -->

Test Case for the RwLock

My test case is simple. Given a set of 12 threads, we spawn:

  • 8 readers. Take a read lock, read the value, release the read lock. If the value == target then stop the thread.
  • 4 writers. Take a write lock, read the value. Add one and write. Continue until value == target then stop.

Other conditions:

  • The test code is identical between Mutex/RwLock (beside the locking costruct)
  • –release is used for compiler optimisations
  • The test hardware is as close as possible (i7 quad core)
  • The tests are run multiple time to construct averages of the performance

The idea being that X target number of writes must occur, while many readers contend as fast as possible on the read. We are pressuring the system of choice between “many readers getting to read fast” or “writers getting priority to drain/block readers”.

On OSX given a target of 500 writes, this was able to complete in 0.01 second for the RwLock. (MBP 2011, 2.8GHz)

On Linux given a target of 500 writes, this completed in 42 seconds. This is a 4000 times difference. (i7-7700 CPU @ 3.60GHz)

All things considered the Linux machine should have an advantage - it’s a desktop processor, of a newer generation, and much faster clock speed. So why is the RwLock performance so much different on Linux?

To the source code!

Examining the Rust source code , many OS primitives come from libc. This is because they require OS support to function. RwLock is an example of this as is mutex and many more. The unix implementation for Rust consumes the pthread_rwlock primitive. This means we need to read man pages to understand the details of each.

OSX uses FreeBSD userland components, so we can assume they follow the BSD man pages. In the FreeBSD man page for pthread_rwlock_rdlock we see:


 To prevent writer starvation, writers are favored over readers.

Linux however, uses different constructs. Looking at the Linux man page:

  This is the default.  A thread may hold multiple read locks;
  that is, read locks are recursive.  According to The Single
  Unix Specification, the behavior is unspecified when a reader
  tries to place a lock, and there is no write lock but writers
  are waiting.  Giving preference to the reader, as is set by
  PTHREAD_RWLOCK_PREFER_READER_NP, implies that the reader will
  receive the requested lock, even if a writer is waiting.  As
  long as there are readers, the writer will be starved.

Reader vs Writer Preferences?

Due to the policy of a RwLock having multiple readers OR a single writer, a preference is given to one or the other. The preference basically boils down to the choice of:

  • Do you respond to write requests and have new readers block?
  • Do you favour readers but let writers block until reads are complete?

The difference is that on a read heavy workload, a write will continue to be delayed so that readers can begin and complete (up until some threshold of time). However, on a writer focused workload, you allow readers to stall so that writes can complete sooner.

On Linux, they choose a reader preference. On OSX/BSD they choose a writer preference.

Because our test is about how fast can a target of write operations complete, the writer preference of BSD/OSX causes this test to be much faster. Our readers still “read” but are giving way to writers, which completes our test sooner.

However, the linux “reader favour” policy means that our readers (designed for creating conteniton) are allowed to skip the queue and block writers. This causes our writers to starve. Because the test is only concerned with writer completion, the result is (correctly) showing our writers are heavily delayed - even though many more readers are completing.

If we were to track the number of reads that completed, I am sure we would see a large factor of difference where Linux has allow many more readers to complete than the OSX version.

Linux pthread_rwlock does allow you to change this policy (PTHREAD_RWLOCK_PREFER_WRITER_NP) but this isn’t exposed via Rust. This means today, you accept (and trust) the OS default. Rust is just unaware at compile time and run time that such a different policy exists.


Rust like any language consumes operating system primitives. Every OS implements these differently and these differences in OS policy can cause real performance differences in applications between development and production.

It’s well worth understanding the constructions used in programming languages and how they affect the performance of your application - and the decisions behind those tradeoffs.

This isn’t meant to say “don’t use RwLock in Rust on Linux”. This is meant to say “choose it when it makes sense - on read heavy loads, understanding writers will delay”. For my project (A copy on write cell) I will likely conditionally compile rwlock on osx, but mutex on linux as I require a writer favoured behaviour. There are certainly applications that will benefit from the reader priority in linux (especially if there is low writer volume and low penalty to delayed writes).

Photography - Why You Should Use JPG (not RAW)

Posted by William Brown on August 05, 2018 02:00 PM

Photography - Why You Should Use JPG (not RAW)

When I started my modern journey into photography, I simply shot in JPG. I was happy with the results, and the images I was able to produce. It was only later that I was introduced to a now good friend and he said: “You should always shoot RAW! You can edit so much more if you do.”. It’s not hard to find many ‘beginner’ videos all touting the value of RAW for post editing, and how it’s the step from beginner to serious photographer (and editor).

Today, I would like to explore why I have turned off RAW on my camera bodies for good. This is a deeply personal decision, and I hope that my experience helps you to think about your own creative choices. If you want to stay shooting RAW and editing - good on you. If this encourages you to try turning back to JPG - good on you too.

There are two primary reasons for why I turned off RAW:

  • Colour reproduction of in body JPG is better to the eye today.
  • Photography is about composing an image from what you have infront of you.

Colour is about experts (and detail)

I have always been unhappy with the colour output of my editing software when processing RAW images. As someone who is colour blind I did not know if it was just my perception, or if real issues existed. No one else complained so it must just be me right!

Eventually I stumbled on an article about how to develop real colour and extract camera film simulations for my editor. I was interested in both the ability to get true reflections of colour in my images, but also to use the film simulations in post (the black and white of my camera body is beautiful and soft, but my editor is harsh).

I spent a solid week testing and profiling both of my cameras. I quickly realised a great deal about what was occuring in my editor, but also my camera body.

The editor I have, is attempting to generalise over the entire set of sensors that a manufacturer has created. They are also attempting to create a true colour output profile, that is as reflective of reality as possible. So when I was exporting RAWs to JPG, I was seeing the differences between what my camera hardware is, vs the editors profiles. (This was particularly bad on my older body, so I suspect the RAW profiles are designed for the newer sensor).

I then created film simulations and quickly noticed the subtle changes. Blacks were blacker, but retained more fine detail with the simulation. Skin tone was softer. Exposure was more even across a variety of image types. How? RAW and my editor is meant to create the best image possible? Why is a film-simulation I have “extracted” creating better images?

As any good engineer would do I created sample images. A/B testing. I would provide the RAW processed by my editor, and a RAW processed with my film simulation. I would vary the left/right of the image, exposure, subject, and more. After about 10 tests across 5 people, only on one occasion did someone prefer the RAW from my editor.

At this point I realised that my camera manufacturer is hiring experts who build, live and breath colour technology. They have tested and examined everything about the body I have, and likely calibrated it individually in the process to make produce exact reproductions as they see in a lab. They are developing colour profiles that are not just broadly applicable, but also pleasing to look at (even if not accurate reproductions).

So how can my film simulations I extracted and built in a week, measure up to the experts? I decided to find out. I shot test images in JPG and RAW and began to provide A/B/C tests to people.

If the editor RAW was washed out compared to the RAW with my film simulation, the JPG from the body made them pale in comparison. Every detail was better, across a range of conditions. The features in my camera body are better than my editor. Noise reduction, dynamic range, sharpening, softening, colour saturation. I was holding in my hands a device that has thousands of hours of expert design, that could eclipse anything I built on a weekend for fun to achieve the same.

It was then I came to think about and realise …

Composition (and effects) is about you

Photography is a complex skill. It’s not having a fancy camera and just clicking the shutter, or zooming in. Photography is about taking that camera and putting it in a position to take a well composed image based on many rules (and exceptions) that I am still continually learning.

When you stop to look at an image you should always think “how can I compose the best image possible?”.

So why shoot in RAW? RAW is all about enabling editing in post. After you have already composed and taken the image. There are valid times and useful functions of editing. For example whitebalance correction and minor cropping in some cases. Both of these are easily conducted with JPG with no loss in quality compared to the RAW. I still commonly do both of these.

However RAW allows you to recover mistakes during composition (to a point). For example, the powerful base-curve fusion module allows dynamic range “after the fact”. You may even add high or low pass filters, or mask areas to filter and affect the colour to make things pop, or want that RAW data to make your vibrance control as perfect as possible. You may change the perspective, or even add filters and more. Maybe you want to optimise de-noise to make smooth high ISO images. There are so many options!

But all these things are you composing after. Today, many of these functions are in your camera - and better performing. So while I’m composing I can enable dynamic range for the darker elements of the frame. I can compose and add my colour saturation (or remove it). I can sharpen, soften. I can move my own body to change perspective. All at the time I am building the image in my mind, as I compose, I am able to decide on the creative effects I want to place in that image. I’m not longer just composing within a frame, but a canvas of potential effects.

To me this was an important distinction. I always found I was editing poorly-composed images in an attempt to “fix” them to something acceptable. Instead I should have been looking at how to compose them from the start to be great, using the tool in my hand - my camera.

Really, this is a decision that is yours. Do you spend more time now to make the image you want? Or do you spend it later editing to achieve what you want?


Photography is a creative process. You will have your own ideas of how that process should look, and how you want to work with it. Great! This was my experience and how I have arrived at a creative process that I am satisfied with. I hope that it provides you an alternate perspective to the generally accepted “RAW is imperative” line that many people advertise.

Dev Null Productions

Posted by Mo Morsi on July 24, 2018 01:35 PM

After my Departure from RedHat I was able to get some RnR, but quickly wanted to get a head start of my next venture. This is because I decided to put a cap on the amount of time that would be dedicated to trying to make "it" happen, and would pause at regular checkpoints to monitor progress. This is not to say I'm going to quit the endeavor at that point in the future (the timeframe of which I'm keeping private), but the intent is to drive focus and keep the ball moving forward objectively. While Omega was a great project to work on, both fun and the source of much growth and experience, I am not comfortable with the amount of time spent on it, for what was gained. All hindsight is 20/20, but every good trader knows when to cut losses.

Dev Null Productions LLC. was launched four months ago in April 2018 and we haven't looked back. Our flagship product, Wipple XRP Intelligence was launched shortly after, providing realtime access to the XRP network and high level stats and reporting. The product is under continued development and we've begun a social-media based marketing drive to promote it. Things are still early, and there is still aways to go & obstacles to overcome (not to mention the crypto-currency bear market that we've been in for the last 1/2 year), but the progress has been great, and there are many more awesome features in the queue

This Thursday, I am giving a presentation on XRP to the Syracuse Software Development Meetup, hosted at the Tech Garden, a tech incubator in Syracuse, NY. I aim to go over the XRP protocol, discussing both the history of Ripple and the technical details, as well as common use cases and gotchyas from our experiences. The event is looking very solid, and there is already a large turnout and some great momentum growing, so I'm excited to participate in it and see how it all goes. While we're still in the early phases of development, I'm hoping to drive some interest in the project, and perhaps meet collaborators who'd like to come onboard for a percentage of ultimate profits!

Be sure to stay tuned for more updates and developments, until then, keep Rippling!

Ripple moon

GSoC Fedora Happines Packets Update – Week 8th and 9th

Posted by Algogator on July 19, 2018 05:15 AM

The talk got accepted a month ago (https://pagure.io/flock/issue/55) but I didn’t know if I’d be able to attend it. I couldn’t find an appointment at the Houston consulate for July, the entire month was booked but luckily someone cancelled last week so I made a quick trip to Houston. And I got my passport back […]

The post GSoC Fedora Happines Packets Update – Week 8th and 9th appeared first on Anna Philips.

fedmsg on CentOS

Posted by Algogator on June 27, 2018 06:46 PM

Zeromq fedmsg is “a library built on ZeroMQ using the PyZMQ Python bindings”. So I thought it might help to learn a little bit more about ZeroMQ ( which is not a messaging queue). Contexts – You usually have one context per process and which manages the sockets. Socket – It can be configured and […]

The post fedmsg on CentOS appeared first on Anna Philips.

Fedora Happiness Packets on CentOS 7

Posted by Algogator on June 15, 2018 05:55 PM

I’m writing this for future me. My default OS is Ubuntu so setting up the project on CentOS was new for me. gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong –param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong –param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC […]

The post Fedora Happiness Packets on CentOS 7 appeared first on Anna Philips.

GSoC update Week 1 and 2

Posted by Algogator on June 12, 2018 07:03 PM

Week 1: Setup a dev instance of Fedora Happiness Packets on CentOS. Took some time to install all the dependencies. It was so much easier on Ubuntu. Get a dev instance from the Infra team https://pagure.io/fedora-infrastructure/issue/6690 Created my wiki user page over here Started reading and tweaking with mozilla-django-oidc to authenticate with FAS Week 2 […]

The post GSoC update Week 1 and 2 appeared first on Anna Philips.

OpenID Connect and Authenticating against Ipsilon

Posted by Algogator on June 08, 2018 02:20 PM

Step 1 – Get your credentials You need to register your app with the OP https://iddev.fedorainfracloud.org/ I used https://github.com/puiterwijk/oidc-register for that. pip install oidc-register To register you need the provider, application, and redirect(callback) URLs oidc-register https://iddev.fedorainfracloud.org/ Note: You can’t use http unless it’s with a localhost.  After you do that you should have […]

The post OpenID Connect and Authenticating against Ipsilon appeared first on Anna Philips.

GSoC 2018: Week 2

Posted by Amitosh Swain Mahapatra on May 30, 2018 06:14 AM

This is Status Report for Fedora App filled by participants on a weekly basis.

Status Report for Amitosh Swain Mahapatra (amitosh)

  • Fedora Account: amitosh
  • IRC: amitosh (found in #fedora, #fedora-dotnet, #fedora-summer-coding, #fedora-commops)
  • Fedora Wiki User Page: amitosh

This time, I am working on improving the Fedora Community App with the Fedora project. It’s been a week since we started off our coding on may 14.

The Fedora App is a central location for Fedora users and innovators to stay updated on The Fedora Project. News updates, social posts, Ask Fedora, as well as articles from Fedora Magazine are all held under this app.

Tasks Completed

Here is the summary of my work in the second week:

We now have offline capablilties in the app (#62). The app now caches the content from Fedora Magazine, FedoCal and Fedora Social. Every time we load the app, we refersh the cache from the API end points in the background. We no longer block the user from interacting with the app and the content also loads a lot faster. (#61)

It still has some rough edges, we will be addressing them in the following weeks.

There are only two hard things in Computer Science: cache invalidation and naming things.

– Phil Karlton

Two Hard Things from Martin Fowler

And in particularly JS, we shall modify it to:

There are only two hard things in Computer Science: 1> Cache invalidation
3> Async callbacks
2> Naming things.

And fortunately, RxJS provides an elegant solution to (3).

This week was particularly challenging and exciting. RxJS Observables and reactive programming patterns was really interesting to learn. Learn RxJS by @btroncone was a great resource that helped me to quickly grasp the concepts. Many thanks!

You can find the weekly report for Week 1 here.

What’s next ?

I’m working on creating unit tests for various services we use in our app and the integration tests for the different screens.

Google Summer of Code 2018 with Fedora

Posted by Algogator on May 24, 2018 03:37 PM

I got selected to work with Fedora on the Fedora Happiness Packets for GSoC 2018 😀 A shout-out to Jona and Bee for helping me with the proposal and initial PRs! About me: Hi there! My name is Anna. I go by the username Algogator on IRC and elsewhere. I study computer science at UTA. […]

The post Google Summer of Code 2018 with Fedora appeared first on Anna Philips.

GSoC 2018: Week 1

Posted by Amitosh Swain Mahapatra on May 22, 2018 05:24 PM

This is Status Report for Fedora App filled by participants on a weekly basis.

Status Report for Amitosh Swain Mahapatra (amitosh)

  • Fedora Account: amitosh
  • IRC: amitosh (found in #fedora, #fedora-dotnet, #fedora-summer-coding, #fedora-commops)
  • Fedora Wiki User Page: amitosh

This time, I am working on improving the Fedora Community App with the Fedora project. It’s been a week since we started off our coding on may 14.

The Fedora App is a central location for Fedora users and innovators to stay updated on The Fedora Project. News updates, social posts, Ask Fedora, as well as articles from Fedora Magazine are all held under this app.

Progress Report

Here is the summary of my work in the first week:

  1. We now have a system for loading configuration such as API keys, depending on the environment we are running. It also allows us to load them from external sources (#55). This would help us to remove the need to store API keys under version control (#52).
  2. The JS to TS conversion I started earlier (#4, #6) is finally complete. All of our code is now fully checked by TypeScript (TS) compiler for type safety (#54, #56). Except for JSON responses, where typing things will be a waste of time as TS does not provide run-time type safety, all other functions and services are now checked using TS interfaces. (#57).
  3. Our code is now following Angular patterns even more closely. I have standardized the Providers who use to return a callback or a Promise to return Observables. We now load network data a bit faster due to improved concurrency in the code.
  4. The documentation coverage of our code has increased. As the part of conversion, I have added TS doc comments describing the usage of various Providers, Services and Components, what the expect and what they return.
  5. The annoying white screen on launch (#16) in certain devices is now gone! (#47)
  6. After the restructuring, we no longer have any in-memory caching. I will be working on offline storage and caching implementation in this week.

What’s next ?

I am working to bring offline storage and sync to Fedora Magazine and Fedora Social sections of the app. This will both improve the usability and performance of the app. From a UX perspective, we will start syncing data rather than blocking the user from doing anything.

GSoC 2018: Kicking off the Coding

Posted by Amitosh Swain Mahapatra on May 14, 2018 03:55 PM

It’s May 14, and this is when we officially start coding for GSoC, 2018 edition. This time, I would be working on improving the Fedora Community App with the Fedora project. This marks the beginning of a journey of 3 months of coding, patching, debugging, git (mess) and the awesome discussions with my mentors and the community.

The Fedora App is a central location for Fedora users and innovators to stay updated on The Fedora Project. News updates, social posts, Ask Fedora, as well as articles from Fedora Magazine are all held under this app.

For the first month of my GSoC coding, I will be working of improving the code quality by completing the TypeScript conversion I started earlier as well as I will also be working on bring offline capabilities to Fedora Magazine reader and the Fedora Social.

Why TypeScript?

TypeScript (TS) is a super-set of JavaScript (JS) that brings optional static typing features to JS. TS is the language used by Ionic - the framework on which the Fedora Community App is built. I personally believe, in a medium to large scale project, a static type system is necessary - it especially helps to catch errors very early and allows to perform safe refactors, but I also agree that there are cases where it’s useful to fallback to dynamic typing.

TS does just this. It is implemented as a syntax extension to JavaScript which is transpiled into JS using the TS compiler. Every valid JS syntax is also valid TS syntax, so it becomes easier to port a project - even partially to use TS features, without incurring the cost of a full project rewrite.

In my earlier work in updating the source code to the latest version of Ionic, I started a conversion from JS to TS. Essential parts are already in place, but still there is a big chunk left for conversion. In my first week of coding, I will be working on this conversion.

Offline Capabilities in Fedora App

Currently, the app lacks any offline capabilities - you always require an internet connection to perform most of the actions.

My work will be to bring offline storage and sync to Fedora Magazine and Fedora Social sections of the app. This will both improve the apps usability and performance. From a UX perspective, we will start syncing data rather than blocking the user from doing anything (as it is currently). The user can continue to act on cached items, while we continue to fetch new items in the background.

Further in second month, I will be working on implementing a lightweight blog reader and caching complete offline copies of user-selected articles.

RETerm to The Terminal with a GUI

Posted by Mo Morsi on May 09, 2018 02:59 PM

When it comes to user interfaces, most (if not all) software applications can be classified into one of three categories:

  • Text Based - whether they entail one-off commands, interactive terminals (REPL), or text-based visual widgets, these saw a major rise in the 50s-80s though were usurped by GUIs in the 80s-90s
  • Graphical - GUIs, or Graphical User Interfaces, facilitate creating visual windows which the user may interact with via the mouse or keyboard. There are many different GUI frameworks available for various platforms
  • Web Based - A special type of graphical interface rendered via a web browser, many applications provide their frontend via HTML, Javascript, & CSS
Interfaces comparison

In recent years modern interface trends seem to be moving in the direction of the Web User Interfaces (WUI), with increasing numbers of apps offering their functionality primarily via HTTP. That being said GUIs and TUIs (Text User Interfaces) are still an entrenched use case for various reasons:

  • Web browsers, servers, and network access may not be available or permissable on all systems
  • Systems need mechanisms to access and interact with the underlying components, incase higher level constructs, such as graphics and network subsystems fail or are unreliable
  • Simpler text & graphical implementations can be coupled and optimized for the underlying operational environment without having to worry about portability and cross-env compatability. Clients can thus be simpler and more robust.

Finally there is a certain pleasing ascethic to simple text interfaces that you don't get with GUIs or WUIs. Of course this is a human-preference sort-of-thing, but it's often nice to return to our computational roots as we move into the future of complex gesture and voice controlled computer interactions.

Scifi terminal

When working on a recent side project (to be announced), I was exploring various concepts as to the user interface which to throw ontop of it. Because other solutions exist in the domain which I'm working in (and for other reasons), I wanted to explore something novel as far as user interaction, and decided to expirement with a text-based approach. ncurses is the goto library for this sort of thing, being available on most modern platforms, along with many widget libraries and high level wrappers


Unfortunately ncurses comes with alot of boilerplate and it made sense to seperate that from the project I intend to use this for. Thus the RETerm library was born, with the intent to provide a high level DSL to implmenent terminal interfaces and applications (... in Ruby of couse <3 !!!)

Reterm sc1

RETerm, aka the Ruby Enhanced TERMinal allows the user to incorporate high level text-based widgets into an orgnaized terminal window, with seemless standardized keyboard interactions (mouse support is on the roadmap to be added). So for example, one could define a window containing a child widget like so:

require 'reterm'
include RETerm

value = nil

init_reterm {
  win = Window.new :rows => 10,
                   :cols => 30

  slider = Components::VSlider.new
  win.component = slider
  value = slider.activate!

puts "Slider Value: #{value}"

This would result in the following interface containing a vertical slider:

Reterm sc2

RETerm ships with many built widgets including:

Text Entry

Reterm sc3

Clickable Button

Reterm sc4

Radio Switch/Rocker/Selectable List

Reterm sc5 Reterm sc6 Reterm sc7

Sliders (both horizontal and vertical)


Ascii Text (with many fonts via artii/figlet)

Reterm sc8

Images (via drawille)

Reterm sc9

RETerm is now available via rubygems. To install, simplly:

  $ gem install reterm

That's All Folks... but wait there is more!!! Afterall:

Delorian meme

For a bit of a value-add, I decided to implement a standard schema where text interfaces could be described in a JSON config file and loaded by the framework, similar to xml schemas which GTK and Android use for their interfaces. One can simply describe their interface in JSON and the framework will instantiate the corresponding text interface:

  "window" : {
    "rows"      : 10,
    "cols"      : 50,
    "border"    : true,
    "component" : {
      "type" : "Entry",
      "init" : {
        "title" : "<C>Demo",
        "label" : "Enter Text: "
Reterm sc10

To assist in generating this schema, I implemented a graphical designer, where components can be dragged and dropped into a 2D canvas to layout the interface.

That's right, you can now use a GUI based application to design a text-based interface.

Retro meme

The Designer itself can be found in the same repo as the RETerm project, loaded in the "designer/" subdir.

Reterm designer

To use if you need to install visualruby (a high level wrapper to ruby-gnome) like so:

  $ gem install visualruby

And that's it! (for real this time) This was certainly a fun side-project to a side-project (toss in a third "side-project" if you consider the designer to be its own thing!). As I to return to the project using RETerm, I aim to revisit it every so often, adding new features, widgets, etc....



Why I still choose Ruby

Posted by Mo Morsi on May 09, 2018 02:59 PM

With the plethora of languages available to developers, I wanted to do a quick follow-up post as to why given my experience in many different environments, Ruby is still the goto language for all my computational needs!

Prg mtn

While different languages offer different solutions in terms of syntax support, memory managment, runtime guarantees, and execution flows; the underlying arithmatic, logical, and I/O hardware being controlled is the same. Thus in theory, given enough time and optimization the performance differences between languages should go to 0 as computational power and capacity increases / goes to infinity (yes, yes, Moore's law and such, but lets ignore that for now).

Of course different classes of problem domains impose their own requirements,

  • real time processing depends low level optimizations that can only be done in assembly and C,
  • data crunching and process parallelization often needs minimal latency and optimized runtimes, something which you only get with compiled/static-typed languages such as C++ and Java,
  • and higher level languages such as Ruby, Python, Perl, and PHP are great for rapid development cycles and providing high level constructs where complicated algorithms can be invoked via elegant / terse means.

But given the rapid rate of hardware performance in recent years, whole classes of problems which were previously limited to 'lower-level' languages such as C and C++ are now able to be feasbily implemented in higher level languages.

Computer power


Thus we see high performance financial applications being implemented in Python, major websites with millions of users a day being implemented in Ruby and Javascript, massive data sets being crunched in R, and much more.

So putting the performance aspect of these environments aside we need to look at the syntactic nature of these languages as well as the features and tools they offer for developers. The last is the easiest to tackle as these days most notable languages come with compilers/interpreters, debuggers, task systems, test suites, documentation engines, and much more. This was not always the case though as Ruby was one of the first languages that pioneered builtin package management through rubygems, and integrated dependency solutions via gemspecs, bundler, etc. CPAN and a few other language-specific online repositories existed before, but with Ruby you got integration that was a core part of the runtime environment and community support. Ruby is still known to be on the leading front of integrated and end-to-end solutions.

Syntax differences is a much more difficult subject to objectively dicuss as much of it comes down to programmer preference, but it would be hard to object to the statement that Ruby is one of the most Object Oriented languages out there. It's not often that you can call the string conversion or type identification methods on ALL constructs, variables, constants, types, literals, primitives, etc:

  > 1.to_s
  => "1"
  > 1.class
  => Integer

Ruby also provides logical flow control constructs not seen in many other languages. For example in addition to the standard if condition then dosomething paradigm, Ruby allows the user to specify the result after the predicate, eg dosomething if condition. This simple change allows developers to express concepts in a natural manner, akin to how they would often be desrcibed between humans. In addition to this, other simple syntax conveniences include:

  • The unless keyword, simply evaluating to if not
          file = File.open("/tmp/foobar", "w")
          file.write("Hello World") unless file.exist?
  • Methods are allowed to end with ? and ! which is great for specifying immutable methods (eg. Socket.open?), mutable methods and/or methods that can thrown an exception (eg. DBRecord.save!)
  • Inclusive and exclusive ranges can be specified via parenthesis and two or three elipses. So for example:
          > (1..4).include?(4)
          => true
          > (1...4).include?(4)
          => false
  • The yield keywork makes it trivial for any method to accept and invoke a callback during the course of its lifetime
  • And much more

Expanding upon the last, blocks are a core concept in Ruby, once which the language nails right on the head. Not only can any function accept an anonymous callback block, blocks can be bound to parameters and operated on like any other data. You can check the number of parameters the callbacks accept by invoking block.arity, dynamically dispatch blocks, save them for later invokation and much more.

Due to the asynchronous nature of many software solutions (many problems can be modeled as asynchronous tasks) blocks fit into many Ruby paradigms, if not as the primary invocation mechanism, then as an optional mechanism so as to enforce various runtime guarantees:

  File.open("/tmp/foobar"){ |file|
    # do whatever with file here

  # File is guaranteed to be closed here, we didn't have to close it ourselves!

By binding block contexts, Ruby facilitates implementing tightly tailored solutions for many problem domains via DSLs. Ruby DSLs exists for web development, system orchestration, workflow management, and much more. This of course is not to mention the other frameworks, such as the massively popular Rails, as well as other widely-used technologies such as Metasploit

Finally, programming in Ruby is just fun. The language is condusive to expressing complex concepts elegantly, jives with many different programming paradigms and styles, and offers a quick prototype to production workflow that is intuitive for both novice and seasoned developers. Nothing quite scratches that itch like Ruby!

Doge ruby

Into The Unknown - My Departure from RedHat

Posted by Mo Morsi on May 09, 2018 02:59 PM

In May 2006, a young starry eyed intern walked into the large corporate lobby of RedHat's Centential Campus in Raleigh, NC, to begin what would be a 12 year journey full of ups and downs, break-throughs and setbacks, and many many memories. Flash forward to April 2018, when the "intern-turned-hardend-software-enginner" filed his resignation and ended his tenure at RedHat to venture into the risky but exciting world of self-employment / entrepreneurship... Incase you were wondering that former-intern / Software Engineer is myself, and after nearly 12 years at RedHat, I finished my last day of employment on Friday April 13th, 2018.

Overall RedHat has been a great experience, I was able to work on many ground-breaking products and technologies, with many very talented individuals from across the spectrum and globe, in a manner that facilitated maximum professional and personal growth. It wasn't all sunshine and lolipops though, there were many setbacks, including many cancelled projects and dead-ends. That being said, I felt I was always able to speak my mind without fear of reprocussion, and always strived to work on those items that mattered the most and had the furthest reaching impact.

Some (but certainly not all) of those items included:

  • The highly publicized, but now defunct, RHX project
  • The oVirt virtualization management (cloud) platform, where I was on the original development team, and helped build the first prototypes & implementation
  • The RedHat Ruby stack, which was a battle to get off the ground (given the prevalence of the Java and Python ecosystems, continuing to to this day). This is one of the items I am most proud of, we addressed the needs of both the RedHat/Fedora and Ruby communities, building the necessary bridge logic to employ Ruby and Rails solutions in many enterprise production environments. This continues to this day as the team continuously stays ontop of upstream Ruby developments and provide robust support and solutions for downstream use
  • The RedHat CloudForms projects, on which I worked on serveral interations, again including initial prototypes and standards, as well as ManageIQ integration.
  • ReFS reverse engineering and parser. The last major research topic that I explored during my tenure at RedHat, this was a very fun project where I built upon the sparse information about the filesystem internals that's out there, and was able to deduce the logic up to the point of being able to read directory lists and file contents and metadata out of a formatted filesystem. While there is plenty of work to go on this front, I'm confident that the published writeups are an excellent launching point for additional research as well as the development of formal tools to extract data from the filesystem.

My plans for the immediate future are to take a short break then file to form a LLC and explore options under that umbrella. Of particular interest are crypto-currencies, specifically Ripple. I've recently begun developing an integrated wallet, ledger and market explorer, and statistical analysis framework called Wipple which I'm planning to continue working on and if all goes according to plan, generating some revenue from. There is alot of ??? between here and there, but thats the name of the game!

Until then, I'd like to thank everyone who helped me do my thing at RedHat, from landing my initial internships and the full-time after that, to showing me the ropes, and not shooting me down when I put myself out there to work on and promote our innovation solutions and technologies!


Data &amp; Market Analysis in C++, R, and Python

Posted by Mo Morsi on May 09, 2018 02:59 PM

In recent years, since efforts on The Omega Project and The Guild sort of fizzled out, I've been exploring various areas of interest with no particular intent other than to play around with some ideas. Data & Financial Engineering was one of those domains and having spent some time diving into the subject (before once again moving on to something else altogether) I'm sharing a few findings here.

My journey down this path started not too long after the Bitcoin Barber Shop Pole was completed, and I was looking for a new project to occupy my free time (the little of it that I have). Having long since stepped down from the SIG315 board, but still renting a private office at the space, I was looking for some way to incorporate that into my next project (besides just using it as the occasional place to work). Brainstorming a bit, I settled on a data visualization idea, where data relating to any number of categories would be aggregated, geotagged, and then projected onto a virtual globe. I decided to use the Marble widget library, built ontop of the QT Framework and had great success:


The architecture behind the DataChoppa project was simple, a generic 'Data' class was implemented using smart pointers ontop of which the Facet Pattern was incorporated, allowing data to be recorded from any number of sources in a generic manner and represented via convenient high level accessors. This was all collected via synchronization and generation plugins which implement a standarized interface whose output was then fed onto a queue on which processing plugings were listening, selecting the data that they were interested in to be operated on from there. The Processors themselves could put more data onto the queue after which the whole process was repeated ad inf., allowing each plugin to satisfy one bit of data-related functionality.

Datachoppa arch

Core Generic & Data Classes

namespace DataChoppa{
  // Generic value container
  class Generic{
      Map<std::string, boost::any> values;
      Map<std::string, std::string> value_strings;

  namespace Data{
    /// Data representation using generic values
    class Data : public Generic{
        Data() = default;
        Data(const Data& data) = default;

        Data(const Generic& generic, TYPES _types, const Source* _source) :
          Generic(generic), types(_types), source(_source) {}

        bool of_type(TYPE type) const;

        Vector to_vector() const;

        TYPES types;

        const Source* source;
    }; // class Data
  }; // namespace Data
}; // namespace DataChoppa

The Process Loop

  namespace DataChoppa {
    namespace Framework{
      void Processor::process_next(){
        if(to_process.empty()) return;
        Data::Data data = to_process.first();
        Plugins::Processors::iterator plugin = plugins.begin();
        while(plugin != plugins.end()) {
          Plugins::Meta* meta = dynamic_cast<Plugins::Meta*>(*plugin);
          //LOG(debug) << "Processing " << meta->id;
          }catch(const Exceptions::Exception& e){
            LOG(warning) << "Error when processing: " << e.what()
                         << " via " << meta->id;
    }; /// namespace Framework
  }; /// namespace DataChoppa

The HTTP Plugin (abridged)

namespace DataChoppa {
  namespace Plugins{
    class HTTP : public Framework::Plugins::Syncer,
                 public Framework::Plugins::Job,
                 public Framework::Plugins::Meta {
        /// ...

        /// sync - always return data to be added to queue, even on error
        Data::Vector sync(){
          String _url = url();
          Network::HTTP::SyncRequest request(_url, request_timeout);

          for(const Network::HTTP::Header& header : headers())

          int attempted = 0;
          Network::HTTP::Response response(request);

          while(attempts == -1 || attempted &lt; attempts){


              if(attempted == attempts){
                Data::Data result = response.to_error_data();
                result.source = &source;
                return result.to_vector();

              if(attempted == attempts){
                Data::Data result = response.to_error_data();
                result.source = &source;
                return result.to_vector();

              Data::Data result = response.to_data();
              result.source = &source;
              return result.to_vector();

          /// we should never get here
          return Data::Vector();
  }; // namespace Plugins
}; // namespace DataChoppa

Overall I was pleased with the result (and perhaps I should have stopped there...). The application collected and aggregated data from many sources including RSS feeds (google news, reddit, etc), weather sources (yahoo weather, weather.com), social networks (facebook, twitter, meetup, linkedin), chat protocols (IRC, slack), financial sources, and much more. While exploring the last I discovered the world of technical analysis and began incorporating many various market indicators into a financial analysis plugin for the project.

The Market Analysis Architecture

Datachoppa extractors Datachoppa annotators

Aroon Indicator (for example)

namespace DataChoppa{
  namespace Market {
    namespace Annotators {
      class Aroon : public Annotator {
          double aroon_up(const Quote& quote, int high_offset, double range){
            return ((range-1) - high_offset) / (range-1) * 100;

          DoubleVector aroon_up(const Quotes& quotes, const Annotations::Extrema* extrema, int range){
            return quotes.collect<DoubleVector>([extrema, range](const Quote& q, int i){
                     return aroon_up(q, extrema->high_offsets[i], range);

          double aroon_down(const Quote& quote, int low_offset, double range){
            return ((range-1) - low_offset) / (range-1) * 100;

          DoubleVector aroon_down(const Quotes& quotes, const Annotations::Extrema* extrema, int range){
            return quotes.collect<DoubleVector>([extrema, range](const Quote& q, int i){
                     return aroon_down(q, extrema->low_offsets[i], range);

          AnnotationList annotate() const{
            const Quotes& quotes = market->quotes;
            if(quotes.size() < range) return AnnotationList();

            const Annotations::Extrema* extrema = aroon_extrema(market, range);
                    Annotations::Aroon* aroon = new Annotations::Aroon(range);
                                        aroon->upper = aroon_up(market->quotes, extrema, range);
                                        aroon->lower = aroon_down(market->quotes, extrema, range);
            return aroon->to_list();
      }; /// class Aroon
    }; /// namespace Annotators
  }; /// namespace Market
}; // namespace DataChoppa

The whole thing worked great, data was pulled in both real time and historical from yahoo finance (until they discontinued it... from then it was google finance), the indicators were run, and results were output. Of course, making $$$ is not as simple as just crunching numbers, and being rather naive I just tossed the results of the indicators into weighted "buckets" and backtested based on simple boolean flags based on the computed signals against threshold values. Thankfully I backtested though as the performance was horrible as losses greatly exceed profits :-(

At this point I should take a step back and note that my progress so far was the result of the availibilty of alot of great resources (we really live in the age of accelerated learning). Specifically the following are indispensible books & sites for those interested in this subject:

  • stockcharts.com - Information on any indicator can be found on this site with details on how it is computed and how it can be used
  • investopedia - Sort of the Wikipedia of investment knowledge, offers great high level insights into how market works and the financial world as it stands
  • Beyond Candlesticks - Though candlestick patterns have limited use, this is great intro to the subject, and provides a good into to reading charts.
  • Nerds on Wall Street - A great book detailing the history of computational finance. Definetly must read if you are new to the domain as it provides a concise high level history on how markets have worked the last few centuries and various computations techniques employed to Seek Alpha
  • High Probability Trading - Provides insights as to the mentality and common pitfalls when trading.
Beyond candlesticksNerds on wallstreetHigh prob trading

The last book is an excellent resource which conveys the importance of money and risk management, as well as the necessity to combine in all factors, or as many factors as you can, when making financial decisions. In the end, I feel this is the gist of it, it's not soley a matter of luck (though there is an aspect of that to this), but rather patience, discipline, balance, and most importantly focus (similar to Aikido but that's a topic for another time). There is no shorting it (unless you're talking about the assets themselves!), and if one does not have / take the necessary time to research and properly plan and out and execute strategies, they will most likely fail (as most do according to the numbers).

It was at this point that I decided to take a step back and restrategize, and having reflected and discussed it over with some acquaintances, I hedged my bets, cut my losses (tech-wise) and switched from C++ to another platform which would allow me prototype and execute ideas quicker. A good amount of time has gone into the C++ project and it worked great, but it did not make sense to continue via a slower development cycle when faster options are available (and afterall every engineer knows time is our most precious resource).

Python and R are the natural choices for this project domain, as there is extensive support in both languages for market analysis, backtesting, and execution. I have used Python at various, points in the past so it was easy to hit the ground running; R was new but by this time no language really poses a serious surprise, the best way I can describe it is spreadsheets on steroids (not exactly, as rather than spreadsheets, data frames and matrixes are the core components, but one can imagine R as being similar to the central execution environment behind Excel, Matlab, or other statistical-software).

I quickly picked up quantmod and prototyped some volatility, trend-following, momentum, and other analysis signal generators in R, plotting them using the provided charting interface. R is a great language for this sort of data manipulation, one can quickly load up structured data from CSV files or online resources, splice it and dice it, chunk it and dunk it, organize it and prioritize it, according to any arithmatic, statistical, or linear/non-linear means which they desire. Quickly loading a new 'view' on the data is as simply as a line of code, and operations can quickly be chained together at high performance.

Volatility indicator in R (consolidated)

quotes <- load_default_symbol("volatility")

quotes.atr <- ATR(quotes, n=ATR_RANGE)

quotes.atr$tr_atr_ratio <- quotes.atr$tr / quotes.atr$atr
quotes.atr$is_high      <- ifelse(quotes.atr$tr_atr_ratio > HIGH_LEVEL, TRUE, FALSE)

# Also Generate ratio of atr to close price
quotes.atr$atr_close_ratio <- quotes.atr$atr / Cl(quotes)

# Generate rising, falling, sideways indicators by calculating slope of ATR regression line
atr_lm       <- list()
atr_lm$df    <- data.frame(quotes.atr$atr, Time = index(quotes.atr))
atr_lm$model <- lm(atr ~ poly(Time, POLY_ORDER), data = atr_lm$df) # polynomial linear model

atr_lm$fit   <- fitted(atr_lm$model)
atr_lm$diff  <- diff(atr_lm$fit)
atr_lm$diff  <- as.xts(atr_lm$diff)

# Current ATR / Close Ratio
quotes.atr.abs_per <- median(quotes.atr$atr_close_ratio[!is.na(quotes.atr$atr_close_ratio)])

# plots
addTA(quotes.atr$tr, type="h")
addTA(as.xts(as.logical(quotes.atr$is_high), index(quotes.atr)), col=col1, on=1)

While it all works great, the R language itself offers very little syntactic sugar for operations not related to data-processing. While there are libraries for most common functionality found in many other execution environments, languages such as Ruby and Python, offer a "friendlier" experience to both novice and seasoned developers alike. Furthermore the process of data synchronization was a tedious step, I was looking for something that offered the flexability of DataChoppa to pull in and process live and historical data from a wide variety of sources, caching results on the fly, and using those results and analysis for subsequent operations.

This all led me to developing a series of Python libraries targeted towards providing a configurable high level view of the market. Intelligence Amplification (IA) as opposed to Artifical Intelligence (AI) if you will (see Nerds on Wall Street).

marketquery.py is a high level market querying library, which implements plugins used to resolve generic market queries for ticker time based data. One can used the interface to query for the lastest quotes or a specific range of them from a particular source, or allow the framework to select one for you.

Retrieve first 3 months of the last 5 years of GBPUSD data

  from marketquery.querier        import Querier
  from marketbase.query.builder   import QueryBuilder
  sym = "GBPUSD"
  first_3mo_of_last_5yr = (QueryBuilder().symbol(sym)
  querier = Querier()
  res     = querier.run(first_3mo_of_last_5yr)
  for query, dat in res.items():
      print(dat.raw[:1000] + (dat.raw[1000:] and '...'))

Retrieve last two month of hourly EURJPY data

  from marketquery.querier        import Querier
  from marketbase.query.builder   import QueryBuilder
  sym = "EURJPY"
  two_months_of_hourly = (QueryBuilder().symbol(sym)
  querier = Querier()
  res     = querier.run(two_months_of_hourly).raw()
  print(res[:1000] + (res[1000:] and '...'))

This provides a quick way to both lookup market data according to specific criteria, as well as cache it so that network resources are used effectively. All caching is configurable, and the user can define timeouts based on the target query, source, and/or data retrieved.

From there the next level up is the technical analysis is was trivial to whip up the tacache.py module which uses the marketquery.py interface to retrieve raw data before feeding it into TALib caching the results. The same caching mechanisms offering the same flexability is employed, if one needs to process a large data set and/or subsets multiple times in a specified period, computational resources are not wasted (important when running on a metered cloud)

Computing various technical indicators

  from marketquery.querier       import Querier
  from marketbase.query.builder  import QueryBuilder
  from tacache.runner            import TARunner
  from tacache.source            import Source
  from tacache.indicator         import Indicator
  from talib                     import SMA
  from talib                  import MACD
  res = Querier().run(QueryBuilder().symbol("AUDUSD")
  ta_runner = TARunner()
  analysis  = ta_runner.run(Indicator(SMA),
  analysis  = ta_runner.run(Indicator(MACD),
  macd, sig, hist = analysis.raw

Finally ontop of all this I wrote a2m.py, a high level querying interface consisting of modules reporting on market volatility and trends as well as other metrics; python scripts which I could quickly execute to report the current and historical market state, making used of the underlying cached query and technical analysis data, periodically invalidated to pull in new/recent live data.

Example using a2m to compute volatility

  sym = "EURUSD"
  self.resolver  = Resolver()
  self.ta_runner = TARunner()

  daily = (QueryBuilder().symbol(sym)

  hourly = (QueryBuilder().symbol(sym)

  current = (QueryBuilder().symbol(sym)

  daily_quotes   = resolver.run(daily)
  hourly_quotes  = resolver.run(hourly)
  current_quotes = resolver.run(current)

  daily_avg  = ta_runner.run(Indicator(talib.SMA, timeperiod=120),  query_result=daily_quotes).raw[-1]
  hourly_avg = ta_runner.run(Indicator(talib.SMA, timeperiod=30),  query_result=hourly_quotes).raw[-1]

  current_val    = current_quotes.raw()[-1]['Close']
  daily_percent  = current_val / daily_avg  if current_val &lt; daily_avg  else daily_avg  / current_val
  hourly_percent = current_val / hourly_avg if current_val &lt; hourly_avg else hourly_avg / current_val
Awesome to the max

I would go onto use this to execute some Forex trades, again not in an algorithmic / automated manner, but rather based on combined knowledge from fundamentals research, as well as the high level technical data, and what was the result...

Poor squidward

I jest, though I did lose a little $$$, it wasn't that much, and to be honest I feel this was due to lack of patience/discipline and other "novice" mistakes as discussed above. I did make about 1/2 of it back, and then lost interest. This all requires alot of focus and time, and I had already spent 2+ years worth of free time on this. With many other interests pulling my strings, I decided to sideline the project(s) alltogether and focus on my next crazy venture.


After some of consideration, I decided to release the R code I wrote under the MIT license. They are rather simple expirements though could be useful as a starting point for others new to the subject. As far as the Python modules and DataChoppa, I intended to eventually release them but aim to take a break first to focus on other efforts and then go back to the war room, to figure out the next stage of the strategy.

And that's that! Enough number crunching, time to go out for a hike!

Hiking meme

ReFS Part III - Back to the Resilience

Posted by Mo Morsi on May 09, 2018 02:59 PM

We've made some great headway on the ReFS filesystem anaylsis front to the point of being able to implement a rudimentary file extraction mechanism (complete with timestamps).

First a recap of the story so far:

  • ReFS, aka "The Resilient FileSystem" is a relatively new filesystem developed by Microsoft. First shipped in Windows Server 2012, it has since seen an increase in popularity and use, especially in enterprise and cloud environments.
  • Little is known about the ReFS internals outside of some sparse information provided by Microsoft. According to that, data is organized into pages of a fixed size, starting at a static position on the disk. The first round of analysis was to determine the boundaries of these top level organizational units to be able to scan the disk for high level structures.
  • Once top level structures, including the object table and root directory, were identified, each was analyzed in detail to determine potential parsable structures such as generic Attribute and Record entities as well as file and directory references.
  • The latest round of analysis consisted of diving into these entities in detail to try and deduce a mechanism which to extract file metadata and content

Before going into details, we should note this analysis is based on observations against ReFS disks generated locally, without extensive sequential cross-referencing and comparison of many large files with many changes. Also it is possible that some structures are oversimplified and/or not fully understood. That being said, this should provide a solid basis for additional analysis, getting us deep into the filesystem, and allowing us to poke and prod with isolated bits to identify their semantics.

Now onto the fun stuff!

- A ReFS filesystem can be identified with the following signature at the very start of the partition:

    00 00 00 52  65 46 53 00  00 00 00 00  00 00 00 00 ...ReFS.........
    46 53 52 53  XX XX XX XX  XX XX XX XX  XX XX XX XX FSRS

- The following Ruby code will tell you if a given offset in a given file contains a ReFS partition:

    # Point this to the file containing the disk image

    # Point this at the start of the partition containing the ReFS filesystem

    # FileSystem Signature we are looking for
    FS_SIGNATURE  = [0x00, 0x00, 0x00, 0x52, 0x65, 0x46, 0x53, 0x00] # ...ReFS.

    img = File.open(File.expand_path(DISK), 'rb')
    img.seek ADDRESS
    sig = img.read(FS_SIGNATURE.size).unpack('C*')
    puts "Disk #{sig == FS_SIGNATURE ? "contains" : "does not contain"} ReFS filesystem"

- ReFS pages are 0x4000 bytes in length

- On all inspected systems, the first page number is 0x1e (0x78000 bytes after the start of the partition containing the filesystem). This is inline w/ Microsoft documentation which states that the first metadata dir is at a fixed offset on the disk.

- Other pages contain various system, directory, and volume structures and tables as well as journaled versions of each page (shadow-written upon regular disk writes)

- The first byte of each page is its Page Number

- The first 0x30 bytes of every metadata page (dubbed the Page Header) seem to follow a certain pattern:

    byte  0: XX XX 00 00   00 00 00 00   YY 00 00 00   00 00 00 00
    byte 16: 00 00 00 00   00 00 00 00   ZZ ZZ 00 00   00 00 00 00
    byte 32: 01 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00
  • dword 0 (XX XX) is the page number which is sequential and corresponds to the 0x4000 offset of the page
  • dword 2 (YY) is the journal number or sequence number
  • dword 6 (ZZ ZZ) is the "Virtual Page Number", which is non-sequential (eg values are in no apparent order) and seem to tie related pages together.
  • dword 8 is always 01, perhaps an "allocated" flag or other

- Multiple pages may share a virtual page number (byte 24/dword 6) but usually don't appear in sequence.

- The following Ruby code will print out the pages in a ReFS partition along w/ their shadow copies:

    # Point this to the file containing the disk image
    # Point this at the start of the partition containing the ReFS filesystem
    FIRST_PAGE = 0x1e
    img = File.open(File.expand_path(DISK), 'rb')
    page_id = FIRST_PAGE
    img.seek(ADDRESS + page_id*PAGE_SIZE)
    while contents = img.read(PAGE_SIZE)
      id = contents.unpack('S').first
      if id == page_id
        pos = img.pos
        start = ADDRESS + page_id * PAGE_SIZE
        img.seek(start + PAGE_SEQ)
        seq = img.read(4).unpack("L").first
        img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
        vpn = img.read(4).unpack("L").first
        print "page: "
        print "0x#{id.to_s(16).upcase}".ljust(7)
        print " @ "
        print "0x#{start.to_s(16).upcase}".ljust(10)
        print ": Seq - "
        print "0x#{seq.to_s(16).upcase}".ljust(7)
        print "/ VPN - "
        print "0x#{vpn.to_s(16).upcase}".ljust(9)
        img.seek pos
      page_id += 1

- The object table (virtual page number 0x02) associates object ids' with the pages on which they reside. Here we an AttributeList consisting of Records of key/value pairs (see below for the specifics on these data structures). We can lookup the object id of the root directory (0x600000000) to retrieve the page on which it resides:

   50 00 00 00 10 00 10 00 00 00 20 00 30 00 00 00 - total length / key & value boundries
   00 00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 - object id
   F4 0A 00 00 00 00 00 00 00 00 02 08 08 00 00 00 - page id / flags
   CE 0F 85 14 83 01 DC 39 00 00 00 00 00 00 00 00 - checksum
   08 00 00 00 08 00 00 00 04 00 00 00 00 00 00 00

^ The object table entry for the root dir, containing its page (0xAF4)

- When retrieving pages by id or virtual page number, look for the ones with the highest sequence number as those are the latest copies of the shadow-write mechanism.

- Expanding upon the previous example we can implement some logic to read and dump the object table:

    ATTR_START = 0x30
    def img
      @img ||= File.open(File.expand_path(DISK), 'rb')
    def pages
      @pages ||= begin
        _pages = {}
        page_id = FIRST_PAGE
        img.seek(ADDRESS + page_id*PAGE_SIZE)
        while contents = img.read(PAGE_SIZE)
          id = contents.unpack('S').first
          if id == page_id
            pos = img.pos
            start = ADDRESS + page_id * PAGE_SIZE
            img.seek(start + PAGE_SEQ)
            seq = img.read(4).unpack("L").first
            img.seek(start + PAGE_VIRTUAL_PAGE_NUM)
            vpn = img.read(4).unpack("L").first
            _pages[id] = {:id => id, :seq => seq, :vpn => vpn}
            img.seek pos
          page_id += 1
    def page(opts)
      if opts.key?(:id)
        return pages[opts[:id]]
      elsif opts[:vpn]
        return pages.values.select { |v|
          v[:vpn] == opts[:vpn]
        }.sort { |v1, v2| v1[:seq] <=> v2[:seq] }.last
    def obj_pages
      @obj_pages ||= begin
        obj_table = page(:vpn => 2)
        img.seek(ADDRESS + obj_table[:id] * PAGE_SIZE)
        bytes = img.read(PAGE_SIZE).unpack("C*")
        len1 = bytes[ATTR_START]
        len2 = bytes[ATTR_START+len1]
        start = ATTR_START + len1 + len2
        objs = {}
        while bytes.size > start && bytes[start] != 0
          len = bytes[start]
          id  = bytes[start+0x10..start+0x20-1].collect { |i| i.to_s(16).upcase }.reverse.join()
          tgt = bytes[start+0x20..start+0x21].collect   { |i| i.to_s(16).upcase }.reverse.join()
          objs[id] = tgt
          start += len
    obj_pages.each { |id, tgt|
      puts "Object #{id} is on page #{tgt}"

We could also implement a method to lookup a specific object's page:

    def obj_page(obj_id)

    puts page(:id => obj_page("0000006000000000").to_i(16))

This will retrieve the page containing the root directory

- Directories, from the root dir down, follow a consistent pattern. They are comprised of sequential lists of data structures whose length is given by the first word value (Attributes and Attribute Lists).

List are often prefixed with a Header Attribute defining the total length of the Attributes that follow that consititute the list. Though this is not a hard set rule as in the case where the list resides in the body of another Attribute (more on that below).

In either case, Attributes may be parsed by iterating over the bytes after the directory page header, reading and processing the first word to determine the next number of bytes to read (minus the length of the first word), and then repeating until null (0000) is encountered (being sure to process specified padding in the process)

- Various Attributes take on different semantics including references to subdirs and files as well as branches to additional pages containing more directory contents (for large directories); though not all Attributes have been identified.

The structures in a directory listing always seem to be of one of the following formats:

- Base Attribute - The simplest / base attribute consisting of a block whose length is given at the very start.

An example of a typical Attribute follows:

      a8 00 00 00  28 00 01 00  00 00 00 00  10 01 00 00  
      10 01 00 00  02 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  a9 d3 a4 c3  27 dd d2 01  
      5f a0 58 f3  27 dd d2 01  5f a0 58 f3  27 dd d2 01  
      a9 d3 a4 c3  27 dd d2 01  20 00 00 00  00 00 00 00  
      00 06 00 00  00 00 00 00  03 00 00 00  00 00 00 00  
      5c 9a 07 ac  01 00 00 00  19 00 00 00  00 00 00 00  
      00 00 01 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  00 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00  01 00 00 00  00 00 00 00  
      00 00 00 00  00 00 00 00

Here we a section of 0xA8 length containing the following four file timestamps (more on this conversion below)

       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       5f a0 58 f3  27 dd d2 01 - 2017-06-04 07:44:40
       a9 d3 a4 c3  27 dd d2 01 - 2017-06-04 07:43:20

It is safe to assume that either

  • one of the first fields in any given Attribute contains an identifier detailing how the attribute should be parsed _or_
  • the context is given by the Attribute's position in the list.
  • attributes corresponding to given meaning are referenced by address or identifier elsewhere

The following is a method which can be used to parse a given Attribute off disk, provided the img read position is set to its start:

    def read_attr
      pos = img.pos
      packed = img.read(4)
      return new if packed.nil?
      attr_len = packed.unpack('L').first
      return new if attr_len == 0

      img.seek pos
      value = img.read(attr_len)
      Attribute.new(:pos   => pos,
                    :bytes => value.unpack("C*"),
                    :len   => attr_len)

- Records - Key / Value pairs whose total length and key / value lengths are given in the first 0x20 bytes of the attribute. These are used to associated metadata sections with files whose names are recorded in the keys and contents are recorded in the value.

An example of a typical Record follows:

    40 04 00 00   10 00 1A 00   08 00 30 00   10 04 00 00   @.........0.....
    30 00 01 00   6D 00 6F 00   66 00 69 00   6C 00 65 00   0...m.o.f.i.l.e.
    31 00 2E 00   74 00 78 00   74 00 00 00   00 00 00 00   1...t.x.t.......
    A8 00 00 00   28 00 01 00   00 00 00 00   10 01 00 00   ¨...(...........
    10 01 00 00   02 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   A9 D3 A4 C3   27 DD D2 01   ........©Ó¤Ã'ÝÒ.
    5F A0 58 F3   27 DD D2 01   5F A0 58 F3   27 DD D2 01   _ Xó'ÝÒ._ Xó'ÝÒ.
    A9 D3 A4 C3   27 DD D2 01   20 00 00 00   00 00 00 00   ©Ó¤Ã'ÝÒ. .......
    00 06 00 00   00 00 00 00   03 00 00 00   00 00 00 00   ................
    5C 9A 07 AC   01 00 00 00   19 00 00 00   00 00 00 00   \..¬............
    00 00 01 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   00 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   01 00 00 00   00 00 00 00   ................
    00 00 00 00   00 00 00 00   20 00 00 00   A0 01 00 00   ........ ... ...
    D4 00 00 00   00 02 00 00   74 02 00 00   01 00 00 00   Ô.......t.......
    78 02 00 00   00 00 00 00 ...(cutoff)                   x.......

Here we see the Record parameters given by the first row:

  • total length - 4 bytes = 0x440
  • key offset - 2 bytes = 0x10
  • key length - 2 bytes = 0x1A
  • flags / identifer - 2 bytes = 0x08
  • value offset - 2 bytes = 0x30
  • value length - 2 bytes = 0x410

Naturally, the Record finishes after the value, 0x410 bytes after the value start at 0x30, or 0x440 bytes after the start of the Record (which lines up with the total length).

We also see that this Record corresponds to a file I created on disk as the key is the File Metadata flag (0x10030) followed by the filename (mofile1.txt).

Here the first attribute in the Record value is the simple attribute we discussed above, containing the file timestamps. The File Reference Attribute List Header follows (more on that below).

From observation Records w/ flag values of '0' or '8' are what we are looking for, while '4' occurs often, this almost always seems to indicate a Historical Record, or a Record that has since been replaced with another.

Since Records are prefixed with their total length, they can be thought of a subclass of Attribute. The following is a Ruby class that uses composition to dispatch record field lookup calls to values in the underlying Attribute:

    class Record
      attr_accessor :attribute

      def initialize(attribute)
        @attribute = attribute

      def key_offset
        @key_offset ||= attribute.words[2]

      def key_length
        @key_length ||= attribute.words[3]

      def flags
        @flags ||= attribute.words[4]

      def value_offset
        @value_offset ||= attribute.words[5]

      def value_length
        @value_offset ||= attribute.words[6]

      def key
        @key ||= begin
          ko, kl, vo, vl = boundries

      def value
        @value ||= begin
          ko, kl, vo, vl = boundries

      def value_pos
        attribute.pos + value_offset

      def key_pos
        attribute.pos + key_offset
    end # class Record

- AttributeList - These are more complicated but interesting. At first glance they are simple Attributes of length 0x20 but upon further inspection we consistently see it contains the length of a large block of Attributes (this length is inclusive, as it contains this first one). After parsing this Attribute, dubbed the 'List Header', we should read the remaining bytes in the List as well as the padding, before arriving at the next Attribute

   20 00 00 00   A0 01 00 00   D4 00 00 00   00 02 00 00 <- list header specifying total length (0x1A0) and padding (0xD4)
   74 02 00 00   01 00 00 00   78 02 00 00   00 00 00 00
   80 01 00 00   10 00 0E 00   08 00 20 00   60 01 00 00
   60 01 00 00   00 00 00 00   80 00 00 00   00 00 00 00
   88 00 00 00  ... (cutoff)

Here we see an Attribute of 0x20 length, that contains a reference to a larger block size (0x1A0) in its third word.

This can be confirmed by the next Attribute whose size (0x180) is the larger block size minute the length of the header (0x1A0 - 0x20). In this case the list only contains one item/child attribute.

In general a simple strategy to parse the entire case would be to:

  • Parse Attributes individually as normal
  • If we encounter a List Header Attribute, we calculate the size of the list (total length minus header length)
  • Then continue parsing Attributes, adding them to the list until the total length is completed.

It also seems that:

  • the padding that occurs after the list is given by header word number 5 (in this case 0xD4). After the list is parsed, we consistently see this many null bytes before the next Attribute begins (which is not part of & unrelated to the list).
  • the type of list is given by its 7th word; directory contents correspond to 0x200 while directory branches are indicated with 0x301

Here is a class that represents an AttributeList header attribute by encapsulating it in a similar manner to Record above:

    class AttributeListHeader
      attr_accessor :attribute

      def initialize(attr)
        @attribute = attr

      # From my observations this is always 0x20
      def len
        @len ||= attribute.dwords[0]

      def total_len
        @total_len ||= attribute.dwords[1]

      def body_len
        @body_len ||= total_len - len

      def padding
        @padding ||= attribute.dwords[2]

      def type
        @type ||= attribute.dwords[3]

      def end_pos
        @end_pos ||= attribute.dwords[4]

      def flags
        @flags ||= attribute.dwords[5]

      def next_pos
        @next_pos ||= attribute.dwords[6]

Here is a method to parse the actual Attribute List assuming the image read position is set to the beginning of the List Header

    def read_attribute_list
      header        = Header.new(read_attr)
      remaining_len = header.body_len
      orig_pos      = img.pos
      bytes         = img.read remaining_len
      img.seek orig_pos

      attributes = []

      until remaining_len == 0
        attributes    << read_attr
        remaining_len -= attributes.last.len

      img.seek orig_pos - header.len + header.end_pos

      AttributeList.new :header     => header,
                        :pos        => orig_pos,
                        :bytes      => bytes,
                        :attributes => attributes

Now we have most of what is needed to locate and parse individual files, but there are a few missing components including:

- Directory Tree Branches: These are Attribute Lists where each Attribute corresponds to a record whose value references a page which contains more directory contents.

Upon encountering an AttributeList header with flag value 0x301, we should

  • iterate over the Attributes in the list,
  • parse them as Records,
  • use the first dword in each value as the page to repeat the directory traversal process (recursively).

Additional files and subdirs found on the referenced pages should be appended to the list of current directory contents.

Note this is the (an?) implementation of the BTree structure in the ReFS filesystem described by Microsoft, as the record keys contain the tree leaf identifiers (based on file and subdirectory names).

This can be used for quick / efficient file and subdir lookup by name (see 'optimization' in 'next steps' below)

- SubDirectories: these are simply Records in the directory's Attribute List whose key contains the Directory Metadata flag (0x20030) as well as the subdir name.

The value of this Record is the corresponding object id which can be used to lookup the page containing the subdir in the object table.

A typical subdirectory Record

    70 00 00 00  10 00 12 00  00 00 28 00  48 00 00 00  
    30 00 02 00  73 00 75 00  62 00 64 00  69 00 72 00  <- here we see the key containing the flag (30 00 02 00) followed by the dir name ("subdir2")
    32 00 00 00  00 00 00 00  03 07 00 00  00 00 00 00  <- here we see the object id as the first qword in the value (0x730)
    00 00 00 00  00 00 00 00  14 69 60 05  28 dd d2 01  <- here we see the directory timestamps (more on those below)
    cc 87 ce 52  28 dd d2 01  cc 87 ce 52  28 dd d2 01  
    cc 87 ce 52  28 dd d2 01  00 00 00 00  00 00 00 00  
    00 00 00 00  00 00 00 00  00 00 00 10  00 00 00 00

- Files: like directories are Records whose key contains a flag (0x10030) followed by the filename.

The value is far more complicated though and while we've discovered some basic Attributes allowing us to pull timestamps and content from the fs, there is still more to be deduced as far as the semantics of this Record's value.

- The File Record value consists of multiple attributes, though they just appear one after each other, without a List Header. We can still parse them sequentially given that all Attributes are individually prefixed with their lengths and the File Record value length gives us the total size of the block.

- The first attribute contains 4 file timestamps at an offset given by the fifth byte of the attribute (though this position may be coincidental an the timestamps could just reside at a fixed location in this attribute).

In the first attribute example above we see the first timestamp is

       a9 d3 a4 c3  27 dd d2 01

This corresponds to the following date

        2017-06-04 07:43:20

And may be converted with the following algorithm:

          tsi = TIMESTAMP_BYTES.pack("C*").unpack("Q*").first
          Time.at(tsi / 10000000 - 11644473600)

Timestamps being in nanoseconds since the Windows Epoch Data (11644473600 = Jan 1, 1601 UTC)

- The second Attribute seems to be the Header of an Attribute List containing the 'File Reference' semantics. These are the Attributes that encapsulate the file length and content pointers.

I'm assuming this is an Attribute List so as to contain many of these types of Attributes for large files. What is not apparent are the full semantics of all of these fields.

But here is where it gets complicated, this List only contains a single attribute with a few child Attributes. This encapsulation seems to be in the same manner as the Attributes stored in the File Record value above, just a simple sequential collection without a Header.

In this single attribute (dubbed the 'File Reference Body') the first Attribute contains the length of the file while the second is the Header for yet another List, this one containing a Record whose value contains a reference to the page which the file contents actually reside.

      | ...                                  |
      | File Entry Record                    |
      | Key: 0x10030 [FileName]              |
      | Value:                               |
      | Attribute1: Timestamps               |
      | Attribute2:                          |
      |   File Reference List Header         |
      |   File Reference List Body(Record)   |
      |     Record Key: ?                    |
      |     Record Value:                    |
      |       File Length Attribute          |
      |       File Content List Header       |
      |       File Content Record(s)         |
      | Padding                              |
      | ...                                  |

While complicated each level can be parsed in a similar manner to all other Attributes & Records, just taking care to parse Attributes into their correct levels & structures.

As far as actual values,

  • the file length is always seen at a fixed offset within its attribute (0x3c) and
  • the content pointer seems to always reside in the second qword of the Record value. This pointer is simply a reference to the page which the file contents can be read verbatim.


And that's it! An example implementation of all this logic can be seen in our expiremental 'resilience' library found here:


The next steps would be to

  • expand upon the data structures above (verify that we have interpreted the existing structures correctly)
  • deduce full Attribute and Record semantics so as to be able to consistently parse files of any given length, with any given number of modifications out of the file system

And once we have done so robustly, we can start looking at optimization, possibly rolling out some expiremental production logic for ReFS filesystem support!

... Cha-ching $ £ ¥ ¢ ₣ ₩ !!!!

Extracting Formally Verified C with FStar and KreMLin

Posted by William Brown on April 29, 2018 02:00 PM

Extracting Formally Verified C with FStar and KreMLin

As software engineering has progressed, the correctness of software has become a major issue. However the tools that exist today to help us create correct programs have been lacking. Human’s continue to make mistakes that cause harm to others (even I do!).

Instead, tools have now been developed that allow the verification of programs and algorithms as correct. These have not gained widespread adoption due to the complexities of their tool chains or other social and cultural issues.

The Project Everest research has aimed to create a formally verified webserver and cryptography library. To achieve this they have developed a language called F* (FStar) and KreMLin as an extraction tool. This allows an FStar verified algorithm to be extracted to a working set of C source code - C source code that can be easily added to existing projects.

Setting up a FStar and KreMLin environment

Today there are a number of undocumented gotchas with opam - the OCaml package manager. Most of these are silent errors. I used the following steps to get to a working environment.

# It's important to have bzip2 here else opam silently fails!
dnf install -y rsync git patch opam bzip2 which gmp gmp-devel m4 \
        hg unzip pkgconfig redhat-rpm-config

opam init
# You need 4.02.3 else wasm will not compile.
opam switch create 4.02.3
opam switch 4.02.3
echo ". /home/Work/.opam/opam-init/init.sh > /dev/null 2> /dev/null || true" >> .bashrc
opam install batteries fileutils yojson ppx_deriving_yojson zarith fix pprint menhir process stdint ulex wasm

PATH "~/z3/bin:~/FStar/bin:~/kremlin:$PATH"
# You can get the "correct" z3 version from https://github.com/FStarLang/binaries/raw/master/z3-tested/z3-
unzip z3-
mv z3- z3

# You will need a "stable" release of FStar https://github.com/FStarLang/FStar/archive/stable.zip
unzip stable.zip
mv FStar-stable Fstar
cd ~/FStar
opam config exec -- make -C src/ocaml-output -j
opam config exec -- make -C ulib/ml

# You need a master branch of kremlin https://github.com/FStarLang/kremlin/archive/master.zip
cd ~
unzip master.zip
mv kremlin-master kremlin
cd kremlin
opam config exec -- make
opam config exec -- make test

Your first FStar extraction

Summer, Code and Fedora

Posted by Amitosh Swain Mahapatra on April 29, 2018 10:41 AM

I have been selected for Google Summer of Code (GSoC) 2018. I’ll be working on developing the backend for the Fedora Community App.

Google Summer of Code

Fedora Project

Fedora has an android app which lets a user browse Fedora Magazine, Ask Fedora, FedoCal etc within it. This app is build using the Ionic Framework, Angular and Cordova. Essentially it is a cross-platform hybrid app.

Project Description

In the current form, most of the functions in the app rely on an in-app browser to render content. This project aims to improve the existing Fedora App for Android for speed, utility and responsiveness, introduce a deeper native integration and make the app more personal for the user.

During the GSoC period I aim for the following deliverables:

  1. Refactored, and restructured code for the android app, providing native experience to the fullest.
  2. Deeper integration with system as well as various Fedora Infra apps.
  3. An Ionic hybrid app that is publishable to app stores like Google Play Store and F-Droid
  4. Providing an offline first experience – caching data from Fedora Magazine and Social and optionally, making some of the content accessible offline.

What would I be working on

The Fedora android app is developed using the Ionic framework and Cordova. It is a framework for developing cross-platform mobile apps, also called as hybrid apps using HTML, CSS and JavaScript.

Here are some of the features I plan to integrate this summer:

  1. Offine data for posts and calendar.
  2. Syncing system calendar with events from FedoCal.
  3. Fedora package search.
  4. Bookmarks and offline reading.
  5. FMN notifications.

About Google Summer of Code

Google Summer of Code (GSoC) is a yearly program by Google to help the open source communities to reach out to student contributors. Organisations pitch projects, and when selected, pick up university students to work on these projects or their own ideas related to the organisation’s project(s).

This is my second GSoC participation. Last summer (2017), I worked with the FOSSi Foundation for bringing code quality metrics for projects listed under LibreCores.org.

While, the summer this time is really really hot (I mean 40°C/104°F in April!), it is going to be interesting!

AD directory admins group setup

Posted by William Brown on April 25, 2018 02:00 PM

AD directory admins group setup

Recently I have been reading many of the Microsoft Active Directory best practices for security and hardening. These are great resources, and very well written. The major theme of the articles is “least privilege”, where accounts like Administrators or Domain Admins are over used and lead to further compromise.

A suggestion that is put forward by the author is to have a group that has no other permissions but to manage the directory service. This should be used to temporarily make a user an admin, then after a period of time they should be removed from the group.

This way you have no Administrators or Domain Admins, but you have an AD only group that can temporarily grant these permissions when required.

I want to explore how to create this and configure the correct access controls to enable this scheme.

Create our group

First, lets create a “Directory Admins” group which will contain our members that have the rights to modify or grant other privileges.

# /usr/local/samba/bin/samba-tool group add 'Directory Admins'
Added group Directory Admins

It’s a really good idea to add this to the “Denied RODC Password Replication Group” to limit the risk of these accounts being compromised during an attack. Additionally, you probably want to make your “admin storage” group also a member of this, but I’ll leave that to you.

# /usr/local/samba/bin/samba-tool group addmembers "Denied RODC Password Replication Group" "Directory Admins"

Now that we have this, lets add a member to it. I strongly advise you create special accounts just for the purpose of directory administration - don’t use your daily account for this!

# /usr/local/samba/bin/samba-tool user create da_william
User 'da_william' created successfully
# /usr/local/samba/bin/samba-tool group addmembers 'Directory Admins' da_william
Added members to group Directory Admins

Configure the permissions

Now we need to configure the correct dsacls to allow Directory Admins full control over directory objects. It could be possible to constrain this to only modification of the cn=builtin and cn=users container however, as directory admins might not need so much control for things like dns modification.

If you want to constrain these permissions, only apply the following to cn=builtins instead - or even just the target groups like Domain Admins.

First we need the objectSID of our Directory Admins group so we can build the ACE.

# /usr/local/samba/bin/samba-tool group show 'directory admins' --attributes=cn,objectsid
dn: CN=Directory Admins,CN=Users,DC=adt,DC=blackhats,DC=net,DC=au
cn: Directory Admins
objectSid: S-1-5-21-2488910578-3334016764-1009705076-1104

Now with this we can construct the ACE.


This permission grants:

  • RP: read property
  • WP: write property
  • LC: list child objects
  • LO: list objects
  • RC: read control

It could be possible to expand these rights: it depends if you want directory admins to be able to do “day to day” ad control jobs, or if you just use them for granting of privileges. That’s up to you. An expanded ACE might be:

# Same as Enterprise Admins

Now lets actually apply this and do a test:

# /usr/local/samba/bin/samba-tool dsacl set --sddl='(A;CI;RPWPLCLORC;;;S-1-5-21-2488910578-3334016764-1009705076-1104)' --objectdn='dc=adt,dc=blackhats,dc=net,dc=au'
# /usr/local/samba/bin/samba-tool group addmembers 'directory admins' administrator -U 'da_william%...'
Added members to group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'
# /usr/local/samba/bin/samba-tool group removemembers 'directory admins' -U 'da_william%...'
Removed members from group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'

It works!


With these steps we have created a secure account that has limited admin rights, able to temporarily promote users with privileges for administrative work - and able to remove it once the work is complete.

Understanding AD Access Control Entries

Posted by William Brown on April 19, 2018 02:00 PM

Understanding AD Access Control Entries

A few days ago I set out to work on making samba 4 my default LDAP server. In the process I was forced to learn about Active Directory Access controls. I found that while there was significant documentation around the syntax of these structures, very little existed explaining how to use them effectively.

What’s in an ACE?

If you look at the the ACL of an entry in AD you’ll see something like:


This seems very confusing and complex (and someone should write a tool to explain these … maybe me). But once you can see the structure it starts to make sense.

Most of the access controls you are viewing here are DACLs or Discrestionary Access Control Lists. These make up the majority of the output after ‘O:DAG:DAD:AI’. TODO: What does ‘O:DAG:DAD:AI’ mean completely?

After that there are many ACEs defined in SDDL or ???. The structure is as follows:


Each of these fields can take varies types. These interact to form the access control rules that allow or deny access. Thankfully, you don’t need to adjust many fields to make useful ACE entries.

MS maintains a document of these field values here.

They also maintain a list of wellknown SID values here

I want to cover some common values you may see though:


Most of the types you’ll see are “A” and “OA”. These mean the ACE allows an access by the SID.


These change the behaviour of the ACE. Common values you may want to set are CI and OI. These determine that the ACE should be inherited to child objects. As far as the MS docs say, these behave the same way.

If you see ID in this field it means the ACE has been inherited from a parent object. In this case the inherit_object_guid field will be set to the guid of the parent that set the ACE. This is great, as it allows you to backtrace the origin of access controls!


This is the important part of the ACE - it determines what access the SID has over this object. The MS docs are very comprehensive of what this does, but common values are:

  • RP: read property
  • WP: write property
  • CR: control rights
  • CC: child create (create new objects)
  • DC: delete child
  • LC: list child objects
  • LO: list objects
  • RC: read control
  • WO: write owner (change the owner of an object)
  • WD: write dac (allow writing ACE)
  • SW: self write
  • SD: standard delete
  • DT: delete tree

I’m not 100% sure of all the subtle behaviours of these, because they are not documented that well. If someone can help explain these to me, it would be great.


We will skip some fields and go straight to SID. This is the SID of the object that is allowed the rights from the rights field. This field can take a GUID of the object, or it can take a “well known” value of the SID. For example ‘AN’ means “anonymous users”, or ‘AU’ meaning authenticated users.


I won’t claim to be an AD ACE expert, but I did find the docs hard to interpret at first. Having a breakdown and explanation of the behaviour of the fields can help others, and I really want to hear from people who know more about this topic on me so that I can expand this resource to help others really understand how AD ACE’s work.

Making Samba 4 the default LDAP server

Posted by William Brown on April 17, 2018 02:00 PM

Making Samba 4 the default LDAP server

Earlier this year Andrew Bartlett set me the challenge: how could we make Samba 4 the default LDAP server in use for Linux and UNIX systems? I’ve finally decided to tackle this, and write up some simple changes we can make, and decide on some long term goals to make this a reality.

What makes a unix directory anyway?

Great question - this is such a broad topic, even I don’t know if I can single out what it means. For the purposes of this exercise I’ll treat it as “what would we need from my previous workplace”. My previous workplace had a dedicated set of 389 Directory Server machines that served lookups mainly for email routing, application authentication and more. The didn’t really process a great deal of login traffic as the majority of the workstations were Windows - thus connected to AD.

What it did show was that Linux clients and applications:

  • Want to use anonymous binds and searchs - Applications and clients are NOT domain members - they just want to do searches
  • The content of anonymous lookups should be “public safe” information. (IE nothing private)
  • LDAPS is a must for binds
  • MemberOf and group filtering is very important for access control
  • sshPublicKey and userCertificate;binary is important for 2fa/secure logins

This seems like a pretty simple list - but it’s not the model Samba 4 or AD ship with.

You’ll also want to harden a few default settings. These include:

  • Disable Guest
  • Disable 10 machine join policy

AD works under the assumption that all clients are authenticated via kerberos, and that kerberos is the primary authentication and trust provider. As a result, AD often ships with:

  • Disabled anonymous binds - All clients are domain members or service accounts
  • No anonymous content available to search
  • No LDAPS (GSSAPI is used instead)
  • no sshPublicKey or userCertificates (pkinit instead via krb)
  • Access control is much more complex topic than just “matching an ldap filter”.

As a result, it takes a bit of effort to change Samba 4 to work in a way that suits both, securely.

Isn’t anonymous binding insecure?

Let’s get this one out the way - no it’s not. In every pen test I have seen if you can get access to a domain joined machine, you probably have a good chance of taking over the domain in various ways. Domain joined systems and krb allows lateral movement and other issues that are beyond the scope of this document.

The lack of anonymous lookup is more about preventing information disclosure - security via obscurity. But it doesn’t take long to realise that this is trivially defeated (get one user account, guest account, domain member and you can search …).

As a result, in some cases it may be better to allow anonymous lookups because then you don’t have spurious service accounts, you have a clear understanding of what is and is not accessible as readable data, and you don’t need every machine on the network to be domain joined - you prevent a possible foothold of lateral movement.

So anonymous binding is just fine, as the unix world has shown for a long time. That’s why I have very few concerns about enabling it. Your safety is in the access controls for searches, not in blocking anonymous reads outright.

Installing your DC

As I run fedora, you will need to build and install samba for source so you can access the heimdal kerberos functions. Fedora’s samba 4 ships ADDC support now, but lacks some features like RODC that you may want. In the future I expect this will change though.

These documents will help guide you:


build steps

install a domain

I strongly advise you use options similar to:

/usr/local/samba/bin/samba-tool domain provision --server-role=dc --use-rfc2307 --dns-backend=SAMBA_INTERNAL --realm=SAMDOM.EXAMPLE.COM --domain=SAMDOM --adminpass=Passw0rd

Allow anonymous binds and searches

Now that you have a working domain controller, we should test you have working ldap:

/usr/local/samba/bin/samba-tool forest directory_service dsheuristics 0000002 -H ldaps://localhost --simple-bind-dn='administrator@samdom.example.com'
ldapsearch -b DC=samdom,DC=example,DC=com -H ldaps://localhost -x

You can see the domain object but nothing else. Many other blogs and sites recommend a blanket “anonymous read all” access control, but I think that’s too broad. A better approach is to add the anonymous read to only the few containers that require it.

/usr/local/samba/bin/samba-tool dsacl set --objectdn=DC=samdom,DC=example,DC=com --sddl='(A;;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Users,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Builtin,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd

In AD groups and users are found in cn=users, and some groups are in cn=builtin. So we allow read to the root domain object, then we set a read on cn=users and cn=builtin that inherits to it’s child objects. The attribute policies are derived elsewhere, so we can assume that things like kerberos data and password material are safe with these simple changes.

Configuring LDAPS

This is a reasonable simple exercise. Given a ca cert, key and cert we can place these in the correct locations samba expects. By default this is the private directory. In a custom install, that’s /usr/local/samba/private/tls/, but for distros I think it’s /var/lib/samba/private. Simply replace ca.pem, cert.pem and key.pem with your files and restart.

Adding schema

To allow adding schema to samba 4 you need to reconfigure the dsdb config on the schema master. To show the current schema master you can use:

/usr/local/samba/bin/samba-tool fsmo show -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au' --password=Password1

Look for the value:

SchemaMasterRole owner: CN=NTDS Settings,CN=LDAPKDC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au

And note the CN=ldapkdc = that’s the hostname of the current schema master.

On the schema master we need to adjust the smb.conf. The change you need to make is:

    dsdb:schema update allowed = yes

Now restart the instance and we can update the schema. The following LDIF should work if you replace ${DOMAINDN} with your namingContext. You can apply it with ldapmodify

dn: CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
cn: sshPublicKey
name: sshPublicKey
lDAPDisplayName: sshPublicKey
description: MANDATORY: OpenSSH Public key
oMSyntax: 4
isSingleValued: FALSE
searchFlags: 8

dn: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
cn: ldapPublicKey
name: ldapPublicKey
description: MANDATORY: OpenSSH LPK objectclass
lDAPDisplayName: ldapPublicKey
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: sshPublicKey

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
sudo ldapmodify -f sshpubkey.ldif -D 'administrator@adt.blackhats.net.au' -w Password1 -H ldaps://localhost
adding new entry "CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

adding new entry "CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

modifying entry "CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

To my surprise, userCertificate already exists! The reason I missed it is a subtle ad schema behaviour I missed. The ldap attribute name is stored in the lDAPDisplayName and may not be the same as the CN of the schema element. As a result, you can find this with:

ldapsearch -H ldaps://localhost -b CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au -x -D 'administrator@adt.blackhats.net.au' -W '(attributeId='

This doesn’t solve my issues: Because I am a long time user of 389-ds, that means I need some ns compat attributes. Here I add the nsUniqueId value so that I can keep some compatability.

dn: CN=nsUniqueId,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
attributeID: 2.16.840.1.113730.3.1.542
cn: nsUniqueId
name: nsUniqueId
lDAPDisplayName: nsUniqueId
description: MANDATORY: nsUniqueId compatability
oMSyntax: 4
isSingleValued: TRUE
searchFlags: 9

dn: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
governsID: 2.16.840.1.113730.3.2.334
cn: nsOrgPerson
name: nsOrgPerson
description: MANDATORY: Netscape DS compat person
lDAPDisplayName: nsOrgPerson
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: nsUniqueId

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
auxiliaryClass: nsOrgPerson

Now with this you can extend your users with the required data for SSH, certificates and maybe 389-ds compatability.

/usr/local/samba/bin/samba-tool user edit william  -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'


Out of the box a number of the unix attributes are not indexed by Active Directory. To fix this you need to update the search flags in the schema.

Again, temporarily allow changes:

    dsdb:schema update allowed = yes

Now we need to add some indexes for common types. Note that in the nsUniqueId schema I already added the search flags. We also want to set that these values should be preserved if they become tombstones so we can recove them.

/usr/local/samba/bin/samba-tool schema attribute modify uid --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify nsUniqueId --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify uidnumber --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify gidnumber --searchflags=9
# Preserve on tombstone but don't index
/usr/local/samba/bin/samba-tool schema attribute modify x509-cert --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify sshPublicKey --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify gecos --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify loginShell --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify home-directory --searchflags=24

AD Hardening

We want to harden a few default settings that could be considered insecure. First, let’s stop “any user from being able to domain join machines”.

/usr/local/samba/bin/samba-tool domain settings account_machine_join_quota 0 -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'

Now let’s disable the Guest account

/usr/local/samba/bin/samba-tool user disable Guest -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'

I plan to write a more complete samba-tool extension for auditing these and more options, so stay tuned!

SSSD configuration

Now that our directory service is configured, we need to configure our clients to utilise it correctly.

Here is my SSSD configuration, that supports sshPublicKey distribution, userCertificate authentication on workstations and SID -> uid mapping. In the future I want to explore sudo rules in LDAP with AD, and maybe even HBAC rules rather than GPO.

Please refer to my other blog posts on configuration of the userCertificates and sshKey distribution.

ignore_group_members = False

# There is a bug in SSSD where this actually means "ipv6 only".
# lookup_family_order=ipv6_first
cache_credentials = True
id_provider = ldap
auth_provider = ldap
access_provider = ldap
chpass_provider = ldap
ldap_search_base = dc=blackhats,dc=net,dc=au

# This prevents an infinite referral loop.
ldap_referrals = False
ldap_id_mapping = True
ldap_schema = ad
# Rather that being in domain users group, create a user private group
# automatically on login.
# This is very important as a security setting on unix!!!
# See this bug if it doesn't work correctly.
# https://pagure.io/SSSD/sssd/issue/3723
auto_private_groups = true

ldap_uri = ldaps://ad.blackhats.net.au
ldap_tls_reqcert = demand
ldap_tls_cacert = /etc/pki/tls/certs/bh_ldap.crt

# Workstation access
ldap_access_filter = (memberOf=CN=Workstation Operators,CN=Users,DC=blackhats,DC=net,DC=au)

ldap_user_member_of = memberof
ldap_user_gecos = cn
ldap_user_uuid = objectGUID
ldap_group_uuid = objectGUID
# This is really important as it allows SSSD to respect nsAccountLock
ldap_account_expire_policy = ad
ldap_access_order = filter, expire
# Setup for ssh keys
ldap_user_ssh_public_key = sshPublicKey
# This does not require ;binary tag with AD.
ldap_user_certificate = userCertificate
# This is required for the homeDirectory to be looked up in the sssd schema
ldap_user_home_directory = homeDirectory

services = nss, pam, ssh, sudo
config_file_version = 2
certificate_verification = no_verification

domains = blackhats.net.au
homedir_substring = /home

pam_cert_auth = True







With these simple changes we can easily make samba 4 able to perform the roles of other unix focused LDAP servers. This allows stateless clients, secure ssh key authentication, certificate authentication and more.

Some future goals to improve this include:

  • Ship samba 4 with schema templates that can be used
  • Schema querying (what objectclass takes this attribute?)
  • Group editing (same as samba-tool user edit)
  • Security auditing tools
  • user/group modification commands
  • Refactor and improve the cli tools python to be api driven - move the logic from netcmd into samdb so that samdb can be an API that python can consume easier. Prevent duplication of logic.

The goal is so that an admin never has to see an LDIF ever again.

Smartcards and You - How To Make Them Work on Fedora/RHEL

Posted by William Brown on February 26, 2018 01:00 PM

Smartcards and You - How To Make Them Work on Fedora/RHEL

Smartcards are a great way to authenticate users. They have a device (something you have) and a pin (something you know). They prevent password transmission, use strong crypto and they even come in a variety of formats. From your “card” shapes to yubikeys.

So why aren’t they used more? It’s the classic issue of usability - the setup for them is undocumented, complex, and hard to discover. Today I hope to change this.

The Goal

To authenticate a user with a smartcard to a physical linux system, backed onto LDAP. The public cert in LDAP is validated, as is the chain to the CA.

You Will Need

I’ll be focusing on the yubikey because that’s what I own.

Preparing the Smartcard

First we need to make the smartcard hold our certificate. Because of a crypto issue in yubikey firmware, it’s best to generate certificates for these externally.

I’ve documented this before in another post, but for accesibility here it is again.

Create an NSS DB, and generate a certificate signing request:

certutil -d . -N -f pwdfile.txt
certutil -d . -R -a -o user.csr -f pwdfile.txt -g 4096 -Z SHA256 -v 24 \
--keyUsage digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment --nsCertType sslClient --extKeyUsage clientAuth \
-s "CN=username,O=Testing,L=example,ST=Queensland,C=AU"

Once the request is signed, and your certificate is in “user.crt”, import this to the database.

certutil -A -d . -f pwdfile.txt -i user.crt -a -n TLS -t ",,"
certutil -A -d . -f pwdfile.txt -i ca.crt -a -n TLS -t "CT,,"

Now export that as a p12 bundle for the yubikey to import.

pk12util -o user.p12 -d . -k pwdfile.txt -n TLS

Now import this to the yubikey - remember to use slot 9a this time! As well make sure you set the touch policy NOW, because you can’t change it later!

yubico-piv-tool -s9a -i user.p12 -K PKCS12 -aimport-key -aimport-certificate -k --touch-policy=always

Setting up your LDAP user

First setup your system to work with LDAP via SSSD. You’ve done that? Good! Now it’s time to get our user ready.

Take our user.crt and convert it to DER:

openssl x509 -inform PEM -outform DER -in user.crt -out user.der

Now you need to transform that into something that LDAP can understand. In the future I’ll be adding a tool to 389-ds to make this “automatic”, but for now you can use python:

>>> import base64
>>> with open('user.der', 'r') as f:
>>>    print(base64.b64encode(f.read))

That should output a long base64 string on one line. Add this to your ldap user with ldapvi:

userCertificate;binary:: <BASE64>

Note that ‘;binary’ tag has an important meaning here for certificate data, and the ‘::’ tells ldap that this is b64 encoded, so it will decode on addition.

Setting up the system

Now that you have done that, you need to teach SSSD how to intepret that attribute.

In your various SSSD sections you’ll need to make the following changes:

auth_provider = ldap
ldap_user_certificate = userCertificate;binary

# This controls OCSP checks, you probably want this enabled!
# certificate_verification = no_verification

pam_cert_auth = True

Now the TRICK is letting SSSD know to use certificates. You need to run:

sudo touch /var/lib/sss/pubconf/pam_preauth_available

With out this, SSSD won’t even try to process CCID authentication!

Add your ca.crt to the system trusted CA store for SSSD to verify:

certutil -A -d /etc/pki/nssdb -i ca.crt -n USER_CA -t "CT,,"

Add coolkey to the database so it can find smartcards:

modutil -dbdir /etc/pki/nssdb -add "coolkey" -libfile /usr/lib64/libcoolkeypk11.so

Check that SSSD can find the certs now:

# sudo /usr/libexec/sssd/p11_child --pre --nssdb=/etc/pki/nssdb
PIN for william
CAC ID Certificate

If you get no output here you are missing something! If this doesn’t work, nothing will!

Finally, you need to tweak PAM to make sure that pam_unix isn’t getting in the way. I use the following configuration.

auth        required      pam_env.so
# This skips pam_unix if the given uid is not local (IE it's from SSSD)
auth        [default=1 ignore=ignore success=ok] pam_localuser.so
auth        sufficient    pam_unix.so nullok try_first_pass
auth        requisite     pam_succeed_if.so uid >= 1000 quiet_success
auth        sufficient    pam_sss.so prompt_always ignore_unknown_user
auth        required      pam_deny.so

account     required      pam_unix.so
account     sufficient    pam_localuser.so
account     sufficient    pam_succeed_if.so uid < 1000 quiet
account     [default=bad success=ok user_unknown=ignore] pam_sss.so
account     required      pam_permit.so

password    requisite     pam_pwquality.so try_first_pass local_users_only retry=3 authtok_type=
password    sufficient    pam_unix.so sha512 shadow try_first_pass use_authtok
password    sufficient    pam_sss.so use_authtok
password    required      pam_deny.so

session     optional      pam_keyinit.so revoke
session     required      pam_limits.so
-session    optional      pam_systemd.so
session     [success=1 default=ignore] pam_succeed_if.so service in crond quiet use_uid
session     required      pam_unix.so
session     optional      pam_sss.so

That’s it! Restart SSSD, and you should be good to go.

Finally, you may find SELinux isn’t allowing authentication. This is really sad that smartcards don’t work with SELinux out of the box and I have raised a number of bugs, but check this just in case.

Happy authentication!

Using b43 firmware on Fedora Atomic Workstation

Posted by William Brown on December 22, 2017 01:00 PM

Using b43 firmware on Fedora Atomic Workstation

My Macbook Pro has a broadcom b43 wireless chipset. This is notorious for being one of the most annoying wireless adapters on linux. When you first install Fedora you don’t even see “wifi” as an option, and unless you poke around in dmesg, you won’t find how to enable b43 to work on your platform.


The b43 driver requires proprietary firmware to be loaded else the wifi chip will not run. There are a number of steps for this process found on the linux wireless page . You’ll note that one of the steps is:

export FIRMWARE_INSTALL_DIR="/lib/firmware"
sudo b43-fwcutter -w "$FIRMWARE_INSTALL_DIR" broadcom-wl-5.100.138/linux/wl_apsta.o

So we need to be able to write and extract our firmware to /usr/lib/firmware, and then reboot and out wifi works.

Fedora Atomic Workstation

Atomic WS is similar to atomic server, that it’s a read-only ostree based deployment of fedora. This comes with a number of unique challenges and quirks but for this issue:

sudo touch /usr/lib/firmware/test
/bin/touch: cannot touch '/usr/lib/firmware/test': Read-only file system

So we can’t extract our firmware!

Normally linux also supports reading from /usr/local/lib/firmware (which on atomic IS writeable …) but for some reason fedora doesn’t allow this path.

Solution: Layered RPMs

Atomic has support for “rpm layering”. Ontop of the ostree image (which is composed of rpms) you can supply a supplemental list of packages that are “installed” at rpm-ostree update time.

This way you still have an atomic base platform, with read-only behaviours, but you gain the ability to customise your system. To achive it, it must be possible to write to locations in /usr during rpm install.

New method - install rpmfusion tainted

As I have now learnt, the b43 data is provided as part of rpmfusion nonfree. To enable this, you need to access the tainted repo. I a file such as “/etc/yum.repos.d/rpmfusion-nonfree-tainted.repo” add the content:


Now, you should be able to run:

atomic host install b43-firmware

You should have a working wifi chipset!

Custom RPM - old method

This means our problem has a simple solution: Create a b43 rpm package. Note, that you can make this for yourself privately, but you can’t distribute it for legal reasons.

Get setup on atomic to build the packages:

rpm-ostree install rpm-build createrepo

RPM specfile:

%define debug_package %{nil}
Summary: Allow b43 fw to install on ostree installs due to bz1512452
Name: b43-fw
Version: 1.0.0
Release: 1
URL: http://linuxwireless.sipsolutions.net/en/users/Drivers/b43/
Group: System Environment/Kernel

BuildRequires: b43-fwcutter

Source0: http://www.lwfinger.com/b43-firmware/broadcom-wl-5.100.138.tar.bz2

Broadcom firmware for b43 chips.

%setup -q -n broadcom-wl-5.100.138


mkdir -p %{buildroot}/usr/lib/firmware
b43-fwcutter -w %{buildroot}/usr/lib/firmware linux/wl_apsta.o

%dir %{_prefix}/lib/firmware/b43

* Fri Dec 22 2017 William Brown <william at blackhats.net.au> - 1.0.0
- Initial version

Now you can put this into a folder like so:

mkdir -p ~/rpmbuild/{SPECS,SOURCES}
<editor> ~/rpmbuild/SPECS/b43-fw.spec
wget -O ~/rpmbuild/SOURCES/broadcom-wl-5.100.138.tar.bz2 http://www.lwfinger.com/b43-firmware/broadcom-wl-5.100.138.tar.bz2

We are now ready to build!

rpmbuild -bb ~/rpmbuild/SPECS/b43-fw.spec
createrepo ~/rpmbuild/RPMS/x86_64/

Finally, we can install this. Create a yum repos file:

baseurl=file:///home/<YOUR USERNAME HERE>/rpmbuild/RPMS/x86_64
rpm-ostree install b43-fw

Now reboot and enjoy wifi on your Fedora Atomic Macbook Pro!

What am I doing ?

Posted by Mayank Jha on December 08, 2017 04:31 AM

Joining club 27 because it’s cool ?

No one knows when will someone die.
My friend died, whom I loved.
Unfortunate and sad.
I could have been one.
I could die the next moment.
I am shit scared now.
What should I do ?
The memory of my friend will stay with me forever till I die.
It’s a scar which will be lifelong.
But I cherish the moments
I had with him.
The wild explorations we had.
Let’s cherish the moments we made.
Sadly you could not stay with us any longer.


Makes us realise how transient human life is.
The one final reality which we all need to face.
The one final breath.
The one final slump into slumber and nothingness.
Complete void.
All that remains is between the start and end.
He said, in the end nothing matters.
But the thing in between is ALL that is.
You came with nothing and would go with nothing.
But in the middle is where the magic called life happens


The emotion called love, which binds us.
Love from our creators.
Love from humans around us.
I don’t know whether or not I’ll end up with a girl.
I don’t know whether I’ll end up with a guy.
I know for sure, that spreading love and passion.
My nerves might never heal.
The itch might never go.
But love shall stay.

Creating yubikey SSH and TLS certificates

Posted by William Brown on November 10, 2017 01:00 PM

Creating yubikey SSH and TLS certificates

Recently yubikeys were shown to have a hardware flaw in the way the generated private keys. This affects the use of them to provide PIV identies or SSH keys.

However, you can generate the keys externally, and load them to the key to prevent this issue.


First, we’ll create a new NSS DB on an airgapped secure machine (with disk encryption or in memory storage!)

certutil -N -d . -f pwdfile.txt

Now into this, we’ll create a self-signed cert valid for 10 years.

certutil -S -f pwdfile.txt -d . -t "C,," -x -n "SSH" -g 2048 -s "cn=william,O=ssh,L=Brisbane,ST=Queensland,C=AU" -v 120

We export this now to PKCS12 for our key to import.

pk12util -o ssh.p12 -d . -k pwdfile.txt -n SSH

Next we import the key and cert to the hardware in slot 9c

yubico-piv-tool -s9c -i ssh.p12 -K PKCS12 -aimport-key -aimport-certificate -k

Finally, we can display the ssh-key from the token.

ssh-keygen -D /usr/lib64/opensc-pkcs11.so -e

Note, we can make this always used by ssh client by adding the following into .ssh/config:

PKCS11Provider /usr/lib64/opensc-pkcs11.so

TLS Identities

The process is almost identical for user certificates.

First, create the request:

certutil -d . -R -a -o user.csr -f pwdfile.txt -g 4096 -Z SHA256 -v 24 \
--keyUsage digitalSignature,nonRepudiation,keyEncipherment,dataEncipherment --nsCertType sslClient --extKeyUsage clientAuth \
-s "CN=username,O=Testing,L=example,ST=Queensland,C=AU"

Once the request is signed, we should have a user.crt back. Import that to our database:

certutil -A -d . -f pwdfile.txt -i user.crt -a -n TLS -t ",,"

Import our CA certificate also. Next export this to p12:

pk12util -o user.p12 -d . -k pwdfile.txt -n TLS

Now import this to the yubikey - remember to use slot 9a this time!

yubico-piv-tool -s9a -i user.p12 -K PKCS12 -aimport-key -aimport-certificate -k --touch-policy=always


What’s the problem with NUMA anyway?

Posted by William Brown on November 06, 2017 01:00 PM

What’s the problem with NUMA anyway?

What is NUMA?

Non-Uniform Memory Architecture is a method of seperating ram and memory management units to be associated with CPU sockets. The reason for this is performance - if multiple sockets shared a MMU, they will cause each other to block, delaying your CPU.

To improve this, each NUMA region has it’s own MMU and RAM associated. If a CPU can access it’s local MMU and RAM, this is very fast, and does not prevent another CPU from accessing it’s own. For example:

CPU 0   <-- QPI --> CPU 1
  |                   |
  v                   v
MMU 0               MMU 1
  |                   |
  v                   v
RAM 1               RAM 2

For example, on the following system, we can see 1 numa region:

# numactl --hardware
available: 1 nodes (0)
node 0 cpus: 0 1 2 3
node 0 size: 12188 MB
node 0 free: 458 MB
node distances:
node   0
  0:  10

On this system, we can see two:

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 24 25 26 27 28 29 30 31 32 33 34 35
node 0 size: 32733 MB
node 0 free: 245 MB
node 1 cpus: 12 13 14 15 16 17 18 19 20 21 22 23 36 37 38 39 40 41 42 43 44 45 46 47
node 1 size: 32767 MB
node 1 free: 22793 MB
node distances:
node   0   1
  0:  10  20
  1:  20  10

This means that on the second system there is 32GB of ram per NUMA region which is accessible, but the system has total 64GB.

The problem

The problem arises when a process running on NUMA region 0 has to access memory from another NUMA region. Because there is no direct connection between CPU 0 and RAM 1, we must communicate with our neighbour CPU 1 to do this for us. IE:

CPU 0 --> CPU 1 --> MMU 1 --> RAM 1

Not only do we pay a time delay price for the QPI communication between CPU 0 and CPU 1, but now CPU 1’s processes are waiting on the MMU 1 because we are retrieving memory on behalf of CPU 0. This is very slow (and can be seen by the node distances in the numactl –hardware output).

Today’s work around

The work around today is to limit your Directory Server instance to a single NUMA region. So for our example above, we would limit the instance to NUMA region 0 or 1, and treat the instance as though it only has access to 32GB of local memory.

It’s possible to run two instances of DS on a single server, pinning them to their own regions and using replication between them to provide synchronisation. You’ll need a load balancer to fix up the TCP port changes, or you need multiple addresses on the system for listening.

The future

In the future, we’ll be adding support for better copy-on-write techniques that allow the cores to better cache content after a QPI negotiation - but we still have to pay the transit cost. We can minimise this as much as possible, but there is no way today to avoid this penalty. To use all your hardware on a single instance, there will always be a NUMA cost somewhere.

The best solution is as above: run an instance per NUMA region, and internally provide replication for them. Perhaps we’ll support an automatic configuration of this in the future.

GSoC 2017 - Mentor Report from 389 Project

Posted by William Brown on August 23, 2017 02:00 PM

GSoC 2017 - Mentor Report from 389 Project

This year I have had the pleasure of being a mentor for the Google Summer of Code program, as part of the Fedora Project organisation. I was representing the 389 Directory Server Project and offered students the oppurtunity to work on our command line tools written in python.


From the start we have a large number of really talented students apply to the project. This was one of the hardest parts of the process was to choose a student, given that I wanted to mentor all of them. Sadly I only have so many hours in the day, so we chose Ilias, a student from Greece. What really stood out was his interest in learning about the project, and his desire to really be part of the community after the project concluded.

The project

The project was very deliberately “loose” in it’s specification. Rather than giving Ilias a fixed goal of you will implement X, Y and Z, I chose to set a “broad and vague” task. Initially I asked him to investigate a single area of the code (the MemberOf plugin). As he investigated this, he started to learn more about the server, ask questions, and open doors for himself to the next tasks of the project. As these smaller questions and self discoveries stacked up, I found myself watching Ilias start to become a really complete developer, who could be called a true part of our community.

Ilias’ work was exceptional, and he has documented it in his final report here .

Since his work is complete, he is now free to work on any task that takes his interest, and he has picked a good one! He has now started to dive deep into the server internals, looking at part of our backend internals and how we dump databases from id2entry to various output formats.

What next?

I will be participating next year - Sadly, I think the python project oppurtunities may be more limited as we have to finish many of these tasks to release our new CLI toolset. This is almost a shame as the python components are a great place to start as they ease a new contributor into the broader concepts of LDAP and the project structure as a whole.

Next year I really want to give this oppurtunity to an under-represented group in tech (female, poc, etc). I personally have been really inspired by Noriko and I hope to have the oppurtunity to pass on her lessons to another aspiring student. We need more engineers like her in the world, and I want to help create that future.

Advice for future mentors

Mentoring is not for everyone. It’s not a task which you can just send a couple of emails and be done every day.

Mentoring is a process that requires engagement with the student, and communication and the relationship is key to this. What worked well was meeting early in the project, and working out what community worked best for us. We found that email questions and responses worked (given we are on nearly opposite sides of the Earth) worked well, along with irc conversations to help fix up any other questions. It would not be uncommon for me to spend at least 1 or 2 hours a day working through emails from Ilias and discussions on IRC.

A really important aspect of this communication is how you do it. You have to balance positive communication and encouragement, along with critcism that is constructive and helpful. Empathy is a super important part of this equation.

My number one piece of advice would be that you need to create an environment where questions are encouraged and welcome. You can never be dismissive of questions. If ever you dismiss a question as “silly” or “dumb”, you will hinder a student from wanting to ask more questions. If you can’t answer the question immediately, send a response saying “hey I know this is important, but I’m really busy, I’ll answer you as soon as I can”.

Over time you can use these questions to help teach lessons for the student to make their own discoveries. For example, when Ilias would ask how something worked, I would send my response structured in the way I approached the problem. I would send back links to code, my thoughts, and how I arrived at the conclusion. This not only answered the question but gave a subtle lesson in how to research our codebase to arrive at your own solutions. After a few of these emails, I’m sure that Ilias has now become self sufficent in his research of the code base.

Another valuable skill is that overtime you can help to build confidence through these questions. To start with Ilias would ask “how to implement” something, and I would answer. Over time, he would start to provide ideas on how to implement a solution, and I would say “X is the right one”. As time went on I started to answer his question with “What do you think is the right solution and why?”. These exchanges and justifications have (I hope) helped him to become more confident in his ideas, the presentation of them, and justification of his solutions. It’s led to this excellent exchange on our mailing lists, where Ilias is discussing the solutions to a problem with the broader community, and working to a really great answer.

Final thoughts

This has been a great experience for myself and Ilias, and I really look forward to helping another student next year. I’m sure that Ilias will go on to do great things, and I’m happy to have been part of his journey.

So you want to script gdb with python …

Posted by William Brown on August 03, 2017 02:00 PM

So you want to script gdb with python …

Gdb provides a python scripting interface. However the documentation is highly technical and not at a level that is easily accessible.

This post should read as a tutorial, to help you understand the interface and work toward creating your own python debuging tools to help make gdb usage somewhat “less” painful.

The problem

I have created a problem program called “naughty”. You can find it here .

You can compile this with the following command:

gcc -g -lpthread -o naughty naughty.c

When you run this program, your screen should be filled with:

thread ...
thread ...
thread ...
thread ...
thread ...
thread ...

It looks like we have a bug! Now, we could easily see the issue if we looked at the C code, but that’s not the point here - lets try to solve this with gdb.

gdb ./naughty
(gdb) run
[New Thread 0x7fffb9792700 (LWP 14467)]
thread ...

Uh oh! We have threads being created here. We need to find the problem thread. Lets look at all the threads backtraces then.

Thread 129 (Thread 0x7fffb3786700 (LWP 14616)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 128 (Thread 0x7fffb3f87700 (LWP 14615)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 127 (Thread 0x7fffb4788700 (LWP 14614)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6


We have 129 threads! Anyone of them could be the problem. We could just read these traces forever, but that’s a waste of time. Let’s try and script this with python to make our lives a bit easier.

Python in gdb

Python in gdb works by bringing in a copy of the python and injecting a special “gdb” module into the python run time. You can only access the gdb module from within python if you are using gdb. You can not have this work from a standard interpretter session.

We can access a dynamic python runtime from within gdb by simply calling python.

(gdb) python
>print("hello world")
>hello world

The python code only runs when you press Control D.

Another way to run your script is to import them as “new gdb commands”. This is the most useful way to use python for gdb, but it does require some boilerplate to start.

import gdb

class SimpleCommand(gdb.Command):
    def __init__(self):
        # This registers our class as "simple_command"
        super(SimpleCommand, self).__init__("simple_command", gdb.COMMAND_DATA)

    def invoke(self, arg, from_tty):
        # When we call "simple_command" from gdb, this is the method
        # that will be called.
        print("Hello from simple_command!")

# This registers our class to the gdb runtime at "source" time.

We can run the command as follows:

(gdb) source debug_naughty.py
(gdb) simple_command
Hello from simple_command!

Solving the problem with python

So we need a way to find the “idle threads”. We want to fold all the threads with the same frame signature into one, so that we can view anomalies.

First, let’s make a “stackfold” command, and get it to list the current program.

class StackFold(gdb.Command):
def __init__(self):
    super(StackFold, self).__init__("stackfold", gdb.COMMAND_DATA)

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    inferiors = gdb.inferiors()
    for inferior in inferiors:


To reload this in the gdb runtime, just run “source debug_naughty.py” again. Try running this: Note that we dumped a heap of output? Python has a neat trick that dir and help can both return strings for printing. This will help us to explore gdb’s internals inside of our program.

We can see from the inferiors that we have threads available for us to interact with:

class Inferior(builtins.object)
 |  GDB inferior object
 |  threads(...)
 |      Return all the threads of this inferior.

Given we want to fold the stacks from all our threads, we probably need to look at this! So lets get one thread from this, and have a look at it’s help.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)

Now we can run this by re-running “source” on our script, and calling stackfold again, we see help for our threads in the system.

At this point it get’s a little bit less obvious. Gdb’s python integration relates closely to how a human would interact with gdb. In order to access the content of a thread, we need to change the gdb context to access the backtrace. If we were doing this by hand it would look like this:

(gdb) thread 121
[Switching to thread 121 (Thread 0x7fffb778e700 (LWP 14608))]
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

We need to emulate this behaviour with our python calls. We can swap to the thread’s context with:

class InferiorThread(builtins.object)
 |  GDB thread object
 |  switch(...)
 |      switch ()
 |      Makes this the GDB selected thread.

Then once we are in the context, we need to take a different approach to explore the stack frames. We need to explore the “gdb” modules raw context.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)
    # Move our gdb context to the selected thread here.

Now that we have selected our thread’s context, we can start to explore here. gdb can do a lot within the selected context - as a result, the help output from this call is really large, but it’s worth reading so you can understand what is possible to achieve. In our case we need to start to look at the stack frames.

To look through the frames we need to tell gdb to rewind to the “newest” frame (ie, frame 0). We can then step down through progressively older frames until we exhaust. From this we can print a rudimentary trace:


# Reset the gdb frame context to the "latest" frame.
# Now, work down the frames.
cur_frame = gdb.selected_frame()
while cur_frame is not None:
    # get the next frame down ....
    cur_frame = cur_frame.older()
(gdb) stackfold

Great! Now we just need some extra metadata from the thread to know what thread id it is so the user can go to the correct thread context. So lets display that too:


# These are the OS pid references.
(tpid, lwpid, tid) = head_thread.ptid
# This is the gdb thread number
gtid = head_thread.num
print("tpid %s, lwpid %s, tid %s, gtid %s" % (tpid, lwpid, tid, gtid))
# Reset the gdb frame context to the "latest" frame.
(gdb) stackfold
tpid 14485, lwpid 14616, tid 0, gtid 129

At this point we have enough information to fold identical stacks. We’ll iterate over every thread, and if we have seen the “pattern” before, we’ll just add the gdb thread id to the list. If we haven’t seen the pattern yet, we’ll add it. The final command looks like:

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    stack_maps = {}
    # This creates a dict where each element is keyed by backtrace.
    # Then each backtrace contains an array of "frames"
    inferiors = gdb.inferiors()
    for inferior in inferiors:
        for thread in inferior.threads():
            # Change to our threads context
            # Get the thread IDS
            (tpid, lwpid, tid) = thread.ptid
            gtid = thread.num
            # Take a human readable copy of the backtrace, we'll need this for display later.
            o = gdb.execute('bt', to_string=True)
            # Build the backtrace for comparison
            backtrace = []
            cur_frame = gdb.selected_frame()
            while cur_frame is not None:
                cur_frame = cur_frame.older()
            # Now we have a backtrace like ['pthread_cond_wait@@GLIBC_2.3.2', 'lazy_thread', 'start_thread', 'clone']
            # dicts can't use lists as keys because they are non-hashable, so we turn this into a string.
            # Remember, C functions can't have spaces in them ...
            s_backtrace = ' '.join(backtrace)
            # Let's see if it exists in the stack_maps
            if s_backtrace not in stack_maps:
                stack_maps[s_backtrace] = []
            # Now lets add this thread to the map.
            stack_maps[s_backtrace].append({'gtid': gtid, 'tpid' : tpid, 'bt': o} )
    # Now at this point we have a dict of traces, and each trace has a "list" of pids that match. Let's display them
    for smap in stack_maps:
        # Get our human readable form out.
        o = stack_maps[smap][0]['bt']
        for t in stack_maps[smap]:
            # For each thread we recorded
            print("Thread %s (LWP %s))" % (t['gtid'], t['tpid']))

Here is the final output.

(gdb) stackfold
Thread 129 (LWP 14485))
Thread 128 (LWP 14485))
Thread 127 (LWP 14485))
Thread 10 (LWP 14485))
Thread 9 (LWP 14485))
Thread 8 (LWP 14485))
Thread 7 (LWP 14485))
Thread 6 (LWP 14485))
Thread 5 (LWP 14485))
Thread 4 (LWP 14485))
Thread 3 (LWP 14485))
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 2 (LWP 14485))
#0  0x00007ffff78d835b in write () from /lib64/libc.so.6
#1  0x00007ffff78524fd in _IO_new_file_write () from /lib64/libc.so.6
#2  0x00007ffff7854271 in __GI__IO_do_write () from /lib64/libc.so.6
#3  0x00007ffff7854723 in __GI__IO_file_overflow () from /lib64/libc.so.6
#4  0x00007ffff7847fa2 in puts () from /lib64/libc.so.6
#5  0x00000000004007e9 in naughty_thread (arg=0x0) at naughty.c:27
#6  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 1 (LWP 14485))
#0  0x00007ffff7bbe90d in pthread_join () from /lib64/libpthread.so.0
#1  0x00000000004008d1 in main (argc=1, argv=0x7fffffffe508) at naughty.c:51

With our stackfold command we can easily see that threads 129 through 3 have the same stack, and are idle. We can see that tread 1 is the main process waiting on the threads to join, and finally we can see that thread 2 is the culprit writing to our display.

My solution

You can find my solution to this problem as a reference implementation here .

Time safety and Rust

Posted by William Brown on July 11, 2017 02:00 PM

Time safety and Rust

Recently I have had the great fortune to work on this ticket . This was an issue that stemmed from an attempt to make clock performance faster. Previously, a call to time or clock_gettime would involve a context switch an a system call (think solaris etc). On linux we have VDSO instead, so we can easily just swap to the use of raw time calls.

The problem

So what was the problem? And how did the engineers of the past try and solve it?

DS heavily relies on time. As a result, we call time() a lot in the codebase. But this would mean context switches.

So a wrapper was made called “current_time()”, which would cache a recent output of time(), and then provide that to the caller instead of making the costly context switch. So the code had the following:

static time_t   currenttime;
static int      currenttime_set = 0;

    if ( !currenttime_set ) {
        currenttime_set = 1;

    time( &currenttime );
    return( currenttime );

current_time( void )
    if ( currenttime_set ) {
        return( currenttime );
    } else {
        return( time( (time_t *)0 ));

In another thread, we would poll this every second to update the currenttime value:

void *
time_thread(void *nothing __attribute__((unused)))
    PRIntervalTime    interval;

    interval = PR_SecondsToInterval(1);

    while(!time_shutdown) {
        csngen_update_time ();


So what is the problem here

Besides the fact that we may not poll accurately (meaning we miss seconds but always advance), this is not thread safe. The reason is that CPU’s have register and buffers that may cache both stores and writes until a series of other operations (barriers + atomics) occur to flush back out to cache. This means the time polling thread could update the clock and unless the POLLING thread issues a lock or a barrier+atomic, there is no guarantee the new value of currenttime will be seen in any other thread. This means that the only way this worked was by luck, and no one noticing that time would jump about or often just be wrong.

Clearly this is a broken design, but this is C - we can do anything.

What if this was Rust?

Rust touts mulithread safety high on it’s list. So lets try and recreate this in rust.

First, the exact same way:

use std::time::{SystemTime, Duration};
use std::thread;

static mut currenttime: Option<SystemTime> = None;

fn read_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        let c_time = currenttime.unwrap();
        println!("reading time {:?}", c_time);

fn poll_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        currenttime = Some(SystemTime::now());
        println!("polling time");

fn main() {
    let poll = thread::spawn(poll_thread);
    let read = thread::spawn(read_thread);

Rust will not compile this code.

> rustc clock.rs
error[E0133]: use of mutable static requires unsafe function or block
  --> clock.rs:13:22
13 |         let c_time = currenttime.unwrap();
   |                      ^^^^^^^^^^^ use of mutable static

error[E0133]: use of mutable static requires unsafe function or block
  --> clock.rs:22:9
22 |         currenttime = Some(SystemTime::now());
   |         ^^^^^^^^^^^ use of mutable static

error: aborting due to 2 previous errors

Rust has told us that this action is unsafe, and that we shouldn’t be modifying a global static like this.

This alone is a great reason and demonstration of why we need a language like Rust instead of C - the compiler can tell us when actions are dangerous at compile time, rather than being allowed to sit in production code for years.

For bonus marks, because Rust is stricter about types than C, we don’t have issues like:

int c_time = time();

Which is a 2038 problem in the making :)

RetroFlix / PI Switch Followup

Posted by Mo Morsi on July 05, 2017 07:22 PM

I've been trying to dedicate some cycles to wrapping up the Raspberry PI entertainment center project mentioned a while back. I decided to abandon the PI Switch idea as the original controller which was purchased for it just did not work properly (or should I say only worked sporadically/intermitantly). It being a cheap device bought online, it wasn't worth the effort to debug (funny enough I can't find the device on Amazon anymore, perhaps other people were having issues...).

Not being able to find another suitable gamepad to use as the basis for a snap together portable device, I bought a Rii wireless controller (which works great out of the box!) and dropped the project (also partly due to lack of personal interest). But the previously designed wall mount works great, and after a bit of work the PI now functions as a seamless media center.

Unfortunately to get it there, a few workarounds were needed. These are listed below (in no particular order).

<style> #rpi_setup li{ margin-bottom: 10px; } </style>
  • To start off, increase your GPU memory. This we be needed to run games with any reasonable performance. This can be accomplished through the Raspberry PI configuration interface.

    Rpi setup1 Rpi setup2

    Here you can also overclock your PI if your model supports it (v3.0 does not as evident w/ the screenshot, though there are workarounds)

  • If you are having trouble w/ the PI output resolution being too large / small for your tv, try adjusting the aspect ratio on your set. Previously mine was set to "theater mode", cutting off the edges of the device output. Resetting it to normal resolved the issue.

    Rpi setup3 Rpi setup5 Rpi setup4
  • To get the Playstation SixAxis controller working via bluetooth required a few steps.
    • Unplug your playstation (since it will boot by default when the controller is activated)
    • On the PI, run
              sudo bluetoothctl
    • Start the controller and watch for a new devices in the bluetoothctl output. Make note of the device id
    • Still in the bluetoothctl command prompt, run
              trust [deviceid]
    • In the Raspberry PI bluetooth menu, click 'make discoverable' (this can also be accomplished via the bluetoothctl command prompt with the discoverable on command) Rpi setup6
    • Finally restart the controller and it should autoconnect!
  • To install recent versions of Ruby you will need to install and setup rbenv. The current version in the RPI repos is too old to be of use (of interest for RetroFlix, see below)
  • Using mednafen requires some config changes, notabley to disable opengl output and enable SDL. Specifically change the following line from
          video.driver opengl
          video.driver sdl
    Unfortunately after alot of effort, I was not able to get mupen64 working (while most games start, as confirmed by audio cues, all have black / blank screens)... so no N64 games on the PI for now ☹
  • But who needs N64 when you have Nethack! ♥‿♥(the most recent version of which works flawlessly). In addition to the small tweaks needed to compile the latest version on Linux, inorder to get the awesome Nevanda tileset working, update include/config.h to enable XPM graphics:
        -/* # define USE_XPM */ /* Disable if you do not have the XPM library */
        +#define USE_XPM  /* Disable if you do not have the XPM library */
    Once installed, edit your nh/install/games/lib/nethackdir/NetHack.ad config file (in ~ if you installed nethack there), to reference the newtileset:
        -NetHack.tile_file: x11tiles
        +NetHack.tile_file: /home/pi/Downloads/Nevanda.xpm

Finally RetroFlix received some tweaking & love. Most changes were visual optimizations and eye candy (including some nice retro fonts and colors), though a workers were also added so the actual downloads could be performed without blocking the UI. Overall it's simple and works great, a perfect portal to work on those high scores!

That's all for now, look for some more updates on the ReFS front in the near future!

indexed search performance for ds - the mystery of the and query

Posted by William Brown on June 25, 2017 02:00 PM

indexed search performance for ds - the mystery of the and query

Directory Server is heavily based on set mathematics - one of the few topics I enjoyed during university. Our filters really boil down to set queries:


This filter describes the intersection of sets of objects containing “attr=val1” and “attr=val2”.

One of the properties of sets is that operations on them are commutative - the sets to a union or intersection may be supplied in any order with the same results. As a result, these are equivalent:


In the past I noticed an odd behaviour: that the order of filter terms in an ldapsearch query would drastically change the performance of the search. For example:


The later query may significantly outperform the former - but 10% or greater. I have never understood the reason why though. I toyed with ideas of re-arranging queries in the optimise step to put the terms in a better order, but I didn’t know what factors affected this behaviour.

Over time I realised that if you put the “more specific” filters first over the general filters, you would see a performance increase.

What was going on?

Recently I was asked to investigate a full table scan issue with range queries. This led me into an exploration of our search internals, and yielded the answer to the issue above.

Inside of directory server, our indexes are maintained as “pre-baked” searches. Rather than trying to search every object to see if a filter matches, our indexes contain a list of entries that match a term. For example:

uid=mark: 1, 2
uid=william: 3
uid=noriko: 4

From each indexed term we construct an IDList, which is the set of entries matching some term.

On a complex query we would need to intersect these. So the algorithm would iteratively apply this:

t1 = (a, b)
t2 = (c, t1)
t3 = (d, t2)

In addition, the intersection would allocate a new IDList to insert the results into.

What would happen is that if your first terms were large, we would allocate large IDLists, and do many copies into it. This would also affect later filters as we would need to check large ID spaces to perform the final intersection.

In the above example, consider a, b, c all have 10,000 candidates. This would mean t1, t2 is at least 10,000 IDs, and we need to do at least 20,000 comparisons. If d were only 3 candidates, this means that we then throw away the majority of work and allocations when we get to t3 = (d, t2).

What is the fix?

We now wrap each term in an idl_set processing api. When we get the IDList from each AVA, we insert it to the idl_set. This tracks the “minimum” IDList, and begins our intersection from the smallest matching IDList. This means that we have the quickest reduction in set size, and results in the smallest possible IDList allocation for the results. In my tests I have seen up to 10% improvement on complex queries.

For the example above, this means we procees d first, to reduce t1 to the smallest possible candidate set we can.

t1 = (d, a)
t2 = (b, t1)
t3 = (c, t2)

This means to create t2, t3, we will do an allocation that is bounded by the size of d (aka 3, rather than 10,000), we only need to perform fewer queries to reach this point.

A benefit of this strategy is that it means if on the first operation we find t1 is empty set, we can return immediately because no other intersection will have an impact on the operation.

What is next?

I still have not improved union performance - this is still somewhat affected by the ordering of terms in a filter. However, I have a number of ideas related to either bitmask indexes or disjoin set structures that can be used to improve this performance.

Stay tuned ….

Glad to be a Mentor of Google Summer Code again!

Posted by Tong Hui on May 25, 2017 04:44 PM

This year I will mentoring in FedoraProject, and help Mandy Wang finish her GSoC program about “Migrate Plinth to Fedora Server” which raised by me.

While, why I proposed this idea? Plinth is developed by Freedombox which is a Debian based project. The Freedombox is aiming for building a 100% free software self-hosting web server to deploy social applications on small machines. It provides online communication tools respecting user privacy and data ownership, replacing services provided by third-parties that under surveillance. Plinth is the front-end of Freedombox, written in Python.

This idea mainly about migrate Plinth from Deb-based to RPM-based, and make it available for Fedora Server which will running on ARM machines.

The main goal of this idea is to make Plinth works fine in Fedora Server or Minimal flavor, due to Plinth write APT commands hard coded, so it is better to make it more adoptive for RPM. The secondary goal is to make a RPM package for Plinth from source and setup a repo for it, so that everyone who use Fedora could use Plinth. Honestly to say, it is not a easy-hand task for GSoC student although I marked it “Novice” in Skill Level.

Mandy who were selected for finishing this idea is super adopted in my mind, because she write a very clear graph for how she will do in this GSoC.

It will start this May 30. I believe we will finish this mission successfully!

TLS Authentication and FreeRADIUS

Posted by William Brown on May 24, 2017 02:00 PM

TLS Authentication and FreeRADIUS

In a push to try and limit the amount of passwords sent on my network, I’m changing my wireless to use TLS certificates for authentication.


Kerberos - why the world moved on

Posted by William Brown on May 22, 2017 02:00 PM

Kerberos - why the world moved on

For a long time I have tried to integrate and improve authentication technologies in my own environments. I have advocated the use of GSSAPI, IPA, AD, and others. However, the more I have learnt, the further I have seen the world moving away. I want to explore some of my personal experiences and views as to why this occured, and what we can do.