August 27, 2015

nsenter gains SELinux support
nsenter is a program that allows you to run program with namespaces of other processes

This tool is often used to enter containers like docker, systemd-nspawn or rocket.   It can be used for debugging or for scripting
tools to work inside of containers.  One problem that it had was the process that would be entering the container could potentially
be attacked by processes within the container.   From an SELinux point of view, you might be injecting an unconfined_t process
into a container that is running as svirt_lxc_net_t.  We wanted a way to change the process context when it entered the container
to match the pid of the process who's namespaces you are entering.

As of util-linux-2.27, nsenter now has this support.

man nsenter
...
       -Z, --follow-context
              Set the SELinux  security  context  used  for  executing  a  new process according to already running process specified by --tar‐get PID. (The util-linux has to be compiled with SELinux support otherwise the option is unavailable.)



docker exec

Already did this but this gives debuggers, testers, scriptors a new tool to use with namespaces and containers.

August 20, 2015

Embedded systems, meet the Internet of Things

An article in Military Embedded Systems magazine discusses the evolution of embedded systems as influenced by the Internet of Things (IoT): Embedded systems, meet the Internet of Things.

The article notes: “In many ways, embedded systems are the progenitor of the Internet of Things (IoT) – and now IoT is changing key aspects of how we design and build military embedded systems. In fact, the new model for embedded systems within IoT might best be described as design, build, maintain, update, extend, and evolve.”


August 19, 2015

Secure distribution of RPM packages

This blog post looks at the final part of creating secure software: shipping it to users in a safe way. It explains how to use transport security and package signatures to achieve this goal.

yum versus rpm

There are two commonly used tools related to RPM package management, yum and rpm. (Recent Fedora versions have replaced yum with dnf, a rewrite with similar functionality.) The yum tool inspects package sources (repositories), downloads RPM packages, and makes sure that required dependencies are installed along with fresh package installations and package updates. yum uses rpm as a library to install packages. yum repositories are defined by .repo files in /etc/yum.repos.d, or by yum plugins for repository management (such as subscription-manager for Red Hat subscription management). rpm is the low-level tool which operates on explicit set of RPM packages. rpm provides both a set of command-line tools, and a library to process RPM packages. In contrast to yum, package dependencies are checked, but violations are not resolved automatically. This means that rpm typically relies on yum to tell it what to do exactly; the recipe for a change to a package set is called a transaction. Securing package distribution at the yum layer resembles transport layer security. The rpm security mechanism is more like end-to-end security (in fact, rpm uses OpenPGP internally, which has traditionally been used for end-to-end email protection).

Transport security with yum

Transport security is comparatively easy to implement. The web server just needs to serve the package repository metadata (repomd.xml and its descendants) over HTTPS instead of HTTP. On the client, a .repo file in /etc/yum.repos.d has to look like this:

[gnu-hello]
name=gnu-hello for Fedora $releasever
baseurl=https://download.example.com/dist/fedora/$releasever/os/
enabled=1

$releasever expands to the Fedora version at run time (like “22”). By default, end-to-end security with RPM signatures is enabled (see the next section), but we will focus on transport security first.

yum will verify the cryptographic digests contained in the metadata files, so serving the metadata over HTTPS is sufficient, but offering the .rpm files over HTTPS as well is a sensible precaution. The metadata can instruct yum to download packages from absolute, unrelated URLs, so it is necessary to inspect the metadata to make sure it does not contain such absolute “http://” URLs. However, transport security with a third-party mirror network is quite meaningless, particularly if anyone can join the mirror network (as it is the case with CentOS, Debian, Fedora, and others). Rather than attacking the HTTPS connections directly, an attacker could just become part of the mirror network. There are two fundamentally different approaches to achieve some degree of transport security.

Fedora provides a centralized, non-mirrored Fedora-run metalink service which provides a list if active mirrors and the expected cryptographic digest of the repomd.xml files. yum uses this information to select a mirror and verify that it serves the up-to-date, untampered repomd.xml. The chain of cryptographic digests is verified from there, eventually leading to verification of the .rpm file contents. This is how the long-standing Fedora bug 998 was eventually fixed.

Red Hat uses a different option to distribute Red Hat Enterprise Linux and its RPM-based products: a content-distribution network, managed by a trusted third party. Furthermore, the repositories provided by Red Hat use a separate public key infrastructure which is managed by Red Hat, so breaches in the browser PKI (that is, compromises of certificate authorities or misissued individual certificates) do not affect the transport security checks yum provides. Organizations that wish to implement something similar can use the sslcacert configuration switch of yum. This is the way Red Hat Satellite 6 implements transport security as well. Transport security has the advantage that it is straightforward to set up (it is not more difficult than to enable HTTPS). It also guards against manipulation at a lower level, and will detect tampering before data is passed to complex file format parsers such as SQLite, RPM, or the XZ decompressor. However, end-to-end security is often more desirable, and we cover that in the next section.

End-to-end security with RPM signatures

RPM package signatures can be used to implement cryptographic integrity checks for RPM packages. This approach is end-to-end in the sense that the package build infrastructure at the vendor can use an offline or half-online private key (such as one stored in hardware security module), and the final system which consumes these packages can directly verify the signatures because they are built into the .rpm package files. Intermediates such as proxies and caches (which are sometimes used to separate production servers from the Internet) cannot tamper with these signatures. In contrast, transport security protections are weakened or lost in such an environment.

Generating RPM signatures

To add an RPM signature to a .rpm signature, you need to generate a GnuPG key first, using gpg --gen-key. Let’s assume that this key has the user ID “rpmsign@example.com”. We first export the public key part to a file in a special directory, otherwise rpmsign will not be able to verify the signatures we create as it uses the RPM database as a source of trusted signing keys (and not the user GnuPG keyring):

$ mkdir $HOME/rpm-signing-keys
$ gpg --export -a rpmsign@example.com > $HOME/rpm-signing-keys/example-com.key

The name of the directory $HOME/rpm-signing-keys does not matter, but the name of the file containing the public key must end in “.key”. On Red Hat Enterprise Linux 7, CentOS 7, and Fedora, you may have to install the rpm-sign package, which contains the rpmsign program. The rpmsign command to create the signature looks like this:

$ rpmsign -D '_gpg_name rpmsign@example.com' --addsign hello-2.10.1-1.el6.x86_64.rpm
Enter pass phrase:
Pass phrase is good.
hello-2.10.1-1.el6.x86_64.rpm:

(On success, there is no output after the file name on the last line, and the shell prompt reappears.) The file hello-2.10.1-1.el6.x86_64.rpm is overwritten in place, with a variant that contains the signature embedded into the RPM header. The presence of a signature can be checked with this command:

$ rpm -Kv -D "_keyringpath $HOME/rpm-signing-keys" hello-2.10.1-1.el6.x86_64.rpm
hello-2.10.1-1.el6.x86_64.rpm:
    Header V4 RSA/SHA1 Signature, key ID de337997: OK
    Header SHA1 digest: OK (b2be54480baf46542bcf395358aef540f596c0b1)
    V4 RSA/SHA1 Signature, key ID de337997: OK
    MD5 digest: OK (6969408a8d61c74877691457e9e297c6)

If the output of this command contains “NOKEY” lines instead, like the following, it means that the public key in the directory $HOME/rpm-signing-keys has not been loaded successfully:

hello-2.10.1-1.el6.x86_64.rpm:
    Header V4 RSA/SHA1 Signature, key ID de337997: NOKEY
    Header SHA1 digest: OK (b2be54480baf46542bcf395358aef540f596c0b1)
    V4 RSA/SHA1 Signature, key ID de337997: NOKEY
    MD5 digest: OK (6969408a8d61c74877691457e9e297c6)

Afterwards, the RPM files can be distributed as usual and served over HTTP or HTTPS, as if they were unsigned.

Consuming RPM signatures

To enable RPM signature checking in rpm explicitly, the yum repository file must contain a gpgcheck=1 line, as in:

[gnu-hello]
name=gnu-hello for Fedora $releasever
baseurl=https://download.example.com/dist/fedora/$releasever/os/
enabled=1
gpgcheck=1

Once signature checks are enabled in this way, package installation will fail with a NOKEY error until the signing key used by .rpm files in the repository is added to the system RPM database. This can be achieved with a command like this:

$ rpm --import https://download.example.com/keys/rpmsign.asc

The file needs to be transported over a trusted channel, hence the use of an https:// URL in the example. (It is also possible to instruct the user to download the file from a trusted web site, copy it to the target system, and import it directly from the file system.) Afterwards, package installation works as before.

After a key has been import, it will appear in the output of the “rpm -qa” command:

$ rpm -qa | grep ^gpg-pubkey-
…
gpg-pubkey-ab0e12ef-de337997
…

More information about the key can be obtained with “rpm -qi gpg-pubkey-ab0e12ef-de337997”, and the key can be removed again using the “rpm --erase gpg-pubkey-ab0e12ef-de337997”, just as if it were a regular RPM package.

Note: Package signatures are only checked by yum if the package is downloaded from a repository (which has checking enabled). This happens if the package is specified as a name or name-version-release on the yum command line. If the yum command line names a file or URL instead, or the rpm command is used, no signature check is performed in current versions of Red Hat Enterprise Linux, Fedora, or CentOS.

Issues to avoid

When publishing RPM software repositories, the following should be avoided:

  1. The recommended yum repository configuration uses baseurl lines containing http:// URLs.
  2. The recommended yum repository configuration explicitly disables RPM signature checking with gpgcheck=0.
  3. There are optional instructions to import RPM keys, but these instructions do not tell the system administrator to disable the gpgcheck=0 line in the default yum configuration provided by the independent software vendor.
  4. The recommended “rpm --import” command refers to the public key file using an http:// URL.

The first three deficiencies in particular open the system up to a straightforward man-in-the-middle attack on package downloads. An attacker can replace the repository or RPM files while they are downloaded, thus gaining the ability execute arbitrary commands when they are installed. As outlined in the article on the PKI used by the Red Hat CDN, some enterprise networks perform TLS intercept, and HTTPS downloads will fail. This possibility is not sufficient to justify weakening package authentication for all customers, such as recommending to use http:// instead of https:// in the yum configuration. Similarly, some customers do not want to perform the extra step involving “rpm --import”, but again, this is not an excuse to disable verification for everyone, as long as RPM signatures are actually available in the repository. (Some software delivery processes make it difficult to create such end-to-end verifiable signatures.)

Summary

If you are creating a repository of packages you should ensure give your users a secure way to consume them. You can do this by following these recommendations:

  • Use https:// URLs everywhere in configuration advice regarding RPM repository setup for yum.
  • Create a signing key and use them to sign RPM packages, as outlined above.
  • Make sure RPM signature checking is enabled in the yum configuration.
  • Use an https:// URL to download the public key in the setup instructions.

We acknowledge that package signing might not be possible for everyone, but software downloads over HTTPS downloads are straightforward to implement and should always be used.

August 15, 2015

Distributing Secrets with Custodia

My last blog post described a crypto library I created named JWCrypto. I've built this library as a building block of Custodia, a Service that helps sharing Secrets, Keys, Passwords in distributed applications like micro service architectures built on containers.

Custodia is itself a building block of a new FreeIPA feature to improve the experience of setting up replicas. In fact Custodia at the moment is mostly plumbing for this feature, and although the plumbing is all there, it is not very usable outside of the FreeIPA project without some thinkering.

The past week I was at Flock where I gave a presentation on the problem of distributing Secrets Securely, which is based on my work and my thinking about the general problem and how I applied that thinking to build a generic service which I then specializes for use by FreeIPA. If you are curious, I have posted the slides I used during my talk, and they assure me soon there will soon be video recordings of all the talks available online.

August 11, 2015

Tokenless Keystone

Keystone Tokens are bearer tokens, and bearer tokens are vulnerable to replay attacks. What if we wanted to get rid of them?

Keystone comes out of the early days of OpenStack. When you have a single monolith, you can have a password table. Once you start having multiple, related services, you need a shared sign on mechanism for them. While the Keystone token mechanism is not elegant, it served the need: to avoid copying the user’s password to multiple services.

Today, the tokens represent two things. First, that the user has authenticated, and second, that the user is performing an operation in a specific scope. The token has a set of roles assigned to it, and those roles are on either a project or a domain.

Cryoptographic Authentication

In an enterprise that has a centralized authentication mechanism such as Kerberos or X509 Client Certificates, an application typically performs a cryptographic handshake to authenticate the user, and then performs an additional query against a centralized directory to see what groups the user belongs to in order to perform an access control check.

This leads to the first things that could replace tokens: reuse the existing authentication mechanisms to provide access to the remote services. Putting Nova or Glance behind a web server running mod_auth_kerb (or better yet, mod_auth_gssapi, but I get ahead of myself) for Kerberos or using HTTPS with Client Certificate can both be done across the public internet now. Both mechanisms have their pros and cons. Once the user authenticates, the service could then query the roles assignments for the user from Keystone instead of validating a token. The data would be the same.

Federation

What if those mechanisms are not acceptable? There is still Federation. Keystone today can serve out tokens using either OpenID connect or SAML. There is no reason these same mechanisms could not be put in front of the other services of OpenStack, with Keystone filling out the Role information either in the assertions or via Server lookup.

All of these mechanism have a greater cost in network calls to the service endpoint, although not necessary to the user, who may not have to make the round trips to Keystone in order to fetch a token (OK SAML is very chatty.) What other options do we have?

Signed Requests

If we don’t care about browser support, and focus on just the CLI, then a few more options open up. Keystone could become a registry for public keys, and users could authenticate by signing the request that go to Nova. The signature of a request would only be slightly larger than the size of a Fernet token, and the user would be able to greatly reduce the number of web calls. There would be slightly larger overhead due to the asymmetric cryptography.

Unfortunately, this is not a real option from the browser;  the browser support for “naked keys” is currently not sufficient to ensure the operations will succeed.  Usage of Client X509 Certificates is still the best way to ensure cryptography from the browser, and will not necessarily support arbitrary document signing.  I expect this to change over time, but I suspect the browser support will be uneven at best for a while.

OAUTH

Both versions of OAUTH  are designed to address distributed authorization.   Without cryptographic signing, the OAUTH (1) protocol degenerates to bearer tokens.  With Cryptography, the OAUTH (2) protocol is roughly comparable to SAML; OpenID connect is based on OAUTH2, so this should be no surprise. So, while Keystone tokens could be replaced with some form of OAUTH, and it would at least be closer to a standard, it wouldn’t radically change the current approach to Keystone tokens. OAUTH either gives us what we have with Keystone tokens today or what we would have with SAML.

Basic-Auth

Keystone tokens provide minimal additional security benefits for an all-in-one deployment. Instead of putting the User ID and password into the body of the request, the user could pass them via the standard Basic-Auth mechanism with no change in the degree of security. This provides parity with how Keystone is deployed today. No outside service should be able to access the Message Broker. Calls between services should be done on internal (localhost) interfaces or domain sockets, not require passwords, and trust the authorization context as set by the caller.

One Time Passwords

One time paswords (OTPs) in conjunction with Basic Auth or some other way to curry the data to the server provides an interesting alternative. In theory, the user could pass the OTP along at the start of the request, the Horizon server would be responsible for timestamping it, and the password could then be used for the duration. This seems impractical, as we are essentially generating a new bearer token. For all-in-one deployments they would work as well as Basic-Auth.

'CVE-2015-4495 and SELinux', Or why doesn't SELinux confine Firefox?
Why don't we confine Firefox with SELinux?

That is one of the most often asked questions, especially after a new CVE like CVE-2015-4495, shows up.  This vulnerability in firefox allows a remote session to grab any files in your home directory.  If you can read the file then firefox can read it and send it back to the website that infected your browser.

The big problem with confining desktop applications is the way the desktop has been designed.

I wrote about confining the desktop several years ago. 

As I explained then the problem is applications are allowed to communicate with each other in lots of different ways. Here are just a few.

*   X Windows.  All apps need full access to the X Server. I tried several years ago to block applications access to the keyboard settings, in order to block keystroke logging, (google xspy).  I was able to get it to work but a lot of applications started to break.  Other access that you would want to block in X would be screen capture, access to the cut/paste buffer. But blocking
these would cause too much breakage on the system.  XAce was an attempt to add MAC controls to X and is used in MLS environments but I believe it causes to much breakage.
*   File system access.  Users expect firefox to be able to upload and download files anywhere they want on the desktop.  If I was czar of the OS, I could state that upload files must go into ~/Upload and Download files go into ~/Download, but then users would want to upload photos from ~/Photos.  Or to create their own random directories.  Blocking access to any particular directory including .ssh would be difficult, since someone probably has a web based ssh session or some other tool that can use ssh public key to authenticate.  (This is the biggest weakness in described in CVE-2015-4495
*   Dbus communications as well as gnome shell, shared memory, Kernel Keyring, Access to the camera, and microphone ...

Every one expects all of these to just work, so blocking these with MAC tools and SELinux is most likely to lead to "setenforce 0" then actually adding a lot of security.

Helper Applications.

One of the biggest problems with confining a browser, is helper applications.  Lets imagine I ran firefox with SELinux type firefox_t.  The user clicks on a .odf file or a .doc file, the browser downloads the file and launches LibreOffice so the user
can view the file.  Should LibreOffice run as LibreOffice_t or firefox_t?  If it runs as LibreOffice_t then if the LibreOffice_t app was looking at a different document, the content might be able to subvert the process.  If I run the LibreOffice as firefox_t, what happens when the user launched a document off of his desktop, it will not launch a new LibreOffice it will just communicate with the running LibreOffice and launch the document, making it accessible to firefox_t.

Confining Plugins.

For several years now we have been confining plugins with SELinux in Firefox and Chrome.  This prevents tools like flashplugin
from having much access to the desktop.  But we have had to add booleans to turn off the confinement, since certain plugins, end up wanting more access.

mozilla_plugin_bind_unreserved_ports --> off
mozilla_plugin_can_network_connect --> off
mozilla_plugin_use_bluejeans --> off
mozilla_plugin_use_gps --> off
mozilla_plugin_use_spice --> off
unconfined_mozilla_plugin_transition --> on


SELinux Sandbox

I did introduce the SELinux Sandbox a few years ago.

The SELinux sandbox would allow you to confine desktop applications using container technologies including SELinux.  You could run firefox, LibreOffice, evince ... in their own isolated desktops.  It is quite popular, but users must choose to use it.  It does not work by default, and it can cause unexpected breakage, for example you are not allowed to cut and paste from one window to another.

Hope on the way.

Alex Larsson is working on a new project to change the way desktop applications run, called Sandboxed Applications.

Alex explains that their are two main goals of his project.

* We want to make it possible for 3rd parties to create and distribute applications that works on multiple distributions.
* We want to run the applications with as little access as possible to the host. (For example user files or network access)

The second goal might allow us to really lock down firefox and friends in a way similar to what Android is able to do on your cell phone (SELinux/SEAndroid blocks lots of access on the web browser.)

Imagine that when a user says he wants upload a file he talks to the desktop rather then directly to firefox, and the desktop
hands the file to firefox.  Firefox could then be prevented from touching anything in the homedir.  Also if a user wanted to
save a file, firefox would ask the desktop to launch the file browser, which would run in the desktop context.   When the user
selected where to save the file, the browser would give a descriptor to firefox to write the file.

Similar controls could isolate firefox from the camera microphone etc.

Wayland which will eventually replace X Windows, also provides for better isolation of applications.

Needless to say, I am anxiously waiting to see what Alex and friends come up with.

The combination of Container Techonolgy including Namespaces and SELinux gives us a chance at controling the desktop

August 06, 2015

User certificates and custom profiles with FreeIPA 4.2

The FreeIPA 4.2 release introduces some long-awaited certificate management features: user certificates and custom certificate profiles. In this blog post, we will examine the background and motivations for these features, then carry out a real-world scenario where both these features are used: user S/MIME certificates for email protection.

Custom profiles

FreeIPA uses the Dogtag Certificate System PKI for issuance of X.509 certificates. Although Dogtag ships with many certificate profiles, and could be configured with profiles for almost any conceivable use case, FreeIPA only used a single profile for the issuance of certificates to service and host principals. (The name of this profile was caIPAserviceCert, but it hardcoded and not user-visible).

The caIPAserviceCert profile was suitable for the standard TLS server authentication use case, but there are many use cases for which it was not suitable; especially those that require particular Key Usage or Extended Key Usage assertions or esoteric certificate extensions, to say nothing of client-oriented profiles.

It was possible (and remains possible) to use the deployed Dogtag instance directly to accomplish almost any certificate management goal, but Dogtag lacks knowledge of the FreeIPA schema so the burden of validating requests falls entirely on administrators. This runs contrary to FreeIPA’s goal of easy administration and the expectations of users.

The certprofile-import command allows new profiles to be imported into Dogtag, while certprofile-mod, certprofile-del, certprofile-show and certprofile-find do what they say on the label. Only profiles that are shipped as part of FreeIPA (at time of writing only caIPAserviceCert) or added via certprofile-import are visible to FreeIPA.

An important per-profile configuration that affects FreeIPA is the ipaCertprofileStoreIssued attribute, which is exposed on the command line as --store=BOOL. This attribute tells the cert-request command what to do with certificates issued using that profile. If TRUE, certificates are added to the target principal’s userCertificate attribute; if FALSE, the issued certificate is delievered to the client in the command result but nothing is stored in the FreeIPA directory (though the certificate is still stored in Dogtag’s database). The option to not store issued certificates is desirable in uses cases that involve the issuance of many short-lived certificates.

Finally, cert-request learned the --profile-id option to specify which profile to use. It is optional and defaults to caIPAserviceCert.

User certificates

Prior to FreeIPA 4.2 certificates could only be issued for host and service principals. The same capability now exists for user principals. Although cert-request treats user principals in substantially the same way as host or service principals there are a few important differences:

  • The subject Common Name in the certificate request must match the FreeIPA user name.
  • The subject email address (if present) must match one of the user’s email addresses.
  • All Subject Alternative Name rfc822Name values must match one of the user’s email addresses.
  • Like services and hosts, KRB5PrincipalName SAN is permitted if it matches the principal.
  • dNSName and other SAN types are prohibited.

CA ACLs

With support for custom certificate profiles, there must be a way to control which profiles can be used for issuing certificates to which principals. For example, if there was a profile for Puppet masters, it would be sensible to restrict use of that profile to hosts that are members of a some Puppet-related group. This is the purpose of CA ACLs.

CA ACLs are created with the caacl-add command. Users and groups can be added or removed with the caacl-add-user and caacl-remove-user commands. Similarly, caacl-{add,remove}-host for hosts and hostgroups, and caacl-{add,remove}-service.

If you are familiar with FreeIPA’s Host-based Access Control (HBAC) policy feature these commands might remind you of the hbacrule commands. That is no coincidence! The hbcarule commands were my guide for implementing the caacl commands, and the same underlying machinery – libipa_hbac via pyhbac – is used by both plugins to enforce their policies.

Putting it all together

Let’s put these features to use with a realistic scenario. A certain group of users in your organisation must use S/MIME for securing their email communications. To use S/MIME, these users must be issued a certificate with emailProtection asserted in the Extended Key Usage certificate extension. Only the authorised users should be able to have such a certificate issued.

To address this scenario we will:

  1. create a new certificate profile for S/MIME certificates;
  2. create a group for S/MIME users and a CA ACL to allow members of that group access to the new profile;
  3. generate a signing request and issue a cert-request command using the new profile.

Let’s begin.

Creating an S/MIME profile

We export the default profile to use as a starting point for the S/MIME profile:

% ipa certprofile-show --out smime.cfg caIPAserviceCert

Inspecting the profile, we find the Extended Key Usage extension configuration containing the line:

policyset.serverCertSet.7.default.params.exKeyUsageOIDs=1.3.6.1.5.5.7.3.1,1.3.6.1.5.5.7.3.2

The Extended Key Usage extension is defined in RFC 5280 §4.2.1.12. The two OIDs in the default profile are for TLS WWW server authentication and TLS WWW client authentication respectively. For S/MIME, we need to assert the Email protection key usage, so we change this line to:

policyset.serverCertSet.7.default.params.exKeyUsageOIDs=1.3.6.1.5.5.7.3.4

We also remove the profileId=caIPAserviceCert and set an appropriate value for the desc and name fields. Now we can import the new profile:

% ipa certprofile-import smime --file smime.cfg \
  --desc "S/MIME certificates" --store TRUE
------------------------
Imported profile "smime"
------------------------
Profile ID: smime
Profile description: S/MIME certificates
Store issued certificates: TRUE

Defining the CA ACL

We will define a new group for S/MIME users, and a CA ACL to allow users in that group access to the smime profile:

% ipa group-add smime_users
-------------------------
Added group "smime_users"
-------------------------
  Group name: smime_users
  GID: 1148600006

% ipa caacl-add smime_acl
------------------------
Added CA ACL "smime_acl"
------------------------
  ACL name: smime_acl
  Enabled: TRUE

% ipa caacl-add-user smime_acl --group smime_users
  ACL name: smime_acl
  Enabled: TRUE
  User Groups: smime_users
-------------------------
Number of members added 1
-------------------------

% ipa caacl-add-profile smime_acl --certprofile smime
  ACL name: smime_acl
  Enabled: TRUE
  Profiles: smime
  User Groups: smime_users
-------------------------
Number of members added 1
-------------------------

Creating and issuing a cert request

Finally we need to create a PKCS #10 certificate signing request (CSR) and issue a certificate via the cert-request command. We will do this for the user alice. Because this certificate is for email protection Alice’s email address should be in the Subject Alternative Name (SAN) extension; we must include it in the CSR.

The following OpenSSL config file can be used to generate the certificate request:

[ req ]
prompt = no
encrypt_key = no

distinguished_name = dn
req_extensions = exts

[ dn ]
commonName = "alice"

[ exts ]
subjectAltName=email:alice@ipa.local

We create and then inspect the CSR (the genrsa step can be skipped if you already have a key):

% openssl genrsa -out key.pem 2048
Generating RSA private key, 2048 bit long modulus
.........................+++
......+++
e is 65537 (0x10001)
% openssl req -new -key key.pem -out alice.csr -config alice.conf
% openssl req -text < alice.csr
Certificate Request:
    Data:
        Version: 0 (0x0)
        Subject: CN=alice
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (1024 bit)
                Modulus:
                    00:da:62:61:b4:42:ee:bd:ff:e0:63:cb:ec:85:af:
                    5d:40:ab:59:98:cf:a2:ad:2a:2d:30:c4:73:dc:28:
                    92:45:d4:12:b2:fc:49:78:e2:03:42:d3:eb:69:4f:
                    33:d2:0c:db:22:6c:19:63:46:46:52:4c:4a:bc:93:
                    c6:1b:81:2b:8c:7b:5c:21:1d:5b:e5:5f:97:12:e3:
                    2b:d5:1f:93:99:c9:42:5e:a1:88:77:b1:4f:97:e2:
                    06:20:8b:eb:b7:0d:af:b8:7a:75:10:7a:0f:42:9b:
                    28:55:4c:e3:12:9f:2a:97:92:ab:f6:53:26:51:32:
                    88:f5:01:7f:e0:45:30:d9:51
                Exponent: 65537 (0x10001)
        Attributes:
        Requested Extensions:
            X509v3 Subject Alternative Name: 
                email:alice@ipa.local
    Signature Algorithm: sha1WithRSAEncryption
         1d:e3:dc:a8:af:6c:42:55:40:1a:88:a3:1f:c3:b7:2b:01:3a:
         8f:1f:80:b5:1c:de:80:53:f3:fc:61:91:16:03:3d:79:3a:4b:
         ee:0d:c0:09:1a:d9:d7:40:6e:05:7a:43:c1:0b:26:0c:22:0e:
         79:d1:b0:27:8d:9a:26:51:d5:1b:1b:46:e7:b5:03:97:51:ec:
         53:ae:dd:52:85:d3:48:8a:ac:cc:c0:84:61:9a:97:2e:25:1b:
         b1:f0:72:1f:73:94:3c:44:d5:12:1e:b5:b5:37:9b:57:5d:08:
         d8:52:d4:e5:52:05:17:cc:5f:28:ad:ac:0c:4c:36:dc:33:c2:
         11:6d
-----BEGIN CERTIFICATE REQUEST-----
MIIBfDCB5gIBADAQMQ4wDAYDVQQDDAVhbGljZTCBnzANBgkqhkiG9w0BAQEFAAOB
jQAwgYkCgYEA2mJhtELuvf/gY8vsha9dQKtZmM+irSotMMRz3CiSRdQSsvxJeOID
QtPraU8z0gzbImwZY0ZGUkxKvJPGG4ErjHtcIR1b5V+XEuMr1R+TmclCXqGId7FP
l+IGIIvrtw2vuHp1EHoPQpsoVUzjEp8ql5Kr9lMmUTKI9QF/4EUw2VECAwEAAaAt
MCsGCSqGSIb3DQEJDjEeMBwwGgYDVR0RBBMwEYEPYWxpY2VAaXBhLmxvY2FsMA0G
CSqGSIb3DQEBBQUAA4GBAB3j3KivbEJVQBqIox/DtysBOo8fgLUc3oBT8/xhkRYD
PXk6S+4NwAka2ddAbgV6Q8ELJgwiDnnRsCeNmiZR1RsbRue1A5dR7FOu3VKF00iK
rMzAhGGaly4lG7Hwch9zlDxE1RIetbU3m1ddCNhS1OVSBRfMXyitrAxMNtwzwhFt
-----END CERTIFICATE REQUEST-----

Observe that the common name is the user’s name alice, and that alice@ipa.local is present as an rfc822Name in the SAN extension.

Now let’s request the certificate:

% ipa cert-request alice.req --principal alice --profile-id smime
ipa: ERROR: Insufficient access: Principal 'alice' is not
  permitted to use CA '.' with profile 'smime' for certificate
  issuance.

Oops! The CA ACL policy prohibited this issuance because we forgot to add alice to the smime_users group. (The not permitted to use CA '.' part is a reference to the upcoming sub-CAs feature). Let’s add the user to the appropriate group and try again:

% ipa group-add-member smime_users --user alice
  Group name: smime_users
  GID: 1148600006
  Member users: alice
-------------------------
Number of members added 1
-------------------------

% ipa cert-request alice.req --principal alice --profile-id smime
  Certificate: MIIEJzCCAw+gAwIBAgIBEDANBgkqhkiG9w0BAQsFADBBMR...
  Subject: CN=alice,O=IPA.LOCAL 201507271443
  Issuer: CN=Certificate Authority,O=IPA.LOCAL 201507271443
  Not Before: Thu Aug 06 04:09:10 2015 UTC
  Not After: Sun Aug 06 04:09:10 2017 UTC
  Fingerprint (MD5): 9f:8e:e0:a3:c6:37:e0:a4:a5:e4:6b:d9:14:66:67:dd
  Fingerprint (SHA1): 57:6e:d5:07:8f:ef:d6:ac:36:b8:75:e0:6c:d7:4f:7d:f9:6c:ab:22
  Serial number: 16
  Serial number (hex): 0x10

Success! We can see that the certificate was added to the user’s userCertificate attribute, or export the certificate to inspect it (parts of the certificate are elided below) or import it into an email program:

% ipa user-show alice
  User login: alice
  First name: Alice
  Last name: Able
  Home directory: /home/alice
  Login shell: /bin/sh
  Email address: alice@ipa.local
  UID: 1148600001
  GID: 1148600001
  Certificate: MIIEJzCCAw+gAwIBAgIBEDANBgkqhkiG9w0BAQsFADBBMR...
  Account disabled: False
  Password: True
  Member of groups: smime_users, ipausers
  Kerberos keys available: True

% ipa cert-show 16 --out alice.pem >/dev/null
% openssl x509 -text < alice.pem
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 16 (0x10)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: O=IPA.LOCAL 201507271443, CN=Certificate Authority
        Validity
            Not Before: Aug  6 04:09:10 2015 GMT
            Not After : Aug  6 04:09:10 2017 GMT
        Subject: O=IPA.LOCAL 201507271443, CN=alice
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (2048 bit)
                Modulus:
                    00:e2:1b:92:06:16:f7:27:c8:59:8b:45:93:60:84:
                    ...
                    34:6f
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Authority Key Identifier: 
                keyid:CA:19:15:12:87:04:70:6E:81:7B:1D:8D:C6:4A:F6:A1:49:AA:0D:45

            Authority Information Access: 
                OCSP - URI:http://ipa-ca.ipa.local/ca/ocsp

            X509v3 Key Usage: critical
                Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment
            X509v3 Extended Key Usage: 
                E-mail Protection
            X509v3 CRL Distribution Points: 

                Full Name:
                  URI:http://ipa-ca.ipa.local/ipa/crl/MasterCRL.bin
                CRL Issuer:
                  DirName: O = ipaca, CN = Certificate Authority

            X509v3 Subject Key Identifier: 
                CE:A5:E3:B0:45:23:EC:B3:13:7C:BC:05:72:42:12:AD:9B:17:11:26
            X509v3 Subject Alternative Name: 
                email:alice@ipa.local
    Signature Algorithm: sha256WithRSAEncryption
         29:6a:99:84:8e:46:dc:0e:42:3d:b2:3e:fc:3f:c4:46:dc:44:
         ...

Conclusion

The ability to define and control access to custom certificate profiles and the extension of FreeIPA’s certificate management features to user principals open the door to many use cases that were previously not supported. Although the certificate management features available in FreeIPA 4.2 are a big step forward, there are still several areas for improvement, outlined below.

First, the Dogtag certificate profile format is obtuse. Documentation will make it bearable, but documentation is no substitute for good UX. An interactive profile builder would be a complex feature to implement but we might go there. Alternatively, a public, curated, searchable (even from FreeIPA’s web UI) repository of profiles for various use cases might be a better use of resources and would allow users and customers to help each other.

Next, the ability to create and use sub-CAs is an oft-requested feature and important for many use cases. Work is ongoing to bring this to FreeIPA soon. See the Sub-CAs design page for details.

Thirdly, the FreeIPA framework currently has authority to perform all kinds of privileged operations on the Dogtag instance. This runs contrary to the framework philosophy which advocates for the framework only having the privileges of the current user, with ACIs (and CA ACLs) enforced in the backends (in this case Dogtag). Ticket #5011 was filed to address this discrepancy.

Finally, the request interface between FreeIPA and Dogtag is quite limited; the only substantive information conveyed is whatever is in the CSR. There is minimal capability for FreeIPA to convey additional data with a request, and any time we (or a user or customer) want to broaden the interface to support new kinds of data (e.g. esoteric certificate extensions containing values from custom attributes), changes would have to be made to both FreeIPA and Dogtag. This approach does not scale.

I have a vision for how to address this final point in a future version of FreeIPA. It will be the subject of future blog posts, talks and eventually – hopefully – design proposals and patches! For now, I hope you have enjoyed this introduction to some of the new certificate management capabilities in FreeIPA 4.2 and find them useful. And remember that feedback, bug reports and help with development are always appreciated!

August 05, 2015

Template for a KeystoneV3.rc

If you are moving from Keystone v2 to v3 call, you need more variables in your environment. Here is a template for an update keystone.rc for V3, in jinja format:

export OS_AUTH_URL=http://{{ keystone_hostname }}:5000/v3
export OS_USERNAME={{ username }}
export OS_PASSWORD={{ password }}
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_PROJECT_NAME={{ project_name }}
export OS_IDENTITY_API_VERSION=3

MVEL as an attack vector

Java-based expression languages provide significant flexibility when using middleware products such as Business Rules Management System (BRMS). This flexibility comes at a price as there are significant security concerns in their use. In this article MVEL is used in JBoss BRMS to demonstrate some of the problems. Other products might be exposed to the same risk.

MVEL is an expression language, mostly used for making basic logic available in application-specific languages and configuration files, such as XML. It’s not intended for some serious object-oriented programming, just simple expressions as in “data.value == 1”. On a surface it doesn’t look like something inherently dangerous.

JBoss BRMS is a middleware product designed to implement Business Rules. The open source counterpart of JBoss BRMS is called drools. The product is intended to allow businesses (especially financial) to implement the decision logic used in their organization’s operations. The product contains a rules repository, an execution engine, and some authoring tools. The business rules themselves are written in a drools rules language. An interesting approach has been chosen for the implementation of drools rules language. The language is complied into MVEL for execution, and it allows the use of MVEL expressions directly, where expressions are applicable.

There is however an implementation detail that makes MVEL usage in middleware products a security concern. MVEL is compiled into plain Java and, as such, allows access to any Java objects and methods that are available to the hosting application. It was initially intended as an expression language that allowed simple programmatic expressions in otherwise non-programmatic configuration files, so this was never a concern: configuration files are usually editable only by the site admins anyway, so from a security perspective adding an expression in a config file is not much different from adding a call in a Java class of an application and deploying it. The same was true for BRMS up to version 5: any drools rule would be deployed as a separate file in repository, so any code in drools rules would be only available for deployment by authorized personnel, usually as part of the company workflow following the code review and other such procedures.

This changed in BRMS (and BPMS) 6. A new WYSIWYG tool was introduced that allowed constructing the rules graphically in a browser session, and testing them right away. So any person with rule authoring permissions (role known as “analyst” rather than “admin”) would be able to do this. The drools rules would allow writing arbitrary MVEL expressions, that in turn allow any calls to any Java classes deployed on the application server without restrictions, including the system ones. This means an analyst would be able to write Sys.exit() in a rule and testing this rule would shut down the server! Basically, the graphical rule editor allowed authenticated arbitrary code execution for non-admin users.

A similar problem existed in JBoss Fuse Service Works 6. While the drools engine that ships with it does not come with any graphical tool to author rules, so the rules must be deployed on the server as before, it comes with RTGov component that has some MVEL interfaces exposed. Sending an RTGov request with an MVEL expression in it would again allow authenticated arbitrary code execution for any user that has RTGov permissions.

This behaviour was caught early on in the development cycle for BxMS/FSW version 6, and a fix was implemented. The fix involves running the application server with Java Security Manager (JSM) turned on, and adding extra configuration files for MVEL-only security policies. After the fix was applied, only the limited number of Java classes were allowed to be used inside MVEL expressions, which were safe for use in legitimate Drools rules and RTGov interfaces, the specific RCE vulnerability was considered solved.

Further problems arose when products went into testing with the fix applied and some regressions were run. It was discovered that it wasn’t a good idea to make the fix with JSM enabled the default setup for productions servers as this caused the servers would run slow. Very slow. Resource consumption was excessive and performance suffered dramatically. It became obvious that making MVEL/JSM fix the default for high-performance production environment was a not an -option.

A solution was found after considerable consultation between Development, QE and Project Management. The following proposals where made for any company running BRMS:

  • When deploying BRMS/BPMS on a high-performance production server, it is suggested to disable JSM, but at the same time not to allow any “analyst”-role users to use these systems for rule development. It is recommended to use these servers for running the rules and applications developed separately and achieving maximum performance, while eliminating the vulnerability by disabling the whole attack vector by disallowing the rule development altogether.
  • When BRMS is deployed on development servers used by rule developers and analysts, it is suggested to run these servers with JSM enabled. Since these are not production servers, they do not require mission critical performance in processing real-time customer data, they are only used for application and rule development. As such, a little sacrifice in performance on a non mission-critical server is a fair trade-off for a tighter security model.
  • The toughest situation arises when a server is deployed in a “BRMS-as-a-service” configuration. In other words when rule development is exposed to customers over the Web (even through VPN-protected Extranet). In this case no other choice is available but to enable complete JSM protection, and accept all the consequences of the performance hit. Without it, any customer with minimal “rule writing and testing” privileges can completely take over the server (and any other co-hosted customers’ data as well), A very undesirable result to avoid.

Similar solutions are recommended for FSW. Since only RTGov exposes the weakness, it is recommended to run RTGov as a separate server with JSM enabled. For high performance production servers, it is recommended not to install or enable the RTGov component, which eliminates the risk of exposure of MVEL-based attack vectors, making it possible to run them without JSM at full speed.

Other approaches are being considered by the development team for new implementation of MVEL fix in the future BRMS versions. Once such idea was to run a dedicated MVEL-only app server under JSM separate from the main app server that runs all other parts of the applications, but other proposals were talked about as well. Stay tuned for more information once the decisions are made.

July 31, 2015

CIL – Part2: Module priorities

In my previous blog, I talked about CIL performance improvements. In this blog post, I would like to introduce another cool feature called module priorities. If you check the link, you can see a nice blog post published by Petr Lautrbach about this new feature.

With new SELinux userspace, we are able to use priorities for SELinux policy modules. It means you can ship own ipa policy module, which is based on distribution policy module, with additional rules and load it with higher priority. No more different names for policy modules and higher priority wins.

# semodule --list-modules=full | grep ipa
400 ipa pp
100 ipa pp

Of course, you can always say you want to use distro policy module and add just additional fixes. Yes, it works fine for some cases when you add just minor fixes which you are not able to get to distro policy for some reasons. Actually you can also package this local policy how Lukas Vrabec wrote in his blog.

Another way how to deal with this case is a fact you can ship SELinux policy for your application at all and don’t be a part of distro policy. Yes, we can see these cases.

For example

# semodule --list-modules=full | grep docker
400 docker pp

But what are disadvantages with this way?

* you need to know how to write SELinux policy
* you need to maintain this policy and reflect the latest distro policy changes
* you need to do “hacks” in your policies if you need to have some interfaces for types shipped by distro policy
* you would get your policy to upstream and check if there is no conflict with distribution policy if they do a merge with the same upstream

From Fedora/RHEL point of view, this was always a problem how to deal with policies for big projects like Cluster, Gluster, OpenShift and so on. We tried to get these policies out of distro policy but it was really hard to do a correct rpm handling and then we faced my above mentioned points.

So is there any easy way how to deal with it? Yes, it is. We ships a policy for a project in our distribution policy and this project takes this policy, adds additional fixes, creates pull requests against distribution policy and if there will be different timelines then it will be shipped by this project. And that’s it! It can be done easily using module priorities.

For example, we have Gluster policy in Fedora by default.

# semodule --list-modules=full | grep gluster
100 glusterd pp

And now, Gluster team needs to do a new release but it causes some SELinux issues. Gluster folks can take distribution policy, add additional rules and package it.

Then we will see something like

# semodule --list-modules=full | grep gluster
100 glusterd pp
400 glusterd pp

In the mean time, Gluster folks can do pull requests with all changes against disitribution policy and they can still ship the same policy. The Gluster policy is a part of distribution policy, it can be easily usptream-able and moreover, it can be disabled in distribution policy by default.

# semodule --list-modules=full | grep gluster
400 gluster cil
100 glusterd pp disabled

$ matchpathcon /usr/sbin/glusterfsd
/usr/sbin/glusterfsd system_u:object_r:glusterd_exec_t:s0

This model is really fantastic and give us answers for lot of issues.


July 30, 2015

Using Ansible to add a NetworkManager connection

The Virtual Machine has two interfaces, but only one is connected to a network. How can I connect the second one?

To check the status of the networks with NetworkManagers Command Line Interface (nmcli) run:

$ sudo nmcli device
DEVICE  TYPE      STATE         CONNECTION  
eth0    ethernet  connected     System eth0 
eth1    ethernet  disconnected  --          
lo      loopback  unmanaged     --

To bring it up manually:

$ sudo nmcli connection add type ethernet ifname eth1  con-name ethernet-eth1
Connection 'ethernet-eth1' (a13aeb2c-630f-4de6-b735-760264927263) successfully added.

To Automate the same thing via Ansible, we can use the command: module, but that will execute every time unless we check that the interface has an IP address. If it does; we want to skip it. We can check that using the predfined facts variables. Each interface has a variable in the form of ansible_interface, which is a dictionary containing details about the host. Here is what my host has for the interfaces:

        "ansible_eth0": {
            "active": true,
            "device": "eth0",
            "ipv4": {
                "address": "192.168.52.4",
                "netmask": "255.255.255.0",
                "network": "192.168.52.0"
            },
            "ipv6": [
                {
                    "address": "fe80::f816:3eff:fed0:510f",
                    "prefix": "64",
                    "scope": "link"
                }
            ],
            "macaddress": "fa:16:3e:d0:51:0f",
            "module": "virtio_net",
            "mtu": 1500,
            "promisc": false,
            "type": "ether"
        },
        "ansible_eth1": {
            "active": true,
            "device": "eth1",
            "macaddress": "fa:16:3e:38:31:71",
            "module": "virtio_net",
            "mtu": 1500,
            "promisc": false,
            "type": "ether"
        },

You can see that, while eth0 has an ipv4 section, eth1 has no such section. Thus, to gate the playbook task on the present of the variable, use a when clause.

Here is the completed task:

  - name: Add second ethernet interface
    command: nmcli connection  add type ethernet ifname eth1  con-name ethernet-eth1
    when: ansible_eth1.ipv4 is not defined

Now, there is an Ansible module for Network Manager, but it is in 2.0 version of Ansible which is not yet released. I want this using the version of Ansible I (and my team) have installed on Fedora 22. Once 2.0 comes out, many of these “one-offs” will use the core modules.

To exec or transition that is the question...
I recently recieved a question on writing policy via linkedin.

Hi, Dan -

I am working on SELinux right now and I know you are an expert on it.. I believe you can give me a help. Now in my policy, I did in myadm policy like
require { ...; type ping_exec_t; ...;class dir {...}; class file {...}; }

allow myadm_t ping_exec_t:file { execute execute_no_trans };

Seems the ping is not work, I got error
ping: icmp open socket: Permission denied

Any ideas?


My response:

When running another program there are two things that can happen:
1. You can either execute the program in the current context (Which is what  you did)
This means that myadm_t needs to have all of the permissions of ping.

2. You can transition to the executables domain  (ping_t)

We usually use interfaces for this.

netutils_domtrans_ping(myadm_t)

netutils_exec_ping(myadm_t)


I think if you looked at your AVC's you would probbaly see something about myadm_t needing the net_raw capability.

sesearch -A -s ping_t -c capability
Found 1 semantic av rules:
   allow ping_t ping_t : capability { setuid net_raw } ;


net_raw access allows ping_t to create and send icmp packets.  You could add that to myadm_t, but that would allow it
to listen at a low level to network traffic, which might not be something you want.  Transitioning is probably better.

BUT...

Transitioning could cause other problems, like leaked file descriptors or bash redirection.  For example if you do a
ping > /tmp/mydata, then you might have to add rules to ping_t to be allowed to write to the label of /tmp/mydata.

It is your choice about which way to go.

I usually transition if their is a lot of access needed, but if their is only a limited access, that I deem not too risky, I
exec and add the additional access to the current domain.
CIL – Part1: Faster SELinux policy (re)build

As you probably know we shipped new features related to SELinux policy store migration in Fedora 23. If you check the link, you can see more details about this change. You can read some technical details, benefits and examples how to test it. In this blog series, called CIL, I would like to introduce you this new feature and show you benefits which CIL brings.

One of the most critical part of SELinux usability are time-consuming SELinux operations like policy installations or loading new policy modules for example. I guess you know what about I am talking. For example, you want to create own policy module for your application and test it on your virtual machine. It means you attempt to execute

semodule -i myapps.pp

and you are waiting, waiting, waitng and waiting.

The same you can see if you try to disable a module

semodule -d rhcs

and you are waiting, waiting, waitng and waiting.

It directly depends on used policy language and on the amount of policy rules which need to be rebuilt if SELinux policy modules are managed. You can read more info about policy modules and kernel policy in my previous blog.

And at this point, CIL brings really big performance improvements. Just imagine, no more “waiting waiting waiting” on a policy installation. No more “waiting waiting waiting” if you load your own policy module.

But no more words and let show you some real numbers.

SELinux_mange_store_time_statistics

You can see really big differences for chosen SELinux operations between a regular system with old SELinux userspace without CIL and with a new SELinux userspace with CIL.

It means we can really talk about ~75% speed-up for tools/apps which access to manage SELinux policy.

Note: These numbers come from Fedora 23 virtual machine and all these actions require a policy rebuild.

And it is not only about SELinux tools but we have also SELinux aware applications – systemd for example which loads Fedora distribution policy on boot process. And you get also big improvements on this boot process.

CIL: systemd[1]: Successfully loaded SELinux policy in 91.886ms.
REGULAR: systemd[1]: Successfully loaded SELinux policy in 172.393ms.

I believe you are now really excited to test this new feature and get own numbers and see how much faster SELinux tools like semodule, semanage are if they manipulate with a policy.


July 29, 2015

Remote code execution via serialized data

Most programming languages contain powerful features, that used correctly are incredibly powerful, but used incorrectly can be incredibly dangerous. Serialization (and deserialization) is one such feature available in most modern programming languages. As mentioned in a previous article:

“Serialization is a feature of programming languages that allows the state of in-memory objects to be represented in a standard format, which can be written to disk or transmitted across a network.”

 

So why is deserialization dangerous?

Serialization and, more importantly, deserialization of data is unsafe due to the simple fact that the data being processed is trusted implicitly as being “correct.” So if you’re taking data such as program variables from a non trusted source you’re making it possible for an attacker to control program flow. Additionally many programming languages now support serialization of not just data (e.g. strings, arrays, etc.) but also of code objects. For example with Python pickle() you can actually serialize user defined classes, you can take a section of code, ship it to a remote system, and it is executed there.

Of course this means that anyone with the ability to send a serialized object to such a system can now execute arbitrary code easily, with the full privileges of the program running it.

Some examples of failure

Unlike many classes of security vulnerabilities you cannot really accidentally create a deserialization flaw. Unlike memory management flaws for example which can easily occur due to a single off-by-one calculation, or misuse of variable type, the only way to create a deserialization flaw is to use deserialization. Some quick examples of failure include:

CVE-2012-4406 – OpenStack Swift (an object store) used Python pickle() to store metadata in memcached (which is a simple key/value store and does not support authentication), so an attacker with access to memcached could cause arbitrary code execution on all the servers using Swift.

CVE-2013-2165 – In JBoss’s RichFaces ResourceBuilderImpl.java the classes which could be called were not restricted allowing an attacker to interact with classes that could result in arbitrary code execution.

There are many more examples spanning virtually every major OS and platform vendor unfortunately. Please note that virtually every modern language includes serialization which is not safe by default to use (Perl Storage, Ruby Marshal, etc.).

So how do we serialize safely?

The simplest way to serialize and deserialize data safely is to use a format that does not include support for code objects. Your best bet for serialization almost all forms of data safely in a widely supported format is JSON. And when I say widely supported I mean everything from Cobol and Fortran to Awk, Tcl and Qt. JSON supports pairs (key:value), arrays and elements and within these a wide variety of data types including strings, numbers, objects (JSON objects), arrays, true, false and null. JSON objects can contain additional JSON objects, so you can for example serialize a number of things into discrete JSON objects and then shove those into a single large JSON (using an array for example).

Legacy code

But what if you are dealing with legacy code and can’t convert to JSON? On the receiving (deserializing end) you can attempt to monkey patch the code to restrict the objects allowed in the serialized data. However most languages do not make this very easy or safe and a determined attacker will be able to bypass them in most cases. An excellent paper is available from BlackHat USA 2011 which covers any number of clever techniques to exploit Python pickle().

What if you need to serialize code objects?

But what if you actually need to serialize and deserialize code objects? Since it’s impossible to determine if code is safe or not you have to trust the code you are running. One way to establish that the code has not been modified in transit, or comes from an untrusted source is to use code signing. Code signing is very difficult to do correctly and very easy to get wrong. For example you need to:

  1. Ensure the data is from a trusted source
  2. Ensure the data has not been modified, truncated or added to in transit
  3. Ensure that the data is not being replayed (e.g. sending valid code objects out of order can result in manipulation of the program state)
  4. Ensure that if data is blocked (e.g. blocking code that should be executed but is not, leaving the program in an inconsistent state) you can return to a known good state

To name a few major concerns. Creating a trusted framework for remote code execution is outside the scope of this article, however there are a number of such frameworks.

Conclusion

If data must be transported in a serialized format use JSON.  At the very least this will ensure that you have access to high quality libraries for the parsing of the data, and that code cannot be directly embedded as it can with other formats such as Python pickle(). Additionally you should ideally encrypt and authenticate the data if it is sent over a network, an attacker that can manipulate program variables can almost certainly modify the program execution in a way that allows privilege escalation or other malicious behavior. Finally you should authenticate the data and prevent replay attacks (e.g. where the attacker records and re-sends a previous sessions data), chances are if you are using JSON you can simply wrap the session in TLS with an authentication layer (such as certificates or username and password or tokens).

July 23, 2015

libuser vulnerabilities

Updated 2015-07-24 @ 12:33 UTC

It was discovered that the libuser library contains two vulnerabilities which, in combination, allow unprivileged local users to gain root privileges. libuser is a library that provides read and write access to files like /etc/passwd, which constitute the system user and group database. On Red Hat Enterprise Linux it is a central system component.

What is being disclosed today?

Qualys reported two vulnerabilities:

It turns out that the CVE-2015-3246 vulnerability, by itself or in conjunction with CVE-2015-3245, can be exploited by an unprivileged local user to gain root privileges on an affected system. However, due to the way libuser works, only users who have accounts already listed in /etc/passwd can exploit this vulnerability, and the user needs to supply the account password as part of the attack. These requirements mean that exploitation by accounts listed only in LDAP (or some other NSS data source) or by system accounts without a valid password is not possible. Further analysis showed that the first vulnerability, CVE-2015-3245, is also due to a missing check in libuser. Qualys has disclosed full technical details in their security advisory posted to the oss-security mailing list.

Which system components are affected by these vulnerabilities?

libuser is a library, which means that in order to exploit it, a program which employs it must be used. Ideally, such a program has the following properties:

  1. It uses libuser.
  2. It is SUID-root.
  3. It allows putting almost arbitrary content into /etc/passwd.

Without the third item, exploitation may still be possible, but it will be much more difficult. If the program is not SUID-root, a user will not have unlimited attempts to exploit the race condition. A survey of programs processing /etc/passwd and related files presents this picture:

  • passwd is SUID-root, but it uses PAM to change the password, which has custom code to modify /etc/passwd not affected by the race condition. The account locking functionality in passwd does use libuser, but it is restricted to root.
  • chsh from util-linux is SUID-root and uses libuser to change /etc/passwd (the latter depending on how util-linux was compiled), but it has fairly strict filters controlling what users can put into these files.
  • lpasswd, lchfn, lchsh and related utilities from libuser are not SUID-root.
  • userhelper (in the usermode package) and chfn (in the util-linux package) have all three qualifications: libuser-based, SUID-root, and lack of filters.

This is why userhelper and chfn are plausible targets for exploitation, and other programs such as passwd and chsh are not.

How can these vulnerabilities be addressed?

System administrators can apply updates from your operating system vendor. Details of affected Red Hat products and security advisories are available on the knowledge base article on the Red Hat Customer Portal. This security update will change libuser to apply additional checks to the values written to the user and group files (so that injecting newlines is no longer possible), and replaces the locking and file update code to follow the same procedures as the rest of the system. The first change is sufficient to prevent newline injection with userhelper as well, which means that only libuser needs to be updated. If software updates are not available or cannot be applied, it is possible to block access to the vulnerable functionality with a PAM configuration change. System administrators can edit the files /etc/pam.d/chfn and /etc/pam.d/chsh and block access to non-root users by using pam_warn (for logging) and pam_deny:

#%PAM-1.0
auth       sufficient   pam_rootok.so
auth required pam_warn.so
auth required pam_deny.so
auth       include      system-auth
account    include      system-auth
password   include      system-auth
session    include      system-auth

This will prevent users from changing their login shells and their GECOS field. userhelper identifies itself to PAM as “chfn”, which means this change is effective for this program as well.

Acknowledgements

Red Hat would like to thank Qualys for reporting these vulnerabilities.

Update (2015-07-24): Clarified that chfn is affected as well and linked to Qualys security advisory.

July 13, 2015

Skills Mastery

You go through a series of stages when learning a new skill. Let’s  look at these stages, covering both the characteristics and the implications of each stage. It is helpful to understand a framework for skill levels, what level you are at – and what level the other members of your team are at.

One powerful model for this is the Dreyfus Model of Skills Acquisition. The Dreyfus Model has been used in a variety of professional settings, including nursing.

Several researchers suggest that it takes roughly 10 years and 10,000 hours of intensive effort to become an expert in a subject. This isn’t just 10 year of experience – it is 10 years of applied, concentrated, progressively more difficult study and practice of the subject. The classic “one year of experience repeated 10 times” will not lead you to mastery. They also estimate that less than 5% of people master even a single subject, much less multiple subjects.

The good news is that many of the skills necessary for achieving mastery of a subject are learned while you are working to master your first subject, and it is then easier and faster to master additional subjects.

An excellent book for understanding how you think and learn – and how to do it better – is Pragmatic Thinking and Learning by Andy Hunt. I have been heavily influenced by this book, and enthusiastically recommend it. It is worthwhile checking out Andy’s website at  www.toolshed.com.

Stages of Skills Mastery (from Wikipedia, the free encyclopedia)

In the fields of education and operations research, the Dreyfus model of skill acquisition is a model of how students acquire skills through formal instruction and practicing. The model proposes that a student passes through five distinct stages: novice, advanced beginner, competent, proficient, and expert.

In the novice stage, a person follows rules as given, without context, with no sense of responsibility beyond following the rules exactly. Competence develops when the individual develops organizing principles to quickly access the particular rules that are relevant to the specific task at hand; hence, competence is characterized by active decision making in choosing a course of action. Proficiency is shown by individuals who develop intuition to guide their decisions and devise their own rules to formulate plans. The progression is thus from rigid adherence to rules to an intuitive mode of reasoning based on tacit knowledge.

Michael Eraut summarized the five stages of increasing skill as follows:

1. Novice

  • “rigid adherence to taught rules or plans”
  • no exercise of “discretionary judgment”

2. Advanced beginner

  • limited “situational perception”
  • all aspects of work treated separately with equal importance

3. Competent

  • “coping with crowdedness” (multiple activities, accumulation of information)
  • some perception of actions in relation to goals
  • deliberate planning
  • formulates routines

4. Proficient

  • holistic view of situation
  • prioritizes importance of aspects
  • “perceives deviations from the normal pattern”
  • employs maxims for guidance, with meanings that adapt to the situation at hand

5. Expert

  • transcends reliance on rules, guidelines, and maxims
  • “intuitive grasp of situations based on deep, tacit understanding” has “vision of what is possible” uses “analytical approaches” in new situations or in case of problems

July 09, 2015

Getting a Keystone Token in Rust

Python is a great language, but sometimes I miss type safety. While I was always comfortable with C++, I know that the lanugague has itw warts. Lately, I’ve been taking an interest in Rust, which seems to be set to address many of the shortcomiings of C++. WHat better way to learn it than to try and use on problems I already know well; OpenStack and Keystone? So, I wrote my first non-trivial program, which Gets a token token.

Here is my src/main.rs

/*
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version. See http://www.gnu.org/licenses/ 
*/


extern crate curl;
extern crate rustc_serialize;


use curl::http;
use rustc_serialize::json;


#[derive(RustcDecodable, RustcEncodable)]
struct Domain {
    name: String
}
#[derive(RustcDecodable, RustcEncodable)]
struct User {
    name: String,
    password: String,
    domain: Domain
}

#[derive(RustcDecodable, RustcEncodable)]
struct PasswordAuth {
    user: User
}

#[derive(RustcDecodable, RustcEncodable)]
struct Project{
    domain: Domain,
    name:  String
}
#[derive(RustcDecodable, RustcEncodable)]
struct Scope{
    project : Project
}
#[derive(RustcDecodable, RustcEncodable)]
struct Identity{
    methods: Vec<string>,
    password: PasswordAuth
}

#[derive(RustcDecodable, RustcEncodable)]
struct Auth{
    identity: Identity,
    scope: Scope
    
}

#[derive(RustcDecodable, RustcEncodable)]
struct TokenRequest{
    auth: Auth,
}

fn main() {

    
    let os_auth_url = env!("OS_AUTH_URL").to_string();
    let os_username = env!("OS_USERNAME").to_string();
    let os_password = env!("OS_PASSWORD").to_string();
    let os_project_name =
        env!("OS_PROJECT_NAME").to_string();  
    let os_project_domain_name =
        env!("OS_PROJECT_DOMAIN_NAME").to_string();  
    let os_user_domain_name =
        env!("OS_USER_DOMAIN_NAME").to_string();  


    let request = TokenRequest {
        auth: Auth{
            identity: Identity {
                methods: vec!["password".to_string()],
                password: PasswordAuth{
                    user: User{
                        name: os_username,
                        password: os_password,        
                        domain: Domain {
                            name: os_user_domain_name}
                    }
                }
            },
            scope: Scope {
                project:
                Project{
                    name: os_project_name,
                    domain: Domain{
                        name: os_project_domain_name
                    }
                }
            }
        }
    };
    let s = json::encode(&request).unwrap();
    let data: &str = &s;
    
    let token_request_url = os_auth_url.to_string()
        + "/auth/tokens";
    let resp = http::handle()
        .post(token_request_url, data)
        .header("Content-Type", "application/json")
        .exec().unwrap();


    let body = String::from_utf8_lossy(resp.get_body());
    
    println!("code={}; headers={:?}; body={}",
    resp.get_code(), resp.get_headers(), body);    
}


Here is the Cargo.toml file

[package]

name = "osrust"
version = "0.0.1"
authors = [ "Adam Young <adam>" ]


[dependencies.curl]
git = "https://github.com/carllerche/curl-rust"

[dependencies]
rustc-serialize = "0.3.15"

My Directory structure looks like this:

./src
./src/main.rs
./Cargo.toml

Build it with:

cargo build

It reads the values necessary to connect to Keystone from environment variables, which I have in a file like this.

export OS_AUTH_URL=http://hostname:5000/v3
export OS_USERNAME=ayoung
export OS_PASSWORD=changeme
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_PROJECT_NAME=ChangeMe

To Run it

 . ~/keystonev3.rc
 ./target/debug/osrust 

I decided to use Curl instead of rust-http, mainly due to the expectation that rust-http hasn’t dealt with Negotiate (GSSAPI, Kerberos) yet but Curl has. I want to use this approach to talk to IPA as well as to Keystone, using Kerberos for authentication. The rust-http would not be radically different.

June 29, 2015

I get a SYS_PTRACE AVC when my utility runs ps, how come?
We often get random SYS_PTRACE AVCs, usually when an application is running the ps command or reading content in /proc.

https://bugzilla.redhat.com/show_bug.cgi?id=1202043

type=AVC msg=audit(1426354432.990:29008): avc:  denied  { sys_ptrace } for  pid=14391 comm="ps" capability=19  scontext=unconfined_u:unconfined_r:mozilla_plugin_t:s0-s0:c0.c1023 tcontext=unconfined_u:unconfined_r:mozilla_plugin_t:s0-s0:c0.c1023 tclass=capability permissive=0

sys_ptrace usually indicates that one process is trying to look at the memory of another process with a different UID.

man 

man capabilites
...

       CAP_SYS_PTRACE
              *  Trace arbitrary processes using ptrace(2);
              *  apply get_robust_list(2) to arbitrary processes;
              *  transfer data to or from the memory  of  arbitrary  processes
                 using process_vm_readv(2) and process_vm_writev(2).
              *  inspect processes using kcmp(2).


These types of access should probably be dontaudited. 

Running the ps command was a privileged process can cause sys_ptrace to happen.  

There is special data under /proc that a privileged process would access by running the ps command,  

This data is almost never actually needed by the process running ps, the data is used by debugging tools 
to see where some of the randomized memory of a process is setup.  

Easiest thing for policy writers to do is to dontaudit the access.


How do files get mislabled?
Sometimes we close bugs as CLOSED_NOT_A_BUG, because of a file being mislabeled, we then tell the user to just run restorecon on the object.

But this leaves the user with the question,

How did the file get mislabeled?

They did not run the machin in permissive mode or disable SELinux, but still stuff became mislabeled?  How come?

The most often case of this in my experience is the mv command, when users mv files around their system the mv command maintains the security contenxt of the src object.

sudo mv ~/mydir/index.html /var/www/html

This ends up with a file labeled user_home_t in the /var/www/html, rather then http_sys_content_t, and apache process is not allowed to read it.  If you use mv -Z on newer SELinux systems, it will change the context to the default for the target directory.

Another common cause is debugging a service or running a service by hand.

This bug report is a potential example.

https://bugzilla.redhat.com/show_bug.cgi?id=1175934

Sometimes we see content under /run (/var/run) which is labeled var_run_t, it should have been labeled something specific to the domain that created it , like apmd_var_run_t.
The most likely cause of this, is that the object was created by an unconfined domain like unconfined_t.  Basically an unconfined domain creates the object based on the parent directory, which would label it as var_run_t.

I would guess that the user/admin ran the daemon directly rather then through the init script.

# /usr/bin/acpid
or
#gdb /usr/bin/acpid


When acpid created the /run/acpid.socket then the object would be mislableed.  Later when the user runs the service through the init system it would get run with the correct type (apmd_t) and would be denied from deleting the file.

type=AVC msg=audit(1418942223.880:4617): avc:  denied  { unlink } for  pid=24444 comm="acpid" name="acpid.socket" dev="tmpfs" ino=2550865 scontext=system_u:system_r:apmd_t:s0 tcontext=system_u:object_r:var_run_t:s0 tclass=sock_file permissive=0


Sadly their is not much we can do to prevent this type of mislabeled file from being created, and end up having to tell the user to run restorecon.

June 23, 2015

Install FreeIPA via Ansible

No better way to learn some more details of Ansible than to automate a task I need to do on a regular basis: ipa-server-install.

My first take at installing FreeIPA (ipa in Centos) via Ansible is pretty simple: use the command module and do it as an Ansible ad-hoc commands:

ansible ipa -i ~/.ossipee/inventory.ini -m shell -u centos --sudo -a "ipa-server-install -U -r AYOUNG -p FreeIPA4All -a FreeIPA4All --setup-dns --forwarder 192.168.52.3"

Next attempt is using a Ansible playbooks. Here is install_ipa.yml

---
- hosts: ipa
  tasks:
  - command: ipa-server-install -U -r AYOUNG -p FreeIPA4All -a FreeIPA4All --setup-dns --forwarder 192.168.52.3

Executed with

 ansible-playbook -i ~/.ossipee/inventory.ini  -u centos --sudo   install_ipa.yml

While this is acceptable for a development setup, I want to improve a few things.

  • Hide the passwords used for the admin accounts.
  • Calculate the Realm from the domain (mostly a to-upper hack using a variable for both)
  • Read the resolver out of the existing resolv.conf

Just for completeness, also did this as an Ansible module.

#!/usr/bin/python

import os
import json
import subprocess

def iparesolver():
    for text in open("/etc/resolv.conf","r"):
        words = text.split()
        if words[0] == "nameserver":
            return words[1] 

def ipa_install_command():
    iparealm="RDO.CLOUDLAB.FREEIPA.ORG"
    install_command = ["ipa-server-install","-U","-r", iparealm,
                       "-p", "FreeIPA4All",
                       "-a", "FreeIPA4All",
                       "--setup-dns", "--forwarder", iparesolver()]
    return install_command


subprocess.call(ipa_install_command())

Executed with

ansible ipa -i ~/.ossipee/inventory.ini -m ipa_server_install  -M ./ansible -u centos --sudo

It reports a failure due to the volumes of data returned, but actually successfully installed IPA.

June 19, 2015

Resetting a Known Host for SSH

I often create and destroy a virtual machine multiple times in development. SSH records the host and key and often complains about a changed value for a given key. As I am attempting to automate more and more, I need to be able to communicate with these recreated hosts without dealing with the warning messages.

#!/bin/sh


if test "$#" -lt 1
then
    echo "usage $0  <ipaddr> <username> "
    echo
    echo "Will remove the ipaddress from the known hosts file,"
    echo "and then make an ssh call to the host without strict,"
    echo "host checking to  repopulate it.  This is risky if you"
    echo "do not know for certain that you are talking to the"
    echo "correct host."

    exit 1
fi

IPADDR=$1

if test "$#" -eq 2
then
    USERNAME=$2
else
    USERNAME=centos
fi
  

sed  -i.bak  '/^$IPADDR/d'   ~/.ssh/known_hosts
ssh  -o  "StrictHostKeyChecking=no"   $USERNAME@$IPADDR hostname

June 17, 2015

Single sign-on with OpenConnect VPN server over FreeIPA

In March of 2015 the 0.10.0 version of OpenConnect VPN was released. One of its main features is the addition of MS-KKDCP support and GSSAPI authentication. Putting the acronyms aside that means that authentication in FreeIPA, which uses Kerberos, is greatly simplified for VPN users. Before explaining more, let’s first explore what the typical login process is on a VPN network.

Currently, with a VPN server/product one needs to login to the VPN server using some username-password pair, and then sign into the Kerberos realm using, again, a username-password pair. Many times, exactly the same password pair is used for both logins. That is, we have two independent secure authentication methods to login to a network, one after the other, consuming the user’s time without necessarily increasing the security level. Can things be simplified and achieve single sign on over the VPN? We believe yes, and that’s the reason we combined the two independent authentications into a single authentication instance. The user logs into the Kerberos realm once and uses the obtained credentials to login to the VPN server as well. That way, the necessary passwords are asked only once, minimizing login time and frustration.

How is that done? If the user needs to connect to the VPN in order to access the Kerberos realm, how could he perform Kerberos authentication prior to that? To answer that question we’ll first explain the protocols in use. The protocol followed by the OpenConnect VPN server is HTTPS based, hence, any authentication method available for HTTPS is available to the VPN server as well. In that particular case, we take advantage of the SPNEGO, and the the MS-KKDCP protocols. The former enables GSSAPI negotiation over HTTPS, thus allowing a Kerberos ticket to be used to authenticate to the server. The MS-KKDCP protocol allows an HTTPS server to behave as a proxy to a Kerberos Authentication Server, and that’s the key point which allows the user to obtain the Kerberos ticket over the VPN server protocol. Thus, the combination of the two protocols allows the OpenConnect VPN server to operate both as a proxy to KDC and as a Kerberos-enabled service. Furthermore, the usage of HTTPS ensures that all transactions with the Kerberos server are protected using the OpenConnect server’s key, ensuring the privacy of the exchange. However, there is a catch; since the OpenConnect server is now a proxy for Kerberos messages, the Kerberos Authentication Server cannot see the real IPs of the clients, and thus cannot prevent a flood of requests which can cause denial of service. To address that, we introduced a point system to the OpenConnect VPN server for banning IP addresses when they perform more than a pre-configured amount of requests.

As a consequence, with the above setup, the login processes is simplified by reducing the required steps to login to a network managed by FreeIPA. The user logs into the Kerberos Authentication Server and the VPN to the FreeIPA managed network is made available with no additional prompts.

Wouldn’t that reduce security? Isn’t it more secure to ask different credentials from the user to connect to the home network and different credentials to access the services into it? That’s a valid concern. There can be networks where this is indeed a good design choice, but in other networks it may be not. By stacking multiple authentication methods you could result in having your users trying the different credentials to the different login prompts, effectively training the less security-oriented to try the passwords they were provided anywhere until it works. However, it is desirable to increase the authentication strength when coming from untrusted networks. For that, it is possible, and recommended, to configure FreeIPA to require a second factor authenticator‌ (OTP) as part of the login process.

Another, equally important concern for the single sign-on, is to prevent re-authentication to the VPN for the whole validity time of a Kerberos key. That is, given the long lifetime of Kerberos tickets, how can we prevent a stolen laptop from being able to access the VPN? That, we address by enforcing a configurable TGT ticket lifetime limit on the VPN server. This way, VPN authentication will only occur if the user’s ticket is fresh, and the user’s password will be required otherwise.

Setting everything up

The next paragraphs move from theory to practice, and describe the minimum set of steps required to setup the OpenConnect VPN server and client with FreeIPA. At this point we assume that a FreeIPA setup is already in place and a realm name KERBEROS.REALM exists. See the Fedora FreeIPA guide for information on how to setup FreeIPA.

Server side: Fedora 22, RHEL7

The first step to install the latest of the 0.10.x branch OpenConnect VPN server (ocserv) at the server system. You can use the following command. In a RHEL7 you will also need to setup the EPEL7 repository.

yum install -y ocserv

That will install the server in an unconfigured state. The server utilizes a single configuration file found in /etc/ocserv/ocserv.conf. It contains several directives documented inline. To allow authentication with Kerberos tickets as well as with the password (e.g., for clients that cannot obtain a ticket – like clients in mobile phones) it is needed to enable PAM as well as GSSAPI authentication with the following two lines in the configuration file.

auth = pam
enable-auth = gssapi[tgt-freshness-time=360]

The option ‘tgt-freshness-time’, is available with openconnect VPN server 0.10.5, and specifies the valid for VPN authentication lifetime, in seconds, of a Kerberos (TGT) ticket. A user will have to reauthenticate if this time is exceeded. In effect that prevents the usage of the VPN for the whole lifetime of a Kerberos ticket.

The following line will enable the MS-KKDCP proxy on ocserv. You’ll need to replace the KERBEROS.RELAM with your realm and the KDC IP address.

kkdcp = /KdcProxy KERBEROS.REALM tcp@KDC-IP-ADDRESS:88

Note, that for PAM authentication to operate you will also need to set up a /etc/pam.d/ocserv. We recommend to use pam_sssd for that, although it can contain anything that best suits the local policy. An example for an SSSD PAM configuration is shown in the Fedora Deployment guide.

The remaining options in ocserv.conf are about the VPN network setup; the comments in the default configuration file should be self-explicable. At minimum you’ll need to specify a range of IPs for the VPN network, the addresses of the DNS servers, and the routes to push to the clients. At this point the server can be run with the following commands.

systemctl enable ocserv
systemctl start ocserv

The status of the server can be checked using “systemctl status ocserv”.

Client side: Fedora 21, RHEL7

The first step is to install the OpenConnect VPN client, named openconnect, in the client system. The version must be 7.05 or later. In a RHEL7 you will need to setup the EPEL7 repository.

yum install -y openconnect network-manager-openconnect

Setup Kerberos to use ocserv as KDC. For that you’ll need to modify /etc/krb5.conf to contain the following:

[realms]
KERBEROS.REALM = {
    kdc = https://ocserv.example.com/KdcProxy
    http_anchors = FILE:/path-to-your/ca.pem
    admin_server = ocserv.example.com
    auto_to_local = DEFAULT
}

[domain_realm]
.kerberos.test = KERBEROS.REALM
kerberos.test = KERBEROS.REALM

Note that, ocserv.example.com should be replaced with the DNS name of your server, and the /path-to-your/ca.pem should be replaced by the a PEM-formatted file which holds the server’s Certificate Authority. For the KDC option the server’s DNS name is preferred to an IP address to simplify server name verification for the Kerberos libraries. At this point you should be able to use kinit to authenticate and obtain a ticket from the Kerberos Authentication Server. Note however, that kinit is very brief on the printed errors and a server certificate verification error will not be easy to debug. Ensure that the http_anchors file is in PEM format, it contains the Certificate Authority that signed the server’s certificate, and that the server’s certificate DNS name matches the DNS name setup in the file. Note also, that this approach requires the user to always use the OpenConnect’s KDCProxy. To avoid that restriction, and allow the user to use the KDC directly when in LAN, we are currently working towards auto-discovery of KDC.

Then, at a terminal run:

$ kinit

If the command succeeds, the ticket is obtained, and at this point you will be able to setup openconnect from network manager GUI and connect to it using the Kerberos credentials. To setup a VPN via NetworkManager on the system menu, select VPN, Network Settings, and add a new Network of “CISCO AnyConnect Compatible VPN (openconnect)”. On the Gateway field, fill in the server’s DNS name, add the server’s CA certificate, and that’s all required.

To use the command line client with Kerberos the following trick is recommended. That avoids using sudo with the client and runs the openconnect client as a normal user, after having created a tun device. The reason it avoids using the openconnect client with sudo, is that sudo will prevent access to the user’s Kerberos credentials.

# sudo ip tuntap add vpn0 mode tun user my-user-name
$ openconnect server.example.com -i vpn0

Client side: Windows

A windows client is available for OpenConnect VPN at this web site. Its setup, similarly to NetworkManager, requires setting the server’s DNS name and its certificate. Configuring windows for use with FreeIPA is outside the scope of this text, but more information can be found at this FreeIPA manual.

Conclusion

A single sign-on solution using FreeIPA and the OpenConnect VPN has many benefits. The core optimization of a single login prompt for the user to authorize access to network resources will result in saving user time and frustration. It is important to note that these optimizations are possible by making VPN access part of the deployed infrastructure, rather than an after thought deployment.  With careful planning, an OpenConnect VPN solution can provide a secure and easy solution to network authentication.

June 14, 2015

SELinux insides – Part1: Policy module store, policy modules and kernel policy.

As you probably know, we are working to get the latest SELinux userspace which supports a new location for a policy store together with CIL support to Fedora 23.

There is a Fedora feature page about this change. We talk about a policy store, policy modules and about a binary policy (or kernel policy).

“SELinux security policy is located in /etc/selinux directory together with configuration files. In Fedora, we use a modular policy. It means the policy is not one large source policy but it can be built from modules. These modules together with a base policy (contains the mandatory information) are compiled, linked and located in a policy store where can be built into a binary format and then loaded into the security server. This binary policy is located in /etc/selinux/<SELINUXTYPE>/policy/policy.29 for example.”

But how are these compiled policy modules created?

If a policy writer starts with a new policy, he creates source policy files (.te, .if, .fc). These files are compiled using checkmodule into an intermediate format .mod. This policy object file contains Type Enforcement (TE) rules together with expanded rules defined by interface files (*.if). Then semodule_package is called (with a file context file) to create a SELinux policy module .pp.

In this phase, semodule is used to manage the policy store (installing, loading, updating, removing modules) and also builds the binary policy file – policy.29 for example.

Policy module creation

How does CIL come into the game?

policy_store_2

As you can see, the current compiled *.pp policy modules are converted to CIL code using /usr/libexec/selinux/hll/pp binary. And all these CIL files are compiled into a binary policy file – policy.29. With the userspace (v2.4), we call .pp file as a High Level Language (HLL) file. So we could define own HLLs to convert these HLL files to CIL format.

Where are these policy modules located?

With installed SELinux, we describe the following directory locations

/sys/fs/selinux The SELinux filesystem.
/etc/selinux Location for SELinux configuration files and policies.
/etc/selinux/<SELINUXTYPE>/module Location for policy module store and additional configuration files.

The default location for policy modules is changed from /etc/selinux/<SELINUXTYPE>/module to /var/lib/selinux/<SELINUXTYPE>/module with the 2.4 userspace. Also the following options are added by libsepol (v2.4) with CIL support to semanage.conf.

store-root = <path>
compiler-directory = <path>
ignore-module-cache = true|false
target-platform = selinux | xen

“store-root” option can be changed from the default /var/lib/selinux to a custom location according to distribution requirements.


June 10, 2015

The hidden costs of embargoes

It’s 2015 and it’s pretty clear the Open Source way has largely won as a development model for large and small projects. But when it comes to security we still practice a less-than-open model of embargoes with minimal or, in some cases, no community involvement. With the transition to more open development tools, such as Gitorious and GitHub, it is now time for the security process to change and become more open.

The problem

In general the argument for embargoes simply consists of “we’ll fix these issues in private and release the update in a coordinated fashion in order to minimize the time an attacker knows about the issue and an update is not available”. So why not reduce the risk of security issues being exploited by attackers and simply embargo all security issues, fix them privately and coordinate vendor updates to limit the time attackers know about this issue? Well for one thing we can’t be certain that an embargoed issue is known only to the people who found and reported it, and the people working on it. By definition if one person is able to find the flaw, then a second person can find it. This exact scenario happened with the high profile OpenSSL “HeartBleed” issue: initially an engineer at Codenomicon found it, and then it was independently found by the Google security team.

Additionally, the problem is a mismatch between how Open Source software is built and how security flaws are handled in Open Source software. Open Source development, testing, QA and distribution of the software mostly happens in public now. Most Open Source organizations have public source code trees that anyone can view, and in many cases submit change requests to. As well many projects have grown in size, not only code wise but developer wise, and some now involve hundreds or even thousands of developers (OpenStack, the Linux Kernel, etc.). It is clear that the Open Source way has scaled and works well with these projects, however the old fashioned way of handling security flaws in secret has not scaled as well.

Process and tools

The Open Source development method generally looks something like this: project has a public source code repository that anyone can copy from, and specific people can commit to. Project may have some continuous integration platform like Jenkins, and then QE testing to make sure everything still works. Fewer and fewer projects have a private or hidden repository, for one reason because there is little benefit to doing so, and many providers do not allow private repositories on the free plans that they offer. This also applies to the continuous integration and testing environments used by many projects (especially when using free services). So actually handling a security issue in secret without exposing it means that many projects cannot use their existing infrastructure and processes, but must instead email patches around privately and do builds/testing on their own workstations.

Code expertise

But let’s assume that an embargoed issue has been discovered and reported to upstream and only the original reporter and upstream know. Now we have to hope that between the reporter and upstream there is enough time and expertise to properly research the issue and fully understand it. The researcher may have found the issue through fuzzing and may only have a test case that causes a crash but no idea even what section of code is affected or how. Alternatively the researcher may know what code is affected but may not fully understand the code or how it is related to other sections of the program, in which case the issue may be more severe than they think, or perhaps it may not be as severe, or even be exploitable at all. The upstream project may also not have the time or resources to understand the code properly, as many of these people are volunteers, and projects have turn over, so the person who originally wrote the code may be long gone. In this case making the issue public means that additional people, such as vendor security teams, can also participate in looking at the issue and helping to fully understand it.

Patch creation with an embargoed issue means only the researcher and upstream participating. The end result of this is often patches that are incomplete and do not fully address the issue. This happened with the Bash Shellshock issue (CVE-2014-6271) where the initial patch, and even subsequent patches, were incomplete resulting in several more CVEs (CVE-2014-6277, CVE-2014-6278, CVE-2014-7169). For a somewhat complete listing of such examples simply search the CVE database for “because of an incomplete fix for”.

So assuming we now have a fully understood issue and a correct patch we actually need to patch the software and run it through QA before release. If the issue is embargoed this means you have to do so in secret. However, many Open Source projects use public or open infrastructure, or services which do not support selective privacy (a project is either public or private). Thus for many projects this means that the patching and QA cycle must happen outside of their normal infrastructure. For some projects this may not be possible; if they have a large Jenkins infrastructure replicating it privately can be very difficult.

And finally we have a fully patched and tested source code release, we may still need to coordinate a release with other vendors which has significant overhead and time constraint for all concerned. Obviously if the issue is public the entire effort spent on privately coordinating the issue is not needed and that effort can be spent on other things such as ensuring the patches are correct and that they address the flaw properly.

The Fix

The answer to the embargo question is surprisingly simple: we only embargo the issues that really matter and even then we use embargoes sparingly. Bringing in additional security experts, who would not normally be aware due to the embargo, rather than just the original researcher and the upstream project, increases the chances of the issue being properly understood and patched the first time around. Making the issue public will get more eyes on it. And finally, for the majority of lower severity issues (e.g. most XSS, temporary file vulnerabilities) attackers have little to no interest in them, so the cost of embargoes really makes no sense here. In short: why not treat most security bugs like normal bugs and get them fixed quickly and properly the first time around?

June 07, 2015

More about Jason Amerine

A mutual friend of mine and Jason’s questioned the use of the word “Whistleblower” in the Survey.  We are fairly certain it is the accurate term.  Here was the response from Bill Ruhling, Lawyer for Jason.

“The predicate to a whistleblower claim is that there was some protected communication. It need not expressly reveal a criminal act or even what you describe as a “clear wrong.” Rather, there is a wide range of communications that fall into that category. Secondly, there must be some retaliatory action taken by someone who knew or should have known about the protected communication.

Here you also have to overlay Constitutional protections appearing in the 1st Amendment which provides, inter alia, an unabridged right to petition Congress.

If you read the ongoing press coverage, you will see the narrative is much broader than simply Jason is being denied pay or even that the Army is out to get him. It details how Jason Amerine was trying to highlight a structural issue in the government that jeopardized clear military objectives (e.g., the return of Bowe Bergdahl) and placed civilian lives at risk (e.g., the six civilian hostages being held in the region). Quite simply, the snake had no head despite being intertwined with numerous agencies inside and outside of the Department of Defense. There is no way to overcome that type of structural deficiency within the “chain of command.” It required political solutions, which is the very essence of why Congress exists and why citizens’ right to petition Congress is such a fundamental aspect of the bill of rights.

There are a lot of details that I will not discuss in this forum, but ultimately it is this effort and the subsequent related processes that led to the current situation facing our classmate. What I can tell you is that Congressman Hunter recently testified on the House floor about the very real effect Jason’s efforts effectuated. Specifically, his communications with Congress led to Under-Secretary Lumpkin being appointed to be the central POC for the release of SGT Bergdahl and most recently drove the creation of legislation approved in the House to appoint a central hostage POC in the government to effectuate the release of hostages in enemy control.

Jason further epitomized the concept of duty we all learned at West Point when he chose the harder right over the easier wrong by raising to the chain of command and to the IG that certain testimony before Congress by high ranking officials was not accurate. The investigation and the subsequent retaliatory actions against Jason Amerine followed those protected communications. While the spiderweb of facts underlying this case are worthy of a law school final examination, they fit the very definition of a whistleblower claim: Jason Amerine made protected communications first to his chain of command and then to the Inspector General. When those communications became known, Jason was suspended from his duties, escorted out of the Pentagon, placed under criminal investigation, had his retirement orders revoked and now has been denied pay.

While I appreciate the desire to think critically about a topic before blindly speaking out in favor of someone, you can rest assured that issues are vetted and researched long before they reach public consumption, particularly where the basic facts are consistently being disclosed by a number of sources.”

When I asked Bill for permission to quote him, he gave it, although he did make this addition:

“The one thing that should be noted, however, is that the situation involving Jason’s pay has been rectified by the government. The chain of command explains it as being an administrative snafu related to the revocation of the retirement orders rather than being attributable to any animus towards Jason. The swiftness with which they resolved the issue speaks positively to the plausibility of their explanation.”

Let’s hope the rest of this is cleared up just as quickly.

If you wish to sign the petition, you can find it here.

June 05, 2015

Dynamic Policy and Microversions

Both Core APIs and Policy have been static for a long part of OpenStack’s lifespan.  While I’ve been working on Dynamic Policy, the Nova team has been looking to use microversions to allow the API to morph more quickly.  Are the two approaches going to interoperate, or are they going to conflict?

With a microversion, a service like Nova can make changes between commits that provide different ways to perform the same operation.  It can also provide a new operation.  Both will challenge the Policy mechanism to keep up.

Of the two, the easier one to handle is new functionality.  Let’s walk through what this process would look like:

A Nova dev needs to provide access to a new resource.  Let’s just for grins, call it a Container.  Containers never existed in Nova before this, so all of the CRUD APIs are new.  Each of them should, thus, get a new policy rule name.  Ideally, this would be cleanly namespaced and clearly self documenting:

“compute:create_container”

and

“compute:delete_container”

(and, yes, I am, at the moment, ignoring the fact that these really should look like the URLS that access them, as that is a whole ‘nother post entirely)

Now, the Nova team should add these rules to the default policy, right?

Well, probably…but now we are getting in to conflict with Dynamic Policy.  It is based on the workflow: lets assume that a Nova server will only execute policy that is fetched dynamically.  The changes made in the upstream default policy file  need to then get uploaded to the Keystone Policy Admin API.  However, we can’t just blindly overrwrite all of the rules that are uploaded.  There are two options:

  1. Each microversion gets its own policy file, and the executed policy is the sum of all the files.  The external system is then responsible for tracking which policy files have been applied to the Policy Database.
  2. When Uploading a new set of policy rules to the Policy Database,  allow an option that specifies “no overwrite”  which will indicate that the rule should only be added if it is not there already.  This will allow a CMS to blindly apply the default policy multiple times.

I prefer the second.

Until the Nova server fetches the policy file from the Keystone server, the new API should never pass the policy check.  This could either be done via a default rule, or via oslo.policy requiring an exact match.  Again, I prefer the second approach.  Default rules are, based on what I’ve seen in the other OpenStack policy.json files, too easy to make to permissive.So, just deploying new packages won’t get you the feature enabled until you update policy, too.  This should be acceptable, as the commitment to Dynamic Policy implies continuing to maintain policy.
What about the case where an API changes?  It turns out microversions already give us the answer for that.  With a microversion, you can distinguish between the older and new versions of the API in the call.  Policy needs to respect that.  It means that policy must be updated to reflect the difference. Let’s user our theoretical container resource as an example, again.  Both versions of the API are going to be active at the same time.Lets say the original container had the fields

id, tenant_id, description

and the new one has

id, project_id, description, label

For Old policy, we want to enforce on tenant_id, but for the new one we want to enforce on both project_id and possible label (yeah, I’ve been thinking about SELInux)

The original pre-dates microversions, so it has the policy rules “compute:create_container” and  “compute:delete_container” but what about the new ones?  Those need to specify a different rule to fetch from the policy file.  It could be based on the microversion string, or any other distinguishing factor.   oslo.policy does not dictate how to distinguish.  If we go with the microversion approach, we append the min-version to the rule in policy:

“compute:create_container:v3.14.15″

and

“compute:delete_container:v3.14.15″

I’d probably prefer something more self documenting, that captures the differences between old and new, but I suspect min-versions are the most succinct string to use.

If the old API is removed, the rule will stick around until told to go away, and cause no harm.

The code absolutely has to be microversion aware.  Policy is not checked by a middleware component that is unaware of the underlying API call, and so we don’t have the need or the ability to automatically map the microversion in.  That gets back to the issue about mapping URLs to policy that I punted on before.

As with most things in Dynamic Policy, this is not set in stone.  This is, at this point, an attempt to think through how it will work.

June 03, 2015

Jason Amerine

Which takes more courage: to lead a 11 person team deep into enemy territory, or to stand up to your own dysfunctional organization to try and fix it?  I know someone that has done both.


Jason Amerine went to congress to try and get the Hostage rescue efforts in sync between the FBI and the Army.This lead to the Hunter Hostage amendment.
Jason on the Set of the Colbert Report in 2011.

The people that he was trying to rescue were aid workers in Afghanistan. People who work for organizations like DAI, where my sister works. While she has never been in harms way (tat least, not that she’s told me,) DAI has lost people in Afghanistan. This is where my worlds come together. This is not theoretical.

We owe it to our citizens abroad to bring them home if they get taken hostage, not accidentally kill them in air-strikes. I can’t help but think that Jason  is advocating this because of his own history.

Back in 2001, Jason was a Special Forces A team commander.  His was the team that brought Hamid Karzai back in to Afghanistan, in an effort to displace the Taliban.  His team was struck by friendly fire, killing two of the team members, wounding several others, including Jason.  Karzai had stepped out of their headquarters, or he would have been in the blast radius, too.  The error was not one of malice, but of poor communication.

sfodc

Army Special Forces Operational Detachment Delta. Jason is in the back row, second from the right, wearing a boonie cap.

If you want to know more, you can read the book.

He is, as far as I know, the only one of my classmates to appear on the Colbert Report

He is, as far as I know, the only member of our class to have his own action figure, too.

Amerine1

Jason did not let the SF community cover up the fact that they had messed up.  In doing so, he got himself drummed out of Special Forces, and, instead, went back to West Point to teach.   When I congratulated him on making Lieutenant Colonel, he told me that he knew that was as far as he would go in the Army.  He was supposed to retire this year, but his retirement is held up by the ongoing investigation into him. He is still on active duty, but the Army has stopped paying him.

Jason opened an Inspectors General (IG) complaint about some of the other reports given to congress not being true. I suspect that this is the real reason he is under investigation. Not that he went to congress, but because he was trying to hold the people that serve in the Army to the standard.

The West Point class of 1993 has closed ranks behind him: we feel his is being punished “choosing the harder right over the easier wrong” and not backing down. Currently, we are encouraging people to sign a White House petition of support for him. Personally, Jason is a friend. We had multiple classes together, most memorable being an English class that lead to some very interesting debates. He’s one of the people I’ve looked forward to talking to at my reunions, and someone that I can unequivocally support.

You should know his story.

Emergency Security Band-Aids with Systemtap

Software security vulnerabilities are a fact of life. So is the subsequent publicity, package updates, and suffering service restarts. Administrators are used to it, and users bear it, and it’s a default and traditional method.

On the other hand, in some circumstances the update & restart methods are unacceptable, leading to the development of online fix facilities like kpatch, where code may be surgically replaced in a running system. There is plenty of potential in these systems, but they are still at an early stage of deployment.

In this article, we present another option: a limited sort of live patching using systemtap. This tool, now a decade old, is conventionally thought of as a tracing widget, but it can do more. It can not only monitor the detailed internals of the Linux kernel and user-space programs, it can also change them – a little. It turns out to be just enough to defeat some classes of security vulnerabilities. We refer to these as security “band-aids” rather than fixes, because we expect them to be used temporarily.

Systemtap

Systemtap is a system-wide programmable probing tool introduced in 2005, and supported since RHEL 4 on all Red Hat and many other Linux distributions. (Its capabilities vary with the kernel’s. For example, user-space probing is not available in RHEL 4.) It is system-wide in the sense that it allows looking into much of the software stack: from device drivers, kernel core, system libraries, through user-applications. It is programmable because it operates based on programs in the systemtap scripting language in order to specify what operations to perform. It probes in the sense that, like a surgical instrument, it safely opens up running software so that we can peek and poke at its internals.

Systemtap’s script language is inspired by dtrace, awk, C, intended to be easy to understand and compact while expressive. Here is hello-world:

probe oneshot { printf("hello world\n") }

Here is some counting of vfs I/O:

 global per_pid_io # an array
 probe kernel.function("vfs_read") 
     { per_pid_io["read",pid()] += $count }
 probe kernel.function("vfs_write")
     { per_pid_io["write",pid()] += $count }
 probe timer.s(5) { exit() } # per_pid_io will be printed

Here is a system-wide strace for non-root processes:

probe syscall.*
{
    if (uid() != 0)
        printf("%s %d %s %s\n", execname(), tid(), name, argstr)
}

Additional serious and silly samples are available on the Internet and also distributed with systemtap packages.

The meaning of a systemtap script is simple: whenever a given probed event occurs, pause that context, evaluate the statements in the probe handler atomically (safely and quickly), then resume the context. Those statements can inspect source-level state in context, trace it, or store it away for later analysis/summary.

The systemtap scripting language is well-featured. It has normal control flow statements, functions with recursion. It deals with integral and string data types, and look-up tables are available. An unusual feature for a small language, full type checking is paired with type inference, so type declarations are not necessary. There exist dozens of types of probes, like timers, calls/interiors/returns from arbitrary functions, designated tracing instrumentation in the kernel or user-space, hardware performance counter values, Java methods, and more.

Systemtap scripts are run by algorithmic translation to C, compilation to machine code, and execution within the kernel as a loadable module. The intuitive hazards of this technique are ameliorated by prolific checking throughout the process, both during translation, and within the generated C code and its runtime. Both time and space usage are limited, so experimentation is safe. There are even modes available for non-root users to inspect only their own processes.  This capability has been used for a wide range of purposes:

  • performance tuning via profiling
  • program-understanding via pinpoint tracing, dynamic call-graphs, through pretty-printing local variables line by line.

Many scripts can run independently at the same time and any can be completely stopped and removed. People have even written interactive games!

Methodology

While systemtap is normally used in a passive (read-only) capacity, it may be configured to permit active manipulations of state. When invoked in “guru mode”, a probe handler may send signals, insert delays, change variables in the probed programs, and even run arbitrary C code. Since systemtap cannot change the program code, changing data is the approach of choice for construction of security band-aids. Here are some of the steps involved, once a vulnerability has been identified in some piece of kernel or user code.

Plan

To begin, we need to look at the vulnerable code bug and, if available, the corrective patch to understand:

  • Is the bug mainly (a) data-processing-related or (b) algorithmic?
  • Is the bug (a) localized or (b) widespread?
  • Is the control flow to trigger the bug (a) simple or (b) complicated?
  • Are control flow paths to bypass the bug (a) available nearby (included in callers) or (b) difficult to reach?
  • Is the bug dependent mainly on (a) local data (such as function parameters) or (b) global state?
  • Is the vulnerability-triggering data accessible over (a) a broad range of the function (a function parameter or global) or only (b) a narrow window (a local variable inside a nested block)?
  • Are the bug triggering conditions (a) specific and selective or (b) complex and imprecise?
  • Are the bug triggering conditions (a) deliberate or (b) incidental in normal operation?

More (a)s than (b)s means it’s more likely that systemtap band-aids would work in the particular situation while more (b)s means it’s likely that patching the traditional way would be best.

Then we need to decide how to change the system state at the vulnerable point. One possibility is to change to a safer error state; the other is to change to a correct state.

In a type-1 band-aid, we will redirect flow of control away from the vulnerable areas. In this approach, we want to “corrupt” incoming data further in the smallest possible way necessary to short-circuit the function to bypass the vulnerable regions. This is especially appropriate if:

  • correcting the data is difficult, perhaps because it is in multiple locations, or because it needs to be temporary or no obvious corrected value can be computed
  • error handling code already exists and is accessible
  • we don’t have a clear identification of vulnerable data states and want to err on the side of error-handling
  • if the vulnerability is deliberate so we don’t want to spend the effort of performing a corrected operation

A type-2 band-aid is correcting data so that the vulnerable code runs correctly. This is especially appropriate in the complementary cases from the above and if:

  • the vulnerable code and data can occur from real workloads so we would like them to succeed
  • corrected data can be practically computed from nearby state
  • natural points occur in the vulnerable code where the corrected data may be inserted
  • natural points occur in the vulnerable code where clean up code (restoring of previous state) may be inserted, if necessary

Implement

With the vulnerability-band-aid approach chosen, we need to express our intent in the systemtap scripting language. The model is simple: for each place where the state change is to be done we place a probe. In each probe handler, we detect whether the context indicates an exploit is in progress and, if so, make changes to the context. We might also need additional probes to detect and capture state from before the vulnerable section of code, for diagnostic purposes.

A minimal script form for changing state can be easily written. It demonstrates one kernel and one user-space function-entry probe, where each happens to take a parameter named p that needs to be range-limited. (The dollar sign identifies the symbol as a variable in the context of the probed program, not as a script-level temporary variable.)

probe kernel.function("foo"),
      process("/lib*/libc.so").function("bar")
{
    if ($p > 100)
        $p = 4
}

Another possible action in the probe handler is to deliver a signal to the current user-space process using the raise function. In this script a global variable in the target program is checked at every statement in the given source code file and line-number-range and deliver a killing blow if necessary:

probe process("/bin/foo").statement("*@src/foo.c:100-200")
{
    if (@var("a_global") > 1000)
        raise(9) # SIGKILL
}

Another possible action is logging the attempt at the systemtap process console:

    # ...
    printf("check process %s pid=%d uid=%d",
           execname(), pid(), uid())
    # ...

Or sending a note to the sysadmin:

    # ...
    system(sprintf("/bin/logger check process %s pid=%d uid=%d",
                   execname(), pid(), uid()))
    # ...

These and other actions may be done in any combination.

During development of a band-aid one should start with just tracing (no band-aid countermeasures) to fine-tune the detection of the vulnerable state. If an exploit is available run it without systemtap, with systemtap (tracing only), and with the operational band-aid. If normal workload can trigger the bug run it with and without the same spectrum of systemtap supervision to confirm that we’re not harming that traffic.

Deploy

To run a systemtap script, we will need systemtap on a developer workstation. Systemtap has been included in RHEL 4 and later since 2006. RHEL 6 and RHEL 7 still receive rebases from new upstream releases, though capabilities vary. (Systemtap upstream is tested against a gamut of RHEL 4 through fresh kernel.org kernels, and against other distributions.)

In addition to systemtap itself, security band-aid type scripts usually require source-level debugging information for the buggy programs. That is because we need the same symbolic information about types, functions, and variables in the program as an interactive debugger like gdb does. On Red Hat/Fedora distributions this means the “-debuginfo” RPMs available on RHN and YUM. Some other distributions make them available as “-dbgsym” packages. Systemtap scripts that probe the kernel will probably need the larger “kernel-debuginfo”; those that probe user-space will probably need a corresponding “foobar-debuginfo” package. (Systemtap will tell you if it’s missing.)

Running systemtap security band-aid scripts will generally require root privileges and a “guru-mode” flag to designate permission to modify state such as:

# stap -g band_aid.stp
[...]
^C

The script will inject instrumentation into existing and future processes and continue running until it is manually interrupted, it stops itself, or error conditions arise. In case of most errors, the script will stop cleanly, print a diagnostic message, and point to a manual page with further advice. For transient or acceptable conditions command line options are available to suppress some safety checks altogether.

Systemtap scripts may be distributed to a network of homogeneous workstations in “pre-compiled” (kernel-object) form, so that a full systemtap + compiler + debuginfo installation is not necessary on the other computers. Systemtap includes automation for remote compilation and execution of the scripts. A related facility is available to create MOK-signed modules for machines running under SecureBoot.  Scripts may also be installed for automatic execution at startup via initscripts.

Some examples

Of all the security bugs for which systemtap band-aids have been published, we analyse a few below.

CVE-2013-2094, perf_swevent_enabled array out-of-bound access

This was an older bug in the kernel’s perf-event subsystem which takes a complex command struct from a syscall. The bug involved missing a range check inside the struct pointed to by the event parameter. We opted for a type-2 data-correction fix, even though the a type-1 failure-induction could have worked as well.

This script demonstrates an unusual technique: embedded-C code called from the script, to adjust the erroneous value. There is a documented argument-passing API between embedded-C and script, but systemtap cannot analyze or guarantee anything about the safety of the C code. In this case, the same calculation could have been expressed within the safe scripting language, but it serves to demonstrate how a more intricate correction could be fitted.

# declaration for embedded-C code
%{
#include <linux/perf_event.h>
%}
# embedded-C function - note %{ %} bracketing
function sanitize_config:long (event:long) %{
    struct perf_event *event;
    event = (struct perf_event *) (unsigned long) STAP_ARG_event;
    event->attr.config &= INT_MAX;
%}

probe kernel.function("perf_swevent_init").call {
    sanitize_config($event)  # called with pointer
}

CVE-2015-3456, “venom”

Here is a simple example from the recent VENOM bug, CVE-2015-3456, in QEMU’s floppy-drive emulation code. In this case, a buffer-overflow bug allows some user-supplied data to overwrite unrelated memory. The official upstream patch adds explicit range limiting for an invented index variable pos, in several functions. For example:

@@ -1852,10 +1852,13 @@
 static void fdctrl_handle_drive_specification_command(FDCtrl *fdctrl, int direction)
 {
     FDrive *cur_drv = get_cur_drv(fdctrl);
+    uint32_t pos;
-    if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x80) {
+    pos = fdctrl->data_pos - 1;
+    pos %= FD_SECTOR_LEN;
+    if (fdctrl->fifo[pos] & 0x80) {
         /* Command parameters done */
-        if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x40) {
+        if (fdctrl->fifo[pos] & 0x40) {
             fdctrl->fifo[0] = fdctrl->fifo[1];
             fdctrl->fifo[2] = 0;
             fdctrl->fifo[3] = 0;

Inspecting the original code, we see that the vulnerable index was inside a heap object at fdctrl->data_pos. A type-2 systemtap band-aid would have to adjust that value before the code runs the fifo[] dereference, and subsequently restore the previous value. This might be expressed like this:

global saved_data_pos
probe process("/usr/bin/qemu-system-*").function("fdctrl_*spec*_command").call
{
    saved_data_pos[tid()] = $fdctrl->data_pos;
    $fdctrl->data_pos = $fdctrl->data_pos % 512 # FD_SECTOR_LEN
}
probe process("/usr/bin/qemu-system-*").function("fdctrl_*spec*_command").return
{
    $fdctrl->data_pos = saved_data_pos[tid()]
    delete saved_data_pos[tid()]
}

The same work would have to be done at each of the three analogous vulnerable sites unless further detailed analysis suggests that a single common higher-level function could do the job.

However, this is probably too much work. The CVE advisory suggests that any call to this area of code is likely a deliberate exploit attempt (since modern operating systems don’t use the floppy driver). Therefore, we could opt for a type-1 band-aid, where we bypass the vulnerable  computations entirely. We find that all the vulnerable functions ultimately have a common caller, fdctrl_write.

static void fdctrl_write (void *opaque, uint32_t reg, uint32_t value)
{
    FDCtrl *fdctrl = opaque;
    [...]
    reg &= 7;
    switch (reg) {
    case FD_REG_DOR:
        fdctrl_write_dor(fdctrl, value);
        break;
        [...]
    case FD_REG_CCR:
        fdctrl_write_ccr(fdctrl, value);
        break;
    default:
        break;
    }
}

We can disarm the entire simulated floppy driver by pretending that the simulated CPU is addressing a reserved FDC register, thus falling through to the default: case. This requires just one probe. Here we’re being more conservative than necessary, overwriting only the low few bits:

probe process("/usr/bin/qemu-system-*").function("fdctrl_write")
{
    $reg = (($reg & ~7) | 6) # replace register address with 0x__6
}

CVE-2015-0235, “ghost”

This recent bug in glibc involved a buffer overflow related to dynamic allocation with a miscalculated size. It affected a function that is commonly used in normal software, and the data required to determine whether the vulnerability would be triggered or not is not available in situ. Therefore, a type-1 error-inducing band-aid would not be appropriate.

However, it is a good candidate for type-2 data-correction. The script below works by incrementing the size_needed variable set around line 86 of glibc nss/digits_dots.c, so as to account for the missing sizeof (*h_alias_ptr). This makes the subsequent comparisons work and return error codes for buffer-overflow situations.

85
86 size_needed = (sizeof (*host_addr)
87             + sizeof (*h_addr_ptrs) + strlen (name) + 1);
88
89 if (buffer_size == NULL)
90   {
91     if (buflen < size_needed)
92       {
93         if (h_errnop != NULL)
94           *h_errnop = TRY_AGAIN;
95         __set_errno (ERANGE);
96         goto done;
97       }
98   }
99 else if (buffer_size != NULL && *buffer_size < size_needed)
100  {
101    char *new_buf;
102    *buffer_size = size_needed;
103    new_buf = (char *) realloc (*buffer, *buffer_size);

The script demonstrates an unusual technique. The variable in need of correction (size_needed) is deep within a particular function so we need “statement” probes to place it before the bad value is used. Because of compiler optimizations, the exact line number where the probe may be placed can’t be known a prior so we ask systemtap to try a whole range. The probe handler than protects itself against being invoked more than once (per function call) using an auxiliary flag array.

global added%
global trap = 1 # stap -G trap=0 to only trace, not fix
probe process("/lib*/libc.so.6").statement("__nss_hostname_digits_dots@*:87-102")
{
    if (! added[tid()])
    {
        added[tid()] = 1; # we only want to add once
        printf("%s[%d] BOO! size_needed=%d ", execname(), tid(),
 $size_needed)
        if (trap)
        {
            # The &@cast() business is a fancy sizeof(uintptr_t),
            # which makes this script work for both 32- and 64-bit glibc's.
            $size_needed = $size_needed + &@cast(0, "uintptr_t")[1]
            printf("ghostbusted to %d", $size_needed)
        }
    printf("\n")
    }
}

probe process("/lib*/libc.so.6").function("__nss_hostname_digits_dots").return
{
    delete added[tid()] # reset for next call
}

This type-2 band-aid allows applications to operate as through glibc was patched.

Conclusions

We hope you enjoyed this foray into systemtap and its unexpected application as a potential band-aid for security bugs. If you would like to learn more, read our documentation, contact our team, or just go forth and experiment. If this technology seems like a fit for your installation and situation, consult your vendor for a possible systemtap band-aid.

May 29, 2015

Fedora Security Team 90-Day Challenge to clean up vulnerabilities… an update

At the beginning of April, the Fedora Security Team (FST) started on a journey to close all critical and important CVEs in Fedora and EPEL that had originated in 2014 and before.  Now that we’re two-thirds the way through I figured it would be a good time to see what we’ve accomplished so far.

Of the 38 CVEs (37 important and 1 critical) we originally identified: 14 have been closed, 1 is currently on QA, and 23 remain open.  The 14 closed CVEs represent around a third of all the identified CVEs.  So, not bad but also not great; there is still work to be done.

If you want to help get some of these CVEs cleaned up here’s a list of the target packages.  We need to make sure that upstream has fixed the problem and that the packagers are pushing these fixes into the repos.

  • ytnef
  • mediatomb
  • rubygem-httparty
  • rubygem-extlib
  • rubygem-crack
  • nagios
  • libmicrohttpd
  • directfb
  • nagios-plugins
  • dcmtk
  • sahana
  • opensaml-java
  • s3ql
  • tomcat
  • openstack-keystone
  • phpMemcachedAdmin

I hope to come back to you at the end of the month with a report on how all of the CVEs were fixed and who helped fix them!


May 26, 2015

Are you getting dac_override AVC message?

Some time ago, Dan Walsh wrote “Why doesn’t SELinux give me the full path in an error message?” blog related to DAC_OVERRIDE capability.

“According to SELinux By Example. DAC_OVERRIDE allows a process to ignore Discretionary Access Controls including access lists.”

In Fedora 22, we have still a quite large number of DAC_OVERRIDE allowed by default. You can check it using

$ sesearch -A -p dac_override -C |grep -v ^DT |wc -l
387

So the question is if they are still needed. Basically most of them have been added because of a bad ownership of files/directories located in /var/lib, /var/log, /var/cache directories. But as you probably realize, we just “mask” bugs in applications and open backdoors in the Fedora SELinux policy.

For this reason, we want to introduce a new Fedora 23 feature to remove these capabilities where it is possible.

Let’s test it on the following real example:

$ sesearch -A -s psad_t -t psad_t -c capability
Found 1 semantic av rules:
allow psad_t psad_t : capability { dac_override setgid setuid net_admin net_raw } ;

$ ls -ldZ /var/lib/psad /var/log/psad /var/run/psad /etc/psad/
drwxr-xr-x. 3 root root system_u:object_r:psad_etc_t:s0 4096 May 26 12:40 /etc/psad/
drwxr-xr-x. 2 root root system_u:object_r:psad_var_lib_t:s0 4096 May 26 12:35 /var/lib/psad
drwxr-xr-x. 4 root root system_u:object_r:psad_var_log_t:s0 4096 May 26 12:47 /var/log/psad
drwxr-xr-x. 2 root root system_u:object_r:psad_var_run_t:s0 100 May 26 12:44 /var/run/psad

$ ps -efZ |grep psad
system_u:system_r:psad_t:s0 root 25461 1 0 12:44 ? 00:00:00 /usr/bin/perl -w /usr/sbin/psad
system_u:system_r:psad_t:s0 root 25466 1 0 12:44 ? 00:00:00 /usr/sbin/psadwatchd -c /etc/psad/psad.con
f

which looks correct. So is dac_override really needed for psad_t? How could I check it?

On my Fedora 23 system, I run with

$ cat dacoverride.cil
(typeattributeset cil_gen_require domain)
(auditallow domain self (capability (dac_override)))

policy module which audits all dac_override as granted in /var/log/audit/audit.log if they are needed.

For example I see

type=AVC msg=audit(1432639909.704:380132): avc: granted { dac_override } for pid=28878 comm="sudo" capability=1 scontext=staff_u:staff_r:staff_sudo_t:s0-s0:c0.c1023 tcontext=staff_u:staff_r:staff_sudo_t:s0-s0:c0.c1023 tclass=capability

which is expected. But I don’t see it for psad_t if I try to use it. So this is probably a bug in the policy and dac_override should be removed for psad_t. Also we should ask psad maintainers for their agreement.

And what happens if you go with the following ownership change

$ ls -ldZ /var/log/psad/
drwxr-xr-x. 4 mgrepl mgrepl system_u:object_r:psad_var_log_t:s0 4096 May 26 13:53 /var/log/psad/

? You get

type=AVC msg=audit(1432641212.164:380373): avc: granted { dac_override } for pid=30333 comm="psad" capability=1 scontext=system_u:system_r:psad_t:s0 tcontext=system_u:system_r:psad_t:s0 tclass=capability

 

 


May 20, 2015

How to create a new initial policy using sepolicy-generate tool?

I have a service running without own SELinux domain and I would like to create a new initial policy for it.

How can I create a new initial policy? Is there a tool for it?

We get these questions very often. And my answer is pretty easy. Yes, there is a tool which can help you with this task.

Let’s use a real example to demonstrate how to create own initial policy for the running lttng-sessiond service on my system.

I see

$ ps -efZ |grep lttng-sessiond
system_u:system_r:unconfined_service_t:s0 root 29186 1 0 12:31 ? 00:00:00 /usr/bin/lttng-sessiond -d

unconfined_service_t tells us the lttng-sessiond service runs without SELinux confinement.

Basically there is no problem with a service running as unconfined_service_t if this service does “everything” or this service is a third party software. A problem occurs if there are another services with own SELinux domains and they want to access objects created by your service.

Then you can see AVCs like

type=AVC msg=audit(1431724248.950:1003): avc: denied { getattr } for pid=768 comm="systemd-logind" path="/dev/shm/lttng-ust-wait-5" dev="tmpfs" ino=25832 scontext=system_u:system_r:systemd_logind_t:s0 tcontext=system_u:object_r:tmpfs_t:s0 tclass=file permissive=0

In that case, you want to create  SELinux policy from the scratch to get objects created by your service with the specific SELinux labeling to see if you can get a proper SELinux confinement.

Let’s start.

1. You need to identify an executable file which is used to start a service. From

system_u:system_r:unconfined_service_t
$:s0 root 29186 1 0 12:31 ? 00:00:00 /usr/bin/lttng-sessiond -d

you can see /usr/bin/lttng-sessiond is used. Also

$ grep ExecStart /usr/lib/systemd/system/lttng-sessiond.service
ExecStart=/usr/bin/lttng-sessiond -d

is useful.

2. Run sepolicy-generate to create initial policy files.

sepolicy generate --init -n lttng /usr/bin/lttng-sessiond
Created the following files:
/home/mgrepl/Devel/RHEL/selinux-policy/lttng.te # Type Enforcement file
/home/mgrepl/Devel/RHEL/selinux-policy/lttng.if # Interface file
/home/mgrepl/Devel/RHEL/selinux-policy/lttng.fc # File Contexts file
/home/mgrepl/Devel/RHEL/selinux-policy/lttng_selinux.spec # Spec file
/home/mgrepl/Devel/RHEL/selinux-policy/lttng.sh # Setup Script

3. Run

# sh lttng.sh

4. YOU ARE DONE. CHECK YOUR RESULTS.

# ls -Z /usr/bin/lttng-sessiond
system_u:object_r:lttng_exec_t:s0 /usr/bin/lttng-sessiond
# systemctl restart lttng-sessiond
# ps -eZ |grep lttng-sessiond
system_u:system_r:lttng_t:s0 root 29850 1 0 12:50 ? 00:00:00 /usr/bin/lttng-sessiond -d
# auseaarch -m avc -ts recent
... probably you see a lot of AVCs ...

Now you have created/loaded own initial policy for your service. In this point, you can work on AVCs, you can ask us to help you with these AVCs.


JSON, Homoiconicity, and Database Access

During a recent review of an internal web application based on the Node.js platform, we discovered that combining JavaScript Object Notation (JSON) and database access (database query generators or object-relational mappers, ORMs) creates interesting security challenges, particularly for JavaScript programming environments.

To see why, we first have to examine traditional SQL injection.

Traditional SQL injection

Most programming languages do not track where strings and numbers come from. Looking at a string object, it is not possible to tell if the object corresponds to a string literal in the source code, or input data which was read from a network socket. Combined with certain programming practices, this lack of discrimination leads to security vulnerabilities. Early web applications relied on string concatenation to construct SQL queries before sending them to the database, using Perl constructs like this to load a row from the users table:

# WRONG: SQL injection vulnerability
$dbh->selectrow_hashref(qq{
  SELECT * FROM users WHERE users.user = '$user'
})

But if the externally supplied value for $user is "'; DROP TABLE users; --", instead of loading the user, the database may end up deleting the users table, due to SQL injection. Here’s the effective SQL statement after expansion of such a value:

  SELECT * FROM users WHERE users.user = ''; DROP TABLE users; --'

Because the provenance of strings is not tracked by the programming environment (as explained above), the SQL database driver only sees the entire query string and cannot easily reject such crafted queries.

Experience showed again and again that simply trying to avoid pasting untrusted data into query strings did not work. Too much data which looks trustworthy at first glance turns out to be under external control. This is why current guidelines recommend employing parametrized queries (sometimes also called prepared statements), where the SQL query string is (usually) a string literal, and the variable parameters are kept separate, combined only in the database driver itself (which has the necessary database-specific knowledge to perform any required quoting of the variables).

Homoiconicity and Query-By-Example

Query-By-Example is a way of constructing database queries based on example values. Consider a web application as an example. It might have a users table, containing columns such as user_id (a serial primary key), name, password (we assume the password is stored in the clear, also this practice is debatable), a flag that indicates if the user is an administrator, a last_login column, and several more.

We could describe a concrete row in the users table like this, using JavaScript Object Notation (JSON):

{
  "user_id": 1,
  "name": "admin",
  "password": "secret",
  "is_admin": true,
  "last_login": 1431519292
}

The query-by-example style of writing database queries takes such a row descriptor, omits some unknown parts, and treats the rest as the column values to match. We could check user name an password during a login operation like this:

{
  "name": "admin",
  "password": "secret",
}

If the database returns a row, we know that the user exists, and that the login attempt has been successful.

But we can do better. With some additional syntax, we can even express query operators. We could select the regular users who have logged in today (“1431475200” refers to midnight UTC, and "$gte" stands for “greater or equal”) with this query:

{
  "last_login": {"$gte": 1431475200},
  "is_admin": false
}

This is in fact the query syntax used by Sequelize, a object-relational mapping tool (ORM) for Node.js.

This achieves homoiconicity refers to a property of programming environment where code (here: database queries) and data look very much alike, roughly speaking, and can be manipulated with similar programming language constructors. It is often hailed as a primary design achievement of the programming language Lisp. Homoiconicity makes query construction with the Sequelize toolkit particularly convenient. But it also means that there are no clear boundaries between code and data, similar to the old way of constructing SQL query strings using string concatenation, as explained above.

Getting JSON To The Database

Some server-side programming frameworks, notably Node.js, automatically decode bodies of POST requests of content type application/json into JavaScript JSON objects. In the case of Node.js, these JSON objects are indistinguishable from other such objects created by the application code.  In other words, there is no marker class or other attribute which allows to tell apart objects which come from inputs and objects which were created by (for example) object literals in the source.

Here is a simple example of a hypothetical login request. When Node.js processes the POST request on he left, it assigns a JavaScript object to the the req.body field in exactly the same way the JavaScript code on the right does.

POST request Application code
POST /user/auth HTTP/1.0
Content-Type: application/json

{"name":"admin","password":"secret"}
req.body = {
  name: "admin",
  password: "secret"
}

In a Node.js application using Sequelize, the application would first define a model User, and then use it as part of the authentication procedure, in code similar to this (for the sake of this example, we still assume the password is stored in plain text, the reason for that will be come clear immediately):

User.findOne({
  where: {
    name: req.body.name,
    password: req.body.password
  }
}).then(function (user) {
  if (user) {
    // We got a user object, which means that login was successful.
    …
  } else {
    // No user object, login failure.
    …
  }
})

The query-by-example part is highlighted.

However, this construction has a security issue which is very difficult to fix. Suppose that the POST request looks like this instead:

POST /user/auth HTTP/1.0
Content-Type: application/json

{
  "name": {"$gte": ""},
  "password": {"$gte": ""}
}

This means that Sequelize will be invoked with this query (and the markers included here are invisible to the Sequelize code, they just illustrate the data that came from the post request):

User.findOne({
  where: {
    name: {"$gte": ""},
    password: {"$gte": ""}
  }
})

Sequelize will translate this into a query similar to this one:

SELECT * FROM users where name >= ''  AND password >= '';

Any string is greater than or equal to the empty string, so this query will find any user in the system, regardless of the user name or password. Unless there are other constraints imposed by the application, this allows an attacker to bypass authentication.

What can be done about this? Unfortunately, not much. Validating POST request contents and checking that all the values passed to database queries are of the expected type (string, number or Boolean) works to mitigate individual injection issues, but the experience with SQL injection issues mentioned at the beginning of this post suggests that this is not likely to work out in practice, particularly in Node.js, where so much data is exposed as JSON objects. Another option would be to break homoiconicity, and mark in the query syntax where the query begins and data ends. Getting this right is a bit tricky. Other Node.js database frameworks do not describe query structure in terms of JSON objects at all; Knex.js and Bookshelf.js are in this category.

Due to the prevalence of JSON, such issues are most likely to occur within Node.js applications and frameworks. However, already in July 2014, Kazuho Oku described a JSON injection issue in the SQL::Maker Perl package, discovered by his colleague Toshiharu Sugiyama.

Update (2015-05-26): After publishing this blog post, we learned that a very similar issue has also been described in the context of MongoDB: Hacking NodeJS and MongoDB.

Other fixable issues in Sequelize

Sequelize overloads the findOne method with a convenience feature for primary-key based lookup. This encourages programmers to write code like this:

User.findOne(req.body.user_id).then(function (user) {
  … // Process results.
}

This allows attackers to ship a complete query object (with the “{where: …}” wrapper) in a POST request. Even with strict query-by-example queries, this can be abused to probe the values of normally inaccessible table columns. This can be done efficiently using comparison operators (with one bit leaking per query) and binary search.

But there is another issue. This construct

User.findOne({
  where: "user_id IN (SELECT user_id " +
    "FROM blocked_users WHERE unblock_time IS NULL)"
}).then(function (user) {
  … // Process results.
}

pastes the marked string directly into the generated SQL query (here it is used to express something that would be difficult to do directly in Sequelize (say, because the blocked_users table is not modeled). With the “findOne(req.body.user_id)” example above, a POST request such as

POST /user/auth HTTP/1.0
Content-Type: application/json

{"user_id":{"where":"0=1; DROP TABLE users;--"}}

would result in a generated query, with the highlighted parts coming from the request:

SELECT * FROM users WHERE 0=1; DROP TABLE users;--;

(This will not work with some databases and database drivers which reject multi-statement queries. In such cases, fairly efficient information leaks can be created with sub-queries and a binary search approach.)

This is not a defect in Sequelize, it is a deliberate feature. Perhaps it would be better if this functionality were not reachable with plain JSON objects. Sequelize already supports marker objects for including literals, and a similar marker object could be used for verbatim SQL.

The Sequelize upstream developers have mitigated the first issue in version 3.0.0. A new method, findById (with an alias, findByPrimary), has been added which queries exclusively by primary keys (“{where: …}” queries are not supported). At the same time, the search-by-primary-key automation has been removed from findOne, forcing applications to choose explicitly between primary key lookup and full JSON-based query expression. This explicit choice means that the second issue (although not completely removed from version 3.0.0) is no longer directly exposed. But as expected, altering the structure of a query by introducing JSON constructs (as with the "$gte example is still possible, and to prevent that, applications have to check the JSON values that they put into Sequelize queries.

Conclusion

JSON-based query-by-example expressions can be an intuitive way to write database queries. However, this approach, when taken further and enhanced with operators, can lead to a reemergence of injection issues which are reminiscent of SQL injection, something these tools try to avoid by operating at a higher abstraction level. If you, as an application developer, decide to use such a tool, then you will have to make sure that data passed into queries has been properly sanitized.

May 19, 2015

Is SELinux good anti-venom?
SELinux to the Rescue 

If you have been following the news lately you might have heard of the "Venom" vulnerabilty.

Researchers found a bug in Qemu process, which is used to run virtual machines on top of KVM based linux
machines.  Red Hat, Centos and Fedora systems were potentially vulnerable.  Updated packages have been released for all platforms to fix the problem.

But we use SELinux to prevent virtual machines from attacking other virtual machines or the host.  SELinux protection on VM's is often called sVirt.  We run all virtual machines with the svirt_t type.  We also use MCS Separation to isolate one VM from other VMs and thier images on the system.

While to the best of my knowlege no one has developed an actual hack to break out of the virtualization layer, I do wonder whether or not the break out would even be allowed by SELinux. SELinux has protections against executable memory, which is usually used for buffer overflow attacks.  These are the execmem, execheap and execstack access controls.  There is a decent chance that these would have blocked the attack. 

# sesearch -A -s svirt_t -t svirt_t -c process -C
Found 2 semantic av rules:
   allow svirt_t svirt_t : process { fork sigchld sigkill sigstop signull signal getsched setsched getsession getcap getattr setrlimit } ; 
DT allow svirt_t svirt_t : process { execmem execstack } ; [ virt_use_execmem ]

Examining the policy on my Fedora 22 machine, we can look at the types that a svirt_t process would be allowed to write. These are the types that SELinux would allow the process to write, if they had matching MCS labels, or s0.

# sesearch -A -s svirt_t -c file -p write -C | grep open 
   allow virt_domain qemu_var_run_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow virt_domain svirt_home_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow virt_domain svirt_tmp_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow virt_domain svirt_image_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow virt_domain svirt_tmpfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
   allow virt_domain virt_cache_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; 
DT allow virt_domain fusefs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_fusefs ]
DT allow virt_domain cifs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_samba ]
ET allow virt_domain dosfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_usb ]
DT allow virt_domain nfs_t : file { ioctl read write create getattr setattr lock append unlink link rename open } ; [ virt_use_nfs ]
ET allow virt_domain usbfs_t : file { ioctl read write getattr lock append open } ; [ virt_use_usb ]

Lines beginning with the D are disabled, and only enabled by toggling the boolean.  I did a video showing the access avialable to an OpenShift process running as root on your system using the same
technology.  Click here to view.

SELinux also blocks capabities, so the qemu process even if running as root would only have the net_bind_service capabilty, which allows it to bind to ports < 1024.

# sesearch -A -s svirt_t -c capability -C
Found 1 semantic av rules:
   allow svirt_t svirt_t : capability net_bind_service ; 

Dan Berrange, creator of libvirt, sums it up nicely on the Fedora Devel list:

"While you might be able to crash the QEMU process associated with your own guest, you should not be able to escalate from there to take over the host, nor be able to compromise other guests on the same host. The attacker would need to find a second independent security flaw to let them escape SELinux in some manner, or some way to trick libvirt via its QEMU monitor connection. Nothing is guaranteed 100% foolproof, but in absence of other known bugs, sVirt provides good anti-venom for this flaw IMHO."

Did you setenforce 1?

May 13, 2015

VENOM, don’t get bitten.
CC BY-SA CrowdStrike

CC BY-SA CrowdStrike

QEMU is a generic and open source machine emulator and virtualizer and is incorporated in some Red Hat products as a foundation and hardware emulation layer for running virtual machines under the Xen and KVM hypervisors.

CVE-2015-3456 (aka VENOM) is a security flaw in the QEMU’s Floppy Disk Controller (FDC) emulation. It can be exploited by a malicious guest user with access to the FDC I/O ports by issuing specially crafted FDC commands to the controller. It can result in guest controlled execution of arbitrary code in, and with privileges of, the corresponding QEMU process on the host. Worst case scenario this can be guest to host exit with the root privileges.

This issue affects all x86 and x86-64 based HVM Xen and QEMU/KVM guests, regardless of their machine type, because both PIIX and ICH9 based QEMU machine types create ISA bridge (ICH9 via LPC) and make FDC accessible to the guest. It is also exposed regardless of presence of any floppy related QEMU command line options so even guests without floppy disk explicitly enabled in the libvirt or Xen configuration files are affected.

We believe that code execution is possible but we have not yet seen any working reproducers that would allow this.

This flaw arises because of an unrestricted indexed write access to the fixed size FIFO memory buffer that FDC emulation layer uses to store commands and their parameters. The FIFO buffer is accessed with byte granularity (equivalent of FDC data I/O port write) and the current index is incremented afterwards. After each issued and processed command the FIFO index is reset to 0 so during normal processing the index cannot become out-of-bounds.

For certain commands (such as FD_CMD_READ_ID and FD_CMD_DRIVE_SPECIFICATION_COMMAND) though the index is either not reset for certain period of time (FD_CMD_READ_ID) or there are code paths that don’t reset the index at all (FD_CMD_DRIVE_SPECIFICATION_COMMAND), in which case the subsequent FDC data port writes result in sequential FIFO buffer memory writes that can be out-of-bounds of the allocated memory. The attacker has full control over the values that are stored and also almost fully controls the length of the write. Depending on how the FIFO buffer is defined, he might also have a little control over the index as in the case of Red Hat Enterprise Linux 5 Xen QEMU package, where the index variable is stored after the memory designated for the FIFO buffer.

Depending on the location of the FIFO memory buffer, this can either result in stack or heap overflow. For all of the Red Hat Products using QEMU the FIFO memory buffer is allocated from the heap.

Red Hat has issued security advisories to fix this flaw and instructions for applying the fix are available on the knowledge-base.

Mitigation

The sVirt and seccomp functionalities used to restrict host’s QEMU process privileges and resource access might mitigate the impact of successful exploitation of this issue.  A possible policy-based workaround is to avoid granting untrusted users administrator privileges within guests.

May 09, 2015

Setting up an RDO deployment to be Identity V3 Only

The OpenStack Identity API Version 3 provides support for many features that are not available in version 2. Much of the installer code from Devstack, Puppet Modules, and Packstack, all assumes that Keystone is operating with the V2 API. In the interest of hastening the conversion, I set up a deployment that is V3 only. Here is how I did it.

The order I performed these operations was:

  1. Convert Horizon
  2. Convert the Servcfice Catalog
  3. Disable the V2 API in Keystone
  4. Convert the authtoken stanze and the Endpoint config files to use discovery

Horizon

Horizon was the simplest. To change Horizon to use the V3 API, edit the local_settings. For RDO, this file is in:
/etc/openstack-dashboard/local_settings

At the end, I added:

 OPENSTACK_API_VERSIONS = {
     "identity": 3
 }
OPENSTACK_KEYSTONE_MULTIDOMAIN_SUPPORT = True
OPENSTACK_KEYSTONE_DEFAULT_DOMAIN = 'Default'

You might want to make the default domain value something different, especially if you are using a domain specific backend for LDAP.

Service Catalog

Next up is migrating the Keystone service catalog. You can query the current values by using direct SQL.

mysql  --user keystone_admin --password=SECRETE   keystone -e "select interface, url from endpoint where service_id =  (select id from service where service.type = 'identity');" 

By Default, the Responses will have V2.0 at the end of them:

+-----------+-------------------------------+
| interface | url                           |
+-----------+-------------------------------+
| admin     | http://10.10.10.40:35357/v2.0 |
| public    | http://10.10.10.40:5000/v2.0  |
| internal  | http://10.10.10.40:5000/v2.0  |
+-----------+-------------------------------+

I used SQL to modify them. For example:

mysql  --user keystone_admin --password=SECRETE   keystone -e "update endpoint set   url  = 'http://10.10.10.40:5000/v3' where  interface ='internal' and  service_id =  (select id from service where service.type = 'identity');" 
mysql  --user keystone_admin --password=SECRETE   keystone -e "update endpoint set   url  = 'http://10.10.10.40:5000/v3' where  interface ='public' and  service_id =  (select id from service where service.type = 'identity');" 
mysql  --user keystone_admin --password=SECRETE   keystone -e "update endpoint set   url  = 'http://10.10.10.40:35357/v3' where  interface ='admin' and  service_id =  (select id from service where service.type = 'identity');" 

You cannot use the openstack cli to perform this; attempting to change an URL:

$ openstack  endpoint set --interface public  --service keystone http://10.10.10.40:5000/v2.0
ERROR: openstack More than one endpoint exists with the name 'http://10.10.10.40:5000/v2.0'.

I’ll Open a ticket for that.

To Use the V3 API for Operations, you are going to want a V3 Keystone RC. Here is mine:

export OS_USERNAME=admin
export OS_PROJECT_NAME=admin
export OS_PROJECT_DOMAIN_NAME=Default
export OS_USER_DOMAIN_NAME=Default
export OS_PASSWORD=SECRETE
export OS_AUTH_URL=http://$HOSTNAME:5000/v3
export OS_REGION_NAME=RegionOne
export PS1='[\u@\h \W(keystone_admin)]\$ '
export OS_IDENTITY_API_VERSION=3

Disabling V2.0

In order to Ensure you are using V3, it is worth while to disable V2.0. The simplest way to do that is to modify the paste file that controls the pipelines. On and RDO system this is /etc/keystone/keystone-paste.ini. I did it By commenting out the following lines:

#[pipeline:public_api]
# The last item in this pipeline must be public_service or an equivalent
# application. It cannot be a filter.
#pipeline = sizelimit url_normalize request_id build_auth_context token_auth admin_token_auth json_body ec2_extension user_crud_extension public_service

#[pipeline:admin_api]
# The last item in this pipeline must be admin_service or an equivalent
# application. It cannot be a filter.
#pipeline = sizelimit url_normalize request_id build_auth_context token_auth admin_token_auth json_body ec2_extension s3_extension crud_extension admin_service

and I removed them from the composites:

[composite:main]
use = egg:Paste#urlmap
#/v2.0 = public_api
/v3 = api_v3
/ = public_version_api

[composite:admin]
use = egg:Paste#urlmap
#/v2.0 = admin_api
/v3 = api_v3
/ = admin_version_api

Configuring Other services

THis setup was not using Neutron, so I only had to handle Nova, GLance, and Cinder. The process should be comparable for Neutron.

RDO adds configuration values under /use/share/<service>/<service>-dist.conf That over ride the defaults from the python code. For example, the Nova packages has:
/usr/share/nova/nova-dist.conf. I commented out the following values, as they are based on old guidance for setting up authtoken, and are not how the Auth plugins for Keystone Client should be configured:

[keystone_authtoken]
#admin_tenant_name = %SERVICE_TENANT_NAME%
#admin_user = %SERVICE_USER%
#admin_password = %SERVICE_PASSWORD%
#auth_host = 127.0.0.1
#auth_port = 35357
#auth_protocol = http
# Workaround for https://bugs.launchpad.net/nova/+bug/1154809
#auth_version = v2.0

To set the proper values, I put the following in /etc/nova/nova.conf

[keystone_authtoken]
auth_plugin = password
auth_url = http://10.10.10.40:35357
username = nova
password = SECRETE
project_name = services
user_domain_name = Default
project_domain_name = Default
#this values is not needed unless you do not modify /usr/share/nova/nova-dist.conf
#auth_version=v3

A Big thanks to Jamie Lennox for helping me get this straight.

I made a comparable change for glance. For Cinder, the change needs to be made in /etc/cinder/api-paste.ini, but the values are comparable:

[filter:authtoken]
paste.filter_factory = keystonemiddleware.auth_token:filter_factory
auth_plugin = password
auth_url = http://10.10.10.40:35357
username = cinder
password=SECRETE
project_name = services
user_domain_name = Default
project_domain_name = Default

You can restart services using the command openstack-service. To Restart Nova, run:

sudo openstack-service restart nova

And comparable commands for Cinder and Glance. I tested the endpoint using the Horizon API. for Glance, use the images page, and for cinder, the volume page. All other pages were Nova controlled. Neutron would obviously be the Network administration. If you get errors on the page saying “cannot access” it is a sign that they are wstill attempting to do V2 API token verification. Looking in the Keystone access log verified that for me. If you see lines like:

10.10.10.40 - - [09/May/2015:03:20:23 +0000] "GET /v2.0 HTTP/1.1" 404 93 "-" "python-keystoneclient"

You know something is trying to use the V2 API.

May 06, 2015

Explaining Security Lingo

This post is aimed to clarify certain terms often used in the security community. Let’s start with the easiest one: vulnerability. A vulnerability is a flaw in a selected system that allows an attacker to compromise the security of that particular system. The consequence of such a compromise can impact the confidentiality, integrity, or availability of the attacked system (these three aspects are also the base metrics of the CVSS v2 scoring system that are used to rate vulnerabilities). ISO/IEC 27000, IETF RFC 2828, NIST, and others have very specific definitions of the term vulnerability, each differing slightly. A vulnerability’s attack vector is the actual method of using the discovered flaw to cause harm to the affected software; it can be thought of as the entry point to the system or application. A vulnerability without an attack vector is normally not assigned a CVE number.

When a vulnerability is found, an exploit can be created that makes use of this vulnerability. Exploits can be thought of as a way of utilizing one or more vulnerabilities to compromise the targeted software; they can come in the form of an executable program, or a simple set of commands or instructions. Exploits can be local, executed by a user on a system that they have access to, or remote, executed to target certain vulnerable services that are exposed over the network.

Once an exploit is available for a vulnerability, this presents a threat for the affected software and, ultimately, for the person or business operating the affected software. ISO/IEC 27000 defines a threat as “A potential cause of an incident, that may result in harm of systems and organization”. Assessing threats is a crucial part of the threat management process that should be a part of every company’s IT risk management policy. Microsoft has defined a useful threat assessment model, STRIDE, that is used to assess every threat in several categories: Spoofing, Tampering, Repudiation, Information disclosure, Denial of service, and Elevation of privilege. Each of these categories correlates to a particular security property of the affected software; for example, if a vulnerability allows the attacker to tamper with the system (Tampering), the integrity of the that system is compromised. A targeted threat is a type of a threat that is specific to a particular application or system; such threats usually involve malware designed to utilize a variety of known vulnerabilities in specific applications that have a large user base, for example, Flash, WordPress, or PHP.

A related term often considered when assessing a threat is a vulnerability window. This is the time from the moment a vulnerability is published, regardless of whether an exploit exists, up to the point when a fix or a workaround is available that can be used to mitigate the vulnerability. If a vulnerability is published along with a fix, then the vulnerability window can also represent the time it takes to patch that particular vulnerability.

A zero-day vulnerability is a subclass of all vulnerabilities that is published while the affected software has no available patch that would mitigate the issue. Similarly, a zero-day exploit is an exploit that uses a vulnerability that has not yet been patched. Edit: Alternatively, the term zero-day can be used to refer to a vulnerability that has not yet been published publicly or semi-publicly (for example, on a closed mailing list). The term zero-day exploit would then refer to an exploit for an undisclosed vulnerability. The two differing definitions for the term zero-day may be influenced with the recent media attention security issues received. Media, maybe unknowingly, have coined the term zero-day to represent critical issues that are disclosed without being immediately patched. Nevertheless, zero-day as a term is not strictly defined and should be used with care to avoid ambiguity in communication.

Unpatched vulnerabilities can allow malicious users to conduct an attack. Attacking a system or an application is the act of using a vulnerability’s exploit to compromise the security policy of the attacked asset. Attacks can be categorized as either active, which directly affect integrity or availability of the system, or passive, which is used to compromise the confidentiality of the system without affecting the system. An example of an ongoing active attack can be a distributed denial of service attack that targets a particular website with the intention of compromising it’s availability.

The terminology described above is only the tip of the iceberg when it comes to the security world. IETF RFC 2828, for example, consists of 191 pages of definitions and 13 pages of references strictly relevant to IT security. However, the knowing the difference between terms such as threat or exploit can be quite crucial when assessing and communicating a vulnerability within a team or a community.

May 01, 2015

Automating Kerberos Authentication

Sometimes you need unattended authentication. Sometimes you are just lazy. Whatever the reason, if a user (human or otherwise) wants to fetch a Ticket Granting Ticket (TGT) from a Kerberos Key Distribution Center (KDC) automatically, the Global Security Services API (GSSAPI) library shipped with most recent distributions support it.

Kerberos is based on symmetric cryptography. If a user needs to store a symmetric key in a filesystem, she uses a file format known as a Key table, or keytab for short. Fetching a keytab is not a standard action, but FreeIPA has shipped with a utility to make it easier: ipa-getkeytab

Before I attempt to get a keytab, I want to authenticate to my KDC and get a TGT manually:

$ kinit ayoung@YOUNGLOGIC.NET
Password for ayoung@YOUNGLOGIC.NET: 
[ayoung@ayoung530 tempest (master)]$ klist
Ticket cache: KEYRING:persistent:14370:krb_ccache_H4Ss9cA
Default principal: ayoung@YOUNGLOGIC.NET

Valid starting       Expires              Service principal
05/01/2015 09:07:06  05/02/2015 09:06:55  krbtgt/YOUNGLOGIC.NET@YOUNGLOGIC.NET

To fetch a keytab and store it in the users home directory, you can run the following command. I’ve coded it to talk to my younglogic.net KDC, so modify it for yours.

ipa-getkeytab -p $USER@YOUNGLOGIC.NET -k $HOME/client.keytab -s ipa.younglogic.net

You can get your own principal from the klist output:

export KRB_PRINCIPAL=$(klist | awk '/Default principal:/ {print $3}')

If you are running on an ipa-client enrolled machine, much of the info you need is in /etc/ipa/default.conf.

$ cat   /etc/ipa/default.conf 
#File modified by ipa-client-install

[global]
basedn = dc=younglogic,dc=net
realm = YOUNGLOGIC.NET
domain = younglogic.net
server = ipa.younglogic.net
host = rdo.younglogic.net
xmlrpc_uri = https://ipa.younglogic.net/ipa/xml
enable_ra = True

You can convert these values into environment variables with:

 $(awk '/=/ {print "export IPA_" toupper($1)"="$3}' < /etc/ipa/default.conf)

Now a user could manually kinit using that keytab and the following commands:

 
$(awk '/=/ {print "export IPA_" toupper($1)"="$3}' < /etc/ipa/default.conf)
kinit -k -t $HOME/client.keytab $USER@$IPA_REALM

We can skip the kinit step by putting the keytab in a specific location. If you look inthe man page for krb5.conf you can find the following section:

default_client_keytab_name
This relation specifies the name of the default keytab for obtaining client credentials. The default is FILE:/var/kerberos/krb5/user/%{euid}/client.keytab. This relation is subject to parameter expansion

What is %{euid}? It is the numeric userid for a user. For yourself, the value is set in $EUID. What if you need it for a different user? Use the getent command to configure the name service switch configured database for this value:

export AYOUNG_EUID=getent passwd ayoung | cut -d: -f3

You need to create that directory before you can put something in it. You only want the user to be able to read or write in that directory.

sudo mkdir  /var/kerberos/krb5/user/$EUID
sudo chown $USER:$USER  /var/kerberos/krb5/user/$EUID 
chmod 700  /var/kerberos/krb5/user/$EUID

Now use that to store the keytab:

 ipa-getkeytab -p $KRB_PRINCIPAL -k   /var/kerberos/krb5/user/$EUID/client.keytab -s $IPA_SERVER

To test out the new keytab, kdestroy to remove the existing TGTS then try performing an action that would require a service ticket.

Here I show an initially cleared credential cache that gets automatically populated when I connect to a remote system via ssh.

[ayoung@ayoung530 tempest (master)]$ kdestroy -A
[ayoung@ayoung530 tempest (master)]$ klist -A
[ayoung@ayoung530 tempest (master)]$ ssh -K rdo.younglogic.net
Last login: Fri May  1 16:42:28 2015 from c-1-2-3-4.imadethisup.net
-sh-4.2$ exit
logout
Connection to rdo.younglogic.net closed.
[ayoung@ayoung530 tempest (master)]$ klist -A
Ticket cache: KEYRING:persistent:14370:krb_ccache_WotXvlm
Default principal: ayoung@YOUNGLOGIC.NET

Valid starting       Expires              Service principal
05/01/2015 12:42:46  05/02/2015 12:42:45  host/rdo.younglogic.net@YOUNGLOGIC.NET
05/01/2015 12:42:46  05/02/2015 12:42:45  host/rdo.younglogic.net@
05/01/2015 12:42:45  05/02/2015 12:42:45  krbtgt/YOUNGLOGIC.NET@YOUNGLOGIC.NET

I would not recommend doing this for normal users. But for service users that need automated access to remote services, this is the correct approach.

April 29, 2015

Creating Hierarchical Projects in Keystone

Hierarchical Multitenancy is coming. Look busy.

Until we get CLI support for creating projects with parent relationships, we have to test via curl. This has given me a chance to clean up a few little techniques on using jq andd heredocs.

#!/usr/bin/bash -x
. ./keystonerc_admin

TOKEN=$( curl -si  -H "Content-type: application/json"  -d@- $OS_AUTH_URL/auth/tokens <<EOF | awk '/X-Subject-Token/ {print $2}'
{
    "auth": {
        "identity": {
            "methods": [
                "password"
            ],
            "password": {
                "user": {
                    "domain": {
                        "name": "$OS_USER_DOMAIN_NAME"
                    },
                    "name": "admin",
                    "password": "$OS_PASSWORD"
                }
            }
        },
        "scope": {
            "project": {
                "domain": {
                    "name": "$OS_PROJECT_DOMAIN_NAME"
                },
                "name": "$OS_PROJECT_NAME"
            }
        }
    }
}
EOF
)

PARENT_PROJECT=$( curl  -H "Content-type: application/json" -H"X-Auth-Token:$TOKEN"  -d@- $OS_AUTH_URL/projects <<EOF |  jq -r '.project  | {id}[]  '
{
    "project": {
        "description": "parent project",
        "domain_id": "default",
        "enabled": true,
        "name": "Parent"
    }
}
EOF
)

echo $PARENT_PROJECT


curl  -H "Content-type: application/json" -H"X-Auth-Token:$TOKEN"  -d@- $OS_AUTH_URL/projects <<EOF 
{
    "project": {
        "description": "demo-project",
        "parent_project_id": "$PARENT_PROJECT",
        "domain_id": "default",
        "enabled": true,
        "name": "child"
    }
}
EOF


Note that this uses V3 of the API. I have the following keystone_adminrc

export OS_USERNAME=admin
export OS_USER_DOMAIN_NAME=Default
export OS_PROJECT_DOMAIN_NAME=Default
export OS_PROJECT_NAME=admin
export OS_PASSWORD=cf8dcb8aae804722
export OS_AUTH_URL=http://192.168.1.80:5000/v3/

export OS_IDENTITY_API_VERSION=3

export OS_REGION_NAME=RegionOne
export PS1='[\u@\h \W(keystone_admin)]\$ '

Container Security: Just The Good Parts

Security is usually a matter of trade-offs. Questions like: “Is X Secure?”, don’t often have direct yes or no answers. A technology can mitigate certain classes of risk even as it exacerbates others.

Containers are just such a recent technology and their security impact is complex. Although some of the common risks of containers are beginning to be understood, many of their upsides are yet to be widely recognized. To emphasize the point, this post will highlight three of advantages of containers that sysadmins and DevOps can use to make installations more secure.

Example Application

To give this discussion focus, we will consider an example application: a simple imageboard application. This application allows users to create and respond in threads of anonymous image and text content. Original posters can control their posts via “tripcodes” (which are basically per-post passwords). The application consists of the following “stack”:

  • nginx to serve static content, reverse proxy the active content, act as a cache-layer, and handle SSL
  • node.js to do the heavy lifting
  • mariadb to enable persistence

The Base Case

The base-case for comparison is the complete stack being hosted on a single machine (or virtual machine). It is true that this is a simple case, but this is not a straw man. A large portion of the web is served from just such unified instances.

The Containerized Setup

The stack naturally splits into three containers:

  • container X, hosting nginx
  • container J, hosting node.js
  • container M, hosting mariadb

Additionally, three /var locations are created on the host: (1) one for static content (a blog, theming, etc.), (2) one for the actual images, and (3) one for database persistence. The node.js container will have a mount for the the image-store, the mariadb container will have a mount for the database, and the nginx container will have mounts for both the image-store and static content.

Advantage #1: Isolated Upgrades

Let’s look at an example patch Tuesday under both setups.

The Base Case

The sysadmin has prepared a second staging instance for testing the latest patches from her distribution. Among the updates is a critical one for SSL that prevents a key-leak from a specially crafted handshake. After applying all updates, she starts her automatic test suite. Everything goes well until the test for tripcodes. It turns out that the node.js code uses the SSL library to hash the tripcodes for storage and the fix either changed the signature or behavior of those methods. This puts the sysadmin in a tight spot. Does she try to disable tripcodes? Hold back the upgrade?

The Contained Case

Here the sysadmin has more work to do. Instead of updating and testing a single staging instance, she will update and test each individual container, promoting them to production on a container-by-container basis. The nginx and mariadb containers suceed and she replaces them in production. Her keys are safe. As with the base case, the tripcode tests don’t succeed. Unlike the base case, the sysadmin has the option of holding back just the node.js’s SSL library and the nature of the flaw being key-exposure at handshake means that this is not an emergency requiring her to rush developers for a fix.

The Advantage

Of course, isolated upgrades aren’t unique to containers. node.js provides them itself, in the form of npm. So—depending on code specifics—the base case sysadmin might have been able to hold back the SSL library used for tripcodes. However, containers grant all application frameworks isolated upgrades, regardless of whether they provide them themselves. Further, they easily provide them to bigger portions of the stack.

Containers also simplify isolated upgrades. Technologies like rubygems or python virtualenvs create reliance on yet another curated collection of dependencies. It’s easy for sysadmins to be in a position where they need three or more such curated collections to update before their application is safe from a given vulnerability. Container-driven isolated upgrades let sysadmins lean on single collections, such as Linux distributions. These are much more likely to have—for example—paid support or guaranteed SLA’s. They also unify the dependency management to the underlying distribution’s update mechanism.

Containers can also make existing isolation mechanisms easier to manage. While the above case might have been handled via node.js’s npm mechanism, containers would have allowed the developers to deal with that complexity, simply handing an updated container to the sysadmin.

Of course, isolated upgrades are not always an advantage. In large-use environments the resource savings from shared images/memory may make it worth the additional headaches to move all applications forward in lock-step.

Advantage #2: Containers Simplify Real Isolation

Containers do not contain.” However, what containers do well is group related processes and create natural (if undefended) trusts boundaries. This—it turns out—simplifies the task of providing real containment immensely. SELinux, cgroups, iptables, and kernel capabilities have a—mostly undeserved—reputation of being complicated. Complemented with containers, these technologies become much simpler to leverage.

The Base Case

A sysadmin trying to lock-down their installation in the traditional case faces a daunting task. First, they must identify what processes should be allowed to do what. Does node.js as used in this application use /tmp? What kernel capabilities does mariadb need? The need to answer these questions is one of the reasons technologies such as SELinux are considered complicated. They require a deep understanding of the behavior of not just application code, but the application runtime and the underlying OS itself. The tools available to trouble-shoot these issues are often limited (e.g. strace).

Even if the sysadmin is able to nail down exactly what processes in her stack need what capabilities (kernel or otherwise) the question of how to actually bind the application by those restrictions is still a complicated one. How will the processes be transitioned to the correct SELinux context? The correct cgroup?

The Contained Case

In contrast, a sysadmin trying to secure a container has four advantages:

  1. It is trivial (and usually automatic) to transition an entire container into a particular SELinux context and/or cgroup (Docker has –security-opt, OpenShift PID-based groups, etc.).
  2. Operating system behavior need not be locked down, only the container/host relationship.
  3. The container is—usually—placed on a virtual network and/or interface (often the container runtime environment even has supplemental lock-down capabilities).
  4. Containers naturally provide for experimentation. You can easily launch a container with a varying set of kernel capabilities.

Most frameworks for launching containers do so with sensible “base” SELinux types. For example, both Docker and systemd-nspawn (when using SELinux under RHEL or Fedora) launch all containers with variations of svirt types based on previous work with libvirt. Additionally, many container launchers also borrow libvirt’s philosophy of giving each launched container unique Multi-Category Security (MCS) labels that can optionally be set by the admin. Combined with read-only mounting and the fact that an admin only needs to worry about container/host interactions, this MCS functionality can go a long way towards restricting an applications behavior.

For this application, it is straight-forward to:

  • Label the static, image, and database stores with unique MCS labels (e.g. c1, c2, and c3).
  • Launch the nginx container with labels and binding options (i.e. :ro) appropriate for reading only the image and static stores (-v /path:/path:ro and –security-opt=label:level:s0:c1,c2 for Docker).
  • Launch the node.js container binding the image store read/write and with a label giving it only access to that store.
  • Launching the mariadb container with only the data persistence store mounted read/write and with a label giving it access only to that store.

Should you need to go beyond what MCS can offer, most container frameworks support launching containers with specific SELinux types. Even when working with derived or original SELinux types, containers make everything easier as you need only worry about the interactions between the container and host.

With containers, there are many tools for restricting intra-container communication. Alternatively, for all container frameworks that give each container a unique IP, iptables can also be applied directly. With iptables—for example—it is easy to restrict:

  • The nginx container from speaking anything but HTTP to the nginx container and HTTPS to the outside world.
  • Block the node.js container from doing anything but speaking HTTP to the nginx container and using the database port of the mariadb container.
  • Block mariadb from doing anything but receiving request from the node.js container on it’s database port.

For preventing DDOS or other resource-based attacks, we can use the container launchers built-in tools (e.g. Docker’s ulimit options) or cgroups directly. Either way it is easy to—for example—restrict the node.js and mariadb containers to some hard resource limit (40% of RAM, 20% of CPU and so on).

Finally, container frameworks combined with unit tests are a great way for finding a restricted set of kernel capabilities with which to run an application layer. Whether the framework encourages starting with a minimal set and building up (systemd-nspawn) or with a larger set and letting you selectively drop (Docker), it’s easy to keep launching containers until you find a restricted—but workable—collection.

The configuration to isolation ratio of the above work is extremely high compared to “manual” SELinux/cgroup/iptables isolation. There is also much less to “go wrong” as it is much easier to understand the container/host relationship and its needs than it is to understand the process/OS relationship. Among other upsides, the above configuration: prevents a compromised nginx from altering any data on the host (including the image-store and database), prevents a compromised mariadb from altering anything other than the database, and—depending on what exact kernel capabilities are absolutely required—may go a long way towards prevention of privilege escalation.

The Advantage

While containers do not allow for any forms of isolation not already possible, in practice they make configuring isolation much simpler. They limit isolation to container/host instead of process/OS. By binding containers to virtual networks or interfaces, they simplify firewall rules. Container implementations often provide sensible SELinux or other protection defaults that can be easily extended.

The trade-off is that containers expose an additional container/container attack-surface that is not as trivial to isolate.

Advantage #3: Containers Have More Limited and Explicit Dependencies

The Base Case

Containers are meant to eliminate “works for me” problems. A common cause of “works for me” problems in traditional installations is hidden dependencies. An example is a software component depending on a common command line utility without a developer knowing it. Besides creating instability over installation types, this is a security issue. A sysadmin cannot protect against a vulnerability in a component they do not know is being used.

The flip-side of unknown dependencies and of much greater concern is extraneous or cross-over components. Components needed by one portion of the stack can actually make other components not designed with them in mind extremely dangerous. Many privilege escalation flaws involve abusing access to suid programs that, while essential to some applications, are extraneous to others.

The Contained Case

Obviously, container isolation helps prevent component dependency cross-over but containers also help to minimize extraneous dependencies. Containers are not virtual machines. Containers do not not have to boot, they do not have to support interactive usage, they are usually single user, and can be simpler than a full operating system in any number of ways. Thus containers can eschew service launchers, shells, sensitive configuration files, and other cruft that serves (from an application perspective) to only serve as an attack surface.

Truly minimal custom containers will more or less look like just the top few layers of their RPM/Deb/pkg “pyramid” without any of the bottom layers. Even “general” purpose containers are undergoing a healthy “race to the bottom” to have as minimal a starting footprint as possible. The Docker version of RHEL 7, an operating system not exactly famous for minimalism, is itself less than 155 megs uncompressed.

The Advantage

Container isolation means that when a portion of your application stack has a dependency, that dependency’s attack surface is available only to that portion of your application. This is in stark contrast to traditional installations where attack surfaces are always additive. Exploitation almost always involves chaining multiple vulnerabilities, so this advantage may be one of containers’ most powerful.

A common security complaint regarding containers is that in many ways they are comparable to statically linked binaries. The flip side is that this puts pressure on developers and maintainers to minimize the size of these blobs, which minimizes their attack surface. Shellshock is a good example of the kind of vulnerability this mitigates. It is nearly impossible for a traditional container to not have a highly complex shell, but many containers ship without a shell of any kind.

Beyond containers themselves this pressure has resulted in the rise of the minimal host operating system (e.g. Atomic, CoreOS, RancherOS). This has brought a reduced attack surface (and in the case of Atomic a certain degree of immutability) to the host as well as the container.

Containers Is As Containers Do

Other security advantages of containers include working well in an immutable and/or stateless paradigms, good content auditability (especially compared to virtual machines), and—potentially—good verifiability. A single blog post can’t cover all of the upsides of containers, much less the upsides and downsides. Ultimately, a large part of understanding the security impact of containers is coming to terms with the fact that containers are neither degenerate virtual machines nor superior jails. They are a unique technology whose impact needs to be assessed on its own.

April 23, 2015

Fedora Security Team’s 90-day Challenge

Earlier this month the Fedora Security Team started a 90-day challenge to close all critical and important CVEs in Fedora that came out in 2014 and before.  These bugs include packages affected in both Fedora and EPEL repositories.  Since we started the process we’ve made some good progress.

Of the thirty-eight Important CVE bugs, six have been closed, three are on QA, and the rest are open.  The one critical bug, rubygems-activesupport in EPEL, still remains but maybe fixed as early as this week.

Want to help?  Please join us in helping make Fedora (and EPEL) and safer place and pitch in to help close these security bugs.


April 22, 2015

Regular expressions and recommended practices

Whenever a security person crosses a vulnerability report, one of the the first steps is to ensure that the reported problem is actually a vulnerability. Usually, the issue falls into well known and studied categories and this step is done rather quickly. Occasionally, however, one can come across bugs where this initial triage is a bit more problematic. This blog post is about such an issue, which will ultimately lead us to the concept of “recommended practice”.

What happened?

On July 31st 2014, Maksymilian Arciemowicz of cxsecurity reported that “C++11 [is] insecure by default.”, with upstream GCC bugs 61601 and 61582. LLVM/Clang’s libc++ didn’t dodge the bullet either, more details are available in LLVM bug 20291.

Not everybody can be bothered to go through so many links, so here is a quick summary: C++11, a new C++ standard approved in 2011, introduced support for regular expressions. Regular expressions (regexes from here on) are an amazingly powerful processing tool – but one that can become extremely complex to handle correctly. Not only can the regex itself become hideous and hard to understand, but also the way how the regex engine deals with it can lead to all sorts of problems. If certain complex regexes are passed to a regex engine, the engine can quickly out-grow the available CPU and memory constraints while trying to process the expression, possibly leading to a catastrophic event, which some call ReDoS, a “regular expression denial of service”.

This is exactly what Maksymilian Arciemowicz exploits: he passes specially crafted regexes to the regex engines provided by the C++11 implementations of GCC and Clang, causing them to use a huge amount of CPU resources or even crash (e.g. due to extreme recursion, which will exhaust all the available stack space, leading to a stack-overflow).

Is it a vulnerability?

CPU exhaustion and crashes are often good indicators for a vulnerability. Additionally, the C++11 standard even suggests error return codes for the exact problems triggered, but the implementations at hand fail to catch these situations. So, this must be a vulnerability, right? Well, this is the point where opinions differ. In order to understand why, it’s necessary to introduce a new concept:

The “recommended practice” concept

“Recommended practice” is essentially a mix of common sense and dos and don’ts. A huge problem is that they are informal, so there’s no ultimate guide on the subject, which leaves best practices open to personal experiences and opinion. Nevertheless, the vast majority of the programming community should know about the dangers of regular expressions; dangers just like the issues Maksymilian Arciemowicz reported in GCC/Clang. That said, passing arbitrary, unfiltered regexes from an untrusted source to the regex engine should be considered as a recommended practice case of “don’t do this; it’ll blow up in your face big time”.

To further clear this up: if an application uses a perfectly reasonable, well defined regex and the application crashes because the regex engine chocked when processing certain specially crafted input, it’s (most likely) a vulnerability in the regex engine. However, if the application uses a regex thought to be well defined, efficient and trusted, but turns out to e.g. take overly long to process certain specially crafted input, while other, more efficient regexes will do the job just fine, it’s (probably) a vulnerability in the application. But if untrusted regexes are passed to the regex engine without somehow filtering them for sanity first (which is incredibly hard to do for anything but the simplest of regexes, so better to avoid it), it is violating what a lot of people believe to be recommended practice, and thus it is often not considered to be a strict vulnerability in the regex engine.

So, next time you feel inclined to pass regexes verbatim to the engine, you’ll hopefully remember that it’s not a good idea and refrain from doing so. If you have done so in the past, you should probably go ahead and fix it.

April 16, 2015

Creating a new Network for a dual NIC VM

I need a second network for testing a packstack deployment. Here is what I did to create it, and then to boot a new VM connected to both networks.

Once again the tables are too big for the stylesheet I am using, but I don’t want to modify the output. The view source icon gives a more readable view.

The Common client supports creating networks.

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ openstack network create ayoung-private
+-------------+--------------------------------------+
| Field       | Value                                |
+-------------+--------------------------------------+
| id          | 9f2948fa-77dd-483d-8841-f9461ee50aee |
| name        | ayoung-private                       |
| project_id  | fefb11ea894f43c0ae5c9686d2f49a9d     |
| router_type | Internal                             |
| shared      | False                                |
| state       | UP                                   |
| status      | ACTIVE                               |
| subnets     |                                      |
+-------------+--------------------------------------+
[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron  subnet create ayoung-private 192.168.52.0/24 --name ayoung-subnet1
Invalid command u'subnet create ayoung-private 192.168.52.0/24 --name'

But not any of the other neutron operations…at least not at first glance. we’ll see later if that is the case, but for now, use the neutron client, which seems to support the V3 Keystone API for Auth. Create a subnet:

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron  subnet-create ayoung-private 192.168.52.0/24 --name ayoung-subnet1
Created a new subnet:
+-------------------+----------------------------------------------------+
| Field             | Value                                              |
+-------------------+----------------------------------------------------+
| allocation_pools  | {"start": "192.168.52.2", "end": "192.168.52.254"} |
| cidr              | 192.168.52.0/24                                    |
| dns_nameservers   |                                                    |
| enable_dhcp       | True                                               |
| gateway_ip        | 192.168.52.1                                       |
| host_routes       |                                                    |
| id                | da738ad8-8469-4aa8-ab91-448bd3878ae6               |
| ip_version        | 4                                                  |
| ipv6_address_mode |                                                    |
| ipv6_ra_mode      |                                                    |
| name              | ayoung-subnet1                                     |
| network_id        | 9f2948fa-77dd-483d-8841-f9461ee50aee               |
| tenant_id         | fefb11ea894f43c0ae5c9686d2f49a9d                   |
+-------------------+----------------------------------------------------+

Create router for the subnet

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron router-create ayoung-private-router
Created a new router:
+-----------------------+--------------------------------------+
| Field                 | Value                                |
+-----------------------+--------------------------------------+
| admin_state_up        | True                                 |
| external_gateway_info |                                      |
| id                    | 51ad4cf6-10de-455f-8a8d-ab9dd3c0fd78 |
| name                  | ayoung-private-router                |
| routes                |                                      |
| status                | ACTIVE                               |
| tenant_id             | fefb11ea894f43c0ae5c9686d2f49a9d     |
+-----------------------+--------------------------------------+

Now I need to find the external network and create a router that points to it:

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron net-list
+--------------------------------------+------------------------------+-------------------------------------------------------+
| id                                   | name                         | subnets                                               |
+--------------------------------------+------------------------------+-------------------------------------------------------+
| 63258623-1fd5-497c-b62d-e0651e03bdca | idm-v4-default               | 3227f3ea-5230-411c-89eb-b1e51298b4f9 192.168.1.0/24   |
| 9f2948fa-77dd-483d-8841-f9461ee50aee | ayoung-private               | da738ad8-8469-4aa8-ab91-448bd3878ae6 192.168.52.0/24  |
| eb94d7e2-94be-45ee-bea0-22b9b362f04f | external                     | 3a72b7bc-623e-4887-9499-de8ba280cb2f                  |
+--------------------------------------+------------------------------+-------------------------------------------------------+
[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron router-gateway-set 51ad4cf6-10de-455f-8a8d-ab9dd3c0fd78 eb94d7e2-94be-45ee-bea0-22b9b362f04f
Set gateway for router 51ad4cf6-10de-455f-8a8d-ab9dd3c0fd78

The router needs an interface on the subnet.

[ayoung@ayoung530 rdo-federation-setup (openstack)]$  neutron router-interface-add 51ad4cf6-10de-455f-8a8d-ab9dd3c0fd78 da738ad8-8469-4aa8-ab91-448bd3878ae6
Added interface 782fdf26-e7c1-4ca7-9ec9-393df62eb11e to router 51ad4cf6-10de-455f-8a8d-ab9dd3c0fd78.

Not sure if I need to create a port, but worth testing out;

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ neutron port-create ayoung-private --fixed-ip ip_address=192.168.52.20
Created a new port:
+-----------------------+--------------------------------------------------------------------------------------+
| Field                 | Value                                                                                |
+-----------------------+--------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                 |
| allowed_address_pairs |                                                                                      |
| binding:vnic_type     | normal                                                                               |
| device_id             |                                                                                      |
| device_owner          |                                                                                      |
| fixed_ips             | {"subnet_id": "da738ad8-8469-4aa8-ab91-448bd3878ae6", "ip_address": "192.168.52.20"} |
| id                    | 80f302db-6c27-42a0-a1a3-45fcfe0b23fe                                                 |
| mac_address           | fa:16:3e:bf:e3:7d                                                                    |
| name                  |                                                                                      |
| network_id            | 9f2948fa-77dd-483d-8841-f9461ee50aee                                                 |
| security_groups       | 6c13abed-81cd-4a50-82fb-4dc98b4f29fd                                                 |
| status                | DOWN                                                                                 |
| tenant_id             | fefb11ea894f43c0ae5c9686d2f49a9d                                                     |
+-----------------------+--------------------------------------------------------------------------------------+

Now to create the vm. I specify the –nic param twice.

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ openstack server create   --flavor m1.medium   --image "CentOS-7-x86_64" --key-name ayoung-pubkey  --security-group default  --nic net-id=63258623-1fd5-497c-b62d-e0651e03bdca  --nic net-id=9f2948fa-77dd-483d-8841-f9461ee50aee     test2nic.cloudlab.freeipa.org
+--------------------------------------+--------------------------------------------------------+
| Field                                | Value                                                  |
+--------------------------------------+--------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                 |
| OS-EXT-AZ:availability_zone          | nova                                                   |
| OS-EXT-STS:power_state               | 0                                                      |
| OS-EXT-STS:task_state                | scheduling                                             |
| OS-EXT-STS:vm_state                  | building                                               |
| OS-SRV-USG:launched_at               | None                                                   |
| OS-SRV-USG:terminated_at             | None                                                   |
| accessIPv4                           |                                                        |
| accessIPv6                           |                                                        |
| addresses                            |                                                        |
| adminPass                            | Exb7Qw3syfDg                                           |
| config_drive                         |                                                        |
| created                              | 2015-04-16T03:35:27Z                                   |
| flavor                               | m1.medium (3)                                          |
| hostId                               |                                                        |
| id                                   | fffef6e0-fcce-4313-af7a-81f9306ef196                   |
| image                                | CentOS-7-x86_64 (38534e64-5d7b-43fa-b59c-aed7a262720d) |
| key_name                             | ayoung-pubkey                                          |
| name                                 | test2nic.cloudlab.freeipa.org                          |
| os-extended-volumes:volumes_attached | []                                                     |
| progress                             | 0                                                      |
| project_id                           | fefb11ea894f43c0ae5c9686d2f49a9d                       |
| properties                           |                                                        |
| security_groups                      | [{u'name': u'default'}]                                |
| status                               | BUILD                                                  |
| updated                              | 2015-04-16T03:35:27Z                                   |
| user_id                              | 64951f595aa444b8a3e3f92091be364d                       |
+--------------------------------------+--------------------------------------------------------+
[ayoung@ayoung530 rdo-federation-setup (openstack)]$ openstack server list
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+
| ID                                   | Name                                | Status  | Networks                                                                    |
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+
| 820f8563-28ae-43fb-a0ff-d4635bd6dd38 | ecp.cloudlab.freeipa.org            | SHUTOFF | idm-v4-default=192.168.1.77, 10.16.19.28                                    |
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+

Set a Floating IP and ssh in:

[ayoung@ayoung530 rdo-federation-setup (openstack)]$ openstack  ip floating list | grep None | sort -R | head -1
| a5abf332-68dc-46c5-a4f1-188b91f8dbf8 | external | 10.16.18.225 | None           | None                                 |
[ayoung@ayoung530 rdo-federation-setup (openstack)]$ openstack ip floating add  10.16.18.225 test2nic.cloudlab.freeipa.org

echo 10.16.18.225 test2nic.cloudlab.freeipa.org | sudo tee -a /etc/hosts
10.16.18.225 test2nic.cloudlab.freeipa.org
$ ssh centos@test2nic.cloudlab.freeipa.org
The authenticity of host 'test2nic.cloudlab.freeipa.org (10.16.18.225)' can't be established.
ECDSA key fingerprint is e3:dd:1b:d6:30:f1:f5:2f:14:d7:6f:98:d6:c9:08:0c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'test2nic.cloudlab.freeipa.org,10.16.18.225' (ECDSA) to the list of known hosts.
[centos@test2nic ~]$ ifconfig eth1
eth1: flags=4098<broadcast>  mtu 1500
        ether fa:16:3e:ab:14:2e  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
Using the openstack command line interface to create a new server.

I have to create a new virtual machine. I want to use the V3 API when authentication to Keystone, which means I need to use the common client, as the keystone client is deprecated and only supports the V2.0 Identity API.

To do anything with the client, we need to set some authorization data variables. Create keystonerc with the following and source it:

  1 export OS_AUTH_URL=http://oslab.exmapletree.com:5000/v3
  2 export OS_PROJECT_NAME=Syrup
  3 export OS_PROJECT_DOMAIN_NAME=Default
  4 export OS_USER_DOMAIN_NAME=Default
  5 export OS_USERNAME=ayoung
  6 export OS_PASSWORD=yeahright

The Formatting for the output of the commands is horribly rendered here, but if you click the little white sheet of paper icon that pops up when you float your mouse cursor over the black text, you get a readable table.

Sanity Check: list servers

[ayoung@ayoung530 oslab]$ openstack server list
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+
| ID                                   | Name                                | Status  | Networks                                                                    |
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+
| a5f70f90-7d97-4b79-b0f0-044d8d9b4c77 | centos7.cloudlab.freeipa.org        | ACTIVE  | idm-v4-default=192.168.1.72, 10.16.19.63                                    |
| 35b116e4-fdd2-4580-bb1a-18f1f6428dd5 | mysql.cloudlab.freeipa.org          | ACTIVE  | idm-v4-default=192.168.1.70, 10.16.19.92                                    |
| a5ca7644-d703-44d7-aa95-fd107e18aefd | horizon.cloudlab.freeipa.org        | ACTIVE  | idm-v4-default=192.168.1.67, 10.16.19.24                                    |
| f7aca565-4439-4a2f-9c31-911349ce8943 | ldapqa.cloudlab.freeipa.org         | ACTIVE  | idm-v4-default=192.168.1.66, 10.16.19.100                                   |
| 2b7b5cc1-83c4-45c3-8ca3-cd4ba4b589d3 | federate.cloudlab.freeipa.org       | ACTIVE  | idm-v4-default=192.168.1.61, 10.16.18.6                                     |
| a8649175-fd18-483c-acb7-2933226fd3a6 | horizon.kerb-demo.org               | ACTIVE  | kerb-demo.org=192.168.0.5, 10.16.19.183                                     |
| 38d24fb3-0dd3-4cf0-98d6-12ea22a1d718 | openstack.kerb-demo.org             | ACTIVE  | kerb-demo.org=192.168.0.3, 10.16.19.101                                     |
| ca9a8249-1f09-4b1a-b8d4-850019b7c4e5 | ipa.kerb-demo.org                   | ACTIVE  | kerb-demo.org=192.168.0.2, 10.16.18.218                                     |
| 29d00b3b-5961-424e-b95c-9d90b3ecf9e3 | ipsilon.cloudlab.freeipa.org        | ACTIVE  | idm-v4-default=192.168.1.60, 10.16.18.207                                   |
| 028df8d8-7ce9-4f61-b36f-a080dd7c4fb8 | ipa.cloudlab.freeipa.org            | ACTIVE  | idm-v4-default=192.168.1.59, 10.16.18.31                                    |
+--------------------------------------+-------------------------------------+---------+-----------------------------------------------------------------------------+

I made a pretty significant use of the help output. To show the basic help string

openstack --help

Gives you a list of the options. To see help on a specific comming, such as the server create command we are going to work towards executing, run:

openstack help server create

A Server is a resource golem composed by stitching together resources from other services. To create this golem I am going to stitch together:

  1. A flavor
  2. An image
  3. A Security Group
  4. A Private Key
  5. A network
First, to find the flavor:

[ayoung@ayoung530 oslab]$ openstack flavor list
+----+---------------------+------+------+-----------+-------+-----------+
| ID | Name                |  RAM | Disk | Ephemeral | VCPUs | Is Public |
+----+---------------------+------+------+-----------+-------+-----------+
| 1  | m1.tiny             |  512 |    1 |         0 |     1 | True      |
| 2  | m1.small            | 2048 |   20 |         0 |     1 | True      |
| 3  | m1.medium           | 4096 |   40 |         0 |     2 | True      |
| 4  | m1.large            | 8192 |   80 |         0 |     4 | True      |
| 6  | m1.xsmall           | 1024 |   10 |         0 |     1 | True      |
| 7  | m1.small.4gb        | 4096 |   20 |         0 |     1 | True      |
| 8  | m1.small.8gb        | 8192 |   20 |         0 |     1 | True      |
| 9  | oslab.4cpu.20hd.8gb | 8192 |   20 |         0 |     4 | True      |
+----+---------------------+------+------+-----------+-------+-----------+

I think this one should taste like cherry. But, since we don’t have a cherry flavor, I guess I’ll pick m1.small.4gb as that has the 4 GB RAM I need.

To FInd an image:

[ayoung@ayoung530 oslab]$ openstack image list
+--------------------------------------+----------------------------------------------+
| ID                                   | Name                                         |
+--------------------------------------+----------------------------------------------+
| 415162df-4bec-474f-9f3b-0a79c2ed3848 | Fedora-Cloud-Base-22_Alpha-20150305          |
| b89dc25b-6f62-4001-b979-05ac14f60e9b | rhel-guest-image-7.1-20150224.0              |
| 38534e64-5d7b-43fa-b59c-aed7a262720d | CentOS-7-x86_64                              |
| bc3c35a2-cf96-4589-ad51-a8d499708128 | Fedora-Cloud-Base-20141203-21.x86_64         |
| 9ea16df1-f178-4589-b32b-0e2e32305c61 | FreeBSD 10.1                                 |
| 6ec77e6e-7ad4-4994-937d-91003fa2d6ac | rhel-6.6-latest                              |
| e61ec961-248b-4ee6-8dfa-5d5198690cab | ubuntu-12.04-precise-cloudimg                |
| 54ba6aa9-7d20-4606-baa6-f8e45a80510c | rhel-guest-image-6.6-20141222.0              |
| bee6e762-102f-467e-95a8-4a798cb5ec75 | heat-functional-tests-image                  |
| 812e129c-6bfd-41f5-afba-6817ac6a23e5 | RHEL 6.5 20140829                            |
| f2dfff20-c403-4e53-ae30-947677a223ce | Fedora 21 20141203                           |
| 473e6f30-a3f0-485b-a5e5-3c5a1f7909a5 | RHEL 6.6 20140926                            |
| b12fe824-c98a-4af5-88a6-b1e11a511724 | centos-7-cloud                               |
| 601e162f-87b4-4fc1-a0d3-1c352f3c2988 | fedora-21-atomic                             |
| 12616509-4c4f-47a5-96b1-317a99ef6bf8 | Fedora 21 Beta                               |
| 77dcb29b-3258-4955-8ca4-a5952c157a2b | RHEL6.6                                      |
| 8550a6db-517b-47ea-82f3-ec4fd48e8c09 | centos-7-x86_64                              |
+--------------------------------------+----------------------------------------------+

Although I really want a Fedora Cloud image…I guess I’ll pick fedora-21-atomic. Close enough for Government work.

[ayoung@ayoung530 oslab]$ openstack keypair list
+---------------+-------------------------------------------------+
| Name          | Fingerprint                                     |
+---------------+-------------------------------------------------+
| ayoung-pubkey | 37:81:08:b2:0e:39:78:0e:62:fb:0b:a5:f1:d7:41:fc |
+---------------+-------------------------------------------------+

That decision is tough.

[ayoung@ayoung530 oslab]$ openstack network list
+--------------------------------------+------------------------------+--------------------------------------+
| ID                                   | Name                         | Subnets                              |
+--------------------------------------+------------------------------+--------------------------------------+
| 3b799c78-ca9d-49d0-9838-b2599cc6b8d0 | kerb-demo.org                | c889bb6b-98cd-47b8-8ba0-5f2de4fe74ee |
| 63258623-1fd5-497c-b62d-e0651e03bdca | idm-v4-default               | 3227f3ea-5230-411c-89eb-b1e51298b4f9 |
| 650fc936-cc03-472d-bc32-d56f56116761 | tester1                      |                                      |
| de4300cc-8f71-46d7-bec5-c0a4ad54954d | BROKEN                       | 6c390add-108c-40d5-88af-cb5e784a9d31 |
| eb94d7e2-94be-45ee-bea0-22b9b362f04f | external                     | 3a72b7bc-623e-4887-9499-de8ba280cb2f |
+--------------------------------------+------------------------------+--------------------------------------+

Tempted to using BROKEN, but I shall refrain. I set up idm-v4-default so I know that is good.

[ayoung@ayoung530 oslab]$ openstack security group list
+--------------------------------------+---------+-------------+
| ID                                   | Name    | Description |
+--------------------------------------+---------+-------------+
| 6c13abed-81cd-4a50-82fb-4dc98b4f29fd | default | default     |
+--------------------------------------+---------+-------------+

Another tough call. OK, with all that, we have enough information to create the server:

One note, the –nic param is where you can distinguish which network to use. That param takes a series of key/value params connected by equal signs. I figured that out from the old nova command line parameters, and would jhave been hopeless lost if I hadn’t stumbled across the old how to guide.

openstack server create   --flavor m1.medium   --image "fedora-21-atomic" --key-name ayoung-pubkey    --security-group default  --nic net-id=63258623-1fd5-497c-b62d-e0651e03bdca ayoung-test

In order to ssh to the machine, we need to assign it a floating IP Address. TO find one that is unassigned:

[ayoung@ayoung530 oslab]$ openstack  ip floating list | grep None
| 943b57ea-4e52-4d05-b665-f808a5fbd887 | external | 10.16.18.61  | None           | None                                 |
| a1f5bb26-4e47-4fe7-875e-d967678364a0 | external | 10.16.18.223 | None           | None                                 |
| a419c144-dbfd-4a42-9f5e-880526683ea0 | external | 10.16.18.235 | None           | None                                 |
| a5abf332-68dc-46c5-a4f1-188b91f8dbf8 | external | 10.16.18.225 | None           | None                                 |
| b5c21c4a-3f12-4744-a426-8d073b3be3c8 | external | 10.16.18.70  | None           | None                                 |
| b67edf85-2e54-4ad1-a014-20b7370e38ba | external | 10.16.18.170 | None           | None                                 |
| c43eb490-1910-4adf-91b6-80375904e937 | external | 10.16.18.196 | None           | None                                 |
| c44a4a56-1534-4200-a227-90de85a218eb | external | 10.16.19.28  | None           | None                                 |
| e98774f9-fe6e-4608-a85d-92a5f39ef2c8 | external | 10.16.19.182 | None           | None                                 |
| f2705313-b03f-4537-a2d8-c01ff1baaee1 | external | 10.16.18.203 | None           | None                                 |

I’ll chose one at random

[ayoung@ayoung530 oslab]$ openstack  ip floating list | grep None | sort -R | head -1
| a419c144-dbfd-4a42-9f5e-880526683ea0 | external | 10.16.18.235 | None           | None                                 |

And Add it to the server:

openstack ip floating add  10.16.18.235  ayoung-test

Test it out:

[ayoung@ayoung530 oslab]$ ssh root@10.16.18.235
The authenticity of host '10.16.18.235 (10.16.18.235)' can't be established.
ECDSA key fingerprint is 2a:cd:5f:37:63:ef:7f:2d:9d:83:fd:85:76:4d:03:3c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.16.18.235' (ECDSA) to the list of known hosts.
Please login as the user "fedora" rather than the user "root".

And log in:

[ayoung@ayoung530 oslab]$ ssh fedora@10.16.18.235
[fedora@ayoung-test ~]$ 

April 15, 2015

JWCrypto a python module to do crypto using JSON

Lately I had the need to do use some crypto in a web-like scenario, a.k.a over-HTTP(S) so I set out to look at what could be used.

Pretty quickly it came clear that the JSON Web Encryption standard proposed in the IETF JOSE Working Group would be a good fit and actually the JSON Web Signature would come useful too.

Once I was convinced this was the standard to use I tried to find out a python module that implemented it as the project I am going to use this stuff in (FreeIPA ultimately) is python based.

The only implementation I found initially (since then I've found other projects scattered over the web) was this Jose project on GitHub.

After a quick look I was not satisfied by three things:

  • It is not a complete implementation of the specs
  • It uses obsolete python crypto-libraries wrappers
  • It is not Python3 compatible
While the first was not a big problem as I could simply contribute the missing parts, the second is, and the third is a big minus too. I wanted to use the new Python Cryptography library as it has proper interfaces and support for modern crypto, and neatly abstracts away the underlying crypto-library bindings.

So after some looking over the specs in details to see how much work it would entail I decided to build a python modules to implement all relevant specs myself.

The JWCrypto project is the result of a few weeks of work, complete of Documentation hosted by ReadTheDocs.

It is an almost complete implementation of the JWK, JWE, JWS and JWT specs and implements most of the algorithms defined in the JWA spec. It has been reviewed internally by a member of the Red Hat Security Team and has an extensive test suite based on the specs and the test vectors included in the JOSE WG Cookbook. It is also both Python2.7 and Python3.3 compatible!

I had a lot of fun implementing it, so if you find it useful feel free to drop me a note.

April 08, 2015

Don’t judge the risk by the logo

It’s been almost a year since the OpenSSL Heartbleed vulnerability, a flaw which started a trend of the branded vulnerability, changing the way security vulnerabilities affecting open-source software are being reported and perceived. Vulnerabilities are found and fixed all the time, and just because a vulnerability gets a name and a fancy logo doesn’t mean it is of real risk to users.

ven1

So let’s take a tour through the last year of vulnerabilities, chronologically, to see what issues got branded and which issues actually mattered for Red Hat customers.

“Heartbleed” (April 2014)CVE-2014-0160

Heartbleed was an issue that affected newer versions of OpenSSL. It was a very easy to exploit flaw, with public exploits released soon after the issue was public. The exploits could be run against vulnerable public web servers resulting in a loss of information from those servers. The type of information that could be recovered varied based on a number of factors, but in some cases could include sensitive information. This flaw was widely exploited against unpatched servers.

For Red Hat Enterprise Linux, only customers running version 6.5 were affected as prior versions shipped earlier versions of OpenSSL that did not contain the flaw.

Apache Struts 1 Class Loader RCE (April 2014) CVE-2014-0114

This flaw allowed attackers to manipulate exposed ClassLoader properties on a vulnerable server, leading to remote code execution. Exploits have been published but they rely on properties that are exposed on Tomcat 8, which is not included in any supported Red Hat products. However, some Red Hat products that ship Struts 1 did expose ClassLoader properties that could potentially be exploited.

Various Red Hat products were affected and updates were made available.

OpenSSL CCS Injection (June 2014) CVE-2014-0224

After Heartbleed, a number of other OpenSSL issues got attention. CCS Injection was a flaw that could allow an attacker to decrypt secure connections. This issue is hard to exploit as it requires a man in the middle attacker who can intercept and alter network traffic in real time, and as such we’re not aware of any active exploitation of this issue.

Most Red Hat Enterprise Linux versions were affected and updates were available.

glibc heap overflow (July 2014) CVE-2014-5119

A flaw was found inside the glibc library where an attacker who is able to make an application call a specific function with a carefully crafted argument could lead to arbitrary code execution. An exploit for 32-bit systems was published (although this exploit would not work as published against Red Hat Enterprise Linux).

Some Red Hat Enterprise Linux versions were affected, in various ways, and updates were available.

JBoss Remoting RCE (July 2014) CVE-2014-3518

A flaw was found in JBoss Remoting where a remote attacker could execute arbitrary code on a vulnerable server. A public exploit is available for this flaw.

Red Hat JBoss products were only affected by this issue if JMX remoting is enabled, which is not the default. Updates were made available.

“Poodle” (October 2014) CVE-2014-3566

Continuing with the interest in OpenSSL vulnerabilities, Poodle was a vulnerability affecting the SSLv3 protocol. Like CCS Injection, this issue is hard to exploit as it requires a man in the middle attack. We’re not aware of active exploitation of this issue.

Most Red Hat Enterprise Linux versions were affected and updates were available.

“ShellShock” (September 2014) CVE-2014-6271

The GNU Bourne Again shell (Bash) is a shell and command language interpreter used as the default shell in Red Hat Enterprise Linux. Flaws were found in Bash that could allow remote code execution in certain situations. The initial patch to correct the issue was not sufficient to block all variants of the flaw, causing distributions to produce more than one update over the course of a few days.

Exploits were written to target particular services. Later, malware circulated to exploit unpatched systems.

Most Red Hat Enterprise Linux versions were affected and updates were available.

RPM flaws (December 2014) CVE-2013-6435, CVE-2014-8118

Two flaws were found in the package manager RPM. Either could allow an attacker to modify signed RPM files in such a way that they would execute code chosen by the attacker during package installation. We know CVE-2013-6435 is exploitable, but we’re not aware of any public exploits for either issue.

Various Red Hat Enterprise Linux releases were affected and updates were available.

“Turla” malware (December 2014)

Reports surfaced of a trojan package targeting Linux, suspected as being part of an “advance persistent threat” campaign. Our analysis showed that the trojan was not sophisticated, was easy to detect, and unlikely part of such a campaign.

The trojan does not use any vulnerability to infect a system, it’s introduction onto a system would be via some other mechanism. Therefore it does not have a CVE name and no updates are applicable for this issue.

“Grinch” (December 2014)

An issue was reported which gained media attention, but was actually not a security vulnerability. No updates were applicable for this issue.

“Ghost” (January 2015) CVE-2015-0235

A bug was found affecting certain function calls in the glibc library. A remote attacker that is able to make an application call to an affected function could execute arbitrary code. While a proof of concept exploit is available, not many applications were found to be vulnerable in a way that would allow remote exploitation.

Red Hat Enterprise Linux versions were affected and updates were available.

“Freak” (March 2015) CVE-2015-0204

It was found that OpenSSL clients accepted EXPORT-grade (insecure) keys even when the client had not initially asked for them. This could be exploited using a man-in-the-middle attack, which could downgrade to a weak key, factor it, then decrypt communication between the client and the server. Like Poodle and CCS Injection, this issue is hard to exploit as it requires a man in the middle attack. We’re not aware of active exploitation of this issue.

Red Hat Enterprise Linux versions were affected and updates were available.

Other issues of customer interest

We can also get a rough guide of which issues are getting the most attention by looking at the number of page views on the Red Hat CVE pages. While the top views were for the  issues above, also of increased interest was:

  • A kernel flaw (May 2014) CVE-2014-0196, allowing local privilege escalation. A public exploit exists for this issue but does not work as published against Red Hat Enterprise Linux.
  • “BadIRET”, a kernel flaw (December 2014) CVE-2014-9322, allowing local privilege escalation. Details on how to exploit this issue have been discussed, but we’re not aware of any public exploits for this issue.
  • A flaw in BIND (December 2014), CVE-2014-8500. A remote attacker could cause a denial of service against a BIND server being used as a recursive resolver.  Details that could be used to craft an exploit are available but we’re not aware of any public exploits for this issue.
  • Flaws in NTP (December 2014), including CVE-2014-9295. Details that could be used to craft an exploit are available.  These serious issues had a reduced impact on Red Hat Enterprise Linux.
  • A flaw in Samba (February 2015) CVE-2015-0240, where a remote attacker could potentially execute arbitrary code as root. Samba servers are likely to be internal and not exposed to the internet, limiting the attack surface. No exploits that lead to code execution are known to exist, and some analyses have shown that creation of such a working exploit is unlikely.

Conclusion

ven2

We’ve shown in this post that for the last year of vulnerabilities affecting Red Hat products the issues that matter and the issues that got branded do have an overlap, but they certainly don’t closely match. Just because an issue gets given a name, logo, and press attention does not mean it’s of increased risk. We’ve also shown there were some vulnerabilities of increased risk that did not get branded.

At Red Hat, our dedicated Product Security team analyse threats and vulnerabilities against all our products every day, and provide relevant advice and updates through the customer portal. Customers can call on this expertise to ensure that they respond quickly to address the issues that matter, while avoiding being caught up in a media whirlwind for those that don’t.

April 05, 2015

On Load Balancers and Kerberos

I've recently witnessed a lot of discussions around using load balancers and FreeIPA on the user's mailing list, and I realized there is a lot of confusion around how to use load balancers when Kerberos is used for authentication.

One of the issues is that Kerberos depends on accurate naming as server names are used to build the Service Principal Name (SPN) used to request tickets from a KDC.

When people introduce a load balancer on a network they usually assign it a new name which is used to redirect all clients to a single box that redirects traffic to multiple hosts behind the balancer.

From a transport point of view this is just fine, the box just handles packets. But from the client point of view all servers now look alike (same name). They have, intentionally, no idea what server they are going to hit.

This is the crux of the problem. When a client wants to authenticate using Kerberos it needs to ask the KDC for a ticket for a specific SPN. The only name available in this case is that of the load balancer, so that names is used to request a ticket.

For example, if we have three HTTP servers in a domain: uno.ipa.dom, due.ipa.dom, tre.ipa.dom; and for some reason we want to load balance them using the name all.ipa.dom then all a client can do is to go to the KDC and ask for a ticket for the SPN named: HTTP/all.ipa.dom@IPA.DOM

Now, once the client actually connect to that IP address and gets redirected to one of the servers by the load balancer, say uno.ipa.dom it will present this server a ticket that can be utilized only if the server has the key for the SPN named HTTP/all.ipa.dom@IPA.DOM

There are a few ways to satisfy this condition depending on what a KDC supports and what is the use case.

Use only one common Service Principal Name

One of the solutions is to create a new Service Principal in the KDC for the name HTTP/all.ipa.dom@IPA.DOM then generate a keytab and distribute it to all servers. The servers will use no other key, and they will identify themselves with the common name, so if a client tries to contact them using their individual name, then authentication will fail, as the KDC will not have a principal for the other names and the services themselves are not configure to use their hostname only the common name.

Use one key and multiple SPNs

A slightly friendlier way is to assign aliases to a single principal name, so that clients can contact the servers both with the common name and directly using the server's individual names. This is possible if the KDC can create aliases to the canonical principal name. The SPNs HTTP/uno.ipa.dom, HTTP/due.ipa.dom, HTTP/tre.ipa.dom are created as aliases of HTTP/all.ipa.dom, so when a client asks for a ticket for any of these names the same key is used to generate it.

Use multiple keys, one per name

Another way again is to assign servers multiple keys. For example the server named uno.ipa.dom will be given a keytab with keys for both HTTP/uno.ipa.dom@IPA.DOM and HTTP/all.ipa.dom@IPA.DOM, so that regardless of how the client tries to access it, the KDC will return a ticket using a key the service has access to.

It is important to note that the acceptor, in this case, must not be configured to use a specific SPN or acquire specific credentials before trying to accept a connection if using GSSAPI, otherwise the wrong key may be selected from the keytab and context establishment may fail. If no name is specified then GSSAPI can try all keys in the keytab until one succeeds in decrypting the ticket.

Proxying authentication

One last option is to actually terminate the connection on a single server which then proxies out to the backend servers. In this case only the proxy has a keytab and the backend servers trust the proxy to set appropriate headers to identify the authenticated client principal, or set a shared session cookie that all servers have access to. In this case clients are forbidden from getting access to the backend server directly by firewalling or similar network level segregation.

Choosing a solution

Choosing which option is right depends on many factors, for example, if (some) clients need to be able to authenticate directly to the backend servers using their individual names, then using only one name only like in the first and fourth options is clearly not possible. Using or not aliases may or not be possible depending on whether the KDC in use supports them.

More complex cases, the FreeIPA Web UI

The FreeIPA Web UI adds more complexity to the aforementioned cases. The Web UI is just a frontend to the underlying LDAP database and relies on constrained delegation to access the LDAP server, so that access control is applied by the LDAP server using the correct user credentials.

The way constrained delegation is implemented requires the server to obtain a TGT using the server keytab. What this means is that only one Service Principal Name can be used in the FreeIPA HTTP server and that name is determined before the client connects. This factor makes it particularly difficult for FreeIPA servers to be load balanced. For the HTTP server the FreeIPA master could theoretically be manually reconfigured to use a single common name and share a keytab, this would allow clients to connect to any FreeIPA server and perform constrained delegation using the common name, however admins wouldn't be able to connect to a specific server and change local settings. Moreover, internal operations and updates may or may not work going forward.

In short, I wouldn't recommend it until the FreeIPA project provides a way to officially access the Web UI using aliases.

A poor man solution if you want to offer a single name for ease of access and some sort of load balancing could be to stand up a server at the common name and a CGI script that redirects clients randomly to one of the IPA servers.

Horizon WebSSO via SSSD

I’ve shown how to set up OpenStack Keystone Federation with SSSD. We know we can set up Horizon with Federation using SAML. Here is how to set up Web Single Sign On (WebSSO) for SSSD and Kerberos.

This is a long one, but I’m trying to include all the steps on one document. Much is a repeat of previous blog posts. However, some details have changed, I want the explanation here to be consistent.

I’m starting with a RHEL 7.1 VM. I tend to use internal Yum repos for packages, to avoid going across the network for Updates, but the general steps should work regardless of update mecahnism. This is, once again, using devstack, as the bits are very fresh for the WebSSO code, and I need to work off master for several projects. We’ll work on an automated version for RDO once the packages are up to date.

I have an IPA server up and running. I want to make this VM the IPA client. Since I’m done with Nova managing my VM config values, I’ll first disable cloud-init; There are many ways to do this, but that is outside the scope of this article.

 sudo yum -y update 
 sudo yum -y groupinstall "Development Tools"
 sudo yum -y install ipa-client sssd-dbus sudo mod_lookup_identity mod_auth_kerb
 sudo yum -y erase cloud-init

Now to use network manager CLI to set some basics. To list the connections:

$ nmcli c
NAME         UUID                                  TYPE            DEVICE 
System eth0  5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03  802-3-ethernet  eth0

I want this machine to keep its host name, and to keep the DNS server I set:

$ sudo nmcli c edit 5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03

===| nmcli interactive connection editor |===

Editing existing '802-3-ethernet' connection: '5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03'

Type 'help' or '?' for available commands.
Type 'describe [<setting>.<prop>]' for detailed property description.

You may edit the following settings: connection, 802-3-ethernet (ethernet), 802-1x, ipv4, ipv6, dcb
nmcli> goto ipv4
You may edit the following properties: method, dns, dns-search, addresses, gateway, routes, route-metric, ignore-auto-routes, ignore-auto-dns, dhcp-hostname, dhcp-send-hostname, never-default, may-fail, dhcp-client-id
nmcli ipv4> set ignore-auto-dns yes
nmcli ipv4> set dns 192.168.1.59
nmcli ipv4> set dhcp-hostname horizon.cloudlab.freeipa.org
nmcli ipv4> save
Connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e03) successfully updated.
nmcli ipv4> quit

Set the hostname via

sudo vi  /etc/hostname

And lets see what happens when I reboot.

$ sudo reboot
Connection to horizon.cloudlab.freeipa.org closed by remote host.
Connection to horizon.cloudlab.freeipa.org closed.
[ayoung@ayoung530 python-keystoneclient (review/ayoung/access_info_split)]$ ssh cloud-user@horizon.cloudlab.freeipa.org
Last login: Wed Apr  1 16:06:27 2015 from 10.10.55.194
[cloud-user@horizon ~]$ hostname
horizon.cloudlab.freeipa.org

Let’s see if we can talk to the IPA server:

$ nslookup ipa.cloudlab.freeipa.org
Server:		192.168.1.59
Address:	192.168.1.59#53

Name:	ipa.cloudlab.freeipa.org
Address: 192.168.1.59
$ sudo ipa-client-install
WARNING: ntpd time&date synchronization service will not be configured as
conflicting service (chronyd) is enabled
Use --force-ntpd option to disable it and force configuration of ntpd

Discovery was successful!
Hostname: horizon.cloudlab.freeipa.org
Realm: CLOUDLAB.FREEIPA.ORG
DNS Domain: cloudlab.freeipa.org
IPA Server: ipa.cloudlab.freeipa.org
BaseDN: dc=cloudlab,dc=freeipa,dc=org

Continue to configure the system with these values? [no]: yes

Much elided, suffice to say it succeeded. On to devstack

sudo mkdir /opt/stack
sudo chown cloud-user /opt/stack/
cd /opt/stack/
git clone https://git.openstack.org/openstack-dev/devstack
cd devstack

Now, one workaround. edit the file files/rpms/general and comment out the libyaml-devel package. The functionality is provided by a different package, and that package does not exist in in RHEL7.

...
which
bc
#libyaml-devel
gettext  # used for compiling message catalogs
net-tools
java-1.7.0-openjdk-headless  # NOPRIME rhel7,f20
java-1.8.0-openjdk-headless  # NOPRIME f21,f22

Here is my local.conf file:

[[local|localrc]]
ADMIN_PASSWORD=FreeIPA4All
DATABASE_PASSWORD=$ADMIN_PASSWORD
RABBIT_PASSWORD=$ADMIN_PASSWORD
SERVICE_PASSWORD=$ADMIN_PASSWORD
SERVICE_TOKEN=$ADMIN_PASSWORD
LIBS_FROM_GIT=python-openstackclient,python-keystoneclient

I tried with Django-openstack-auth as well, but it seems devstack does not know to fetch that. I ended up cloning that repo by hand.

Run devstack:

./stack

And wait.

Ok, it is done!

Then edit /etc/sssd/sssd.conf and set sssd to start the info pipe services

[sssd]
services = nss, sudo, pam, ssh, ifp

And, in the same file, let infopipe know it can respond with a subset of the LDAP values.

[ifp]
allowed_uids = apache, root, cloud-user
user_attributes = +givenname, +sn, +uid


Ah, forgot to add in a cloud-user user to IPA, as that is what devstack is set to use. I need the ipa client.

$ sudo yum install ipa-admintools
$ kinit ayoung
$ ipa user-add cloud-user --uid=1000
First name: Cloud
Last name: User

And now:

 sudo service sssd restart

Test infopipe:

sudo dbus-send --print-reply --system --dest=org.freedesktop.sssd.infopipe /org/freedesktop/sssd/infopipe org.freedesktop.sssd.infopipe.GetUserGroups string:ayoung

Returns

method return sender=:1.17 -> dest=:1.29 reply_serial=2
   array [
      string "admins"
      string "ipausers"
      string "wheel"
   ]

Now to configure HTTPD. I’m not going to bother with HTTPS for this setup, as it is only proof of concept, and there is a good bit of Horizon to reset if you do HTTPS.

$ sudo yum install mod_lookup_identity mod_auth_kerb

OK…now we start treading new ground. Instead of a whole new Kerberized setup for Keystone, I’m only going to Kerberize the segment protected by Federation. That is

Create a file with the V3 and admin env vars set:

$ cat openrc.v3 
. ./openrc
export OS_AUTH_URL=http://192.168.1.67:5000/v3
export OS_IDENTITY_API_VERSION=3
export OS_USERNAME=admin

source that and test:

$ openstack project list
+----------------------------------+--------------------+
| ID                               | Name               |
+----------------------------------+--------------------+
| 07a369b34c6f41948143f6ff75dc81a6 | alt_demo           |
| 0edb00180b3d4676baf5c39325e0639d | demo               |
| 18c3137a9a4e4266adb3b143c0d62ac3 | service            |
| 64378db96ab845dd8346ce0bcff9709d | admin              |
| 9a08ef972d7f4ef9a52190085b6b25d0 | invisible_to_admin |
+----------------------------------+--------------------+

create the groups and mappings for Federation

Here is the contents of mapping.json

[
     {
         "local": [
             {
                 "user": {
                     "name": "{0}",
                     "id": "{0}",
                      "domain": {"name": "Default"}
                 }
             }
         ],
         "remote": [
             {
                 "type": "REMOTE_USER"
             }
         ]
     },

     {
         "local": [
             {
                 "groups": "{0}",
                 "domain": {
                     "name": "Default"
                 }
             }
         ],
         "remote": [
             {
                 "type": "REMOTE_USER_GROUPS",
                 "blacklist": []
             }
         ]
     }

 ]

  openstack group create admins
  openstack group create ipausers
  openstack role add  --project demo --group ipausers member
  openstack identity provider create sssd
  openstack mapping create  --rules /home/cloud-user/mapping.json  kerberos_mapping
  openstack federation protocol create --identity-provider sssd --mapping kerberos_mapping kerberos
  openstack identity provider set --remote-id SSSD sssd

Get the Keytab

$ ipa service-add HTTP/horizon.cloudlab.freeipa.org
----------------------------------------------------------------------
Added service "HTTP/horizon.cloudlab.freeipa.org@CLOUDLAB.FREEIPA.ORG"
----------------------------------------------------------------------
  Principal: HTTP/horizon.cloudlab.freeipa.org@CLOUDLAB.FREEIPA.ORG
  Managed by: horizon.cloudlab.freeipa.org
$ ipa-getkeytab -s ipa.cloudlab.freeipa.org -k /tmp/openstack.keytab -p HTTP/horizon.cloudlab.freeipa.org
Keytab successfully retrieved and stored in: /tmp/openstack.keytab
$ sudo mv /tmp/openstack.keytab /etc/httpd/conf
$ sudo chown apache /etc/httpd/conf/openstack.keytab
$ sudo chmod 600 /etc/httpd/conf/openstack.keytab

Enable Kerberos for Keystone. In /etc/httpd/conf.d/keystone.conf

Listen 5000
Listen 35357

#enable modules for Kerberos and getting id from sssd
LoadModule lookup_identity_module modules/mod_lookup_identity.so
LoadModule auth_kerb_module modules/mod_auth_kerb.so

<VirtualHost *:5000>
    WSGIDaemonProcess keystone-public processes=5 threads=1 user=cloud-user display-name=%{GROUP} 
    WSGIProcessGroup keystone-public
    WSGIScriptAlias / /var/www/keystone/main
    WSGIApplicationGroup %{GLOBAL}
    WSGIPassAuthorization On
    <IfVersion >= 2.4>
      ErrorLogFormat "%{cu}t %M"
    </IfVersion>
    ErrorLog /var/log/httpd/keystone.log
    CustomLog /var/log/httpd/keystone_access.log combined


    #This tells WebSSO what IdP to use
    SetEnv IDP_ID SSSD
    #Protect the urls that have kerberos in their path with Kerberos.

    <location ~ "kerberos" >
    AuthType Kerberos
    AuthName "Kerberos Login"
    KrbMethodNegotiate on
    KrbMethodK5Passwd off
    KrbServiceName HTTP
    KrbAuthRealms CLOUDLAB.FREEIPA.ORG
    Krb5KeyTab /etc/httpd/conf/openstack.keytab
    KrbSaveCredentials on
    KrbLocalUserMapping on
    Require valid-user
    #  SSLRequireSSL
     LookupUserAttr mail REMOTE_USER_EMAIL " "
    LookupUserGroups REMOTE_USER_GROUPS ";"
   </location>

Make sure Keystone can handle Kerberos. In /etc/keystone/keystone.conf

[auth]
methods = external,password,token,oauth1,kerberos
kerberos = keystone.auth.plugins.mapped.Mapped

and under federation

[federation]
# I tried this first, and it worked.  It will have issues when sharing WebSSO with other mechanisms
remote_id_attribute = IDP_ID
trusted_dashboard = http://horizon.cloudlab.freeipa.org/auth/websso/
sso_callback_template = /etc/keystone/sso_callback_template.html

#this is supposed to work but does not yet.
[kerberos]
remote_id_attribute=IDP_ID

Once this is set, copy the templacte file for the webSSO post response from the Keystone repo to /etc/keystone

 cp /opt/stack/keystone/etc/sso_callback_template.html /etc/keystone/
curl   --negotiate -u:   $HOSTNAME:5000/v3/OS-FEDERATION/identity_providers/sssd/protocols/kerberos/auth

Returns

{"token": {"methods": ["kerberos"], "expires_at": "2015-04-02T21:17:51.054150Z", "extras": {}, "user": {"OS-FEDERATION": {"identity_provider": {"id": "sssd"}, "protocol": {"id": "kerberos"}, "groups": [{"id": "482eb4e6a0c64348845773b506d1db77"}, {"id": "6da803796a4540d48a0aff3b3185edad"}, {"id": "f0bf681ae2e84d1580a7ff54ea49bf27"}]}, "domain": {"id": "Federated", "name": "Federated"}, "id": "ayoung", "name": "ayoung"}, "audit_ids": ["J-wAsamnQ5-NHjXRYHSAbA"], "issued_at": "2015-04-02T20:17:51.054182Z"}}

And to test WebSSO

curl   --negotiate -u:   $HOSTNAME:5000/v3/auth/OS-FEDERATION/websso/kerberos?origin=http://horizon.cloudlab.freeipa.org/auth/websso/

Returns

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    
  </head>
  <body>
     
Please wait...
<noscript> </noscript>
<script type="text/javascript"> window.onload = function() { document.forms['sso'].submit(); } </script> </body> </html>

On to Horizon:

Since this is devstack, the config changes go in:
/opt/stack/horizon/openstack_dashboard/local/local_settings.py

I put all my custom settings at the bottom:

WEBSSO_ENABLED = True


WEBSSO_CHOICES = (
        ("credentials", _("Keystone Credentials")),
        ("kerberos", _("Kerberos")),
      )

WEBSSO_INITIAL_CHOICE="kerberos"


COMPRESS_OFFLINE=True

OPENSTACK_KEYSTONE_DEFAULT_ROLE="Member"

OPENSTACK_HOST="horizon.cloudlab.freeipa.org"

OPENSTACK_API_VERSIONS = {
    "identity": 3
}

OPENSTACK_KEYSTONE_URL="http://horizon.cloudlab.freeipa.org:5000/v3"

Around here is where I cloned the django-openstack-auth repo and made it available to the system using:

 cd /opt/stack/
 git clone https://git.openstack.org/openstack/django_openstack_auth
 cd django_openstack_auth
 sudo python setup.py develop
 sudo systemctl restart httpd.service

When you hit Horizon from a browser it should look like this:

Horizon  Login Screen set up for Federation  Showing Kerberos by Default

Horizon Login Screen set up for Federation Showing Kerberos by Default


There was a lot of trial and error in making this work, and the cause of the error is not always clear. Some of the things that tripped me up bpoth the first time and trying to replicate:

  • There was one outstanding bug that needed to be fixed. I patched this inline. The fix has already merged.
  • Getting the HOSTNAME right on the host. Removing cloud-init worked for me, although I’ve been assured there are better ways to do that.
  • The Horizon config needs to use the hostnames, not the IP addresses, for Kerberos to work.
  • If you do the sssd setup before installing apache HTTPD, and the apache user does not exist, the sssd daemon won’t restart. However, if you forget to add the apache user to the sssd.conf the webserver won’t be able to read from dbus, and thus the REMOTE_USER_GROUPS env var won’t be passed to HTTPD. The error message is
    IndexError: tuple index out of range
  • The keytab needs to be owned by the apache user.
  • Horizon needs to both use the V3 API explicitly and use the AUTH_URL that ends in /v3. It might be possible to drop the /v3 and depend on discovery, but leaving the v2.0 on there will certainly break auth.
  • As I had buried in the devstack instructions: edit the file files/rpms/general and comment out the libyaml-devel package. The functionality is provided by a different package, and that package does not exist in in RHEL7.

April 01, 2015

JOSE – JSON Object Signing and Encryption

Federated Identity Management has become very widespread in past years – in addition to enterprise deployments a lot of popular web services allow users to carry their identity over multiple sites. Social networking sites especially are in a good position to drive the federated identity management, as they have both critical mass of users and the incentive to become an identity provider. As the users move away from a single device to using multiple portable devices, there is a constant pressure to make the federated identity protocols simpler (with respect to complexity), more user friendly (especially for developers) and easier to implement (on wide range of devices and platforms).

Unfortunately older technologies are deeply rooted in enterprise environments and are unsuitable for Internet (Kerberos), or are based on more or less complicated data serialization (e.g. OpenID 2.0 or SAML). Canonicalization, whitespace handling and representation of binary data are among the challenges that various serialization formats face.

OpenID Connect

Another approach to the problem of serialization of structured data in context of identity management is to use already widespread and simple JSON in combination with base64url encoding of data. The advantage in context of federated identity management is obvious – while being sufficiently universal, they have almost native support in web clients. Getting this level of simplicity and interoperability is very compelling, despite these having some shortcomings in e.g. bandwidth-efficiency.

This approach is taken by an upcoming standard OpenID Connect, a third revision of OpenID protocol. The protocol describes a method of providing identity-based claims from identity provider to relying party, with end user being authenticated by Identity Provider and authorizing the request. The communication between client, relying party and identity provider follows OAUTH 2.0 protocol, just like in previous version of the protocol, OpenID 2.0. Claims have a format of JSON key-value hash and the task of protection of integrity, and possibly confidentiality, is addressed by JSON Object Signing and Encryption (JOSE) standard.

JOSE

The standard provides a general approach to signing and encryption of any content, not necessarily in JSON. However, it is deliberately built on JSON and base64url to be easily usable in web applications. Also, while being used in OpenID Connect, it can be used as a building block in other protocols.

JOSE is still an upcoming standard, but final revisions should be available shortly. It consists of several upcoming RFCs:

  • JWA – JSON Web Algorithms, describes cryptographic algorithms used in JOSE
  • JWK – JSON Web Key, describes format and handling of cryptographic keys in JOSE
  • JWS – JSON Web Signature, describes producing and handling signed messages
  • JWE – JSON Web Encryption, describes producting and handling encrypted messages
  • JWT – JSON Web Token, describes representation of claims encoded in JSON and protected by JWS or JWE

JWK

JSON Web Key is a data structure representing a cryptographic key with both the cryptographic data and other attributes, such as key usage.

{ 
  "kty":"EC",
  "crv":"P-256",
  "x":"MKBCTNIcKUSDii11ySs3526iDZ8AiTo7Tu6KPAqv7D4",
  "y":"4Etl6SRW2YiLUrN5vfvVHuhp7x8PxltmWWlbbM4IFyM",
  "use":"enc",
  "kid":"1"
}

Mandatory “kty” key type parameter describes the cryptographic algorithm associated with the key. Depending on the key type, other parameters might be used – as shown in the example elliptic curve key would contain “crv” parameter identifying the curve, “x” and “y” coordinates of point, optional “use” to denote intended usage of the key and “kid” as key ID. The specification now describes three key types: “EC” for Elliptic Curve, “RSA” for, well, RSA, and “oct” for octet sequence denoting the shared symmetric key.

JWS

JSON Web Signature standard describes process of creation and validation of datastructure representing signed payload. As example take following string as a payload:

'{"iss":"joe",
 "exp":1300819380,
 "http://example.com/is_root":true}'

Incidentally, this string contains JSON data, but this is not relevant for the signing procedure and it might as well be any data. Before signing, the payload is always converted to base64url encoding:

eyJpc3MiOiJqb2UiLA0KICJleHAiOjEzMDA4MTkzODAsDQogImh0dHA6Ly9leGFtcGxlLm
NvbS9pc19yb290Ijp0cnVlfQ

Additional parameters are associated with each payload. Required parameter is “alg”, which denotes the algorithm used for generating a signature (one of the possible values is “none” for unprotected messages). The parameters are included in final JWS in either protected or unprotected header. The data in protected header is integrity protected and base64url encoded, whereas unprotected header human readable associated data.

As example, the protected header will contain following data:

{"alg":"ES256"}

which in base64url encoding look like this:

eyJhbGciOiJFUzI1NiJ9

The “ES356″ here is identifier for ECDSA signature algorithm using P-256 curve and SHA-256 digest algorithm.

Unprotected header can contain a key id parameter:

{"kid":"e9bc097a-ce51-4036-9562-d2ade882db0d"}

The base64url encoded payload and protected header are concatenated with ‘.’ to form a raw data, which is fed to the signature algorithm to produce the final signature.

Finally, the JWS output is serialized using one of JSON or Compact serializations. Compact serialization is simple concatenation of comma separated base64url encoded protected header, payload and signature. JSON serialization is a human readable JSON object, which for the example above would look like this:

{
  "payload": "eyJpc3MiOiJqb2UiLA0KICJleHAiOjEzMDA4MTkzODAsDQogImh0dHA6
              Ly9leGFtcGxlLmNvbS9pc19yb290Ijp0cnVlfQ",
  "protected":"eyJhbGciOiJFUzI1NiJ9",
  "header":
    {"kid":"e9bc097a-ce51-4036-9562-d2ade882db0d"},
     "signature":
     "DtEhU3ljbEg8L38VWAfUAqOyKAM6-Xx-F4GawxaepmXFCgfTjDxw5djxLa8IS
      lSApmWQxfKTUJqPP3-Kg6NU1Q"
}

Such process for generating signature is pretty straightforward, yet still supports some advanced use-cases, such as multiple signatures with separate headers.

JWE

JSON Web Encryption follows the same logic as JWS with a few differences:

  • by default, for each message new content encryption key (CEK) should be generated. This key is used to encrypt the plaintext and is attached to the final message. Public key of recipient or a shared key is used only to encrypt the CEK (unless direct encryption is used, see below).
  • only AEAD (Authenticated Encryption with Associated Data) algorithms are defined in the standard, so users do not have to think about how to combine JWE with JWS.

Just like with JWS, header data of JWE object can be transmitted in either integrity protected, unprotected or per-recipient unprotected header. The final JSON serialized output then has the following structure:

{
  "protected": "<integrity-protected header contents>",
  "unprotected": <non-integrity-protected header contents>,
  "recipients": [
    {"header": <per-recipient unprotected header 1 contents>,
     "encrypted_key": "<encrypted key 1 contents>"},
     ...
    {"header": <per-recipient unprotected header N contents>,
     "encrypted_key": "<encrypted key N contents>"}],
  "aad":"<additional authenticated data contents>",
  "iv":"<initialization vector contents>",
  "ciphertext":"<ciphertext contents>",
  "tag":"<authentication tag contents>"
}

The CEK is encrypted for each recipient separately, using different algorithms. This gives us ability to encrypt a message to recipients with different keys, e.g. RSA, shared symmetric and EC key.

The two used algorithms need to be specified as a header parameters. “alg” parameter specified the algorithm used to protect the CEK, while “enc” parameter specifies the algorithm used to encrypt the plaintext using CEK as key. Needless to say, “alg” can have a value of “dir”, which marks direct usage of the key, instead of using CEK.

As example, assume we have RSA public key of the first recipient and share a symmetric key with second recipient. The “alg” parameter for the first recipient will have value “RSA1_5″ denoting RSAES-PKCS1-V1_5 algorithm and “A128KW” denoting AES 128 Keywrap for the second recipient, along with key IDs:

{"alg":"RSA1_5","kid":"2011-04-29"}

and

{"alg":"A128KW","kid":"7"}

These algorithms will be used to encrypt content encryption key (CEK) to each of the recipients. After CEK is generated, we use it to encrypt the plaintext with AES 128 in CBC mode with HMAC SHA 256 for integrity:

{"enc":"A128CBC-HS256"}

We can protect this information by putting it into a protected header, which, when base64url encoded, will look like this:

eyJlbmMiOiJBMTI4Q0JDLUhTMjU2In0

This data will be fed as associated data to AEAD encryption algorithm and therefore be protected by the final signature tag.

Putting this all together, the resulting JWE object will looks like this:

{
  "protected": "eyJlbmMiOiJBMTI4Q0JDLUhTMjU2In0",
  "recipients":[
    {"header": {"alg":"RSA1_5","kid":"2011-04-29"},
     "encrypted_key":
       "UGhIOguC7IuEvf_NPVaXsGMoLOmwvc1GyqlIKOK1nN94nHPoltGRhWhw7Zx0-
        kFm1NJn8LE9XShH59_i8J0PH5ZZyNfGy2xGdULU7sHNF6Gp2vPLgNZ__deLKx
        GHZ7PcHALUzoOegEI-8E66jX2E4zyJKx-YxzZIItRzC5hlRirb6Y5Cl_p-ko3
        YvkkysZIFNPccxRU7qve1WYPxqbb2Yw8kZqa2rMWI5ng8OtvzlV7elprCbuPh
        cCdZ6XDP0_F8rkXds2vE4X-ncOIM8hAYHHi29NX0mcKiRaD0-D-ljQTP-cFPg
        wCp6X-nZZd9OHBv-B3oWh2TbqmScqXMR4gp_A"},
    {"header": {"alg":"A128KW","kid":"7"},
     "encrypted_key":
        "6KB707dM9YTIgHtLvtgWQ8mKwboJW3of9locizkDTHzBC2IlrT1oOQ"}],
  "iv": "AxY8DCtDaGlsbGljb3RoZQ",
  "ciphertext": "KDlTtXchhZTGufMYmOYGS4HffxPSUrfmqCHXaI9wOGY",
  "tag": "Mz-VPPyU4RlcuYv1IwIvzw"
}

JWA

JSON Web Algorithms defines algorithms and their identifiers to be used in JWS and JWE. The three parameters that specify algorithms are “alg” for JWS, “alg” and “enc” for JWE.

"enc": 
    A128CBC-HS256, A192CBC-HS384, A256CBC-HS512 (AES in CBC with HMAC), 
    A128GCM, A192GCM, A256GCM

"alg" for JWS: 
    HS256, HS384, HS512 (HMAC with SHA), 
    RS256, RS384, RS512 (RSASSA-PKCS-v1_5 with SHA), 
    ES256, ES384, ES512 (ECDSA with SHA), 
    PS256, PS384, PS512 (RSASSA-PSS with SHA for digest and MGF1)

"alg" for JWE: 
    RSA1_5, RSA-OAEP, RSA-OAEP-256, 
    A128KW, A192KW, A256KW (AES Keywrap), 
    dir (direct encryption), 
    ECDH-ES (EC Diffie Hellman Ephemeral+Static key agreement), 
    ECDH-ES+A128KW, ECDH-ES+A192KW, ECDH-ES+A256KW (with AES Keywrap), 
    A128GCMKW, A192GCMKW, A256GCMKW (AES in GCM Keywrap), 
    PBES2-HS256+A128KW, PBES2-HS384+A192KW, PBES2-HS512+A256KW 
    (PBES2 with HMAC SHA and AES keywrap)

On the first look the wealth of choice for “alg” in JWE is balanced by just two options for “enc”. Thanks to “enc” and “alg” being separate, algorithms suitable for encrypting cryptographic key and content can be separately defined. AES Keywrap scheme defined in RFC 3394 is a preferred way to protect cryptographic key. The scheme uses fixed value of IV, which is checked after decryption and provides integrity protection without making the encrypted key longer (by adding IV and authentication tag). But here`s a catch – while A128KW refers to AES Keywrap algorithm as defined in RFC 3394, word “keywrap” in A128GCMKW is used in a more general sense as synonym to encryption, so it denotes simple encryption of key with AES in GCM mode.

JWT

While previous parts of JOSE provide a general purpose cryptographic primitives for arbitrary data, JSON Web Token standard is more tied to the OpenID Connect. JWT object is simply JSON hash with claims, that is either signed with JWS or encrypted with JWE and serialized using compact serialization. Beware of a terminological quirk – when JWT is used as plaintext in JWE or JWS, it is referred to as nested JWT (rather than signed, or encrypted).

JWT standard defines claims – name/value pair asserting information about subject. The claims include

  • “iss” to identify issuer of the claim
  • “sub” identifying subject of JWT
  • “aud” (audience) identifying intended recipients
  • “exp” to mark expiration time of JWT
  • “nbf” (not before) to mark time before which JWT must be rejected
  • “iat” (issued at) to mark time when JWT was created
  • “jti” (JWT ID) as unique identifier for JWT

While standard mandates what are mandatory values of the claims, all of them are optional to use in a valid JWT. This means applications can use any structure for JWT if it`s not intended to use publicly, and for public JWT set of claims is defined and collisions in names are prevented.

The good

The fact that JOSE combines JSON and base64url encoding making it simple and web friendly is a clear win. Although we will definitely see JOSE adopted in web environment first, it does have ambition to become more general purpose standard.

The design promotes secure choices, e.g. use of unique CEK per message, which makes users default to secure configurations while still giving option to use less secure methods (“dir” for encryption, “none” in JWS). Being a new standard authors did seize the opportunity to define only secure algorithms. This is certainly good, but as the advances in cryptography weaken the algorithms, ability to deprecate algorithms (with backwards compatibility always being an issue) will be more important in the future.

The bad

Despite the effort to keep the standard simple, some complexities inevitably slipped through. One obvious comes from a different serializations. JOSE standards support two serializations – JSON and Compact. JSON serialization is human readable and gives the users more freedom as it supports some advances features like multiple recipients. However, since single recipient is a much more common case, standard also defines a variant called flattened JSON serialization. In this type nested parameters from nested fields (like “recipients”) is moved directly to top level JSON object, making multiple recipients with flattened serialization impossible.

The compact serialization is created, according to the standard, for “space constrained environments”. The result of the serialization is comma separated concatenation of base64url encoded segments of the original JSON object. For example for JWS the serialization is constructed as follows:

BASE64URL(UTF8(JWS Protected Header)) || '.' ||
BASE64URL(JWS Payload) || '.' ||
BASE64URL(JWS Signature)

Astute readers notice that compact serialization further restricts the available features of JWS, e.g. it is not possible to include unprotected header anymore. The space saving comes from dropping the keys that denote the parts of JSON object (“payload”, “signature” etc.). On the other hand, base64url encoding expands the length of the cleartext data (i.e. unprotected header), so in extreme example compact serialized JWS might actually be longer that JSON serialized one if the header contains enough data (to compensate inefficiency of specifying keys in the object) and is stored as unprotected header. Of importance is also the number of dot separated sections, since their number is the only method of differentiating between compact serialized JWE and JWS. Other proposed extensions, such as Key Managed JSON Web Signature (KMJWS), must take this into account.

The standard also still contains several ambiguities, e.g. JWK defines a JWK Set and states that “The member names within a JWK Set MUST be unique;”  without specifying what member name actually is.

The ugly

The JOSE standard is already incorporated in OpenID Connect standard. As the standards evolved side by side, OpenID Connect standard is based on older revisions of JOSE. More importantly, 15.6.1. Pre-Final IETF Specifications section of OpenID Connect states:

“Implementers should be aware that this specification uses several IETF specifications that are not yet final specifications … While every effort will be made to prevent breaking changes to these specifications, should they occur, OpenID Connect implementations should continue to use the specifically referenced draft versions above in preference to the final versions …”

The compatibility issues are always bane of cryptographic standard and this decision to prefer pre-final revisions of JOSE standard might force implementations to make some hard decisions.

Future

The JOSE standard seems to be quickly approaching the final revisions and we will most probably see more of it on the web. Implementations for most of the popular languages are in place and we will see whether the decision to award Special European Identity Award for Best Innovation for Security in the API Economy to JOSE will also stand the test of time.

March 31, 2015

Report on IoT (Internet of Things) Security

IoT (Internet of Things) devices have – and in many cases have earned! – a rather poor reputation for security. It is easy to find numerous examples of security issues in various IoT gateways and devices.

So I was expecting the worst when I had the opportunity to talk to a number of IoT vendors and to attend the IoT Day at EclipseCon. Instead, I was pleasantly surprised to discover that considerable attention is being paid to security!

  • Frameworks, infrastructure, and lessons from the mobile phone space are being applied to IoT. The mobile environment isn’t perfect, but has made considerable progress over the last few years. This is actually a pretty good starting point.
  • Code signing is being emphasized. This means that the vendor has purchased a code signing certificate from a known Certificate Authority and used it to sign their application. This ensures that the code has not been corrupted or tampered with and provides some assurance that it is coming from a known source. Not an absolute guarantee, as the Certificate Authorities aren’t perfect, but a good step.
  • Certificate based identity management, based on X.509 certificates, is increasingly popular. This provides a strong mechanism to identify systems and encrypt their communications.
  • Oauth based authentication and authorization is becoming more widely used.
  • Encrypted communications are strongly recommended. The Internet of Things should run on https!
  • Encrypted storage is recommended.

Julian Vermillard of Sierra Wireless gave a presentation at EclipseCon on 5 Elements of IoT Security. His points included:

  • Secure your hardware. Use secure storage and secure communications. Firmware and application updates should be signed.
  • “You can’t secure what you can’t update.”
    • Upgrades must be absolutely bulletproof – you can never “brick” a device!
    • Need rollback capabilities for all updates. An update may fail for many reasons, and you may need to revert to an earlier version of the code. For example, an update might not work with other software in your system.
  • Secure your communications
    • Recommends using Perfect Forward Secrecy.
    • Use public key cryptography:
      • X.509 certificates (see above discussions on X.509). Make sure you address certificate revocation.
      • Pre-Shared Keys. This is often easier to implement but weaker than a full Public Key X.509 infrastructure.
      • Whatever approach you take, make sure you can handle regular secret rotation or key rotation.
    • For low end devices look at TLS Minimal. I’m not familiar with this; it appears to be an IETF Draft.

Julian also recommended keeping server security in mind – the security of the backend service the IoT device or gateway is talking to is as important as device level security!

The challenge now is to get actual IoT manufacturers and software developers to build robust security into their devices. For industrial devices, where there is a high cost for security failures, we may be able to do this.

For consumer IoT devices you will have to vote with your wallet. If secure IoT devices sell better than insecure ones, manufacturers will provide security. If cost and time to market are everything, we will get insecure devices.


March 26, 2015

OpenStack keeps resetting my hostname

No matter what I changed, something kept setting the hostname on my vm to federate.cloudlab.freeipa.org.novalocal. Even forcing the /etc/hostname file to be uneditable did not prevent this change. Hunting this down took far too long, and here is the result of my journey.

Old Approach

A few releases ago, I had a shell script for spinning up new virtual machines that dealt with dhclient resetting values by putting overrides into /etc/dhclient.conf.  Find this file was a moving target.  First it moved into

/etc/dhcp/dhclient.conf.

Then to a file inside

/etc/dhcp/dhclient.d

And so on.  The change I wanted to make was to do two things:

  1.  set the hostname explicitly and keep it that way
  2. Use my own dnsserver, not the dhcp managed one

Recently, I started working on a RHEL 7.1 system running on our local cloud.  No matter what I did, I could not fix the host name.  Here are some  of the things I tried:

  1. Setting the value in /etc/hostname
  2. running hostnamectl set-hostname federate.cloudlab.freeipa.org
  3. Using nmcli to set the properties for the connections ipv4 configuration
  4. Explicitly Setting it in /etc/sysconfig/network-scripts/ifcfg-eth0
  5. Setting the value in /etc/hostname and making hostname immutable with chattr +i /etc/hostname

Finally, Dan Williams (dcbw) suggested I look in the journal to see what was going on with the host name.  I ran journalctl -b and did a grep for hostname.  Everything looked right until…

Mar 26 14:01:10 federate.cloudlab.freeipa.org cloud-init[1914]: [CLOUDINIT] stages.py[DEBUG]: Running module set_hostname (<module 'cloudinit.config.cc_set_hostname' from '/usr/lib/python2.7/site-packages/cloudinit...

cloud-init?

But…I thought that was only supposed to be run when the VM was first created? So, regardless of the intention, it was no longer helping me.

yum erase cloud-init

And now the hostname that I set in /etc/hostname survives a reboot. I’ll post more when I figure out why cloud-init is still running after initialization.