Fedora security Planet

Episode 62 - All about the Equifax hack

Posted by Open Source Security Podcast on September 11, 2017 02:35 PM
Josh and Kurt talk about the Equifax breach and what it will mean for all of us.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5728237/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Measuring security: Part 1 - Things that make money

Posted by Josh Bressers on September 11, 2017 12:48 PM
If you read my previous post on measuring security, you know I broke measuring into three categories. I have no good reason to do this other than it's something that made sense to me. There are without question better ways to split these apart, I'm sure there is even overlap, but that's not important. What actually matters is to start a discussion on measuring what we do. The first topic is about measuring security that directly adds to revenue such as a product or service.

Revenue
The concept of making money is simple enough. You take a resource such as raw materials, money, even people in some instances. Usually it's all three. You take these resources then transform them into something new and better. The new creation is then turned into money, or revenue, for your
business. If you have a business that doesn't make more money than it spends you have a problem. If you have a business that doesn't make any money you have a disaster.

This is easy enough to understand, but let's use a grossly simplified example to make sure we're all on the same page. Let's say you're making widgets. I suppose since this is a security topic we should call them BlockWidgetChain. In our fictional universe you spend $10 on materials and people. Make sure you can track how much something costs, you should be able to determine how much of that $10 is materials and how much is people. You then you sell the BlockWidgetChain for $20. That means you spent $10 to make $20. This should make sense to anyone who understands math (or maths for you English speakers).

Now let's say you have a competitor who makes BlockChainWidgets. They're the same thing basically, but they have no idea how much it costs them to make BlockChainWidgets. They know if they charge more than $20 they can't compete because BlockWidgetChains cost $20. Their solution is to charge $20 and hope the books work out.

I've not only described the business plan for most startups but also a company that's almost certainly in trouble. You have to know how much you spend on resources. If you spend more than you're charging for the product that's a horrible business model. Most of security works like this unfortunately. We have no idea how much a lot of what we do costs, we certainly don't know how much value it adds to the bottom line. In many instances we cannot track spending in a meaningful way.

Measuring security
So now we're on to the idea of measuring security in an environment where the security is responsible for making money. Something like security features in a product. Maybe even a security product in some instances. This is the work that pays my bills. I've been working on product security for a very long time. If you're part of your product team (which you should be, product security doesn't belong anywhere else, more on that another day) then you understand the importance of having features that make a product profitable and useful. For example I would say SSO is a must have in today's environment. If you don't have this feature you can't be as effective in the market. But adding and maintaining features isn't free. If you spend $30 and sell it for $20, you'd make more money just by staying in bed. Sometimes the most profitable decision is to not do something.

Go big or go home
The biggest mistake we like to make is doing too much. It's easy to scope a feature too big. At worst you end up failing completely, at best you end up with what you should have scoped in the first place. But you spend a lot more on failure before you end up where you should have been from the start.

Let's use SSO as our example here. If you were going to scope the best SSO solution in the world, your product would be using SAML, OAuth, PKI, Kerberos, Active Directory, LDAP, and whatever else you manage to think of on planning day. This example is pretty clearly over the top, but I bet a lot of new SSO system scope SAML and OAuth at the same time. The reality is you only need one to start. You can add more later. Firstly having a small scope is important. It shows you want to do one thing and do it well instead of doing 3 things badly. There are few features that are useful in a half finished state. Your sales team has no desire to show off a half finished product.

How to decide
But how do we decide which feature to add? The first thing I do is look at customer feedback. Do the customers clearly prefer one over the other? Setup calls with them, go on visits. Learn what they do and how they do it. If this doesn't give you a clear answer, the next question is always "which feature would sell more product". In the case of something like SAML vs OAuth there might not be a good answer. If you're some sort of cloud service OAuth means you can let customers auth against Google and Facebook. That would probably result in more users.

If you're focused on a lot of on-prem solutions, SAML might be more used. It's even possible SSO isn't what customers are after once you start to dig. I find it's best to make a mental plan of how things should look, then make sure that's not what gets built because whatever I think of first is always wrong ;)

But how much does it cost?
Lastly if there's not a good way to show revenue for a feature, you can look at investment cost. The amount of time and money something will take to implement can really help when deciding what to do. If a feature will take years to develop, that's probably not a feature you want or need. Most industries will be very different in a few years. The expectations of today won't be the expectations of tomorrow.

For example if SAML will take three times as long as OAuth to implement. And both features will result in the same number of sales. OAuth will have a substantially larger return on investment as it's much cheaper to implement. A feature doesn't count for anything until it's on the market. Half done or in development are the same as "doesn't exist". Make sure you track time as part of your costs. Money is easy to measure, but people and time are often just as important.

I really do think this is the easiest security category to measure and justify. That could be because I do it every day, but I think if you can tie actual sales back to security features you'll find yourself in a good place. Your senior leadership will think you're magic if you can show them if they invest resources in X they will get Y. Make sure you track the metrics though. It's not enough to meet expectations, make an effort to exceed your expectations. There's nothing leadership likes better than someone who can over-deliver on a regular basis.

I see a lot of groups that don't do any of this. They wander in circles sometimes adding security features that don't matter, often engineering solutions that customers only need or want 10% of. I'll never forget when I first looked at actual metrics on new features and realized something we wanted to add was going to have a massive cost and generate zero additional revenue (it may have actually detracted in future product sales). On this day I saw the power in metrics. Overnight my group became heroes for saving everyone a lot of work and headaches. Sometimes doing nothing is the most valuable action you can take.

Solutions Architect

Posted by Adam Young on September 05, 2017 03:59 PM

Today is my first day at Red Hat! Well, OK, I’ve been here a few years, but today I move from Engineering to Sales. My new role is “Specialist Solutions Architect” where that specialty is Cloud.

I have a lot to learn, and I will try to use this site to record the most important and interesting details I learn.

What are the Cloud Products? well, according to Red Hat’s site, they are (please mentally prepend Red Hat to all of these) OpenStack Platform, OpenShift, CloudForms, Virtualization, Certificate System, Directory Server, as well as products bundles built out of these. Of these all, I’d guess I have the most to learn about CloudForms, as I’ve only recently started working with that. Really, though, I have a lot to learn across the board. I know that both Ansible Tower and Satellite server are major integration points for management of servers in the large, and I’ll be expected to provide expertise there as well. Plus, everything builds on the other product lines: RHEL and variants, as well as the Storage and Networking solutions.

This is going to be fun. Time to dig in.

Episode 61 - Market driven security

Posted by Open Source Security Podcast on September 05, 2017 12:17 PM
Josh and Kurt talk about our lack of progress in security, economics, and how to interact with peers.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5708244/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



The father of modern security: B. F. Skinner

Posted by Josh Bressers on September 04, 2017 01:43 PM
A lot of what we call security is voodoo. Most of it actually.

What I mean with that statement is our security process is often based on ideas that don't really work. As an industry we have built up a lot of ideas and processes that aren't actually grounded in facts and science. We don't understand why we do certain things, but we know that if we don't do those things something bad will happen! Will it really happen? I heard something will happen. I suspect the answer is no, but it's very difficult to explain this concept sometimes.

I'm going to start with some research B. F. Skinner did as my example here. The very short version is that Skinner did research on pigeons. He had a box that delivered food at random intervals. The birds developed rituals that they would do in order to have their food delivered. If a pigeon decided that spinning around would cause food to be delivered, it would continue to spin around, eventually the food would appear reinforcing the nonsensical behavior. The pigeon believed their ritual was affecting how often the food was delivered. The reality is nothing the pigeon did affected how often food was delivered. The pigeon of course didn't know this, they only knew what they experienced.

My favorite example  to use next to this pigeon experiment is the password policies of old. A long time ago someone made up some rules about what a good password should look like. A good password has letters, and numbers, and special characters, and the name of a tree in it. How often we should change a password was also part of this. Everyone knows you should change passwords as often as possible. Two or three times a day is best. The more you change it the more secure it is!

Today we've decided that all this advice was terrible. The old advice was based on voodoo. It was our ritual that kept us safe. The advice to some people seemed like a fair idea, but there were no facts backing it up. Lots of random characters seems like a good idea, but we didn't know why. Changing your password often seemed like a good idea, but we didn't know why. This wasn't much different than the pigeon spinning around to get more food. We couldn't prove it didn't not work, so we kept doing it because we had to do something.

Do you know why we changed all of our password advice? We changed it because someone did the research around passwords. We found out that very long passwords using real words is substantially better than a nonsense short password. We found out that people aren't good at changing their passwords every 90 days. They end up using horrible passwords and adding a 1 to the end. We measured the effectiveness of these processes and understood they were actually doing the opposite of what we wanted them to do. Without question there are other security ideas we do today that fall into this category.

Even though we have research showing this password advice was terrible we still see a lot of organizations and people who believe the old rituals are the right way to keep passwords safe. Sometimes even when you prove something to someone they can't believe it. They are so invested in their rituals that they are unable to imagine any other way of existing. A lot of security happens this way. How many of our rules and processes are based on bad ideas?

How to measure
Here's where it gets real. It's easy to pick on the password example because it's in the past. We need to focus on the present and the future. You have an organization that's full of policy, ideas, and stuff. How can we try to make a dent in what we have today? What matters? What doesn't work, and what's actually harmful?

I'm going to split everything into 3 possible categories. We'll dive deeper into each in future posts, but we'll talk about them briefly right now.

Things that make money
Number one is things that make money. This is something like a product you sell, or a website that customers use to interact with your company. Every company does something that generates revenue. Measuring things that fit into this category is really easy. You just ask "Will this make more, less, or the same amount of money?" If the answer is less you're wasting your time. I wrote about this a bit a long time ago, the post isn't great, but the graphic I made is useful, print it out and plot your features on it. You can probably start asking this question today without much excitement.

Cost of doing business
The next category is what I call cost of doing business. This would be things like compliance or being a part of a professional organization. Sending staff to conferences and meetings. Things that don't directly generate revenue but can have a real impact on the revenue. If you don't have PCI compliance, you can't process payments, you have no revenue, and the company won't last long. Measuring some of these is really hard. Does sending someone to Black Hat directly generate revenue? No. But it will create valuable connections and they will likely learn new things that will be a benefit down the road. I guess you could think of these as investments in future revenue.

My thoughts on how to measure this one is less mature. I think about these often. I'll elaborate more in a future post.

Infrastructure
The last category I'm going to call "infrastructure". This one is a bit harder to grasp what makes sense. It's not unlike the previous question though. In this case we ask ourselves "If I stopped doing this what bad thing would happen?" Now I don't mean movie plot bad thing. Yeah if you stopped using your super expensive keycard entry system a spy from a competitor could break in and steal all your secrets using an super encrypted tor enabled flash drive, but they probably won't. This is the category where you have to consider the cost of an action vs the cost of not doing an action. Not doing things will often have a cost, but doing things also has a cost.

Return on investment is the name of the game here. Nobody likes to spend money they don't have to. This is why cloud is disrupting everything. Why pay for servers you don't need when you can rent only what you do need?

I have some great stories for this category, be sure to come back when I publish this followup article.

The homework for everyone now is to just start thinking about what you do and why you do it. If you don't have a good reason, you need to change your thinking. Changing your thinking is really hard to do as a human though. Many of us like to double down on our old beliefs when presented with facts. Don't be that person, keep an open mind.

Running Keycloak in OpenShift

Posted by Fraser Tweedale on September 04, 2017 06:26 AM

At PyCon Australia in August I gave a presentation about federated and social identity. I demonstrated concepts using Keycloak, an Open Source, feature rich identity broker. Keycloak is deployed in JBoss, so I wasn’t excited about the prospect of setting up Keycloak from scratch. Fortunately there is an official Docker image for Keycloak, so with that as the starting point I took an opportunity to finally learn about OpenShift v3, too.

This post is simply a recounting of how I ran Keycloak on OpenShift. Along the way we will look at how to get the containerised Keycloak to trust a private certificate authority (CA).

One thing that is not discussed is how to get Keycloak to persist configuration and user records to a database. This was not required for my demo, but it will be important in a production deployment. Nevertheless I hope this article is a useful starting point for someone wishing to deploy Keycloak on OpenShift.

Bringing up a local OpenShift cluster

To deploy Keycloak on OpenShift, one must first have an OpenShift. OpenShift Online is Red Hat’s public PaaS offering. Although running the demo on a public PaaS was my first choice, OpenShift Online was experiencing issues at the time I was setting up my demo. So I sought a local solution. This approach would have the additional benefit of not being subject to the whims of conference networks (or, it was supposed to – but that is a story for another day!)

oc cluster up

Next I tried oc cluster up. oc is the official OpenShift client program. On Fedora, it is provided by the origin-clients package. oc cluster up command pulls required images and brings up an OpenShift cluster running on the system’s Docker infrastructure. The command takes no further arguments; it really is that simple! Or is it…?

% oc cluster up
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v1.5.0 image ...
   Pulling image openshift/origin:v1.5.0
   Pulled 0/3 layers, 3% complete
   ...
   Pulled 3/3 layers, 100% complete
   Extracting
   Image pull complete
-- Checking Docker daemon configuration ... FAIL
   Error: did not detect an --insecure-registry argument on the Docker daemon
   Solution:

     Ensure that the Docker daemon is running with the following argument:
        --insecure-registry 172.30.0.0/16

OK, so it is not that simple. But it got a fair way along, and (kudos to the OpenShift developers) they have provided actionable feedback about how to resolve the issue. I added --insecure-registry 172.30.0.0/16 to the OPTIONS variable in /etc/sysconfig/docker, then restarted Docker and tried again:

% oc cluster up
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... OK
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v1.5.0 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ...
   Using nsenter mounter for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ...
   Using 192.168.0.160 as the server IP
-- Starting OpenShift container ...
   Creating initial OpenShift configuration
   Starting OpenShift using container 'origin'
   Waiting for API server to start listening
-- Adding default OAuthClient redirect URIs ... OK
-- Installing registry ... OK
-- Installing router ... OK
-- Importing image streams ... OK
-- Importing templates ... OK
-- Login to server ... OK
-- Creating initial project "myproject" ... OK
-- Removing temporary directory ... OK
-- Checking container networking ... OK
-- Server Information ... 
   OpenShift server started.
   The server is accessible via web console at:
       https://192.168.0.160:8443

   You are logged in as:
       User:     developer
       Password: developer

   To login as administrator:
       oc login -u system:admin

Success! Unfortunately, on my machine with several virtual network, oc cluster up messed a bit too much with the routing tables, and when I deployed Keycloak on this cluster it was unable to communicate with my VMs. No doubt these issues could have been solved, but being short on time and with other approaches to try, I abandoned this approach.

Minishift

Minishift is a tool that launches a single-node OpenShift cluster in a VM. It supports a variety of operating systems and hypervisors. On GNU+Linux it supports KVM and VirtualBox.

First install docker-machine and docker-machine-driver-kvm. (follow the instructions at the preceding links). Unfortunately these are not yet packaged for Fedora.

Download and extract the Minishift release for your OS from https://github.com/minishift/minishift/releases.

Run minishift start:

% ./minishift start
-- Installing default add-ons ... OK
Starting local OpenShift cluster using 'kvm' hypervisor...
Downloading ISO 'https://github.com/minishift/minishift-b2d-iso/releases/download/v1.0.2/minishift-b2d.iso'

... wait a while ...

It downloads a boot2docker VM image containing the openshift cluster, boots the VM, and the console output then resembles the output of oc cluster up. I deduce that oc cluster up is being executed on the VM.

At this point, we’re ready to go. Before I continue, it is important to note that once you have access to an OpenShift cluster, the user experience of creating and managing applications is essentially the same. The commands in the following sections are relevant, regardless whether you are running your app on OpenShift online, on a cluster running on your workstation, or anything in between.

Preparing the Keycloak image

The JBoss project provides official Docker images, including an official Docker image for Keycloak. This image runs fine in plain Docker but the directory permissions are not correct for running in OpenShift.

The Dockerfile for this image is found in the jboss-dockerfiles/keycloak repository on GitHub. Although they do not publish an official image for it, this repository also contains a Dockerfile for Keycloak on OpenShift! I was able to build that image myself and upload it to my Docker Hub account. The steps were as follows.

First clone the jboss-dockerfiles repo:

% git clone https://github.com/jboss-dockerfiles/keycloak docker-keycloak
Cloning into 'docker-keycloak'...
remote: Counting objects: 1132, done.
remote: Compressing objects: 100% (22/22), done.
remote: Total 1132 (delta 14), reused 17 (delta 8), pack-reused 1102
Receiving objects: 100% (1132/1132), 823.50 KiB | 158.00 KiB/s, done.
Resolving deltas: 100% (551/551), done.
Checking connectivity... done.

Next build the Docker image for OpenShift:

% docker build docker-keycloak/server-openshift
Sending build context to Docker daemon 2.048 kB
Step 1 : FROM jboss/keycloak:latest
 ---> fb3fc6a18e16
Step 2 : USER root
 ---> Running in 21b672e19722
 ---> eea91ef53702
Removing intermediate container 21b672e19722
Step 3 : RUN chown -R jboss:0 $JBOSS_HOME/standalone &&     chmod -R g+rw $JBOSS_HOME/standalone
 ---> Running in 93b7d11f89af
 ---> 910dc6c4a961
Removing intermediate container 93b7d11f89af
Step 4 : USER jboss
 ---> Running in 8b8ccba42f2a
 ---> c21eed109d12
Removing intermediate container 8b8ccba42f2a
Successfully built c21eed109d12

Finally, tag the image into the repo and push it:

% docker tag c21eed109d12 registry.hub.docker.com/frasertweedale/keycloak-openshift

% docker login -u frasertweedale registry.hub.docker.com
Password:
Login Succeeded

% docker push registry.hub.docker.com/frasertweedale/keycloak-openshift
... wait for upload ...
latest: digest: sha256:c82c3cc8e3edc05cfd1dae044c5687dc7ebd9a51aefb86a4bb1a3ebee16f341c size: 2623

Adding CA trust

For my demo, I used a local FreeIPA installation to issue TLS certificates for the the Keycloak app. I was also going to carry out a scenario where I configure Keycloak to use that FreeIPA installation’s LDAP server to authenticate users. I wanted to use TLS everywhere (eat your own dog food!) I needed the Keycloak application to trust the CA of one of my local FreeIPA installations. This made it necessary to build another Docker image based on the keycloak-openshift image, with the appropriate CA trust built in.

The content of the Dockerfile is:

FROM frasertweedale/keycloak-openshift:latest
USER root
COPY ca.pem /etc/pki/ca-trust/source/anchors/ca.pem
RUN update-ca-trust
USER jboss

The file ca.pem contains the CA certificate to add. It must be in the same directory as the Dockerfile. The build copies the CA certificate to the appropriate location and executes update-ca-trust to ensure that applications – including Java programs – will trust the CA.

Following the docker build I tagged the new image into my hub.docker.com repository (tag: f25-ca) and pushed it. And with that, we are ready to deploy Keycloak on OpenShift.

Creating the Keycloak application in OpenShift

At this point we have a local OpenShift cluster (via Minishift) and a Keycloak image (frasertweedale/keycloak-openshift:f25-ca) to deploy. When deploying the app we need to set some environment variables:

KEYCLOAK_USER=admin

A username for the Keycloak admin account to be created

KEYCLOAK_PASSWORD=secret123

Passphrase for the admin user

PROXY_ADDRESS_FORWARDING=true

Because the application will be running behind OpenShift’s HTTP proxy, we need to tell Keycloak to use the "external" hostname when creating hyperlinks, rather than Keycloak’s own view.

Use the oc new-app command to create and deploy the application:

% oc new-app --docker-image frasertweedale/keycloak-openshift:f25-ca \
    --env KEYCLOAK_USER=admin \
    --env KEYCLOAK_PASSWORD=secret123 \
    --env PROXY_ADDRESS_FORWARDING=true
--> Found Docker image 45e296f (4 weeks old) from Docker Hub for "frasertweedale/keycloak-openshift:f25-ca"

    * An image stream will be created as "keycloak-openshift:f25-ca" that will track this image
    * This image will be deployed in deployment config "keycloak-openshift"
    * Port 8080/tcp will be load balanced by service "keycloak-openshift"
      * Other containers can access this service through the hostname "keycloak-openshift"

--> Creating resources ...
    imagestream "keycloak-openshift" created
    deploymentconfig "keycloak-openshift" created
    service "keycloak-openshift" created
--> Success
    Run 'oc status' to view your app.

The app gets created immediately but it is not ready yet. The download of the image and deployment of the container (or pod in OpenShift / Kubernetes terminology) will proceed in the background.

After a little while (depending on how long it takes to download the ~300MB Docker image) oc status will show that the deployment is up and running:

% oc status
In project My Project (myproject) on server https://192.168.42.214:8443

svc/keycloak-openshift - 172.30.198.217:8080
  dc/keycloak-openshift deploys istag/keycloak-openshift:f25-ca 
    deployment #2 deployed 3 minutes ago - 1 pod

View details with 'oc describe <resource>/<name>' or list everything with 'oc get all'.

(In my case, the first deployment failed because the 10-minute timeout elapsed before the image download completed; hence deployment #2 in the output above.)

Creating a secure route

Now the Keycloak application is running, but we cannot reach it from outside the Keycloak project itself. In order to be able to reach it there must be a route. The oc create route command lets us create a route that uses TLS (so clients can authenticate the service). We will use the domain name keycloak.ipa.local. The public/private keypair and certificate have already been generated (how to do that is outside the scope of this article). The certificate was signed by the CA we added to the image earlier. The service name – visible in the oc status output above – is svc/keycloak-openshift.

% oc create route edge \
  --service svc/keycloak-openshift \
  --hostname keycloak.ipa.local \
  --key /home/ftweedal/scratch/keycloak.ipa.local.key \
  --cert /home/ftweedal/scratch/keycloak.ipa.local.pem
route "keycloak-openshift" created

Assuming there is a DNS entry pointing keycloak.ipa.local to the OpenShift cluster, and that the system trusts the CA that issued the certificate, we can now visit our Keycloak application:

% curl https://keycloak.ipa.local/
<!--
  ~ Copyright 2016 Red Hat, Inc. and/or its affiliates
  ~ and other contributors as indicated by the @author tags.
  ~
  ~ Licensed under the Apache License, Version 2.0 (the "License");
  ~ you may not use this file except in compliance with the License.
  ~ You may obtain a copy of the License at
  ~
  ~ http://www.apache.org/licenses/LICENSE-2.0
  ~
  ~ Unless required by applicable law or agreed to in writing, software
  ~ distributed under the License is distributed on an "AS IS" BASIS,
  ~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  ~ See the License for the specific language governing permissions and
  ~ limitations under the License.
  -->
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>
<head>
    <meta http-equiv="refresh" content="0; url=/auth/" />
    <meta name="robots" content="noindex, nofollow">
    <script type="text/javascript">
        window.location.href = "/auth/"
    </script>
</head>
<body>
    If you are not redirected automatically, follow this <a href='/auth'>link</a>.
</body>
</html>

If you visit in a browser, you will be able to log in using the admin account credentials specified in the KEYCLOAK_USER and KEYCLOAK_PASSWORD environment variables specified when the app was created. And from there you can create and manage authentication realms, but that is beyond the scope of this article.

Conclusion

In this post I discussed how to run Keycloak in OpenShift, from bringing up an OpenShift cluster to building the Docker image and creating the application and route in OpenShift. I recounted that I found OpenShift Online unstable at the time I tried it, and that although oc cluster up did successfully bring up a cluster I had trouble getting the Docker and VM networks to talk to each other. Eventually I tried Minishift which worked well.

We saw that although there is no official Docker image for Keycloak in OpenShift, there is a Dockerfile that builds a working image. It is easy to further extend the image to add trust for private CAs.

Creating the Keycloak app in OpenShift, and adding the routes, is straightforward. There are a few important environment variables that must be set. The oc create route command was used to create a secure route to access the application from the outside.

We did not discuss how to set up Keycloak with a database for persisting configuration and user records. The deployment we created is ephemeral. This satisfied my needs for demonstration purposes but production deployments will require persistence. There are official JBoss Docker images that extend the base Keycloak image and add support for PostgreSQL, MySQL and MongoDB. I have not tried these but I’d suggest starting with one of these images if you are looking to do a production deployment. Keep in mind that these images may not include the changes that are required for deploying in OpenShift.

SE Linux for CentOS Part 3

Posted by Adam Young on September 02, 2017 01:02 AM

After the previous two days debugging, Simo Sorce suggested that I need to tell the OS to show all AVCs, some are hidden by default.

The problem is that not all AVCs are reported. We can disable this.

First I needed to install setools:

sudo yum install setools-console

With that I could confirm that there were hidden AVCs:

sudo seinfo --stats | grep audit Audit
allow: 157 Dontaudit: 8036

I disabled the hiding of the AVCs:

sudo semodule --disable_dontaudit --build

And a bunch more AVCs now show up when I deploy a VM.  But…after a couple iterations, its obvious that the same errors keep showing up. Here’s a sample:

type=AVC msg=audit(1504306852.970:3169): avc: denied { search } for pid=27918 comm="cat" name="27821" dev="proc" ino=275493 scontext=system_u:system_r:svirt_tcg_t:s0:c130,c773 tcontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c5 tclass=dir
type=AVC msg=audit(1504306852.984:3173): avc: denied { search } for pid=27929 comm="nsenter" name="27821" dev="proc" ino=275493 scontext=system_u:system_r:svirt_tcg_t:s0:c130,c773 tcontext=system_u:system_r:svirt_lxc_net_t:s0:c0,c5 tclass=dir
type=AVC msg=audit(1504306852.994:3174): avc: denied { search } for pid=27930 comm="pkill" name="498" dev="proc" ino=12227 scontext=system_u:system_r:svirt_tcg_t:s0:c130,c773 tcontext=system_u:system_r:udev_t:s0-s0:c0.c1023 tclass=dir

 

Looking at the policy generated by audit2allow:

#!!!! This avc is a constraint violation. You would need to modify the attributes of either the source or target types to allow this access.
#Constraint rule: 
# mlsconstrain dir { ioctl read lock search } ((h1 dom h2 -Fail-) or (t1 != { netlabel_peer_t openshift_t openshift_app_t sandbox_min_t sandbox_x_t sandbox_web_t sandbox_net_t svirt_t svirt_tcg_t svirt_lxc_net_t svirt_qemu_net_t svirt_kvm_net_t } -Fail-) ); Constraint DENIED

The line is commented out, which tells me I should not just blindly enable it.  At the bottom of the policy file I see the comment:

# Possible cause is the source user (system_u) and target user (unconfined_u) are different.
# Possible cause is the source role (system_r) and target role (unconfined_r) are different.
# Possible cause is the source level (s0:c130,c773) and target level (s0-s0:c0.c1023) are different.

Curious.

SE Linux for CentOS Continued

Posted by Adam Young on September 02, 2017 12:52 AM

Trying to troubleshoot the issues from Yesterday’s SELinux errors.

Immediately after a new deploy of the manifests, I want to look at the context on the qemu file:

$ kubectl get pods libvirt-81sdh 
NAME            READY     STATUS    RESTARTS   AGE
libvirt-81sdh   2/2       Running   0          28s

Now to look at the file:

$ kubectl  exec libvirt-81sdh -c libvirtd -- ls  -lZ  /usr/local/bin/qemu-system-x86_64
-rwxrwxr-x. 1 root root system_u:object_r:unlabeled_t:s0 2814 Aug 10 00:48 /usr/local/bin/qemu-system-x86_64

Running restorecon on it, as the audit2allow output suggests:

[ayoung@drifloon kubevirt]$ kubectl  exec libvirt-81sdh -c libvirtd -- restorecon  /usr/local/bin/qemu-system-x86_64
[ayoung@drifloon kubevirt]$ kubectl  exec libvirt-81sdh -c libvirtd -- ls  -lZ  /usr/local/bin/qemu-system-x86_64
-rwxrwxr-x. 1 root root system_u:object_r:bin_t:s0 2814 Aug 10 00:48 /usr/local/bin/qemu-system-x86_64

unlabeled_t became bin_t.

Once again, attempt to deploy a vm, and see what AVCs we get:

$ kubectl apply -f cluster/vm-pxe.yaml 
vm "testvm" created
[ayoung@drifloon kubevirt]$ kubectl delete  -f cluster/vm-pxe.yaml 
vm "testvm" deleted

This is what the audit log showed:

type=AVC msg=audit(1504291091.397:2933): avc:  denied  { transition } for  pid=32273 comm="libvirtd" path="/usr/local/bin/qemu-system-x86_64" dev="dm-18" ino=31526884 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:svirt_tcg_t:s0:c322,c373 tclass=process

There were several lines like that, but they were identical except for the pid. What does audit2allow show?

#============= spc_t ==============

#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow spc_t svirt_tcg_t:process transition;

Lets see if the additional parameters make a difference:

$ kubectl  exec libvirt-81sdh -c libvirtd -- restorecon  -R -v  /usr/local/bin/qemu-system-x86_64
$ kubectl  exec libvirt-81sdh -c libvirtd -- ls  -lZ  /usr/local/bin/qemu-system-x86_64
-rwxrwxr-x. 1 root root system_u:object_r:bin_t:s0 2814 Aug 10 00:48 /usr/local/bin/qemu-system-x86_64

The original lableeing of system_u:object_r:unlabeled_t:s0 is now system_u:object_r:bin_t:s0, which is the same as it was after maiong the restorecon call without the additional parameters.

How about the additional line, the allow? I can apply it outside of the container by using audit2allow:

cat /tmp/audit.txt | audit2allow -a -M virt-policy
sudo semodule -i virt-policy.pp

Upon deploy, a similar error, with a different context:

type=AVC msg=audit(1504294173.446:3734): avc:  denied  { entrypoint } for  pid=6565 comm="libvirtd" path="/usr/local/bin/qemu-system-x86_64" dev="dm-18" ino=31526884 scontext=system_u:system_r:svirt_tcg_t:s0:c577,c707 tcontext=system_u:object_r:bin_t:s0 tclass=file

Running this through audit2allow generates

#============= svirt_tcg_t ==============

#!!!! WARNING: 'bin_t' is a base type.
allow svirt_tcg_t bin_t:file entrypoint;

While this is a pretty powerful rule, it might be appropriate for what we are doing with virt. Again, lets apply the policy and see what happens.

$ cat virt-policy-2.txt | audit2allow -a -M virt-policy-2
$ sudo semodule -i virt-policy-2.pp

Now a slew of errors, but different ones:

type=AVC msg=audit(1504294406.893:3797): avc:  denied  { write } for  pid=7236 comm="qemu-system-x86" path="pipe:[423417]" dev="pipefs" ino=423417 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1504294406.893:3797): avc:  denied  { write } for  pid=7236 comm="qemu-system-x86" path="pipe:[423417]" dev="pipefs" ino=423417 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:system_r:spc_t:s0 tclass=fifo_file
type=AVC msg=audit(1504294406.894:3798): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="ld.so.cache" dev="dm-18" ino=8388771 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file
type=AVC msg=audit(1504294406.894:3799): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3800): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3801): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3802): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3803): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3804): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3805): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.894:3806): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="lib64" dev="dm-18" ino=143 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.895:3807): avc:  denied  { read } for  pid=7236 comm="qemu-system-x86" name="libtinfo.so.6" dev="dm-18" ino=29360804 scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:object_r:unlabeled_t:s0 tclass=lnk_file
type=AVC msg=audit(1504294406.900:3808): avc:  denied  { sigchld } for  pid=21975 comm="docker-containe" scontext=system_u:system_r:svirt_tcg_t:s0:c550,c926 tcontext=system_u:system_r:container_runtime_t:s0 tclass=process

This process is iterative, and I had to go through it 10 times until I came up with a complete set of audit2allow generated files. Here is the sum total of what was generated.

module virt-policy-2 1.0;

require {
	type svirt_tcg_t;
	type bin_t;
	class file entrypoint;
}

#============= svirt_tcg_t ==============

#!!!! WARNING: 'bin_t' is a base type.
allow svirt_tcg_t bin_t:file entrypoint;

module virt-policy-3 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	type spc_t;
	type container_runtime_t;
	class process sigchld;
	class lnk_file read;
	class fifo_file write;
	class file read;
}

#============= svirt_tcg_t ==============
allow svirt_tcg_t container_runtime_t:process sigchld;
allow svirt_tcg_t spc_t:fifo_file write;

#!!!! WARNING: 'unlabeled_t' is a base type.
allow svirt_tcg_t unlabeled_t:file read;
allow svirt_tcg_t unlabeled_t:lnk_file read;

module virt-policy-4 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	class file open;
}

#============= svirt_tcg_t ==============

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/etc/ld.so.cache' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /etc/ld.so.cache
allow svirt_tcg_t unlabeled_t:file open;

module virt-policy-5 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	class file getattr;
}

#============= svirt_tcg_t ==============

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/etc/ld.so.cache' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /etc/ld.so.cache
allow svirt_tcg_t unlabeled_t:file getattr;

module virt-policy-6 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	class file execute;
}

#============= svirt_tcg_t ==============

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/usr/lib64/libtinfo.so.6.0' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/lib64/libtinfo.so.6.0
allow svirt_tcg_t unlabeled_t:file execute;

module virt-policy-7 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	type spc_t;
	class fifo_file { getattr ioctl };
	class file { execute_no_trans write };
}

#============= svirt_tcg_t ==============
allow svirt_tcg_t spc_t:fifo_file { getattr ioctl };

#!!!! WARNING: 'unlabeled_t' is a base type.
allow svirt_tcg_t unlabeled_t:file { execute_no_trans write };

module virt-policy-8 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	type sysfs_t;
	class capability { setgid setuid };
	class file append;
	class filesystem getattr;
}

#============= svirt_tcg_t ==============
allow svirt_tcg_t self:capability { setgid setuid };
allow svirt_tcg_t sysfs_t:filesystem getattr;

#!!!! WARNING: 'unlabeled_t' is a base type.
allow svirt_tcg_t unlabeled_t:file append;

module virt-policy-9 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	class file ioctl;
	class dir read;
}

#============= svirt_tcg_t ==============
allow svirt_tcg_t unlabeled_t:dir read;

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/etc/sudoers' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /etc/sudoers
allow svirt_tcg_t unlabeled_t:file ioctl;

module virt-policy 1.0;

require {
	type svirt_tcg_t;
	type spc_t;
	class process transition;
}

#============= spc_t ==============

#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow spc_t svirt_tcg_t:process transition;


module virt-policy-10 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	class capability { net_admin sys_resource };
	class file lock;
	class netlink_audit_socket create;
}

#============= svirt_tcg_t ==============
allow svirt_tcg_t self:capability { net_admin sys_resource };
allow svirt_tcg_t self:netlink_audit_socket create;

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/run/utmp' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /run/utmp
allow svirt_tcg_t unlabeled_t:file lock;

Obviously, using permissive would have been a shorter process. Let me restart the VM and try that. Here’s what I generate after one iteration:

module kubevirt-policy 1.0;

require {
	type unlabeled_t;
	type svirt_tcg_t;
	type container_runtime_t;
	class capability audit_write;
	class unix_stream_socket connectto;
	class file entrypoint;
	class netlink_audit_socket nlmsg_relay;
}

#============= svirt_tcg_t ==============

#!!!! The file '/run/docker.sock' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /run/docker.sock
allow svirt_tcg_t container_runtime_t:unix_stream_socket connectto;
allow svirt_tcg_t self:capability audit_write;
allow svirt_tcg_t self:netlink_audit_socket nlmsg_relay;

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow svirt_tcg_t unlabeled_t:file entrypoint;

And…. we start into the same pattern. It takes several runs to get to a set of policies that run cleanly. It seems some of the earlier AVCs mask later ones, and running in permissive mode only reports the first of several. Needless to say, the policy for running a VM Via Kubevirt is going to require some scrutiny.

And even then, the VMs still fail to deploy. Disable SELinux and they run. This mystery continues.

SELinux for Kubevirt on Centos

Posted by Adam Young on August 31, 2017 04:51 PM

Without disabling SELinux enforcement, an attempt to deploy a VM generates the following audit message:

type=AVC msg=audit(1504194626.938:877): avc: denied { transition } for pid=9574 comm="libvirtd" path="/usr/local/bin/qemu-system-x86_64" dev="dm-19" ino=31526884 scontext=system_u:system_r:spc_t:s0 tcontext=system_u:system_r:svirt_tcg_t:s0:c408,c741 tclass=process

Running this through audit2allow provides a little more visibility into the problem:

#============= spc_t ==============

#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow spc_t svirt_tcg_t:process transition;

This is probably due to running as much of the virtualization machinery in containers. /usr/local/bin/qemu-system-x86_64 comes from inside the libvirt container. It does not exist on the base OS filesystem. Thus, just running restorecon won’t do much.

While it is tempting to make this change and hope that everything works, I’ve learned that SELinux is dilligent enough that, if one thing fails, it is usually a few related things that are actually failing, and just dealing with the first is not enough. settting SELinux into permissive mode ande deploying a VM produces a much longer list of avcs. Runnning these through audit2allow generates the following:

#============= spc_t ==============

#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow spc_t svirt_tcg_t:process transition;

#============= svirt_lxc_net_t ==============
allow svirt_lxc_net_t svirt_tcg_t:dir { getattr open read search };
allow svirt_lxc_net_t svirt_tcg_t:file { getattr open read };

#============= svirt_tcg_t ==============
allow svirt_tcg_t cgroup_t:file { getattr open write };

#!!!! The file '/dev/ptmx' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /dev/ptmx
allow svirt_tcg_t container_devpts_t:chr_file open;

#!!!! The file '/run/docker.sock' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /run/docker.sock
allow svirt_tcg_t container_runtime_t:unix_stream_socket connectto;
allow svirt_tcg_t self:capability { audit_write dac_override net_admin setgid setuid sys_ptrace sys_resource };
allow svirt_tcg_t self:netlink_audit_socket { create nlmsg_relay };
allow svirt_tcg_t spc_t:fifo_file { getattr ioctl write };
allow svirt_tcg_t svirt_lxc_net_t:dir search;
allow svirt_tcg_t svirt_lxc_net_t:file { getattr open read };
allow svirt_tcg_t svirt_lxc_net_t:lnk_file read;
allow svirt_tcg_t sysfs_t:filesystem getattr;

#!!!! WARNING: 'unlabeled_t' is a base type.
allow svirt_tcg_t unlabeled_t:dir { add_name mounton read remove_name write };

#!!!! WARNING: 'unlabeled_t' is a base type.
#!!!! The file '/usr/local/bin/qemu-system-x86_64' is mislabeled on your system.  
#!!!! Fix with $ restorecon -R -v /usr/local/bin/qemu-system-x86_64
allow svirt_tcg_t unlabeled_t:file { append create entrypoint execute execute_no_trans getattr ioctl lock open read unlink write };
allow svirt_tcg_t unlabeled_t:lnk_file read;


While audit2allow will get you going again, it often produces a policy that is too permissive. Before blindly accepting this new policy, we need to look through it and figure out if there are better policy rules that acheive the same ends.

Many of the rules look sane on a first glance. For example,

allow svirt_tcg_t container_runtime_t:unix_stream_socket connectto;

allowing a file with the label svirt_tcg_t and the process runtime label of container_runtime to talk to a unix domain socket is obviously required to perform libvirt based tasks.

I suspect that the right solution involves modifying the libvirt container definition to properly set the SELinux context internally, as well as possibly adding something to the manifest for the daemonset to honor/carry forward the SELinux values.

Security ROI isn't impossible, we suck at measuring

Posted by Josh Bressers on August 30, 2017 04:11 PM
As of late I've been seeing a lot of grumbling that security return on investment (ROI) is impossible. This is of course nonsense. Understanding your ROI is one of the most important things you can do as a business leader. You have to understand if what you're doing makes sense. By the very nature of business, some of the things we do have more value than other things. Some things even have negative value. If we don't know which things are the most important, we're just doing voodoo security.

H. James Harrington once said
Measurement is the first step that leads to control and eventually to improvement. If you can’t measure something, you can’t understand it. If you can’t understand it, you can’t control it. If you can’t control it, you can’t improve it.
Anyone paying attention to the current state of security will probably shed a tear over that statement. The foundation of the statement results in this truth: we can't control or improve the state of security today. As much as we all like to talk about what's wrong with security and how to fix it. The reality is we don't really know what's broken, which of course means we have no idea how to fix anything.

Measuring security isn't impossible, it's just really hard today. It's really hard because we don't really understand what security is in most instances. Security isn't one thing, it's a lot of little things that don't really have anything to do with each other but we clump them together for some reason. We like to build teams of specialized people and call them the security team. We pretend we're responsible for a lot of little unrelated activities but we often don't have any real accountability. The reality is this isn't a great way to do something that actually works, it's a great way to have a lot smart people fail to live up to their true potential. The best security teams in the world today aren't good at security, they're just really good at marketing themselves so everyone thinks they're good at security.

Security needs to be a part of everything, not a special team that doesn't understand what's happening outside their walls. Think for a minute what an organization would look like if we split groups up by what programming language they knew. Now you have all the python people in one corner and all the C guys in the other corner. They'll of course have a long list of reasons why they're smarter and better than the other group (we'll ignore the perl guys down in the basement). Now if there is a project that needs some C and some python they would have to go to each group and get help. Bless the soul of anyone who needs C and python working together in their project. You know this would just be a massive insane turf war with no winner. It's quite likely the project would never work because the groups wouldn't have a huge incentive to work together. I imagine you can see the problem here. You have two groups that need to work together without proper incentive to actually work together.

Security is a lot like this. Does having a special secure development group outside of the development group make sense? Why does it make sense to have a security operations group that isn't just part of IT? If you're not part of a group do you have an incentive for the group to succeed? If I can make development's life so difficult they can't possibly succeed that's development's problem, not yours. You have no incentive to be a reasonable member of the team. The reality is you're not a member of the team at all. Your incentive is to protect your own turf, not help anyone else.

I'm going to pick on Google's Project Zero for a minute here. Not because they're broken, but because they're really really good at what they do. Project zero does research into how to break things, then they work with the project they broke to make it better. If this was part of a more traditional security thinking group, Project Zero would do research, build patches, then demand everyone uses whatever it is they built and throw a tantrum if they don't. This would of course be crazy, unwelcome, and a waste of time. Project Zero has a razor focus on research. More importantly though they work with other groups when it's time to get the final work done. Their razor focus and ability to work with others gives them a pretty clear metric they can see. How many flaws did they find? How many got fixed? How many new attack vectors did they create? This is easy to measure. Of course some groups won't work with them, but in that case they can publish their advisories and move on. There's no value in picking long horrible fights.

So here's the question you have to ask yourself. How much of what you do directly affects the group you're a part of? I don't mean things like enforcing compliance, compliance is a cost like paying for electricity, think bigger here about things that generate revenue. If you're doing a project with development, do your decisions affect them or do they affect you? If your decisions affect development you probably can't measure what you do. You can really only measure things that affect you directly. Even if you think you can measure someone else, you'll never be as good as they are. And honestly, who cares what someone else is doing, measure yourself first.

It's pretty clear we don't actually understand what we like to call "security" because we have no idea how to measure it. If we did understand it, we could measure it. According to H. James Harrington we can't fix what we can't measure it. I think given everything we've seen over the past few years, this is quite accurate. We will never fix our security problems without first measuring our security ROI.

I'll spend some time in the next few posts discussing how to measure what we do with actual examples. It's not as hard as it sounds.

Episode 60 - The official blockchain episode

Posted by Open Source Security Podcast on August 30, 2017 01:30 PM
Josh and Kurt talk about the eclipse and blockchain.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5690282/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Helicopter security

Posted by Josh Bressers on August 28, 2017 02:39 PM
After my last post about security spending, I was thinking about how most security teams integrate into the overall business (hint: they don't). As part of this thought experiment I decided to compare traditional security to something that in modern times has come to be called helicopter parenting.

A helicopter parent is someone who won't let their kids do anything on their own. These are the people you hear about who follow their child to college, to sports practice. They yell at teachers and coaches for not respecting how special the child is. The kids are never allowed to take any risks because risk is dangerous and bad. If they climb the tree, while it could be a life altering experience, they could also fall and get hurt. Skateboarding is possibly the most dangerous thing anyone could ever do! We better make sure nothing bad can ever happen.

It's pretty well understood now that this sort of attitude is terrible for the children. They must learn to do things on their own, it's part of the development process. Taking risks and failing is an extremely useful exercise. It's not something we think about often, but you have to learn to fail. Failure is hard to learn. The children of helicopter parents do manage to learn one lesson they can use in their life, they learn to hide what they do from their parents. They get extremely good at finding way to get around all their rules and restrictions. To a degree we all had this problem growing up. At some point we all wanted to do something our parents didn't approve of, which generally meant we did it anyway, we just didn't tell our parents. Now imagine a universe where your parents let you do NOTHING, you're going to be hiding literally everything. Nobody throughout history has ever accepted the fact that they can do nothing, they just make sure the authoritarian doesn't know about it. Getting caught is still better than doing nothing much of the time.

This brings us to traditional security. Most security teams don't try to work with the business counterparts. Security teams often think they can just tell everyone else what to do. Have you ever heard the security team ask "what are you trying to do?" Of course not. They always just say "don't do that" or maybe "do it this way" then move on to tell the next group how to do their job. They don't try to understand what you're doing and why you are doing it. It's quite literally not their job to care what you're doing, which is part of the problem. Things like phishing tests are used to belittle, not teach (they have no value as teaching tools, but we won't discuss that today). Many of the old school security teams see their job as risk aversion, not risk management. They are helicopter security teams.

Now as we know from children, if you prevent someone from doing anything they don't become your obedient servant, they go out of their way to make sure the authority has no idea what's going on. This is basically how shadow IT became a thing. It was far easier to go around the rules than work with the existing machine. Helicopter security is worse than nothing. At least with nothing you can figure out what's going on by asking questions and getting honest answers. In a helicopter security environment information is actively hidden because truth will only get you in trouble.

Can we fix this?
I don't know the answer to this question. A lot of tech people I see (not just security) are soldiers from the last war. With the way we see cloud transforming the universe there are a lot of people who are still stuck in the past. We often hear it's hard to learn new things but it's more than that. Technology, especially security, never stands still. It used to move slow enough you could get by for a few years on old skills, but we're in the middle of disruptive change right now. If you're not constantly questioning your existing skills and way of thinking you're already behind. Some people are so far behind they will never catch up. It's human nature to double down on the status quo when you're not part of the change. Helicopter security is that doubling down.

It's far easier to fight change and hope your old skills will remain useful than it is to learn a new skill. Everything we see in IT today is basically a new skill. Today the most useful thing you can know is how to learn quickly, what you learned a few months ago could be useless today, it will probably be useless in the near future. We are actively fighting change like this in security today. We try to lump everything together and pretend we have some sort of control over it. We never really had any control, it's just a lot more obvious now than it was before. Helicopter security doesn't work, no matter how bad you want it to.

The Next Step
The single biggest thing we need to start doing is measure ourselves. Even if you don't want to learn anything new you can at least try to understand what we're doing today that actually works, which things sort of work, and of course the things that don't work at all. In the next few posts I'm going to discuss how to measure security as well as how to avoid voodoo security. It's a lot harder to justify helicopter security behavior once we understand which of our actions work and which don't.

Deploying Kubevirt on Origin Master

Posted by Adam Young on August 25, 2017 03:14 PM

Now that I have a functional OpenShift Origin built from source, I need to deploy KubeVirt on top of it.

Here are my notes. This is rough, and not production quality yet, but should get you started.

Prerequisites

As I said in that last post, in order to Build Kubevirt, I had to upgrade to a later version of go (rpm had 1.6, now I have 1.8).

Docker

In order to build the manifests with specific versions, I over-rode some config options as I described here.

Specifically, I used docker_tag=devel to make sure I didn’t accidentally download the released versions from Docker hub, as well as set the master_ip address.

To generate the docker images:

make docker

make manifests

 

Config Changes

In order to make the configuration changes, and have them stick:

oc cluster down

edit
sudo vi /var/lib/origin/openshift.local.config/master/master-config.yaml

under
 networkConfig:
 clusterNetworkCIDR: 10.128.0.0/14

add

externalIPNetworkCIDRs: ["0.0.0.0/0"]

The bring the cluster up with:

oc cluster up --use-existing-config --loglevel=5 --version=413eb73 --host-data-dir=/var/lib/origin/etcd/ | tee /tmp/oc.log 2&gt;&amp;1

Networking

Got an error showing
$ Ensure that access to ports tcp/8443, udp/53 and udp/8053 is allowed on 192.168.122.233.

[ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8443/tcp
 [sudo] password for ayoung:
 [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=8053/udp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --zone=public --add-port=53/udp

Those won’t persists as is, so:

[ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8443/tcp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=53/udp
 success
 [ayoung@drifloon origin]$ sudo firewall-cmd --permanent --zone=public --add-port=8053/udp
 success

Log in as admin

Redeploy the cluster and then

$ oc login -u system:admin
 Logged into "https://127.0.0.1:8443" as "system:admin" using existing credentials.

You have access to the following projects and can switch between them with 'oc project ':

default
 kube-public
 kube-system
 * myproject
 openshift
 openshift-infra

Using project "myproject".
 [ayoung@drifloon kubevirt]$ oc project kube-system
 Now using project "kube-system" on server "https://127.0.0.1:8443".

Deploying Manifests

Need Updated Manifests from this branch.

To generate alternate values for the manifest:

$ for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done

message: ‘No nodes are available that match all of the following predicates::
MatchNodeSelector (1).’
reason: Unschedulable

Head node is not schedulable.

oc adm manage-node localhost --schedulable=true

And now….

for MANIFEST in `ls manifests/*yaml` ; do kubectl delete -f $MANIFEST ; done

wait a bit

for MANIFEST in `ls manifests/*yaml` ; do kubectl apply -f $MANIFEST ; done
$ kubectl get pods
 NAME READY STATUS RESTARTS AGE
 haproxy-858199412-m78n5 0/1 CrashLoopBackOff 8 16m
 kubevirt-cockpit-demo-4250553349-gm8qm 0/1 CrashLoopBackOff 224 18h
 spice-proxy-1193136539-gr7b3 0/1 CrashLoopBackOff 225 18h
 virt-api-4068750737-j7bwj 1/1 Running 0 18h
 virt-controller-3722000252-bsbsr 1/1 Running 0 18h

why are those three crashing? Permissions

$ kubectl logs kubevirt-cockpit-demo-4250553349-gm8qm
 cockpit-ws: /etc/cockpit/ws-certs.d/0-self-signed.cert: Failed to open file '/etc/cockpit/ws-certs.d/0-self-signed.cert': Permission denied
 kubectl logs haproxy-858199412-hqmk1
 <7>haproxy-systemd-wrapper: executing /usr/local/sbin/haproxy -p /haproxy/run/haproxy.pid -f /usr/local/etc/haproxy/haproxy.cfg -Ds
 [ALERT] 228/180509 (15) : [/usr/local/sbin/haproxy.main()] Cannot create pidfile /haproxy/run/haproxy.pid
 [ayoung@drifloon kubevirt]$ kubectl logs spice-proxy-1193136539-gr7b3
 FATAL: Unable to open configuration file: /home/proxy/squid.conf: (13) Permission denied

virt-api runs as a strange user:

1000050+ 14464 14450 0 Aug16 ? 00:00:01 /virt-api –port 8183 –spice-proxy 192.168.122.233:3128
1000050+ is, I am guessing, a uid made up by kubernetes

looks like I am tripping over the fact that Openshift security policy by default prohibits you from running as known users. (thanks claytonc)

Pull Request is merged for changing these in KubeVirt

Service Accounts:

oc create serviceaccount -n kube-system kubevirt
oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt

modify the libvirt and virt-handler manifests like so (this is in the version from the branch above):

     spec:
+       serviceAccountName: kubevirt
       containers:
       - name: virt-handler

and

     spec:
+       serviceAccountName: kubevirt
       hostNetwork: true
       hostPID: true
       hostIPC: true
       securityContext:
         runAsUser: 0

OK a few more notes: Need manifest changes so that the various resources end up in the kube-system namespace as well as run as the appropriate kubevirt or kubevirt-admin users. See this pull request.

SecurityContextConstraints

Have to manually apply the permissions.yaml file, then add scc to get the daemon pods to schedule:

oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt
oc adm policy add-scc-to-user privileged -n kube-system -z kubevirt-admin

 

You could also run as default serviceAccountuser and just run:

oc  adm policy add-scc-to-user privileged -n kube-system -z default

But that is not a good long term strategy.

 

In order to launch a VM, it turns out we need an eth1:  The default network setup by the libvirt image assumes it is there.  The easiest way to get one is to modify the VM to use a second network card.  That requires restarting the cluster.  You can also set the name of the interface  in config-local.sh to the appropriate network device/connection for your system using the primary_nic value.

Disable SELinux

Try to deploy vm with

kubectl apply -f cluster/vm.yaml

Check the status using:

 kubectl get vms testvm -o yaml

VM failed to deploy.

Check libvirt Container log

oc logs libvirt-ztt0w -c libvirtd 
2017-08-25 13:03:07.253+0000: 5155: error : qemuProcessReportLogError:1845 : internal error: process exited while connecting to monitor: libvirt: error : cannot execute binary /usr/local/bin/qemu-system-x86_64: Permission denied

For now, disable SELinux. In the future, we’ll need a customer SELinux policy to allow this.

 sudo setenforce 0

 

iSCSI vs PXE

Finally, the iscsi pod defined in the manifests trips over a bunch of OpenShift permissions hardening issues.  Prior to working those out, I just wanted to run a PXE bootable VM, so I copied the vm.yaml to vm-pxe.yaml and applied that.

Lesson Learned: SecurityContextConstraints can’t be in manifests.

Using the FOR loop for the manifests won’t work long term. We’ll need to apply the permissions.yaml file first, then run the two oc commands to add the Users to the scc, and finally run the rest of the manifests. Adding users to sccs cannot be done via manifest apply, as it has to modify an existing resource. The ServiceAccounts need to be created prior to any of the daemonset or deployment manifests and added to the sccs, or the node selection criteria will not be met, and no pods will be scheduled.

 

 

Running OpenShift Origin built from source

Posted by Adam Young on August 23, 2017 02:52 PM

Kubernetes is moving from Third Party Resources to the Aggregated API Server.  In order to work with this and continue to deploy on OpenShift Origin, we need to move from working with the shipped and stable version that is in Fedora 26 to the development version in git.  Here are my notes to get it up and running.

Process

It took a couple tries to realize that the go build process needs a fairly bulky virtual machine. I ended up using one that has 8 GB Ram and 50 GB disk. In order to minimize churn, I also went with a Centos7 deployment.

Once I had a running VM here are the configuration changes I had to make.

Use nmcli c eth0 on and set ONBOOT. This can also be done via editing the network config files or using older tools.

yum update -y
yum groupinstall "Development Tools"
yum instyall -y origin-clients

Ensure that the Docker daemon is running with the following argument:
–insecure-registry 172.30.0.0/16  By editing the file /etc/sysconfig/docker.  My Options line looks like this:

OPTIONS='--selinux-enabled --log-driver=journald --signature-verification=false --insecure-registry 172.30.0.0/16'

Followed dirs from here In order to set up the development environment.

cd $GOPATH/src/github.com/openshift/origin
hack/env hack/build-base-images.sh
OS_BUILD_ENV_PRESERVE=_output/local/releases hack/env make release

Note that the KubeVirt code I want to run on top of this requires a later version of go, and thus I upgrade to go version go1.8.3 linux/amd64  via the tarball install method.

The Hash that gets generated by the build depends on when you run.   To see the images run

docker images

I expand the terminal to full screen lots of columns of data.  Here is a subset:

REPOSITORY TAG IMAGE ID CREATED SIZE
openshift/hello-openshift 413eb73 911092241b5a 35 hours ago 5.84 MB
openshift/hello-openshift latest 911092241b5a 35 hours ago 5.84 MB
openshift/openvswitch 413eb73 c53aae019d81 35 hours ago 1.241 GB
openshift/openvswitch latest c53aae019d81 35 hours ago 1.241 GB
openshift/node 413eb73 af6135fc50c9 35 hours ago 1.239 GB
openshift/node latest af6135fc50c9 35 hours ago 1.239 GB

The tag is the second column.  This is what I use in order to install. I don’t use “Latest” as that changes over time, and it might accidentally succeed using a remote package when the local build failed.

I want to be able to edit configuration values.  A also want the etcd store to persist across reboots.  Thus,

sudo mkdir /var/lib/origin/etcd
sudo chown ayoung:ayoung /var/lib/origin/etcd

And then my final command line to bring up the cluster is:

oc cluster up --use-existing-config --loglevel=5 --version=413eb73 --host-data-dir=/var/lib/origin/etcd/ | tee /tmp/oc.log 2>&1

Notes:

Below are some of my Troubleshooting notes. I am going to leave them in here so they show up on future searches for people that have the same problems.  They are rough, and you don’t need to read them.

hack/env make release errored

[WARNING] Copying _output/local/releases from the container failed!
[WARNING] Error response from daemon: lstat /var/lib/docker/devicemapper/mnt/fb199307b2f95649066c42f55e5487c66eb3421e5407c8bd6d2f0a7058bc8cd5/rootfs/go/src/github.com/openshift/origin/_output/local/releases: no such file or directory

Tried with OS_BUILD_ENV_PRESERVE=_output/local but no difference.

Should have been
OS_BUILD_ENV_PRESERVE=_output/local/releases hack/env make release

This did not work (basename error)
export PATH=”${PATH}:$( source hack/lib/init.sh; echo “${OS_OUTPUT_BINPATH}/$( os::util::host_platform )/” )”

But I was able to do
export PATH=$PATH:$PWD/_output/local/bin/linux/amd64/

and then

oc cluster up –version=latest

failed due to docker error
— Checking Docker daemon configuration … FAIL
Error: did not detect an –insecure-registry argument on the Docker daemon
Solution:

used https://wiki.centos.org/SpecialInterestGroup/PaaS/OpenShift-Quickstart

To fix

oc seems to be running OK now. But not using my git commit

Told to run:

Then

oc cluster up –version=8d96d48

GSoC 2017 - Mentor Report from 389 Project

Posted by William Brown on August 23, 2017 02:00 PM

GSoC 2017 - Mentor Report from 389 Project

This year I have had the pleasure of being a mentor for the Google Summer of Code program, as part of the Fedora Project organisation. I was representing the 389 Directory Server Project and offered students the oppurtunity to work on our command line tools written in python.

Applications

From the start we have a large number of really talented students apply to the project. This was one of the hardest parts of the process was to choose a student, given that I wanted to mentor all of them. Sadly I only have so many hours in the day, so we chose Ilias, a student from Greece. What really stood out was his interest in learning about the project, and his desire to really be part of the community after the project concluded.

The project

The project was very deliberately “loose” in it’s specification. Rather than giving Ilias a fixed goal of you will implement X, Y and Z, I chose to set a “broad and vague” task. Initially I asked him to investigate a single area of the code (the MemberOf plugin). As he investigated this, he started to learn more about the server, ask questions, and open doors for himself to the next tasks of the project. As these smaller questions and self discoveries stacked up, I found myself watching Ilias start to become a really complete developer, who could be called a true part of our community.

Ilias’ work was exceptional, and he has documented it in his final report here .

Since his work is complete, he is now free to work on any task that takes his interest, and he has picked a good one! He has now started to dive deep into the server internals, looking at part of our backend internals and how we dump databases from id2entry to various output formats.

What next?

I will be participating next year - Sadly, I think the python project oppurtunities may be more limited as we have to finish many of these tasks to release our new CLI toolset. This is almost a shame as the python components are a great place to start as they ease a new contributor into the broader concepts of LDAP and the project structure as a whole.

Next year I really want to give this oppurtunity to an under-represented group in tech (female, poc, etc). I personally have been really inspired by Noriko and I hope to have the oppurtunity to pass on her lessons to another aspiring student. We need more engineers like her in the world, and I want to help create that future.

Advice for future mentors

Mentoring is not for everyone. It’s not a task which you can just send a couple of emails and be done every day.

Mentoring is a process that requires engagement with the student, and communication and the relationship is key to this. What worked well was meeting early in the project, and working out what community worked best for us. We found that email questions and responses worked (given we are on nearly opposite sides of the Earth) worked well, along with irc conversations to help fix up any other questions. It would not be uncommon for me to spend at least 1 or 2 hours a day working through emails from Ilias and discussions on IRC.

A really important aspect of this communication is how you do it. You have to balance positive communication and encouragement, along with critcism that is constructive and helpful. Empathy is a super important part of this equation.

My number one piece of advice would be that you need to create an environment where questions are encouraged and welcome. You can never be dismissive of questions. If ever you dismiss a question as “silly” or “dumb”, you will hinder a student from wanting to ask more questions. If you can’t answer the question immediately, send a response saying “hey I know this is important, but I’m really busy, I’ll answer you as soon as I can”.

Over time you can use these questions to help teach lessons for the student to make their own discoveries. For example, when Ilias would ask how something worked, I would send my response structured in the way I approached the problem. I would send back links to code, my thoughts, and how I arrived at the conclusion. This not only answered the question but gave a subtle lesson in how to research our codebase to arrive at your own solutions. After a few of these emails, I’m sure that Ilias has now become self sufficent in his research of the code base.

Another valuable skill is that overtime you can help to build confidence through these questions. To start with Ilias would ask “how to implement” something, and I would answer. Over time, he would start to provide ideas on how to implement a solution, and I would say “X is the right one”. As time went on I started to answer his question with “What do you think is the right solution and why?”. These exchanges and justifications have (I hope) helped him to become more confident in his ideas, the presentation of them, and justification of his solutions. It’s led to this excellent exchange on our mailing lists, where Ilias is discussing the solutions to a problem with the broader community, and working to a really great answer.

Final thoughts

This has been a great experience for myself and Ilias, and I really look forward to helping another student next year. I’m sure that Ilias will go on to do great things, and I’m happy to have been part of his journey.

Spend until you're secure

Posted by Josh Bressers on August 23, 2017 12:20 PM
I was watching a few Twitter conversations about purchasing security last week and had yet another conversation about security ROI. This has me thinking about what we spend money on. In many industries we can spend our way out of problems, not all problems, but a lot of problems. With security if I gave you a blank check and said "fix it", you couldn't. Our problem isn't money, it's more fundamental than that.

Spend it like you got it
First let's think about how some problems can be solved with money. If you need more electricity capacity, or more help during a busy time, or more computing power, it's really easy to add capacity. You need more compute power, you can either buy more computers or just spend $2.15 in the cloud. If you need to dig a big hole, for a publicity stunt on Black Friday, you just pay someone to dig a big hole. It's not that hard.

This doesn't always work though, if you're building a new website, you probably can't buy your way to success. If a project like this falls behind it can be very difficult to catch back up. You can however track progress which I would say is at least a reasonable alternative. You can move development to another group or hire a new consultant if the old one isn't living up to expectations.

More Security
What if we need "more" security. How can we buy our way into more security for our organization? I'd start by asking the question can we show any actual value for our current security investment? If you stopped spending money on security tomorrow do you know what the results would be? If you stopped buying toilet paper for your company tomorrow you can probably understand what will happen (if you have a good facilities department I bet they already know the answer to this).

This is a huge problem in many organizations. If you don't know what would happen if you lowered or increased your security spending you're basically doing voodoo security. You can imagine many projects and processes as having a series of inputs that can be adjusted. Things like money, time, people, computers, the list could go on. You can control these variables and have direct outcomes on the project. More people could mean you can spend less money on contractors, more computers could mean less time spent on rendering or compiling. Ideally you have a way to find the optimal levels for each of these variables resulting in not only a high return on investment, but also happier workers as they can see the results of their efforts.

We can't do this with security today because security is too broad. We often don't know what would happen if we add more staff, or more technology.

Fundamental fundamentals
So this brings us to why we can't spend our way to security. I would argue there are two real problems here. The first being "security" isn't a thing. We pretend security is an industry that means something but it's really a lot of smaller things we've clumped together in such a way that ensures we can only fail. I see security teams claim to own anything that has the word security attached to it. They claim ownership of projects and ideas, but then they don't actually take any actions because they're too busy or lack the skills to do the work. Just because you know how to do secure development doesn't automatically make you an expert at network security. If you're great at network security it doesn't mean you know anything about physical security. Security is a lot of little things, we have to start to understand what those are and how to push responsibility to respective groups. Having a special application security team that's not part of development doesn't work. You need all development teams doing things securely.

The second problem is we don't measure what we do. How many security teams tell IT they have to follow a giant list of security rules, but they have no idea what would happen if one or more of those rules were rolled back? Remember when everyone insisted we needed to use complex passwords? Now that's considered bad advice and we shouldn't make people change their passwords often. It's also a bad idea to insist they use a variety of special characters now. How many millions have been wasted on stupid password rules? The fact that we changed the rules without any fanfare means there was no actual science behind the rules in the first place. If we even tried to measure this I suspect we would have known YEARS ago that it was a terrible idea. Instead we just kept doing voodoo security. How many more of our rules do you think will end up being rolled back in the near future because they don't actually make sense?

If you're in charge of a security program the first bit of advice I'd give out is to look at everything you own and get rid of whatever you can. Your job isn't to do everything, figure out what you have to do, then do it well. One project well done is far better than 12 half finished. The next thing you need to do is figure out how much whatever you do costs, and how much benefit it creates. If you can't figure out the benefit, you can probably stop doing it today. If it costs more than it saves, you can stop that too. We must have a razor focus if we're to understand what our real problems are. Once we understand the problems we can start to solve them.

Customizing the KubeVirt Manifests

Posted by Adam Young on August 21, 2017 05:02 PM

My cloud may not look like your cloud. The contract between the application deployment and the Kubernetes installation is a set of manifest files that guide Kubernetes in selecting, naming, and exposing resources. In order to make the generation of the Manifests sane in KubeVirt, we’ve provided a little bit of build system support.

The manifest files are templatized in a jinja style. I say style, because the actual template string replacement is done using simmple bash scripting. Regardless of the mechanism, it should not be hard for a developer to understand what happens. I’ll assume that you have your source code checked out in $GOPATH/src/kubevirt.io/kubevirt/

The template files exist in the manifests subdirectory. Mine looks like this:

haproxy.yaml.in             squid.yaml.in            virt-manifest.yaml.in
iscsi-demo-target.yaml.in   virt-api.yaml.in         vm-resource.yaml.in
libvirt.yaml.in             virt-controller.yaml.in
migration-resource.yaml.in  virt-handler.yaml.in

The simplest way to generate a set of actual manifest files is to run make manifests

make manifests
./hack/build-manifests.sh
$ ls -l manifests/*yaml
-rw-rw-r--. 1 ayoung ayoung  672 Aug 21 10:17 manifests/haproxy.yaml
-rw-rw-r--. 1 ayoung ayoung 2384 Aug 21 10:17 manifests/iscsi-demo-target.yaml
-rw-rw-r--. 1 ayoung ayoung 1707 Aug 21 10:17 manifests/libvirt.yaml
-rw-rw-r--. 1 ayoung ayoung  256 Aug 21 10:17 manifests/migration-resource.yaml
-rw-rw-r--. 1 ayoung ayoung  709 Aug 21 10:17 manifests/squid.yaml
-rw-rw-r--. 1 ayoung ayoung  832 Aug 21 10:17 manifests/virt-api.yaml
-rw-rw-r--. 1 ayoung ayoung  987 Aug 21 10:17 manifests/virt-controller.yaml
-rw-rw-r--. 1 ayoung ayoung  954 Aug 21 10:17 manifests/virt-handler.yaml
-rw-rw-r--. 1 ayoung ayoung 1650 Aug 21 10:17 manifests/virt-manifest.yaml
-rw-rw-r--. 1 ayoung ayoung  228 Aug 21 10:17 manifests/vm-resource.yaml

Looking at the difference between, say the virt-api template and final yaml file:

$ diff -u manifests/virt-api.yaml.in manifests/virt-api.yaml
--- manifests/virt-api.yaml.in	2017-07-20 13:29:00.532916101 -0400
+++ manifests/virt-api.yaml	2017-08-21 10:17:10.533038861 -0400
@@ -7,7 +7,7 @@
     - port: 8183
       targetPort: virt-api
   externalIPs :
-    - "{{ master_ip }}"
+    - "192.168.200.2"
   selector:
     app: virt-api
 ---
@@ -23,14 +23,14 @@
     spec:
       containers:
       - name: virt-api
-        image: {{ docker_prefix }}/virt-api:{{ docker_tag }}
+        image: kubevirt/virt-api:latest
         imagePullPolicy: IfNotPresent
         command:
             - "/virt-api"
             - "--port"
             - "8183"
             - "--spice-proxy"
-            - "{{ master_ip }}:3128"
+            - "192.168.200.2:3128"
         ports:
           - containerPort: 8183
             name: "virt-api"
@@ -38,4 +38,4 @@
       securityContext:
         runAsNonRoot: true
       nodeSelector:
-        kubernetes.io/hostname: {{ primary_node_name }}
+        kubernetes.io/hostname: master

make manifests, it turns out, just calls a bash script ./hack/build-manifests.sh. This script uses two files to determine the values to use for template string substitution. First, the defaults: hack/config-default.sh. This is where master_ip get the value of 192.168.200.2. This file also gives priority to the $DOCKER_TAG environment variable. However, if you need to customize values further, you can create and manage them in the file hack/config-local.sh. The goal is that any of the keys from the -default file that are specified in the hack/config-local.sh will use the value from the latter file. The set of keys with their defaults (as of this writing) that you can customize are:

binaries="cmd/virt-controller cmd/virt-launcher cmd/virt-handler cmd/virt-api cmd/virtctl cmd/virt-manifest"
docker_images="cmd/virt-controller cmd/virt-launcher cmd/virt-handler cmd/virt-api cmd/virt-manifest images/haproxy images/iscsi-demo-target-tgtd images/vm-killer images/libvirt-kubevirt images/spice-proxy cmd/virt-migrator cmd/registry-disk-v1alpha images/cirros-registry-disk-demo"
optional_docker_images="cmd/registry-disk-v1alpha images/fedora-atomic-registry-disk-demo"
docker_prefix=kubevirt
docker_tag=${DOCKER_TAG:-latest}
manifest_templates="`ls manifests/*.in`"
master_ip=192.168.200.2
master_port=8184
network_provider=weave
primary_nic=${primary_nic:-eth1}
primary_node_name=${primary_node_name:-master}

Not all of these are for Manifest files. The docker_images key is used in selecting the set of generating Docker images to generate in a command called from a different section of the Makefile. The network_provider is used in the Vagrant setup, and so on.However, most of the values are used in the manifest files. So, If I want to set a master IP Address of 10.10.10.10, I would have a hack/config-local.sh file that looks like this:

master_ip=10.10.10.10
$  diff -u manifests/virt-api.yaml.in manifests/virt-api.yaml
--- manifests/virt-api.yaml.in	2017-07-20 13:29:00.532916101 -0400
+++ manifests/virt-api.yaml	2017-08-21 10:42:28.434742371 -0400
@@ -7,7 +7,7 @@
     - port: 8183
       targetPort: virt-api
   externalIPs :
-    - "{{ master_ip }}"
+    - "10.10.10.10"
   selector:
     app: virt-api
 ---
@@ -23,14 +23,14 @@
     spec:
       containers:
       - name: virt-api
-        image: {{ docker_prefix }}/virt-api:{{ docker_tag }}
+        image: kubevirt/virt-api:latest
         imagePullPolicy: IfNotPresent
         command:
             - "/virt-api"
             - "--port"
             - "8183"
             - "--spice-proxy"
-            - "{{ master_ip }}:3128"
+            - "10.10.10.10:3128"
         ports:
           - containerPort: 8183
             name: "virt-api"
@@ -38,4 +38,4 @@
       securityContext:
         runAsNonRoot: true
       nodeSelector:
-        kubernetes.io/hostname: {{ primary_node_name }}
+        kubernetes.io/hostname: master

IoT Security for Developers [Survive IoT Part 5]

Posted by Russel Doty on August 15, 2017 10:33 PM

Previous articles focused on how to securely design and configure a system based on existing hardware, software, IoT Devices, and networks. If you are developing IoT devices, software, and systems, there is a lot more you can do to develop secure systems.

The first thing is to manage and secure communications with IoT Devices. Your software needs to be able to discover, configure, manage and communicate with IoT devices. By considering security implications when designing and implementing these functions you can make the system much more robust. The basic guideline is don’t trust any device. Have checks to verify that a device is what it claims to be, to verify device integrity, and to validate communications with the devices.

Have a special process for discovering and registering devices and restrict access to it. Do not automatically detect and register any device that pops up on the network! Have a mechanism for pairing devices with the gateway, such as a special pairing mode that must be invoked on both the device and the gateway to pair or a requirement to manually enter a device serial number or address into the gateway as part of the registration process. For industrial applications adding devices is a deliberate process – this is not a good operation to fully automate!

A solid approach to gateway and device identity is to have a certificate provisioned onto the device at the factory, by the system integrator, or at a central facility. It is even better if this certificate is backed by a HW root of trust that can’t be copied or spoofed.

Communications between the gateway and the device should be designed. Instead of a general network connection, which can be used for many purposes, consider using a specialized interface. Messaging interfaces are ideal for many IoT applications. Two of the most popular messaging interfaces are MQTT (Message Queued Telemetry Transport) and CoAP. In addition to their many other advantages, these messaging interfaces only carry IoT data, greatly reducing their capability to be used as an attack vector.

Message based interfaces are also a good approach for connecting the IoT Gateway to backend systems. An enterprise message bus like AMQP is a powerful tool for handling asynchronous inputs from thousands of gateways, routing them, and feeding the data into backend systems. A messaging system makes the total system more reliable, more robust, and more efficient – and makes it much easier to implement large scale systems! Messaging interfaces are ideal for handling exceptions – they allow you to simply send the exception as a regular message and have it properly processed and routed by business logic on the backend.

Messaging systems are also ideal for handling unreliable networks and heavy system loads. A messaging system will queue up messages until the network is available. If a sudden burst of activity causes the network and backend systems to be overloaded the messaging system will automatically queue up the messages and then release them for processing as resources become available. Messaging systems allow you to ensure reliable message delivery, which is critical for many applications. Best of all, messaging systems are easy for a programmer to use and do the hard work of building a robust communications capability for you.

No matter what type of interface you are using it is critical to sanitize your inputs. Never just pass through information from a device – instead, check it to make sure that is properly formatted, that it makes sense, that it does not contain a malicious payload, and that the data has not been corrupted. The overall integrity of an IoT system is greatly enhanced by ensuring the quality of the data it is operating on. Perhaps the best example of this is Little Bobby Tables from XKCD (XKCD.com):

Importance of sanitizing your input.

Importance of sanitizing your input.

On a more serious level, poor input sanitization is responsible for many security issues. Programmers should assume that users can’t be trusted and all interactions are a potential attack.


Episode 59 - The VPN Episode

Posted by Open Source Security Podcast on August 15, 2017 03:14 PM
Josh and Kurt talk about VPNs and the upcoming eclipse.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5644794/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Installing FreeIPA with an Active Directory subordinate CA

Posted by Fraser Tweedale on August 14, 2017 06:04 AM

FreeIPA is often installed in enterprise environments for managing Unix and Linux hosts and services. Most commonly, enterprises use Microsoft Active Directory for managing users, Windows workstations and Windows servers. Often, Active Directory is deployed with Active Directory Certificate Services (AD CS) which provides a CA and certificate management capabilities. Likewise, FreeIPA includes the Dogtag CA, and when deploying FreeIPA in an enterprise using AD CS, it is often desired to make the FreeIPA CA a subordinate CA of the AD CS CA.

In this blog post I’ll explain what is required to issue an AD sub-CA, and how to do it with FreeIPA, including a step-by-step guide to configuring AD CS.

AD CS certificate template overview

AD CS has a concept of certificate templates, which define the characteristics an issued certificate shall have. The same concept exists in Dogtag and FreeIPA except that in those projects we call them certificate profiles, and the mechanism to select which template/profile to use when issuing a certificate is different.

In AD CS, the template to use is indicated by an X.509 extension in the certificate signing request (CSR). The template specifier can be one of two extensions. The first, older extension has OID 1.3.6.1.4.1.311.20.2 and allows you to specify a template by name:

CertificateTemplateName ::= SEQUENCE {
   Name            BMPString
}

(Note that some documents specify UTF8String instead of BMPString. BMPString works and is used in practice. I am not actually sure if UTF8String even works.)

The second, Version 2 template specifier extension has OID 1.3.6.1.4.1.311.21.7 and allows you to specify a template by OID and version:

CertificateTemplate ::= SEQUENCE {
    templateID              EncodedObjectID,
    templateMajorVersion    TemplateVersion,
    templateMinorVersion    TemplateVersion OPTIONAL
}

TemplateVersion ::= INTEGER (0..4294967295)

Note that some documents also show templateMajorVersion as optional, but it is actually required.

When submitting a CSR for signing, AD CS looks for these extensions in the request, and uses the extension data to select the template to use.

External CA installation in FreeIPA

FreeIPA supports installation with an externally signed CA certificate, via ipa-server-install --external-ca or (for existing CA-less installations ipa-ca-install --external-ca). The installation takes several steps. First, a key is generated and a CSR produced:

$ ipa-ca-install --external-ca

Directory Manager (existing master) password: XXXXXXXX

Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
  [1/8]: configuring certificate server instance
The next step is to get /root/ipa.csr signed by your CA and re-run /sbin/ipa-ca-install as:
/sbin/ipa-ca-install --external-cert-file=/path/to/signed_certificate --external-cert-file=/path/to/external_ca_certificate

The installation program exits while the administrator submits the CSR to the external CA. After they receive the signed CA certificate, the administrator resumes the installation, giving the installation program the CA certificate and a chain of one or more certificates up to the root CA:

$ ipa-ca-install --external-cert-file ca.crt --external-cert-file ipa.crt
Directory Manager (existing master) password: XXXXXXXX

Configuring certificate server (pki-tomcatd). Estimated time: 3 minutes
  [1/29]: configuring certificate server instance
  ...
  [29/29]: configuring certmonger renewal for lightweight CAs
Done configuring certificate server (pki-tomcatd).

Recall, however, that if the external CA is AD CS, a CSR must bear one of the certificate template specifier extensions. There is an additional installation program option to add the template specifier:

$ ipa-ca-install --external-ca --external-ca-type=ms-cs

This adds a name-based template specifier to the CSR, with the name SubCA (this is the name of the default sub-CA template in AD CS).

Specifying an alternative AD CS template

Everything discussed so far is already part of FreeIPA. Until now, there is no way to specify a different template to use with AD CS.

I have been working on a feature that allows an alternative AD CS template to be specified. Both kinds of template specifier extension are supported, via the new --external-ca-profile installation program option:

$ ipa-ca-install --external-ca --external-ca-type=ms-cs \
  --external-ca-profile=1.3.6.1.4.1.311.21.8.8950086.10656446.2706058.12775672.480128.147.7130143.4405632:1

(Note: huge OIDs like the above are commonly used by Active Directory for installation-specific objects.)

To specify a template by name, the --external-ca-profile value should be:

--external-ca-profile=NAME

To specify a template by OID, the OID and major version must be given, and optionally the minor version too:

--external-ca-profile=OID:MAJOR[:MINOR]

Like --external-ca and --external-ca-type, the new --external-ca-profile option is available with both ipa-server-install and ipa-ca-install.

With this feature, it is now possible to specify an alternative or custom certificate template when using AD CS to sign the FreeIPA CA certificate. The feature has not yet been merged but there an open pull request. I have also made a COPR build for anyone interested in testing the feature.

The remainder of this post is a short guide to configuring Active Directory Certificate Services, defining a custom CA profile, and submitting a CSR to issue a certificate.

Renewing the certificate

FreeIPA provides the ipa-cacert-manage renew command for renewing an externally-signed CA certificate. Like installation with an externally-signed CA, this is a two-step procedure. In the first step, the command prompts Certmonger to generate a new CSR for the CA certificate, and saves the CSR so that the administrator can submit it to the external CA.

For renewing a certificate signed by AD CS, as in the installation case a template specifier extension is needed. Therefore the ipa-cacert-manage renew command has also learned the --external-ca-profile option:

# ipa-cacert-manage renew --external-ca-type ms-cs \
  --external-ca-profile MySubCA
Exporting CA certificate signing request, please wait
The next step is to get /var/lib/ipa/ca.csr signed by your CA and re-run ipa-cacert-manage as:
ipa-cacert-manage renew --external-cert-file=/path/to/signed_certificate --external-cert-file=/path/to/external_ca_certificate
The ipa-cacert-manage command was successful

The the above example the CSR that was generated will contain a version 1 template extension, using the name MySubCA. Like the installation commands, the version 2 extension is also supported.

This part of the feature requires some changes to Certmonger as well as FreeIPA. At time of writing these changes haven’t been merged. There is a Certmonger pull request and a Certmonger COPR build if you’d like to test the feature.

Appendix A: installing and configuring AD CS

Assuming an existing installation of Active Directory, AD CS installation and configuration will take 10 to 15 minutes. Open Server Manager, invoke the Add Roles and Features Wizard and select the AD CS Certification Authority role:

image

Proceed, and wait for the installation to complete…

image

After installation has finished, you will see AD CS in the Server Manager sidebar, and upon selecting it you will see a notification that Configuration required for Active Directory Certificate Services.

image

Click More…, and up will come the All Servers Task Details dialog showing that the Post-deployment Configuration action is pending. Click the action to continue:

image

Now comes the AD CS Configuration assistant, which contains several steps. Proceed past the Specify credentials to configure role services step.

In the Select Role Services to configure step, select Certification Authority then continue:

image

In the Specify the setup type of the CA step, choose Enterprise CA then continue:

image

The Specify the type of the CA step lets you choose whether the AD CS CA will be a root CA or chained to an external CA (just like how FreeIPA lets you create root or subordinate CA!) Installing AD CS as a Subordinate CA is outside the scope of this guide. Choose Root CA and continue:

image

The next step lets you Specify the type of the private key. You can use an existing private key or Create a new private key, the continue.

The Specify the cryptographic options step lets you specify the Key length and hash algorithm for the signature. Choose a key length of at least 2048 bits, and the SHA-256 digest:

image

Next, Specify the name of the CA. This sets the Subject Distinguished Name of the CA. Accept defaults and continue.

The next step is to Specify the validity period. CA certificates (especially root CAs) typically need a long validity period. Choose a value like 5 Years, then continue:

image

Accept defauts for the Specify the database locations step.

Finally, you will reach the Confirmation step, which summarises the chosen configuration options. Review the settings then Configure:

image

The configuration will take a few moments, then the Results will be displayed:

image

AD CS is now configured and you can begin issuing certificates.

Appendix B: creating a custom sub-CA certificate template

In this section we look at how to create a new certificate template for sub-CAs by duplicating an existing template, then modifying it.

To manage certificate templates, from Server Manager right-click the server and open the Certification Authority program:

image

In the sidebar tree view, right-click Certificate Templates then select Manage.

image

The Certificate Templates Console will open. The default profile for sub-CAs has the Template Display Name Subordinate Certification Authority. Right-click this template and choose Duplicate Template.

image

The new template is created and the Properties of New Template dialog appears, allowing the administrator to customise the template. You can set a new Template display name, Template name and so on:

image

You can also change various aspects of certificate issuance including which extensions will appear on the issued certificate, and the values of those extensions. In the following screenshot, we see a new Certificate Policies OID being defined for addition to certificates issued via this template:

image

Also under Extensions, you can discover the OID for this template by looking at the Certificate Template Information extension description.

Finally, having defined the new certificate template, we have to activate it for use with the AD CA. Back in the Certification Authority management window, right-click Certificate Templates and select Certificate Template to Issue:

image

This will pop up the Enable Certificate Templates dialog, containing a list of templates available for use with the CA. Select the new template and click OK. The new certificate template is now ready for use.

Appendix C: issuing a certificate

In this section we look at how to use AD CS to issue a certificate. It is assumed that the CSR to be signed exists and Active Directory can access it.

In the Certification Authority window, in the sidebar right-click the CA and select All Tasks >> Submit new request…:

image

This will bring up a file chooser dialog. Find the CSR and Open it:

image

Assuming all went well (including the CSR indicating a known certificate template), the certificate is immediately issued and the Save Certificate dialog appear, asking where to save the issued certificate.

But that's not my job!

Posted by Josh Bressers on August 13, 2017 07:45 PM
This week I've been thinking about how security people and non security people interact. Various conversations I have often end up with someone suggesting everyone needs some sort of security responsibility. My suspicion is this will never work.

First some background to think about. In any organization there are certain responsibilities everyone has. Without using security as our specific example just yet, let's consider how a typical building functions. You have people who are tasked with keeping the electricity working, the plumbing, the heating and cooling. Some people keep the building clean, some take care of the elevators. Some work in the building to accomplish some other task. If the company that inhabits the building is a bank you can imagine the huge number of tasks that take place inside.

Now here's where I want our analogy to start. If I work in a building and I see a leaking faucet. I probably would report it. If I didn't, it's likely someone else would see it. It's quite possible if I'm one of the electricians and while accessing some hard to reach place I notice a leaking pipe. That's not my job to fix it, I could tell the plumbers but they're not very nice to me, so who cares. The last time I told them about a leaking pipe they blamed me for breaking it, so I don't really have an incentive here. If I do nothing, it really won't affect me. If I tell someone, at best it doesn't affect me, but in reality I probably will get some level of blame or scrutiny.

This almost certainly makes sense to most of us. I wonder if there are organizations where reporting things like this comes with an incentive. A leaking water pipe could end up causing millions in damage before it's found. Nowhere I've ever worked ever really had an incentive to report things like this. If it's not your job, you don't really have to care, so nobody ever really cared.

Now let's think about phishing in a modern enterprise. You see everything from blaming the user who clicked the link, to laughing at them for being stupid, to even maybe firing someone for losing the company a ton of money. If a user clicks a phishing link, and suspects a problem, they have very little incentive to be proactive. It's not their job. I bet the number of clicked phish links we find out about is much much lower than the total number clicked.

I also hear security folks talking about educating the users on how all this works. Users should know how to spot phishing links! While this won't work for a variety of reasons, at the end of the day, it's not their job so why do we think they should know how to do this? Even more important, why do we think they should care?

The think I keep wondering is should this be the job of everyone or just the job of the security people? I think the quick reaction is "everyone" but my suspicion is it's not. Electricity is a great example. How many stories have you heard of office workers being electrocuted in the office? The number is really low because we've made electricity extremely safe. If we put this in the context of modern security we have a system where the office is covered in bare wires. Imagine wires hanging from the ceiling, some draped on the floor. The bathroom has sparking wires next to the sink. We lost three interns last week, those stupid interns! They should have known which wires weren't safe to accidentally touch. It's up to everyone in the office to know which wires are safe and which are dangerous!

This is of course madness, but it's modern day security. Instead of fixing the wires, we just imagine we can train everyone up on how to spot the dangerous ones.

Docker without sudo on Centos 7

Posted by Adam Young on August 09, 2017 06:28 PM

I have been geting prepped to build the OpenShift origin codebase on Centos 7.  I started from a fairly minimal VM which did not have docker or Development Tools installed.  Once I thought I had all the prerequisites, I kicked off the build and got

Cannot connect to the Docker daemon. Is the docker daemon running on this host?

This seems to be due to the fact that  the ayoung user does not have permissions to read/write on the domain socket.  /var/run/docker.sock

$ ls -la /var/run/docker.sock
srw-rw----. 1 root root 0 Aug 9 09:03 /var/run/docker.sock

Enough other stuff seems to discuss this as well.  How can we set up for non-root and non-sudo access to docker?

On my Fedora system, I have:

$ ls -la /var/run/docker.sock
srw-rw----. 1 root docker 0 Aug 7 09:01 /var/run/docker.sock

I set this up long enough ago that I do not remember if I was the one that did this, or if it was a configuration setup by some other package. The docker group has a pretty random ID:

$ getent group docker
docker:x:14372:ayoung

So I probably did that.

Back to the VM:

sudo groupadd docker
 sudo chown root:docker /var/run/docker.sock
 sudo usermod -aG docker ayoung

I exited out and logged back in:

$ groups
ayoung wheel docker

And it worked.  Will the socket stay that way?  Hmm.  After the build completes, I’ll reboot the VM and see what we have.

Yes it did.  Is there a better way to do this?  Let me know if you do.

 

Episode 58 - Backwards compatibility to the point of insanity

Posted by Open Source Security Podcast on August 09, 2017 01:21 PM
Josh and Kurt talk about MalwareTech, Debian killing off TLS 1.0 and 1.1, auto safety, HBO, and npm not typo squatting.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5624741/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Bonding two Ethernet devices

Posted by Adam Young on August 08, 2017 10:15 PM

In my continued investigations of networking stuff, I came across the question “How do you bond two ethernet devices together?”   While I did this years ago on RHEL3, I have pretty much forgotten how, so I decided to research and relearn this.

To start, I cloned my Centos 7 VM so I had a throw-away playground.  Before booting the VM, I added a second ethernet device on the default network, parallel to the existing device.

 

Then I booted it on up.

To start with, I have two ethernet devices that have both gotten their configuration information via DHCP.  I’m going to trash those to get to a clear starting point:

First, lets do it the easy way, with THe Network Manager Text UI (nmtui).

Create brings up this screen

Next to the box that says “slaves” (that term needs to go) Select add, and select ethernet:

And take all the defaults.

Do this again for the second ethernet device as well, and the result should look like this:

And exit out of nmtui.  You can use ip addr to see the current set up, and use ping to confirm that it works.

Lets do that using the CLI.  First, cleanup.

nmcli co delete "Bond connection 1"
nmcli co delete "Ethernet connection 1"
nmcli co delete "Ethernet connection 2"

Turns out there is a tutorial on the machine already:

man nmcli-examples

Example 6. Adding a bonding master and two slave connection profiles

$ nmcli con add type bond ifname mybond0 mode active-backup
$ nmcli con add type ethernet ifname eth1 master mybond0
$ nmcli con add type ethernet ifname eth2 master mybond0

Of course, that should be eth0 and eth1, but the steps laid out work and, again, you can test with ping once it is done.

Enabling an Ethernet connection on Centos7

Posted by Adam Young on August 08, 2017 05:40 PM

I recently created a new Centos VM. When it booted, I noticed it did not have a working ethernet connection. So, I started playing with things, and got it working. Here are my notes:

To view connections

nmcli c

To bring up the connection was simple:

nmcli c eth0 up

To make it persist across boots required the autoconnect value:

nmcli c mod eth0 connection.autoconnect yes

This last one reflects the ONBOOT value in the file /etc/sysconfig/network-scripts/ifcfg-eth0

Setting it changed ONBOOT from no to yes.

Rebooting the machine and ip addr shows the connection up.

 

What's new in José v8?

Posted by Nathaniel McCallum on August 08, 2017 03:35 PM

Wait! What’s José?

José is a general purpose cryptography toolkit which uses the data formats standardized by the JOSE IETF Working Group. By analogy, José is to JOSE what GPG is to OpenPGP and OpenSSL is to X.509.

José provides both a C-language library and a command line interface and is licensed under the Apache Software License version 2.0. José v8+ is available on Fedora 26+ (dnf install jose), Red Hat Enterpirse Linux 7.4+ (yum install jose) and macOS/Homebrew (brew install jose).

Motivation

While building José and projects that use it, we kept running into the problem that we needed to do manipulation of JSON data on the command line. First, we attempted to use jq. However, we found the interface to be cumbersome. Our biggest pain point was that we often wanted simple error reporting, but jq required us to write complicated error handlers. We also anticipated that many other consumers of José would have similar needs and we wanted to provide an integrated solution.

The end result of this development need is the jose fmt command; a simple JSON stack machine. We believe that this implementation is both simpler to use and less error prone. Let’s look at how to use jose fmt.

JSON Parsing

In our first example, we will extract a single JSON string value from a JSON object:

$ echo '{"key":"value"}' | jose fmt -j- -g key -u-
value
$ echo $?
0

This is simple enough. We pass three instructions to jose fmt: -j, -g and -u. First, the -j- instruction tells jose fmt to parse JSON from standard input. The result of this operation is placed on the top of the internal stack. Second, the -g key instruction tells jose fmt to get the value of the "key" property. Like before, the result of this operation is placed on the top of the internal stack. Finally, the -u- instruction tells jose fmt to print the string on the top of the stack to standard output without quotes.

What happens if the input is malformed or we pass instructions that are invalid? In this case, jose fmt errors and the return code of the process is the number of the instruction that failed. For example:

$ jose fmt -j '{}' -g key -u-
$ echo $?
2

In this case, the input is valid JSON, so the first instruction (-j '{}') succeeds. However, the resulting value does not contain the property "key", so the second instruction (-g key) fails. Thus, the return code is 2.

JSON Modification

Let’s look at a more complex case. This time we have a nested JSON structure and we want to change the value of "c" to 3. Here’s how we do it:

$ jose fmt -j '{"a":{"b":{"c":7}}}' -g a -g b -j 3 -s c -UUUo-
{"a":{"b":{"c":3}}}

First, we parse the input (-j ...) then we successively place the nested objects on the stack (-g a and -g b). Next, we place our new value on the stack (-j 3) and set it (-s c), overwriting the previous value. Finally, we unwind the stack back to the root object (-UUU) and output the JSON to standard output (-o-).

URL-Safe Base64

Because we wanted to provide an integrated solution, jose fmt also provides native support for performing URL-Safe Base64 encoding and decoding. This makes it extremely easy to parse objects such as JWE and JWS. For example, imagine we have a JWE that looks something like this:

$ cat my.jwe
{"protected":"eyJlbmMiOiJBMjU2Q0JDLUhTNTEyIn0",...}

If we wanted to extract a value from the JWE Protected Header using jq, we’d have to extract the "protected" property, then decode it and, finally, parse it again. With jose fmt we can do this in all one step:

$ jose fmt -j my.jwe -g protected -y -o-
{"enc":"A256CBC-HS512"}
$ jose fmt -j my.jwe -g protected -y -g enc -u-
A256CBC-HS512

In this example we use the -y instruction to decode the URL-Safe Base64 and parse the resulting bytes as JSON. If this succeeds, the JSON value produced from this process pushed onto the top of the stack (just like all other operations). From here, it can be manipulated just like any other value.

Conclusion

The jose fmt is a flexible utility that can be used to parse JSON of all sorts. You can use it to read values from JSON. You can use it to transform JSON values to new forms. You can use it to modify values within nested JSON objects. You can also use it to build up JSON values gradually. Try it in your project today!

For a detailed list of all the possible instructions, see jose fmt -h.

What is minishift ssh anyway?

Posted by Adam Young on August 08, 2017 03:35 PM

The documentation says that to access a minishift-deployed VM you can use `minishift ssh` to log in, but what if you want to use other tooling (like Ansible) to get in there? How can you use standard ssh commands to connect?

First, we need to find the IP address for the host machine. Since the oc commands work, we know that openshift, and thus kubernetes, can find the machine based on configuration. Looking in ~/.kube/config We see the set of server machines with stanzas like this

- context:
 cluster: 192-168-42-239:8443
 namespace: myproject
 user: system:admin/192-168-42-239:8443
 name: myproject/192-168-42-239:8443/system:admin

 

And the current context is set as:

current-context: myproject/192-168-42-239:8443/system:admin

Which tells us which to look at.  In my case, I can confirm using:

ping 192.168.42.239

which gives me:

PING 192.168.42.239 (192.168.42.239) 56(84) bytes of data.
64 bytes from 192.168.42.239: icmp_seq=1 ttl=64 time=0.355 ms

So I know that is an active VM.

If I run the minishift ssh command, I get logged in to the vm, and I see that it is as the docker user.  So I know I am going to want to run a command like:

ssh  docker@192.168.42.239

But that prompts me for my password, so it is not using the correct pkey.

Turns out that minishift sticks the openssl generated files in

~/.minishift/machines/minishift/

And so the complete command is:

ssh -i ~/.minishift/machines/minishift/id_rsa docker@192.168.42.23

Adding External IPs for Minishift

Posted by Adam Young on August 04, 2017 07:33 PM

In the interest of simplifying the development and deployment of Kubevirt, we decided to make sure it was possible to run with minishift.  After downloading and running the minishift binary, I had a working minishift cluster.  However, in order to deploy the api-server to the cluster, I needed an external IP;  otherwise I’d get the error:

Error: service "" is invalid spec.externalIPs: Forbidden: externalIPs have been disabled

Here is how I got around this error.

I had to ssh in to the minishift vm …

 minishift ssh

and edit the config file.

sudo vi /mnt/sda1/var/lib/minishift/openshift.local.config/master/master-config.yaml

The change I needed to make looks like this:

networkConfig: 
 clusterNetworkCIDR: 10.128.0.0/14 
- externalIPNetworkCIDRs: null
+ externalIPNetworkCIDRs: ["0.0.0.0/0"]
 hostSubnetLength: 9 
 ingressIPNetworkCIDR: 172.29.0.0/16
 networkPluginName: "" 
 serviceNetworkCIDR: 172.30.0.0/16

 

and then, from my workstation, stopping and restarting minishift:

minishift stop

minishift start

 

At this point I was able to deploye the manifests.

 for MANIFEST in `ls ~/go/src/kubevirt.io/kubevirt/manifests/*yaml` ; do oc apply -f  $MANIFEST; done

There were other errors to follow, but this got beyond the external IP complaint.

So you want to script gdb with python …

Posted by William Brown on August 03, 2017 02:00 PM

So you want to script gdb with python …

Gdb provides a python scripting interface. However the documentation is highly technical and not at a level that is easily accessible.

This post should read as a tutorial, to help you understand the interface and work toward creating your own python debuging tools to help make gdb usage somewhat “less” painful.

The problem

I have created a problem program called “naughty”. You can find it here .

You can compile this with the following command:

gcc -g -lpthread -o naughty naughty.c

When you run this program, your screen should be filled with:

thread ...
thread ...
thread ...
thread ...
thread ...
thread ...

It looks like we have a bug! Now, we could easily see the issue if we looked at the C code, but that’s not the point here - lets try to solve this with gdb.

gdb ./naughty
...
(gdb) run
...
[New Thread 0x7fffb9792700 (LWP 14467)]
...
thread ...

Uh oh! We have threads being created here. We need to find the problem thread. Lets look at all the threads backtraces then.

Thread 129 (Thread 0x7fffb3786700 (LWP 14616)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 128 (Thread 0x7fffb3f87700 (LWP 14615)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 127 (Thread 0x7fffb4788700 (LWP 14614)):
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

...

We have 129 threads! Anyone of them could be the problem. We could just read these traces forever, but that’s a waste of time. Let’s try and script this with python to make our lives a bit easier.

Python in gdb

Python in gdb works by bringing in a copy of the python and injecting a special “gdb” module into the python run time. You can only access the gdb module from within python if you are using gdb. You can not have this work from a standard interpretter session.

We can access a dynamic python runtime from within gdb by simply calling python.

(gdb) python
>print("hello world")
>hello world
(gdb)

The python code only runs when you press Control D.

Another way to run your script is to import them as “new gdb commands”. This is the most useful way to use python for gdb, but it does require some boilerplate to start.

import gdb

class SimpleCommand(gdb.Command):
    def __init__(self):
        # This registers our class as "simple_command"
        super(SimpleCommand, self).__init__("simple_command", gdb.COMMAND_DATA)

    def invoke(self, arg, from_tty):
        # When we call "simple_command" from gdb, this is the method
        # that will be called.
        print("Hello from simple_command!")

# This registers our class to the gdb runtime at "source" time.
SimpleCommand()

We can run the command as follows:

(gdb) source debug_naughty.py
(gdb) simple_command
Hello from simple_command!
(gdb)

Solving the problem with python

So we need a way to find the “idle threads”. We want to fold all the threads with the same frame signature into one, so that we can view anomalies.

First, let’s make a “stackfold” command, and get it to list the current program.

class StackFold(gdb.Command):
def __init__(self):
    super(StackFold, self).__init__("stackfold", gdb.COMMAND_DATA)

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    inferiors = gdb.inferiors()
    for inferior in inferiors:
        print(inferior)
        print(dir(inferior))
        print(help(inferior))

StackFold()

To reload this in the gdb runtime, just run “source debug_naughty.py” again. Try running this: Note that we dumped a heap of output? Python has a neat trick that dir and help can both return strings for printing. This will help us to explore gdb’s internals inside of our program.

We can see from the inferiors that we have threads available for us to interact with:

class Inferior(builtins.object)
 |  GDB inferior object
...
 |  threads(...)
 |      Return all the threads of this inferior.

Given we want to fold the stacks from all our threads, we probably need to look at this! So lets get one thread from this, and have a look at it’s help.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)
    print(help(head_thread))

Now we can run this by re-running “source” on our script, and calling stackfold again, we see help for our threads in the system.

At this point it get’s a little bit less obvious. Gdb’s python integration relates closely to how a human would interact with gdb. In order to access the content of a thread, we need to change the gdb context to access the backtrace. If we were doing this by hand it would look like this:

(gdb) thread 121
[Switching to thread 121 (Thread 0x7fffb778e700 (LWP 14608))]
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

We need to emulate this behaviour with our python calls. We can swap to the thread’s context with:

class InferiorThread(builtins.object)
 |  GDB thread object
...
 |  switch(...)
 |      switch ()
 |      Makes this the GDB selected thread.

Then once we are in the context, we need to take a different approach to explore the stack frames. We need to explore the “gdb” modules raw context.

inferiors = gdb.inferiors()
for inferior in inferiors:
    thread_iter = iter(inferior.threads())
    head_thread = next(thread_iter)
    # Move our gdb context to the selected thread here.
    head_thread.switch()
    print(help(gdb))

Now that we have selected our thread’s context, we can start to explore here. gdb can do a lot within the selected context - as a result, the help output from this call is really large, but it’s worth reading so you can understand what is possible to achieve. In our case we need to start to look at the stack frames.

To look through the frames we need to tell gdb to rewind to the “newest” frame (ie, frame 0). We can then step down through progressively older frames until we exhaust. From this we can print a rudimentary trace:

head_thread.switch()

# Reset the gdb frame context to the "latest" frame.
gdb.newest_frame()
# Now, work down the frames.
cur_frame = gdb.selected_frame()
while cur_frame is not None:
    print(cur_frame.name())
    # get the next frame down ....
    cur_frame = cur_frame.older()
(gdb) stackfold
pthread_cond_wait@@GLIBC_2.3.2
lazy_thread
start_thread
clone

Great! Now we just need some extra metadata from the thread to know what thread id it is so the user can go to the correct thread context. So lets display that too:

head_thread.switch()

# These are the OS pid references.
(tpid, lwpid, tid) = head_thread.ptid
# This is the gdb thread number
gtid = head_thread.num
print("tpid %s, lwpid %s, tid %s, gtid %s" % (tpid, lwpid, tid, gtid))
# Reset the gdb frame context to the "latest" frame.
(gdb) stackfold
tpid 14485, lwpid 14616, tid 0, gtid 129

At this point we have enough information to fold identical stacks. We’ll iterate over every thread, and if we have seen the “pattern” before, we’ll just add the gdb thread id to the list. If we haven’t seen the pattern yet, we’ll add it. The final command looks like:

def invoke(self, arg, from_tty):
    # An inferior is the 'currently running applications'. In this case we only
    # have one.
    stack_maps = {}
    # This creates a dict where each element is keyed by backtrace.
    # Then each backtrace contains an array of "frames"
    #
    inferiors = gdb.inferiors()
    for inferior in inferiors:
        for thread in inferior.threads():
            # Change to our threads context
            thread.switch()
            # Get the thread IDS
            (tpid, lwpid, tid) = thread.ptid
            gtid = thread.num
            # Take a human readable copy of the backtrace, we'll need this for display later.
            o = gdb.execute('bt', to_string=True)
            # Build the backtrace for comparison
            backtrace = []
            gdb.newest_frame()
            cur_frame = gdb.selected_frame()
            while cur_frame is not None:
                backtrace.append(cur_frame.name())
                cur_frame = cur_frame.older()
            # Now we have a backtrace like ['pthread_cond_wait@@GLIBC_2.3.2', 'lazy_thread', 'start_thread', 'clone']
            # dicts can't use lists as keys because they are non-hashable, so we turn this into a string.
            # Remember, C functions can't have spaces in them ...
            s_backtrace = ' '.join(backtrace)
            # Let's see if it exists in the stack_maps
            if s_backtrace not in stack_maps:
                stack_maps[s_backtrace] = []
            # Now lets add this thread to the map.
            stack_maps[s_backtrace].append({'gtid': gtid, 'tpid' : tpid, 'bt': o} )
    # Now at this point we have a dict of traces, and each trace has a "list" of pids that match. Let's display them
    for smap in stack_maps:
        # Get our human readable form out.
        o = stack_maps[smap][0]['bt']
        for t in stack_maps[smap]:
            # For each thread we recorded
            print("Thread %s (LWP %s))" % (t['gtid'], t['tpid']))
        print(o)

Here is the final output.

(gdb) stackfold
Thread 129 (LWP 14485))
Thread 128 (LWP 14485))
Thread 127 (LWP 14485))
...
Thread 10 (LWP 14485))
Thread 9 (LWP 14485))
Thread 8 (LWP 14485))
Thread 7 (LWP 14485))
Thread 6 (LWP 14485))
Thread 5 (LWP 14485))
Thread 4 (LWP 14485))
Thread 3 (LWP 14485))
#0  0x00007ffff7bc38eb in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00000000004007bc in lazy_thread (arg=0x7fffffffdfb0) at naughty.c:19
#2  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#3  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 2 (LWP 14485))
#0  0x00007ffff78d835b in write () from /lib64/libc.so.6
#1  0x00007ffff78524fd in _IO_new_file_write () from /lib64/libc.so.6
#2  0x00007ffff7854271 in __GI__IO_do_write () from /lib64/libc.so.6
#3  0x00007ffff7854723 in __GI__IO_file_overflow () from /lib64/libc.so.6
#4  0x00007ffff7847fa2 in puts () from /lib64/libc.so.6
#5  0x00000000004007e9 in naughty_thread (arg=0x0) at naughty.c:27
#6  0x00007ffff7bbd3a9 in start_thread () from /lib64/libpthread.so.0
#7  0x00007ffff78e936f in clone () from /lib64/libc.so.6

Thread 1 (LWP 14485))
#0  0x00007ffff7bbe90d in pthread_join () from /lib64/libpthread.so.0
#1  0x00000000004008d1 in main (argc=1, argv=0x7fffffffe508) at naughty.c:51

With our stackfold command we can easily see that threads 129 through 3 have the same stack, and are idle. We can see that tread 1 is the main process waiting on the threads to join, and finally we can see that thread 2 is the culprit writing to our display.

My solution

You can find my solution to this problem as a reference implementation here .

Jury Duty

Posted by Adam Young on August 03, 2017 02:03 AM

I spent the past six work days in a courthouse as a juror.  It was a civil case, involving a house repair after a burst pipe flooded it. Verdict went in at around 3 PM (Aug. 2) 

There is so much you don’t know on a jury. You can only consider the evidence placed before you…and sometimes you have to forget something you learned before the witness reacts to the word “Objection.”

It was a construction case, and, despite having grown up as the son (and sometimes employee) of a construction contractor, they chose me anyway. I don’t think it colored my reasoning anyway.

Based on this incomplete information, we had to award money to one or the other; doing nothing was, in effect, awarding money to the client who had not paid.

While I did not agree with the other eleven people on the jury about all of the outcomes (there were several charges both ways) I was very thankful to have all of them share the burden of making the decision. I can only imaging the burden carried by a judge in arbitration.

On the other hand, in arbitration, the judge can do research. We couldn’t. We had to even forget things we know about construction (like you postpone work on the outside to get the people back inside) if it was not presented as evidence.

I was very thankful to have my dad to talk this over with afterwards as he has fifty plus years in the construction industry. He clarified some of my assumptions (based on the incomplete information I gave him) and I think I can let go of my doubts. I can sleep soundly tonight knowing I did the best I could, and that, most likely, justice was served.

The number one thing I took away from this experience is, with anything involving contracting, or money in general, is to get everything in writing, communicate as clearly as possible. Aside from covering you for a future lawsuit, it might help prevent that lawsuit by keeping the other person on track. Run your business such that someone else could step in and take over from you, and know exactly what you were doing…or you can hand over what you want to a brand new contractor and they could take over. Obviously, that is a high bar to clear, but the better you do, the better for all involved.

Episode 57 - We may never see amazing security research ever again

Posted by Open Source Security Podcast on August 01, 2017 01:37 PM
Josh and Kurt talk about Black Hat and Defcon, safes, banks, voting machines, SMBv1 DoS attack, Flash, liability, and password masking.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5598524/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



For a security conference that everyone claims not to trust the wifi, there sure was a lot of wifi

Posted by Josh Bressers on July 28, 2017 02:32 PM
I attended BlackHat USA 2017, Elastic had a booth on the floor I spent a fair bit of time at as well as meetings scattered about the conference center. It was a great time as always, but this year I had a secret with me. I put together a Raspberry Pi that was passively collecting wifi statistics. Just certain metadata, no actual wifi data packets were captured or harmed in the making of this. I then log everything into Elasticsearch so I can build pretty visualizations in Kibana. I only captured 2.4 Ghz data with one radio, so I had it jumping around. Obviously I missed plenty of data, but this was really just about looking for interesting patterns.

I put everything I used to make this project go into GitHub, it's really rough though, you've been warned.

I have a ton of data to mine, I'll no doubt spend a great deal of time in the future doing that, but here's the basic TL;DR picture.

pretty picture

I captured 12.6 million wifi packets, the blue bars show when I captured what, the table shows the SSIDs I saw (not all packets have SSID data), and the colored graph shows which wifi channels were seen (not all packets have channel data either). I also have packet frequencies logged, so all that can be put together later. The two humps in the wifi data was when I was around the conference, I admit I was surprised by the volume of wifi I saw basically everywhere, even in the middle of the night from my hotel room.

Below is a graph showing the various frequencies I saw, every packet has to come in on some wireless frequency even if it doesn't have a wifi channel.



The devices seen data was also really interesting.

This chart represents every packet seen, so it's clearly going to be a long tail. It's no surprise an access point sends out a lot of packets, I didn't expect Apple to be #1 here, I expected the top few to be access point manufacturers. It would seem Apple gear is more popular and noisy than I expected.

A more interesting graph is unique devices seen by manufacturer (as a side note, I saw 77,904 devices in total over my 3 days).


This table is far more useful as it's totally expected a single access point will be very noisy. I didn't expect Cisco to make the top 3 I admit. But this means that Apple was basically 10% of wifi devices then we drop pretty quickly.

There's a lot more interesting data in this set, I just have to spend some time finding it all. I'll also make a point to single out the data specific to business hours. Stay tuned for a far more detailed writeup.

Security by Isolating Insecurity [Survive IoT Part 4]

Posted by Russel Doty on July 25, 2017 10:14 PM

In my previous post I introduced “Goldilocks Security”, proposing three approaches to security.

Solution 1: Ignore Security

Safety in the crowd – with tens of millions of cameras out there, why would anyone pick mine? Odds are that the bad guys won’t pick yours – they will pick all of them! Automated search and penetration tools easily find millions of IP cameras. You will be lost in the crowd – the crowd of bots!

Solution 2: Secure the Cameras

For home and small business customers, a secure the camera approach simply won’t work because ease of use wins out over effective security in product design and because the camera vendors’ business model (low-cost, ease of use, and access over the Internet) all conspire against security. What’s left?

Solution 3: Isolation

If the IP cameras can’t be safely placed on the Internet, then isolate them from the Internet.

To do this, introduce an IoT Gateway between the cameras and all other systems. This IoT Gateway would have two network interfaces: one network interface dedicated to the cameras and the second network interface used to connect to the outside world. An application running on the IoT Gateway would talk to the IP cameras and then talk to the outside world (if needed). There would be no network connection between the IP cameras and anything other than the IoT Gateway application. The IoT Gateway would also be hardened and actively managed for best security.

How is this implemented?

  • Put the IP cameras on a dedicated network. This should be a separate physical network. At a minimum it should be a VLAN (Virtual LAN). There will typically be a relatively small number of IP cameras in use, so a dedicated network switch, probably with PoE, is cost effective.
    • Use static IP addresses. If the IP cameras are assigned static IP addresses, there is no need to have an IP gateway or DNS server on the network segment. This further reduces the ability of the IP cameras to get out on the network. You lose the convenience of DHCP assigned address and gain significant security.
    • You can have multiple separate networks. For example, you might have one for external cameras, one for cameras in interior public spaces, one for manufacturing space and one for labs. With this configuration, someone gaining access to the exterior network would not be able to gain access to the lab cameras.
  • Add an IoT Gateway – a computer with a network interface connected to the camera network. In the example above, the gateway would have four network interfaces – one for each camera network. The IoT Gateway would probably also be connected to the corporate network; this would require a fifth network interface. Note that you can have multiple IoT Gateways, such as one for each camera network, one for a building management system, one for other security systems, and one that connects an entire building or campus to the Internet.
  • Use a video monitoring program such as ZoneMinder or a commercial program to receive, monitor and display the video data. Such a program can monitor multiple camera feeds, analyze the video feeds for things such as motion detection, record multiple video streams, and create events and alerts. These events and alerts can do things like trigger alarms, send emails, send text messages, or trigger other business rules. Note that the video monitoring program further isolates the cameras from the Internet – the cameras talk to the video monitoring program and the video monitoring program talks to the outside world.
  • Sandbox the video monitoring program using tools like SELinux and containers. These both protect the application and protect the rest of the system from the application – even if the application is compromised, it won’t be able to attack the rest of the system.
  • Remove any unneeded services from the IoT Gateway. This is a dedicated device performing a small set of tasks. There shouldn’t be any software on the system that is not needed to perform these tasks – no development tools, no extraneous programs, no unneeded services running.
  • Run the video monitoring program with minimal privileges. This program should not require root level access.
  • Configure strong firewall settings on the IoT Gateway. Only allow required communications. For example, only allow communications with specific IP addresses or mac addresses (the IP cameras configured into the system) over specific ports using specific protocols. You can also configure the firewall to only allow specific applications access to the network port. These settings would keep anything other than authorized cameras from accessing the gateway and keep the authorized cameras from talking to anything other than the video monitoring application. This approach also protects the cameras. Anyone attempting to attack the cameras from the Internet would need to penetrate the IoT Gateway and then change settings such as the firewall and SELinux before they could get to the cameras.
  • Use strong access controls. Multi-factor authentication is a really good idea. Of course you have a separate account for each user, and assign each user the minimum privilege they need to do their job. Most of the time you don’t need to be logged in to the system – most video monitoring applications can display on the lock screen, allowing visual monitoring of the video streams without being able to change the system. For remote gateways interactive access isn’t needed at all; they simply process sensor data and send it to a remote system.
  • Other systems should be able to verify the identity of the IoT Gateway. A common way to do this is to install a certificate on the gateway. Each gateway should have a unique certificate, which can be provided by systems like Linux IdM or MS Active Directory. Even greater security can be provided by placing the system identity into a hardware root of trust like a TPM (Trusted Processing Module), which prevents the identity from being copied, cloned, or spoofed.
  • Encrypted communications is always a good idea for security. Encryption protects the contents of the video stream from being revealed, prevents the contents of the video stream from being modified or spoofed, and verifies the integrity of the video stream – any modifications of the encrypted traffic, either deliberate or due to network error, are detected. Further, if you configure a VPN (Virtual Private Network) between the IoT Gateway and backend systems you can force all network traffic through the VPN, thus preventing network attacks against the IoT Gateway. For security systems it is good practice to encrypt all traffic, both internal and external.
  • Proactively manage the IoT Gateway. Regularly update it to get the latest security patches and bug fixes. Scan it regularly with tools like OpenSCAP to maintain secure configuration. Monitor logfiles for anomalies that might be related to security events, hardware issues, or software issues.

You can see how a properly configured IoT Gateway can allow you to use insecure IoT devices as part of a secure system. This approach isn’t perfect – the cameras should also be managed like the gateway – but it is a viable approach to building a reasonably secure and robust system out of insecure devices.

One issue is that the cameras are not protected from local attack. If WiFi is used the attacker only needs to be nearby. If Ethernet is used an attacker can add another device to the network. This is difficult as you would need to gain access to the network switch and find a live port on the proper network. Attacking the Ethernet cable leaves signs, including network glitches. Physically attacking a camera also leaves signs. All of this can be done, but is more challenging than a network based attack over the Internet and can be managed through physical security and good network monitoring. These are some of the reasons why I strongly prefer wired network connections over wireless network connections.


Security and privacy are the same thing

Posted by Josh Bressers on July 23, 2017 12:36 AM
Earlier today I ran across this post on Reddit
Security but not Privacy (Am I doing this right?)

The poster basically said "I care about security but not privacy".

It got me thinking about security and privacy. There's not really a difference between the two. They are two faces of the same coin but why isn't always obvious in today's information universe. If a site like Facebook or Google knows everything about you it doesn't mean you don't care about privacy, it means you're putting your trust in those sites. The same sort of trust that makes passwords private.

The first thing we need to grasp is what I'm going to call a trust boundary. I trust you understand trust already (har har har). But a trust boundary is less obvious sometimes. A security (or privacy) incident happens when there is a breach of the trust boundary. Let's just dive into some examples to better understand this.

A web site is defaced
In this example the expectation is the website owner is the only person or group that can update the website content. The attacker crossed a trust boundary that allowed them to make unwanted changes to the website.

Your credit card is used fraudulently
It's expected that only you will be using your credit card. If someone gets your number somehow and starts to make purchases with your card, how they got the card crosses a trust boundary. You could easily put this example in the "privacy" bucket if you wanted to keep them separate, it's likely your card was stolen due to lax security at one of the businesses you visited.

Your wallet is stolen
This one is tricky. The trust boundary is probably your pocket or purse. Maybe you dropped it or forgot it on a counter. Whatever happened the trust boundary is broken when you lose control of your wallet. An event like this can trickle down though. It could result in identity theft, your credit card could be used. Maybe it's just about the cash. The scary thing is you don't really know because you lost a lot of information. Some things we'd call privacy problems, some we'd call security problems.

I use a confusing last example on purpose to help prove my point. The issue is all about who do you trust with what. You can trust Facebook and give them tons of information, many of us do. You can trust Google for the same basic reasons. That doesn't mean you don't care about privacy, it just means you have put them inside a certain trust boundary. There are limits to that trust though.

What if Facebook decided to use your personal information to access your bank records? That would be a pretty substantial trust boundary abuse. What if your phone company decided to use the information they have to log into your Facebook account?

A good password isn't all that different from your credit card number. It's a bit of private information that you share with one or more other organizations. You are expecting them not to cross a trust boundary with the information you gave them.

The real challenge is to understand what trust boundaries you're comfortable with. What do you share with who? Nobody is an island, we must exist in an ecosystem of trust. We all have different boundaries of what we will share. That's quite all right. If you understand your trust boundary making good security/privacy decisions becomes a lot easier.

They say information is the new oil. If that's true then trust must be the currency.

Goldilocks Security: Bad, Won’t Work, and Plausible [Survive IoT Part 3]

Posted by Russel Doty on July 20, 2017 11:03 PM

Previous posts discussed the security challenge presented by IoT devices, using IP Video Cameras as an example. Now let’s consider some security alternatives:

Solution 1: Ignore Security

This is the most common approach to IoT security today. And, to a significant degree, it works. In the same way that ignoring fire safety usually works – only a few businesses or homes burn down each year!

Like fire safety, the risks from ignoring IoT security grow over time. Like fire safety, the cost of the relatively rare events can be catastrophic. Unlike fire safety, an IoT event can affect millions of entities at the same time.

And, unlike traditional IT security issues, IoT security issues can result in physical damage and personal injury. Needless to say, I do not recommend ignoring the issue as a viable approach to IoT security!

Solution 2: Secure the Cameras

Yes, you should secure IP cameras. They are computers sitting on your network – and should be treated like computers on your network! Best practices for IT security are well known and readily available. You should install and configure them securely, update them regularly, and monitor them continuously.

If you have a commercial implementation of an IP video security system you should have regular updates and maintenance of your system. You should be demanding strong security – both physical security and IT security – of the video security system.

You did have IT involved in selection, implementation and operation of the video security system, didn’t you? You did make security a key part of the selection process, just as you would for any other IT system, didn’t you? You are doing regular security scans of the video security system and monitoring all network traffic, aren’t you? Good, you have nothing to worry about!

If you are like many companies, you are probably feeling a bit nervous right now…

For home and small business customers, a secure the camera approach simply won’t work.

  • Customer ease of use expectations largely prevent effective security.
  • Customer knowledge and expertise doesn’t support secure configuration or updates to the system.
  • The IoT vendor business model doesn’t support security: Low cost, short product life, a great feature set, ease of use, and access over the Internet all conspire against security.
  • There is a demonstrated lack of demand for security. People have shown, by their actions and purchasing decisions, the effective security is not a priority. At least until there is a security breach – and then they are looking for someone to blame. And often someone to sue…

Securing the cameras is a great recommendation but generally will not work in practice. Unfortunately. Still, it should be a requirement for any Industrial IoT deployment.

Solution 3: Isolation

If ignoring the problem doesn’t work and fixing the problem isn’t viable, what is left? Isolation. If the IP cameras can’t be safely placed on the Internet, then isolate them from the Internet.

Such isolation will both protect the cameras from the Internet and protect the Internet from the cameras.

The challenge is that networked cameras have to be on the network to work.

Even though the cameras are designed to be directly connected to the Internet, they don’t have to be directly connected to the Internet. The cameras can be placed on a separate isolated network.

In my next post, I will go into detail on how to achieve this isolation using an IoT Gateway between the cameras and all the other systems.


Summer is coming

Posted by Josh Bressers on July 20, 2017 12:27 PM
I'm getting ready to attend Black Hat. I will miss BSides and Defcon this year unfortunately due to some personal commitments. And as I'm packing up my gear, I started thinking about what these conferences have really changed. We've been doing this every summer for longer than many of us can remember now. We make our way to the desert, we attend talks by what we consider the brightest minds in our industry. We meet lots of people. Everyone has a great time. But what is the actionable events that come from these things.

The answer is nothing. They've changed nothing.

But I'm going to put an asterisk next to that.

I do think things are getting better, for some definition of better. Technology is marching forward, security is getting dragged along with a lot of it. Some things, like IoT, have some learning to do, but the real change won't come from the security universe.

Firstly we should understand that the world today has changed drastically. The skillset that mattered ten years ago doesn't have a lot of value anymore. Things like buffer overflows are far less important than they used to be. Coding in C isn't quite what it once was. There are many protections built into frameworks and languages. The cloud has taken over a great deal of infrastructure. The list can go on.

The point of such a list is to ask the question, how much of the important change that's made a real difference came from our security leaders? I'd argue not very much. The real change comes from people we've never heard of. There are people in the trenches making small changes every single day. Those small changes eventually pile up until we notice they're something big and real.

Rather than trying to fix the big problems, our time is better spent ignoring the thought leaders and just doing something small. Conferences are important, but not to listen to the leaders. Go find the vendors and attendees who are doing new and interesting things. They are the ones that will make a difference, they are literally the future. Even the smallest bug bounty, feature, or pull request can make a difference. The end goal isn't to be a noisy gasbag, instead it should be all about being useful.



New version of buildah 0.2 released to Fedora.

Posted by Dan Walsh on July 19, 2017 01:01 PM
New features and bugfixes in this release

Updated Commands
buildah run
     Add support for -- ending options parsing
     Add a way to disable PTY allocation
     Handle run without an explicit command correctly
Buildah build-using-dockerfile (bud)
    Ensure volume points get created, and with perms
buildah containers
     Add a -a/--all option - Lists containers not created by buildah.
buildah Add/Copy
     Support for glob syntax
buildah commit
     Add flag to remove containers on commit
buildah push
     Improve man page and help information
buildah export:
    Allows you to export a container image
buildah images:
    update commands
    Add JSON output option
buildah rmi
    update commands
buildah containers
     Add JSON output option

New Commands
buildah version
     Identify version information about the buildah command
buildah export
     Allows you to export a containers image

Updates
Buildah docs: clarify --runtime-flag of run command
Update to match newer storage and image-spec APIs
Update containers/storage and containers/image versions


Episode 56 - Devil's Advocate and other fuzzy topics

Posted by Open Source Security Podcast on July 18, 2017 08:50 PM
Josh and Kurt talk about forest fires, fuzzing, old time Internet, and Net Neutrality. Listen to Kurt play the Devil's Advocate and manage to change Josh's mind about net neutrality.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5551879/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Representative IoT Device: IP Video Camera [Survive IoT Part 2]

Posted by Russel Doty on July 17, 2017 09:58 PM

One of the most flexible, powerful, and useful IoT sensors is a video camera. Video streams can be used directly. They can also be analyzed using modern software and an incredible range of information extracted from the images: motion detection for eventing and alerts, automobile license recognition for parking systems and theft detection, facial recognition, manufacturing quality control, part location and orientation for robotics, local environment for autonomous vehicles, crop analysis for health and pests, and new uses that haven’t been thought of yet!

The IoT revolution for video cameras is the IP (Internet Protocol) camera – a video camera with integrated computer that can talk directly to a network and provide video and still images in a format that can be directly manipulated by software. An IP camera is essentially a computer with an image sensor and a network interface. A surprisingly powerful computer which can do image processing, image analysis, image conversion, image compression, and send multiple real-time video streams over the Internet. The IP cameras use standard processors, operating systems, and toolkits for video processing and networking.

Modern IP security cameras have high resolution – 3MP-5MP – excellent image quality, the ability to see in complete darkness, and good mechanical construction that can withstand direct exposure to the elements for many years. Many of these IP Video Cameras have enough processing power to be able to do motion detection inside the camera – a rather advanced video analysis capability! They can be connected to the network over WiFi or Ethernet. A popular capability is PoE or Power over Ethernet, which allows a camera to use a single Ethernet cable for both network and power. For ease of use these IP cameras are designed to automatically connect to back-end servers in the cloud and then to display the video stream on smartphones.

These IP cameras are available with full support and regular updates from industrial suppliers at prices ranging from several hundred to a few thousand dollars per camera. They are commonly sold in systems that include cameras, installation, monitoring and recording systems and software, integration, and service and support. There are a few actual manufacturers of the cameras, and many OEMs place their own brand names on the cameras.

These same cameras are readily available to consumers for less than $100 through unofficial, unsupported, “grey market” channels.

IP cameras need an account for setup, configuration and management. They contain an embedded webserver with full control of the camera. Virtually all cameras have a root level account with username of admin and password of admin. Some of them even recommend that you change this default password… One major brand of IP cameras also has two hardcoded maintenance accounts with root access; you can’t change the password on these accounts. And you can discover the username and password with about 15 seconds of Internet research.

The business model that allows you to purchase a high quality IP camera for <$100 does not support lifetime updates of software. It also does not support high security – ease of use and avoiding support calls is the highest priority. Software updates can easily cause problems – and the easiest way to avoid problems caused by software updates is to avoid software updates. The result is a “fire and forget” model where the software in the IP camera is never updated after the camera is installed. This means that security vulnerabilities are never addressed.

Let’s summarize:

  • IP video cameras are powerful, versatile and flexible IoT sensors that can be used for many purposes.
  • High quality IP cameras are readily available at low cost.
  • IP video cameras are powerful general purpose computers.
  • The business model for IP video cameras results in cameras that are seldom updated and are typically not configured for good security.
  • IP video cameras are easy to compromise and take over.
    • Can be used to penetrate the rest of your network.
    • Can be used to attack the Internet.
  • There are 10’s of millions of IP video cameras installed.

So far we have outlined the problem. The next post will begin to explore how we can address the security issues – including obvious approaches that won’t work…


Episode 55 - Good docs ruin my story

Posted by Open Source Security Podcast on July 12, 2017 01:56 PM
Josh and Kurt talk about Let's Encrypt, certificates, Kaspersky, A/V, code signing, Not Petya, self driving cars, and failures that become security problems.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5534632/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/87A93A/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



How I Survived the Internet of Things [Survive IoT Part 1]

Posted by Russel Doty on July 12, 2017 12:32 AM

Working with IoT from a software architecture perspective teaches you a lot, but leaves the nagging question “how does this really work?”. Theory is great and watching other people work is relaxing, but the time comes when I have to get my hands dirty. So I decided I had to actually implement an IoT project.

The first step was to define the goals for the project:

  • Hands-on experience with Industrial IoT technologies. I’m much more interested in Industrial IoT than Consumer IoT. I am not going to have anything to do with an Internet refrigerator!
  • Accomplish a real task with IoT:
    • Something useful and worthwhile; something that makes a difference.
    • Something usable – including by non-technical people!
    • Something robust and reliable. A system that can be expected to function for a decade or longer with essentially perfect reliability.
    • “Affordable” – a reasonably low cost entry cost, but with a bias toward functional capabilities, low maintenance, and long life. Balance initial costs with operational costs and minimize system elements that have monthly or yearly fees.
    • Secure – including system and network security. There will be much more on this topic!
  • Learn how things really work. Engage in hand to hand combat with sensors, devices, systems, wired vs. wireless, reliability, usability, interoperability, and the myriad other factors that crop up when you actually try to make something work.
  • A bias toward using commercial components and systems rather than building things out of Raspberry Pi and sensor modules. There isn’t anything wrong with Raspberry Pi and low level integration, I just wanted to work at a higher level.
  • And, to be completely honest, to have an excuse to play with some neat toys!

Based on these goals I chose to work on home automation with a focus on security and lighting. After considering many things that could be done I chose to implement monitoring of fire, carbon monoxide, power, temperature, water intrusion, perimeter intrusion, and video monitoring. I also implemented lighting control with the goals of power savings, convenience, and having lights on when you come home. When designing and implementing the various subsystems I chose commercial grade monitoring, sensors and controls.

I sometimes get the question “do you live in a bad neighborhood?” No, I live in a great neighborhood. The main reasons for this project were safety, reduced power consumption, and an excuse to play with neat toys. Yes, I got carried away…

October 2016: Things Attack the Internet

In October 2016 several large Internet sites were subjected to a massive DdoS (Distributed Denial of Service) attack carried out by hundreds of thousands, perhaps millions, of compromised IP video cameras and home routers. These attacks were some of the highest bandwidth attacks ever observed and are hard to defend against.

In January of 2017, an estimated 70% of the security cameras in Washington DC were compromised by malware and were not able to stream video. Workers had to physically go to each individual camera and do a fresh install of the original firmware to return them to operation.

Security experts have been warning about weaknesses in IoT for years. Many of these warnings are about how easy it is to compromise and subvert IoT systems. The October 2016 attacks showed that these IoT weaknesses can also be used to directly attack key parts of the Internet. A larger attack could potentially make the Internet unusable!

Since IP cameras were used in the first major attack by IoT on the Internet and I have several of these cameras installed in my system, let’s start our case study with with them.

The next article will begin exploring the capabilities, security, and business model of powerful and affordable IoT devices.


Time safety and Rust

Posted by William Brown on July 11, 2017 02:00 PM

Time safety and Rust

Recently I have had the great fortune to work on this ticket . This was an issue that stemmed from an attempt to make clock performance faster. Previously, a call to time or clock_gettime would involve a context switch an a system call (think solaris etc). On linux we have VDSO instead, so we can easily just swap to the use of raw time calls.

The problem

So what was the problem? And how did the engineers of the past try and solve it?

DS heavily relies on time. As a result, we call time() a lot in the codebase. But this would mean context switches.

So a wrapper was made called “current_time()”, which would cache a recent output of time(), and then provide that to the caller instead of making the costly context switch. So the code had the following:

static time_t   currenttime;
static int      currenttime_set = 0;

time_t
poll_current_time()
{
    if ( !currenttime_set ) {
        currenttime_set = 1;
    }

    time( &currenttime );
    return( currenttime );
}

time_t
current_time( void )
{
    if ( currenttime_set ) {
        return( currenttime );
    } else {
        return( time( (time_t *)0 ));
    }
}

In another thread, we would poll this every second to update the currenttime value:

void *
time_thread(void *nothing __attribute__((unused)))
{
    PRIntervalTime    interval;

    interval = PR_SecondsToInterval(1);

    while(!time_shutdown) {
        poll_current_time();
        csngen_update_time ();
        DS_Sleep(interval);
    }

    /*NOTREACHED*/
    return(NULL);
}

So what is the problem here

Besides the fact that we may not poll accurately (meaning we miss seconds but always advance), this is not thread safe. The reason is that CPU’s have register and buffers that may cache both stores and writes until a series of other operations (barriers + atomics) occur to flush back out to cache. This means the time polling thread could update the clock and unless the POLLING thread issues a lock or a barrier+atomic, there is no guarantee the new value of currenttime will be seen in any other thread. This means that the only way this worked was by luck, and no one noticing that time would jump about or often just be wrong.

Clearly this is a broken design, but this is C - we can do anything.

What if this was Rust?

Rust touts mulithread safety high on it’s list. So lets try and recreate this in rust.

First, the exact same way:

use std::time::{SystemTime, Duration};
use std::thread;


static mut currenttime: Option<SystemTime> = None;

fn read_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        thread::sleep(interval);
        let c_time = currenttime.unwrap();
        println!("reading time {:?}", c_time);
    }
}

fn poll_thread() {
    let interval = Duration::from_secs(1);

    for x in 0..10 {
        currenttime = Some(SystemTime::now());
        println!("polling time");
        thread::sleep(interval);
    }
}

fn main() {
    let poll = thread::spawn(poll_thread);
    let read = thread::spawn(read_thread);
    read.join().unwrap();
    poll.join().unwrap();
}

Rust will not compile this code.

> rustc clock.rs
error[E0133]: use of mutable static requires unsafe function or block
  --> clock.rs:13:22
   |
13 |         let c_time = currenttime.unwrap();
   |                      ^^^^^^^^^^^ use of mutable static

error[E0133]: use of mutable static requires unsafe function or block
  --> clock.rs:22:9
   |
22 |         currenttime = Some(SystemTime::now());
   |         ^^^^^^^^^^^ use of mutable static

error: aborting due to 2 previous errors

Rust has told us that this action is unsafe, and that we shouldn’t be modifying a global static like this.

This alone is a great reason and demonstration of why we need a language like Rust instead of C - the compiler can tell us when actions are dangerous at compile time, rather than being allowed to sit in production code for years.

For bonus marks, because Rust is stricter about types than C, we don’t have issues like:

int c_time = time();

Which is a 2038 problem in the making :)

Implications of Common Name deprecation for Dogtag and FreeIPA

Posted by Fraser Tweedale on July 11, 2017 03:25 AM

Or, ERR_CERT_COMMON_NAME_INVALID, and what we are doing about it.

Google Chrome version 58, released in April 2017, removed support for the X.509 certificate Subject Common Name (CN) as a source of naming information when validating certificates. As a result, certificates that do not carry all relevant domain names in the Subject Alternative Name (SAN) extension result in validation failures.

At the time of writing this post Chrome is just the first mover, but Mozilla Firefox and other programs and libraries will follow suit. The public PKI used to secure the web and other internet communiations is largely unaffected (browsers and CAs moved a long time ago to ensure that certificates issued by publicly trusted CAs carried all DNS naming information in the SAN extension), but some enterprises running internal PKIs are feeling the pain.

In this post I will provide some historical and technical context to the situation, and explain what we are are doing in Dogtag and FreeIPA to ensure that we issue valid certificates.

Background

X.509 certificates carry subject naming information in two places: the Subject Distinguished Name (DN) field, and the Subject Alternative Name extension. There are many types of attributes available in the DN, including organisation, country, and common name. The definitions of these attribute types came from X.500 (the precursor to LDAP) and all have an ASN.1 representation.

Within the X.509 standard, the CN has no special interpretation, but when certificates first entered widespread use in the SSL protocol, it was used to carry the domain name of the subject site or service. When connecting to a web server using TLS/SSL, the client would check that the CN matches the domain name they used to reach the server. If the certificate is chained to a trusted CA, the signature checks out, and the domain name matches, then the client has confidence that all is well and continues the handshake.

But there were a few problems with using the Common Name. First, what if you want a certificate to support multiple domain names? This was especially a problem for virtual hosts in the pre-SNI days where one IP address could only have one certificate associated with it. You can have multiple CNs in a Distinguished Name, but the semantics of X.500 DNs is strictly heirarichical. It is not an appropriate use of the DN to cram multiple, possibly non-hierarchical domain names into it.

Second, the CN in X.509 has a length limit of 64 characters. DNS names can be longer. The length limit is too restrictive, especially in the world of IaaS and PaaS where hosts and services are spawned and destroyed en masse by orchestration frameworks.

Third, some types of subject names do not have a corresponding X.500 attribute, including domain names. The solution to all three of these problems was the introduction of the Subject Alternative Name X.509 extension, to allow more types of names to be used in a certificate. (The SAN extensions is itself extensible; apart from DNS names other important name types include IP addresses, email addresses, URIs and Kerberos principal names). TLS clients added support for validating SAN DNSName values in addition to the CN.

The use of the CN field to carry DNS names was never a standard. The Common Name field does not have these semantics; but using the CN in this way was an approach that worked. This interpretation was later formalised by the CA/B Forum in their Baseline Requirements for CAs, but only as a reflection of a current practice in SSL/TLS server and client implementations. Even in the Baseline Requirements the CN was a second-class citizen; they mandated that if the CN was present at all, it must reflect one of the DNSName or IP address values from the SAN extension. All public CAs had to comply with this requirement, which is why Chrome’s removal of CN support is only affecting private PKIs, not public web sites.

Why remove CN validation?

So, Common Name was not ideal for carrying DNS naming information, but given that we now have SAN, was it really necessary to deprecate it, and is it really necessary to follow through and actually stop using it, causing non-compliant certificates that were previously accepted to now be rejected?

The most important reason for deprecating CN validation is the X.509 Name Constraints extension. Name Constraints, if they appear in a CA certificate or intermediate CA certificate, constrain the valid subject names on leaf certificates. Various name types are supported including DNS names; a DNS name constraint restricts the domain of validity to the domain(s) listed and subdomains thereof. For example, if the DNS name example.com appears in a CA certificate’s Name Constraints extension, leaf certificates with a DNS name of example.com or foo.example.com could be valid, but a DNS name of foo.example.net could not be valid. Conforming X.509 implementations must enforce these constraints.

But these constraints only apply to SAN DNSName values, not to the CN. This is why accepting DNS naming information in the CN had to be deprecated – the name constraints cannot be properly enforced!

So back in May 2000 the use of Common Name for carrying a DNS name was deprecated by RFC 2818. Although it deprecated the practice this RFC required clients to fall back to the Common Name if there were no SAN DNSName values on the certificate. Then in 2011 RFC 6125 removed the requirement for clients to fall back to the common name, making this optional behaviour. Over recent years, some TLS clients began emitting warnings when they encountered certificates without SAN DNSNames, or where a DNS name in the CN did not also appear in the SAN extension. Finally, Chrome has become the first widely used client to remove support.

Despite more than 15 years notice on the deprecation of this use of Common Name, a lot of CA software and client tooling still does not have first-class support for the SAN extension. Most tools used to generate CSRs do not even ask about SAN, and require complex configuration to generate a request bearing the SAN extension. Similarly, some CA programs does not do a good job of issuing RFC-compliant certificates. Right now, this includes Dogtag and FreeIPA.

Subject Alternative Name and FreeIPA

For some years, FreeIPA (in particular, the default profile for host and service certificates, called caIPAserviceCert) has supported the SAN extension, but the client is required to submit a CSR containing the desired SAN extension data. The names in the CSR (the CN and all alternative names) get validated against the subject principal, and then the CA would issue the certificate with exactly those names. There was no way to ensure that the domain name in the CN was also present in the SAN extension.

We could add this requirement to FreeIPA’s CSR validation routine, but this imposes an unreasonable burden on the user to "get it right". Tools like OpenSSL have poor usability and complex configuration. Certmonger supports generating a CSR with the SAN extension but it must be explicitly requested. For FreeIPA’s own certificates, we have (in recent major releases) ensured that they have contained the SAN extension, but this is not the default behaviour and that is a problem.

FreeIPA 4.5 brought with it a CSR autogeneration feature that, for a given certificate profile, lets the administrator specify how to construct a CSR appropriate for that profile. This reduces the burden on the end user, but they must still opt in to this process.

Subject Alternative Name and Dogtag

Until Dogtag 10.4, there were two ways to produce a certificate with the SAN extension. One was the SubjectAltNameExtDefault profile component, which, for a given profile, supports a fixed number of names, either hard coded or based on particular request attributes (e.g. the CN, the email address of the authenticated user, etc). The other was the UserExtensionDefault which copies a given extension from the CSR to the final certificate verbatim (no validation of the data occurs). We use UserExtensionDefault in FreeIPA’s certificate profile (all names are validated by the FreeIPA framework before the request is submitted to Dogtag).

Unfortunately, SubjectAltNameExtDefault and UserExtensionDefault are not compatible with each other. If a profile uses both and the CSR contains the SAN extension, issuance will fail with an error because Dogtag tried to add two SAN extensions to the certificate.

In Dogtag 10.4 we introduced a new profile component that improves the situation, especially for dealing with the removal of client CN validation. The CommonNameToSANDefault will cause any profile that uses it to examine the Common Name, and if it looks like a DNS name, it will add it to the SAN extension (creating the extension if necessary).

Ultimately, what is needed is a way to define a certificate profile that just makes the right certificate, without placing an undue burden on the client (be it a human user or a software agent). The complexity and burden should rest with Dogtag, for the sake of all users. We are gradually making steps toward this, but it is still a long way off. I have discussed this utopian vision in a previous post.

Configuring CommonNameToSANDefault

If you have Dogtag 10.4, here is how to configure a profile to use the CommonNameToSANDefault. Add the following policy directives (the policyset and serverCertSet and index 12 are indicative only, but the index must not collide with other profile components):

policyset.serverCertSet.12.constraint.class_id=noConstraintImpl
policyset.serverCertSet.12.constraint.name=No Constraint
policyset.serverCertSet.12.default.class_id=commonNameToSANDefaultImpl
policyset.serverCertSet.12.default.name=Copy Common Name to Subject

Add the index to the list of profile policies:

policyset.serverCertSet.list=1,2,3,4,5,6,7,8,9,10,11,12

Then import the modified profile configuration, and you are good to go. There are a few minor caveats to be aware of:

  • Names containing wildcards are not recognised as DNS names. The rationale is twofold; wildcard DNS names, although currently recognised by most programs, are technically a violation of the X.509 specification (RFC 5280), and they are discouraged by RFC 6125. Therefore if the CN contains a wildcard DNS name, CommonNameToSANDefault will not copy it to the SAN extension.
  • Single-label DNS names are not copied. It is unlikely that people will use Dogtag to issue certificates for top-level domains. If CommonNameToSANDefault encounters a single-label DNS name, it will assume it is actually not a DNS name at all, and will not copy it to the SAN extension.
  • The CommonNameToSANDefault policy index must come after UserExtensionDefault, SubjectAltNameExtDefault, or any other component that adds the SAN extension, otherwise an error may occur because the older components do not gracefully handle the situation where the SAN extension is already present.

What we are doing in FreeIPA

Updating FreeIPA profiles to use CommonNameToSANDefault is trickier – FreeIPA configures Dogtag to use LDAP-based profile storage, and mixed-version topologies are possible, so updating a profile to use the new component could break certificate requests on other CA replicas if they are not all at the new versions. We do not want this situation to occur.

The long-term fix is to develop a general, version-aware profile update mechanism that will import the best version of a profile supported by all CA replicas in the topology. I will be starting this effort soon. When it is in place we will be able to safely update the FreeIPA-defined profiles in existing deployments.

In the meantime, we will bump the Dogtag dependency and update the default profile for new installations only in the 4.5.3 point release. This will be safe to do because you can only install replicas at the same or newer versions of FreeIPA, and it will avoid the CN validation problems for all new installations.

Conclusion

In this post we looked at the technical reasons for deprecating and removing support for CN domain validation in X.509 certificates, and discussed the implications of this finally happening, namely: none for the public CA world, but big problems for some private PKIs and programs including FreeIPA and Dogtag. We looked at the new CommonNameToSANDefault component in Dogtag that makes it easier to produce compliant certs even when the tools to generate the CSR don’t help you much, and discussed upcoming and proposed changes in FreeIPA to improve the situation there.

One big takeaway from this is to be more proactive in dealing with deprecated features in standards, APIs or programs. It is easy to punt on the work, saying "well yes it is deprecated but all the programs still support it…" The thing is, tomorrow they may not support it anymore, and when it was deprecated for good reasons you really cannot lay the blame at Google (or whoever). On the FreeIPA team we (and especially me as PKI wonk in residence) were aware of these issues but kept putting off the work. Then one day users and customers start having problems accessing their internal services in Chrome! 15 years should have been enough time to deal with it… but we (I) did not.

Lesson learned.

Who's got your hack back?

Posted by Josh Bressers on July 09, 2017 12:22 AM
The topic of hacking back keeps coming up these days. There's an attempt to pass a bill in the US that would legalize hacking back. There are many opinions on this topic, I'm generally not one to take a hard stand against what someone else thinks. In this case though, if you think hacking back is a good idea, you're wrong. Painfully wrong.

Everything I've seen up to this point tells me the people who think hacking back is a good idea are either mistaken about the issue or they're misleading others on purpose. Hacking back isn't self defense, it's not about being attacked, it's not about protection. It's a terrible idea that has no place in a modern society. Hacking back is some sort of stone age retribution tribal law. It has no place in our world.

Rather than break the various argument apart. Let's think about two examples that exist in the real world.

Firstly, why don't we give the people doing mall security guns? There is one really good reasons I can think of here. The insurance company that holds the policy on the mall would never allow the security to carry guns. If you let security carry guns, they will use them someday. They'll probably use them in an inappropriate manner, the mall will be sued, and they will almost certainly lose. That doesn't mean the mall has to pay a massive settlement, it means the insurance company has to pay a massive settlement. They don't want to do that. Even if some crazy law claims it's not illegal to hack back, no sane insurance company will allow it. I'm not talking about cyber insurance, I'm just talking about general policies here.

The second example revolves around shoplifting. If someone is caught stealing from a store, does someone go to their house and take some of their stuff in retribution? They don't of course. Why not? Because we're not cave people anymore. That's why. Retribution style justice has no place in a modern civilization. This is how a feud starts, nobody has ever won a feud, at best it's a draw when they all kill each other.

So this has me really thinking. Why would anyone want to hack back? There aren't many reasons that don't revolve around revenge. The way most attacks work you can't reliably know who is doing what with any sort of confidence. Hacking back isn't going to make anything better. It would make things a lot worse. Nobody wants to be stuck in the middle of a senseless feud. Well, nobody sane.

Redeploying just virt-controller for Kubevirt development

Posted by Adam Young on July 07, 2017 05:11 PM

Bottom line up front:

cluster/vagrant/sync_build.sh
cluster/kubectl.sh delete -f manifests/virt-controller.yaml
cluster/kubectl.sh create -f manifests/virt-controller.yaml

When reworking code (refactoring or rewriting) you want to make sure the tests run. While Unit tests run quickly and within the code tree, functional tests require a more dedicated setup. Since the time to deploy a full live cluster is non-trivial, we want to be able to deploy only the component we’ve been working on. In the case of virt-controller, this is managed as a service, a deployment, and a single pod. All are defined by manifests/virt-controller.yaml.

To update a deployment, we need to make sure that the next time the containers run, they contains the new code. ./cluster/vagrant/sync_build.sh does a few things to make that happen. It complies the go code, rebuilds the containers, and uploads them to the image repositories on the vagrant machines.

All of these steps can be done using the single line:

make vagrant-deploy

but it will take a while.  I ran it using the time command and it took 1m9.724s.

make alone takes 0m5.685s.

./cluster/vagrant/sync_build.sh  takes 0m24.773s

cluster/kubectl.sh delete -f manifests/virt-controller.yaml takes 0m3.265s

and

time cluster/kubectl.sh create -f manifests/virt-controller.yaml takes 0m0.203s.  Running it this way I find keeps me from getting distracted and losing the zone.

Running make docker is very slow, as it regenerates all of the docker containers.  If you don’t really care about all of them, you can generate just virt-controller by running:

./hack/build-docker.sh build virt-controller

Which takes 0m1.521s.

So, the gating factor seems to be the roughly 40 second deploy time for ./cluster/vagrant/sync_build.sh.  Not ideal for rapid development, but not horrible.

 

Episode 54 - Turning into an old person

Posted by Open Source Security Podcast on July 04, 2017 09:07 PM
Josh and Kurt talk about Canada Day, Not Petya, Interac goes down, Minecraft, airport security and books, then GDPR.


<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="http://html5-player.libsyn.com/embed/episode/id/5534634/height/90/width/640/theme/custom/autonext/no/thumbnail/yes/autoplay/no/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/87A93A/" style="border: none;" webkitallowfullscreen="" width="640"></iframe>

Show Notes



Sausage Factory: Advanced module building in Fedora

Posted by Stephen Gallagher on June 30, 2017 01:58 PM

First off, let me be very clear up-front: normally, I write my blog articles to be approachable by readers of varying levels of technical background (or none at all). This will not be one of those. This will be a deep dive into the very bowels of the sausage factory.

This blog post is a continuation of the Introduction to building modules in Fedora entry I wrote last month. It will assume a familiarity with all of the concepts discussed there.

Analyzing a more complicated module

Last time, we picked an extremely simple package to create. The talloc module needed to contain only a single RPM, since all the dependencies necessary both at build-time and runtime were available from the existing base-runtime, shared-userspace and common-build-dependencies packages.

This time, we will pick a slightly more complicated example that will require exploring some of the concepts around building with package dependencies. For this purpose, I am selecting the sscg package (one of my own and discussed previously on this blog in the article “Self-Signed SSL/TLS Certificates: Why they are terrible and a better alternative“).

We will start by analyzing sscg‘s dependencies. As you probably recall from the earlier post, we can do this with dnf repoquery:

dnf repoquery --requires sscg.x86_64 --resolve

Which returns with:

glibc-0:2.25-6.fc26.i686
glibc-0:2.25-6.fc26.x86_64
libpath_utils-0:0.2.1-30.fc26.x86_64
libtalloc-0:2.1.9-1.fc26.x86_64
openssl-libs-1:1.1.0f-4.fc26.x86_64
popt-0:1.16-8.fc26.x86_64

and then also get the build-time dependencies with:

dnf repoquery --requires --enablerepo=fedora-source --enablerepo=updates-source sscg.src --resolve

Which returns with:/home/sgallagh/modulebuild/builds/module-talloc-master-20170526153440/results/module-build-macros-mock-stderr.log

gcc-0:7.1.1-3.fc26.i686
gcc-0:7.1.1-3.fc26.x86_64
libpath_utils-devel-0:0.2.1-30.fc26.i686
libpath_utils-devel-0:0.2.1-30.fc26.x86_64
libtalloc-devel-0:2.1.9-1.fc26.i686
libtalloc-devel-0:2.1.9-1.fc26.x86_64
openssl-devel-1:1.1.0f-4.fc26.i686
openssl-devel-1:1.1.0f-4.fc26.x86_64
popt-devel-0:1.16-8.fc26.i686
popt-devel-0:1.16-8.fc26.x86_64

So let’s start by narrowing down the set of dependencies we already have by comparing them to the three foundational modules. The base-runtime module provides gcc, glibcopenssl-libs, openssl-devel, popt, and popt-devel . The shared-userspace module provides libpath_utils and libpath_utils-devel as well, which leaves us with only libtalloc as an unsatisfied dependency. Wow, what a convenient and totally unexpected outcome when I chose this package at random! Kidding aside, in most real-world situations this would be the point at which we would start recursively going through the leftover packages and seeing what their dependencies are. In this particular case, we know from the previous article that libtalloc is self-contained, so we will only need to include sscg and libtalloc in the module.

As with the libtalloc example, we need to now clone the dist-git repositories of both packages and determine the git hash that we intend to use for building the sscg module. See the previous blog post for details on this.

Creating a module with internal dependencies

Now let’s set up our git repository for our new module:

mkdir sscg && cd sscg
touch sscg.yaml
git init
git add sscg.yaml
git commit -m "Initial setup of the module"

And then we’ll edit the sscg.yaml the same way we did for the libtalloc module:

document: modulemd
version: 1
data:
  summary: Simple SSL certificate generator
  description: A utility to aid in the creation of more secure "self-signed" certificates. The certificates created by this tool are generated in a way so as to create a CA certificate that can be safely imported into a client machine to trust the service certificate without needing to set up a full PKI environment and without exposing the machine to a risk of false signatures from the service certificate.
  stream: ''
  version: 0
  license:
    module:
    - GPLv3+
  references:
    community: https://github.com/sgallagher/sscg
    documentation: https://github.com/sgallagher/sscg/blob/master/README.md
    tracker: https://github.com/sgallagher/sscg/issues
  dependencies:
    buildrequires:
      base-runtime: f26
      shared-userspace: f26
      common-build-dependencies: f26
      perl: f26
    requires:
      base-runtime: f26
      shared-userspace: f26
  api:
    rpms:
    - sscg
  profiles:
    default:
    - sscg
  components:
    rpms:
      libtalloc:
        rationale: Provides a hierarchical memory allocator with destructors. Dependency of sscg.
        ref: f284a27d9aad2c16ba357aaebfd127e4f47e3eff
        buildorder: 0
      sscg:
        rationale: Purpose of this module. Provides certificate generation helpers.
        ref: d09681020cf3fd33caea33fef5a8139ec5515f7b
        buildorder: 1

There are several changes from the libtalloc example in this modulemd, so let’s go through them one at a time.

The first you may notice is the addition of perl in the buildrequires: dependencies. This is actually a workaround at the moment for a bug in the module-build-service where not all of the runtime requirements of the modules specified as buildrequires: are properly installed into the buildroot. It’s unfortunate, but it should be fixed in the near future and I will try to remember to update this blog post when it happens.

You may also notice that the api section only includes sscg and not the packages from the libtalloc component. This is intentional. For the purposes of this module, libtalloc satisfies some dependencies for sscg, but as the module owner I do not want to treat libtalloc as a feature of this module (and by extension, support its use for anything other than the portions of the library used by sscg). It remains possible for consumers of the module to link against it and use it for their own purposes, but they are doing so without any guarantee that the interfaces will remain stable or even be present on the next release of the module.

Next on the list is the addition of the entirely-new profiles section. Profiles are a way to indicate to the package manager (DNF) that some packages from this module should automatically be installed when the module is activated if a certain system profile is enabled. The ‘default’ profile will take effect if no other profile is explicitly set. So in this case, the expectation if a user did dnf module install sscg would be to activate this module and install the sscg package (along with its runtime dependencies) immediately.

Lastly, under the RPM components there is a new option, buildorder. This is used to inform the MBS that some packages are dependent upon others in the module when building. In our case, we need libtalloc to be built and added into the buildroot before we can build sscg or else the build will fail and we will be sad. By adding buildorder, we tell the MBS: it’s okay to build any of the packages with the same buildorder value concurrently, but we should not attempt to build anything with a higher buildorder value until all of those lower have completed. Once all packages in a buildorder level are complete, the MBS will generate a private buildroot repository for the next buildorder to use which includes these packages. If the buildorder value is left out of the modulemd file, it is treated as being buildorder: 0.

At this point, you should be able to go ahead and commit this modulemd file to git and run mbs-build local successfully. Enjoy!


Protected: DRAFT: Sausage Factory: Advanced module building in Fedora

Posted by Stephen Gallagher on June 30, 2017 01:34 PM

This post is password protected. You must visit the website and enter the password to continue reading.


Quick Blog on Buildah.

Posted by Dan Walsh on June 30, 2017 12:39 PM
Buildah is a new tool that we released last week for building containers without requiring a container runtime daemon running. --nodockerneeded

Here is a blog that talks about some of its features.

http://www.projectatomic.io/blog/2017/06/introducing-buildah/

Our main goal was to make this simple.  I was asked by a fellow engineer about a feature that docker has for copying a file out of a container onto the host.  "docker cp".  In docker this ends up being a client server operation, and required someone to code it up.  We don't have this feature in buildah.  :^(

BUT, buildah gives you the primitives you need to do simpler functionality and allows you to use the full power of bash.  If I want to copy a file out of a container, I can simply mount the container and copy it out.

# mnt=$(buildah mount CONTAINER_ID)
# cp $mnt/PATHTOSRC /PATHTODEST
# buildah umount CONTAINER_ID


The beauty of this is we could use lots of tools, I could scp if I wanted to copy to another machine, or rsync, or ftp...

Once your have the container mounted up, you can use any bash command on it, to move files in or out.

buildah == simplicity

Running Kubevirt functional tests in Gogland

Posted by Adam Young on June 30, 2017 12:42 AM

When tests fail, as they often will, the debugger can greatly shorten the time it takes to figure out why.  The Kubevirt functional tests run essentially as a remote client.  Getting a debuggable setup is not that different from my earlier post on running virt-launcher in the debugger.

I started by trying to run the unit tests list other tests, but had a similar problem to the virt-controller setup.  Looking at cluster/run_tests.sh what is not clear is that it is doing a directory based run of the tests.  Changing the config to look like this worked:

Note that I changes Kind to directory and edited the Directory field to point to where we have our unit tests.  The program arguments field. looks like this in full:

-master=http://192.168.200.2:8184 --kubeconfig=/home/ayoung/go/src/kubevirt.io/kubevirt/cluster/vagrant/.kubeconfig