Fedora security Planet

Stealing from customers

Posted by Josh Bressers on May 29, 2017 09:59 PM
I was having some security conversations last week and cybersecurity insurance came up as a topic. This isn't overly unusual as it's a pretty popular topic, but someone said something that really got me thinking.
What if the insurance covered the customers instead of the companies?
Now I understand that many cybersecurity insurance policies can cover some amount of customer damage and loss, but fundamentally the coverage is for the company that is attacked, customers who have data stolen will maybe get a year of free credit monitoring or some other token service. That's all well and good, but I couldn't help myself from thinking about this problem from another angle. Let's think about insurance in the context of shoplifting. For this thought exercise we're going to use a real store in our example, which won't be exactly correct, but the point is to think about the problem, not get all the minor details correct.

If you're in a busy store shopping and someone steals your wallet, it's generally accepted that the store is not at fault for this theft. Most would put some effort into helping you, but at the end of the day you're probably out of luck if you expect the store to repay you for anything you lost. They almost certainly won't have insurance to cover the theft of customer property in their store.

Now let's also imagine there are things taken from the store, actual merchandise gets stolen. This is called shoplifting. It has a special name and many stores even have special groups to help minimize this damage. They also have insurance to cover some of these losses. Most businesses see some shoplifting as a part of doing business. They account for some volume of this theft when doing their planning and profit calculations.

In the real world, I suspect customers being robbed while in a store isn't very common. If there is a store that gains a reputation for customers having wallets stolen, nobody will shop there. If you visit a store in a rough part of town they might even have a security guard at the door to help keep the riffraff out. This is because no shop wants to be known as a dangerous place. You can't exist as a store with that sort of reputation. Customers need to feel safe.

In the virtual world, all that can be stolen is basically information. Sometimes that information can be equated to actual money, sometimes it's just details about a person. Some will have little to no value like a very well known email address. Sometimes it can have a huge value like a tax identifier that can be used to commit identity theft. It can be very very difficult to know when information is stolen, but also the value of that information taken can vary widely. We also seem to place very little value on our information. Many people will trade it away for a trinket online worth a fraction of the information they just supplied.

Now let's think about insurance. Just like loss prevention insurance, cybersecurity insurance isn't there to protect customers. It exists to help protect the company from the losses of an attack. If customer data is stolen the customers are not really covered, in many instances there's nothing a customer can do. It could be impossible to prove your information was stolen, even if it gets used somewhere else can you prove it came from the business in question?

After spending some time on the question of what if insurance covered the customers, I realize how hard this problem is to deal with. While real world customer theft isn't very common and it's basically not covered, there's probably no hope for information. It's so hard to prove things beyond a reasonable doubt and many of our laws require actual harm to happen before any action can be taken. Proving this harm is very very difficult. We're almost certainly going to need new laws to deal with these situations.

Merging Kubernetes client configs at run time

Posted by Adam Young on May 26, 2017 03:20 PM

Last time I walked through the process of merging two sets of Kubernetest client configurations into one. For more ephemeral data, you might not want to munge it all into your main configuration. The KUBECONFIG environment variables lets you specify muiltiple configuration files and merge them into a single set of configuration data.

From

kubectl config --help

If $KUBECONFIG environment variable is set, then it is used [as] a list of paths (normal path delimiting rules for your system). These paths are merged. When a value is modified, it is modified in the file that defines the stanza. When a value is created, it is created in the first file that exists. If no files in the chain exist, then it creates the last file in the list.

 

So, lets start with the file downloaded by the kubevirt build system yesterday.

 

[ayoung@ayoung541 vagrant]$ echo $PWD
/home/ayoung/go/src/kubevirt.io/kubevirt/cluster/vagrant
[ayoung@ayoung541 vagrant]$ export KUBECONFIG=$PWD/.kubeconfig
[ayoung@ayoung541 vagrant]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* kubernetes-admin@kubernetes kubernetes kubernetes-admin 

Contrast this with what a get without the environment variable set, if I use the configuration in ~/.kube, which I synced over from my OpenShift cluster:

[ayoung@ayoung541 vagrant]$ unset KUBECONFIG
[ayoung@ayoung541 vagrant]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
 default/munchlax:8443/ayoung munchlax:8443 ayoung/munchlax:8443 default
* default/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 default
 kube-system/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 kube-system

I want to create a new configuration for the vagrant managed machines for Kubevirt.  IT turns out that the API server specified there is actually a proxy, a short term shim we put in place as we anxiously awate the Amagalmated Api Server of 1.7.  However, sometimes this proxy is broken or we just need to by-pass it.  The only difference between this setup and the proxied setup is the server URL.

So…I create a new file, based on the .kubeconfig file, but munged slightly.  Here is the diff:

[ayoung@ayoung541 vagrant]$ diff -Nurd .kubeconfig .kubeconfig-core 
--- .kubeconfig 2017-05-24 19:49:24.643158731 -0400
+++ .kubeconfig-core 2017-05-26 11:10:49.359955538 -0400
@@ -3,13 +3,13 @@
 - cluster:
 certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFM01EVXlOREl4TWpnek5sb1hEVEkzTURVeU1qSXhNamd6Tmxvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTmRmCnVINkE3Q1JVRVQ5VzhpSGgyam9EQUNxVGZXQ01ITFN3dzc5Q01DZXlyWFhZazVvR0lIbnZkeVB5aEVNZ2xYYysKczUwZVJZRDBkWTYrYUlnVmtVaElIVitSUHltVHE0WklrS3EzNnV5MXk0TFYzSDNTaGt2eVZBbitjL3htYldaZQp5bEZaZHhSMTFoVjRac0h4WXdzWTR4bmVoaWpkMnkwWUFaQnkwellkQm5xTmE4cFpDb3BNbStLdmtjVEJ1UERGCkp5ZWkzU0tJd3R1R0gxU3ByUCsxdi9OSGFCOTNXR0g0MFQxbm1HZTRGWWQ2SzErcWNNdndpdmY1dVQ4Nk10M2YKVWhEQWZNUlk3aW5maXVsVW1HeUNPWlNsbFhpWlRMWmpoOGZiUW1FdmZvOFJjMm1lOGtwTXJpMDdIWUQ4ZjZFNQpScjNhT05mcTkwd2s1VDM5YWxjQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFNTll1R3N1bGNpY3REQ0pFZ3R3K2ZQSUU3S04KQnRvV2RuZWZZdktya1l1WUVTRkk5alFXTFNmVGN1MnpibzNRWnYzZDg3WnkvNjYyK2R0SWloWFA5V3NJWVhHUApDUXNuTUMyZXY5djlmOU9WbVhZbEhuUUx0YXRiSDZVTFZPdWJZUXlFRlRSa21XV1dwcXpoR1pNWk1pbG8wRzhLCnBNd29Ia0dDWm5tUytyUVVEVWF6QlprcVdzRFNabW5jWUhtdFRtMEJ6RUJpa002SEFsNzAvT21rNGpHcmtHZEQKS2tMWU16UjJkZnlkSklCVGxKdGlGYjRhZ3R5amlFb3NDSGY0Z1oyY0xUMTRyOENud0QrOWxSbVk3dDNDRjIrdgpFOGxxb3RSYVI2TVRyWnZkUXUrOWtFYnNKWVZUN1NQR3pqeEpMZ1BmTGprK0g1YUJWQU9od0tvdTV0QT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
 server: https://192.168.200.2:6443
- name: kubernetes
+ name: core
 contexts:
 - context:
- cluster: kubernetes
+ cluster: core
 user: kubernetes-admin
- name: kubernetes-admin@kubernetes
-current-context: kubernetes-admin@kubernetes
+ name: kubernetes-admin@core
+current-context: kubernetes-admin@core
 kind: Config
 preferences: {}
 users:

Now I have a couple choices. I can just specify this second config file on the command line:

[ayoung@ayoung541 vagrant]$ kubectl --kubeconfig=$PWD/.kubeconfig-core config get-contexts
 CURRENT NAME CLUSTER AUTHINFO NAMESPACE
 kubernetes-admin@core core kubernetes-admin

Or I can munge the two together and provide a flag which states which context to use.

[ayoung@ayoung541 vagrant]$ export KUBECONFIG=$PWD/.kubeconfig:$PWD/.kubeconfig-core
[ayoung@ayoung541 vagrant]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* kubernetes-admin@kubernetes kubernetes kubernetes-admin 
 kubernetes-admin@core core kubernetes-admin

Note that this gives a different current context (with the asterix) than if I reverse the order of the files in the env-var:

[ayoung@ayoung541 vagrant]$ export KUBECONFIG=$PWD/.kubeconfig-core:$PWD/.kubeconfig
[ayoung@ayoung541 vagrant]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* kubernetes-admin@core core kubernetes-admin 
 kubernetes-admin@kubernetes kubernetes kubernetes-admin

Whichever one declared the default first wins.

However, regardless of the order, I can explicitly set the context I want to use on the command line:

[ayoung@ayoung541 vagrant]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
 kubernetes-admin@core core kubernetes-admin 
* kubernetes-admin@kubernetes kubernetes kubernetes-admin 
[ayoung@ayoung541 vagrant]$ kubectl --context=kubernetes-admin@core config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* kubernetes-admin@kubernetes kubernetes kubernetes-admin 
 kubernetes-admin@core core kubernetes-admin

Again, notice the line where asterix specifies which context is in use.

With only two files, it might be easier to just specify the –kubeconfig option, but as the number of configs you work with grows, you might find you want to share the user data between two of them, or have a bunch of scripts that work between them, and it is easier to track which context to use than to track which file contains which set of data.

Merging two Kubernetes client configurations

Posted by Adam Young on May 25, 2017 03:22 PM

I have two distinct Kubernetes clusters I work with on a daily basis. One is a local vagrant bases set of VM built by the Kubevirt code base. The other is a “baremetal” install of OpenShift Origin on a pair of Fedora workstation in my office. I want to be able to switch back and forth between them.

When you run the kubectl command without specifying where the application should look for the configuration file, it defaults to looking in $HOME/.kube/config. This file maintains the configuration values for a handful of object types. Here is an abbreviated look at the one set up by origin.

apiVersion: v1
clusters:
- cluster:
 api-version: v1
 certificate-authority-data: LS0...LQo=
 server: https://munchlax:8443
 name: munchlax:8443
contexts:
- context:
 cluster: munchlax:8443
 namespace: default
 user: system:admin/munchlax:8443
 name: default/munchlax:8443/system:admin
- context:
 cluster: munchlax:8443
 namespace: kube-system
 user: system:admin/munchlax:8443
 name: kube-system/munchlax:8443/system:admin
current-context: kube-system/munchlax:8443/system:admin
kind: Config
preferences: {}
users:
- name: system:admin/munchlax:8443
 user:
 client-certificate-data: LS0...tLS0K
 client-key-data: LS0...LS0tCg==

Note that I have ellided the very long cryptographic entries for certificate-authority-data, client-certificate-data, and client-key-data.

First up is an array of clusters.  The minimal configuration for each here provides a servername, which is the remote URL to use, some set of certificate authority data, and a name to be used for this configuration elsewhere in this file.

At the bottom of the file, we see a chunk of data for user identification.  Again, the user has a local name

 system:admin/munchlax:8443

With the rest of the identifying information hidden away inside the client certificate.

These two entities are pulled together in a Context entry. In addition, a context entry has a namespace field. Again, we have an array, with each entry containing a name field. The Name of the context object is going to be used in the current-context field and this is where kubectl starts its own configuration.   Here is an object diagram.

The next time I run kubectl, it will read this file.

  1. Based on the value of CurrentContext, it will see it should use the kube-system/munchlax:8443/system:admin context.
  2. From that context, it will see it should use
    1. the system:admin/munchlax:8443 user,
    2. the kube-system namespace, and
    3. the URL https://munchlax:8443 from the munchlax:8443 server.

Below is a similar file from the kubevirt set up, found on my machine at the path ~/go/src/kubevirt.io/kubevirt/cluster/vagrant/.kubeconfig

apiVersion: v1
clusters:
- cluster:
 certificate-authority-data: LS0...LS0tLQo=
 server: https://192.168.200.2:6443
 name: kubernetes
contexts:
- context:
 cluster: kubernetes
 user: kubernetes-admin
 name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
kind: Config
preferences: {}
users:
- name: kubernetes-admin
 user:
 client-certificate-data: LS0...LS0tLQo=
 client-key-data: LS0...LS0tCg==

Again, I’ve ellided the long cryptographic data.  This file is organized the same way as the default one.  kubevirt uses it via a shell script that resolves to the following command line:

${KUBEVIRT_PATH}cluster/vagrant/.kubectl --kubeconfig=${KUBEVIRT_PATH}cluster/vagrant/.kubeconfig "$@"

which overrides the default configuration location.  What if I don’t want to use the shell script?  I’ve manually merged the two files into a single ~/.kube/config.  The resulting one has two users,

  • system:admin/munchlax:8443
  • kubernetes-admin

two clusters,

  • munchlax:8443
  • kubernetes

and three contexts.

  • default/munchlax:8443/system:admin
  • kube-system/munchlax:8443/system:admin
  • kubernetes-admin@kubernetes

With current-context: kubernetes-admin@kubernetes:

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
haproxy-686891680-k4fxp 1/1 Running 0 15h
iscsi-demo-target-tgtd-2918391489-4wxv0 1/1 Running 0 15h
kubevirt-cockpit-demo-1842943600-3fcf9 1/1 Running 0 15h
libvirt-199kq 2/2 Running 0 15h
libvirt-zj6vw 2/2 Running 0 15h
spice-proxy-2868258710-l85g2 1/1 Running 0 15h
virt-api-3813486938-zpd8f 1/1 Running 0 15h
virt-controller-1975339297-2z6lc 1/1 Running 0 15h
virt-handler-2s2kh 1/1 Running 0 15h
virt-handler-9vvk1 1/1 Running 0 15h
virt-manifest-322477288-g46l9 2/2 Running 0 15h

but with current-context: kube-system/munchlax:8443/system:admin

$ kubectl get pods
NAME READY STATUS RESTARTS AGE
tiller-deploy-3580499742-03pbx 1/1 Running 2 8d
youthful-wolverine-testme-4205106390-82gwk 0/1 CrashLoopBackOff 30 2h

There is support in the kubectl executable for configuration:

[ayoung@ayoung541 helm-charts]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
 kubernetes-admin@kubernetes kubernetes kubernetes-admin 
 default/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 default
* kube-system/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 kube-system
[ayoung@ayoung541 helm-charts]$ kubectl config current-context kubernetes-admin@kubernetes
kube-system/munchlax:8443/system:admin
[ayoung@ayoung541 helm-charts]$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
 default/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 default
* kube-system/munchlax:8443/system:admin munchlax:8443 system:admin/munchlax:8443 kube-system
 kubernetes-admin@kubernetes kubernetes kubernetes-admin

The openshift login command can add additional configuration information.

$ oc login
Authentication required for https://munchlax:8443 (openshift)
Username: ayoung
Password: 
Login successful.

You have one project on this server: "default"

Using project "default".

This added the following information to my .kube/config

under contexts:

- context:
 cluster: munchlax:8443
 namespace: default
 user: ayoung/munchlax:8443
 name: default/munchlax:8443/ayoung

under users:

- name: ayoung/munchlax:8443
 user:
 token: 24i...o8_8

This time I elided the token.

It seems that it would be pretty easy to write a tool for merging two configuration files.  The caveats I can see include:

  • don’t duplicate entries
  • ensure that two entries with the same name but different values trigger an error

Getting started with helm on OpenShift

Posted by Adam Young on May 24, 2017 05:20 PM

After attending in on a helm based lab at the OpenStack summit, I decided I wanted to try it out for myself on my OpenShift cluster.

Since helm is not yet part of Fedora, I used the upstream binary distribution Inside the tarball was, among other things, a standalone binary named helm, which I moved to ~/bin (which is in my path). Once I had that in place:

$ helm init
Creating /home/ayoung/.helm 
Creating /home/ayoung/.helm/repository 
Creating /home/ayoung/.helm/repository/cache 
Creating /home/ayoung/.helm/repository/local 
Creating /home/ayoung/.helm/plugins 
Creating /home/ayoung/.helm/starters 
Creating /home/ayoung/.helm/repository/repositories.yaml 
$HELM_HOME has been configured at /home/ayoung/.helm.

Tiller (the helm server side component) has been installed into your Kubernetes Cluster.
Happy Helming!

Checking on that Tiller install:

$ kubectl get pods --all-namespaces
NAMESPACE     NAME                             READY     STATUS    RESTARTS   AGE
default       docker-registry-2-z91cq          1/1       Running   0          23h
default       registry-console-1-g4qml         1/1       Running   0          1d
default       router-5-4w3zt                   1/1       Running   0          23h
kube-system   tiller-deploy-3210876050-8gx0w   1/1       Running   0          1m

But trying a helm command line operation fails.

$ helm list
Error: User "system:serviceaccount:kube-system:default" cannot list configmaps in project "kube-system"

This looks like an RBAC issue. I want to assign the role ‘admin’ to the user “system:serviceaccount:kube-system:tiller” on the project “kube-system”

$ oc project kube-system
Now using project "kube-system" on server "https://munchlax:8443".
[ansible@munchlax ~]$ oadm policy add-role-to-user admin system:serviceaccount:kube-system:tiller
role "admin" added: "system:serviceaccount:kube-system:tiller"
[ansible@munchlax ~]$ ./helm list
[ansible@munchlax ~]$

Now I can follow the steps outlined in the getting started guide:

[ansible@munchlax ~]$ ./helm create mychart
Creating mychart
[ansible@munchlax ~]$ rm -rf mychart/templates/
deployment.yaml  _helpers.tpl     ingress.yaml     NOTES.txt        service.yaml     
[ansible@munchlax ~]$ rm -rf mychart/templates/*.*
[ansible@munchlax ~]$ 
[ansible@munchlax ~]$ 
[ansible@munchlax ~]$ vi mychart/templates/configmap.yaml
[ansible@munchlax ~]$ ./helm install ./mychart
NAME:   esteemed-pike
LAST DEPLOYED: Wed May 24 11:46:52 2017
NAMESPACE: kube-system
STATUS: DEPLOYED

RESOURCES:
==> v1/ConfigMap
NAME               DATA  AGE
mychart-configmap  1     0s
[ansible@munchlax ~]$ ./helm get manifest esteemed-pike

---
# Source: mychart/templates/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: mychart-configmap
data:
  myvalue: "Hello World"
[ansible@munchlax ~]$ ./helm delete esteemed-pike
release "esteemed-pike" deleted

Exploring OpenShift RBAC

Posted by Adam Young on May 24, 2017 03:27 PM

OK, since I did it wrong last time, I’m going to try creating an user in OpenShift, and grant that user permissions to do various things. 

I’m going to start by removing the ~/.kube directory on my laptop and perform operations via SSH on the master node.  From my last session I can see I still have:

$ oc get users
NAME UID FULL NAME IDENTITIES
ayoung cca08f74-3a53-11e7-9754-1c666d8b0614 allow_all:ayoung
$ oc get identities
NAME IDP NAME IDP USER NAME USER NAME USER UID
allow_all:ayoung allow_all ayoung ayoung cca08f74-3a53-11e7-9754-1c666d8b0614

What openshift calls projects (perhaps taking the lead from Keystone?) Kubernetes calls namespaces:

$ oc get projects
NAME DISPLAY NAME STATUS
default Active
kube-system Active
logging Active
management-infra Active
openshift Active
openshift-infra Active
[ansible@munchlax ~]$ kubectl get namespaces
NAME STATUS AGE
default Active 18d
kube-system Active 18d
logging Active 7d
management-infra Active 10d
openshift Active 18d
openshift-infra Active 18d

According to the documentation here I should be able to log in from my laptop, and all of the configuration files just get magically set up.  Lets see what happens:

$ oc login
Server [https://localhost:8443]: https://munchlax:8443 
The server uses a certificate signed by an unknown authority.
You can bypass the certificate check, but any data you send to the server could be intercepted by others.
Use insecure connections? (y/n): y

Authentication required for https://munchlax:8443 (openshift)
Username: ayoung
Password: 
Login successful.

You don't have any projects. You can try to create a new project, by running

oc new-project <projectname>

Welcome! See 'oc help' to get started.

Just to make sure I sent something, a typed in the password “test” but it could have been anything.  The config file now has this:

$ cat ~/.kube
.kube/ .kube.bak/ 
[ayoung@ayoung541 ~]$ cat ~/.kube/config 
apiVersion: v1
clusters:
- cluster:
 insecure-skip-tls-verify: true
 server: https://munchlax:8443
 name: munchlax:8443
contexts:
- context:
 cluster: munchlax:8443
 user: ayoung/munchlax:8443
 name: /munchlax:8443/ayoung
current-context: /munchlax:8443/ayoung
kind: Config
preferences: {}
users:
- name: ayoung/munchlax:8443
 user:
 token: 4X2UAMEvy43sGgUXRAp5uU8KMyLyKiHupZg7IUp-M3Q

I’m going to resist the urge to look too closely into that token thing.
I’m going to work under the assumption that a user can be granted roles in several namespaces. Lets see:

 $ oc get namespaces
 Error from server (Forbidden): User "ayoung" cannot list all namespaces in the cluster

Not a surprise.  But the question I have now is “which namespace am I working with?”  Let me see if I can figure it out.

$ oc get pods
Error from server (Forbidden): User "ayoung" cannot list pods in project "default"

and via kubectl

$ kubectl get pods
Error from server (Forbidden): User "ayoung" cannot list pods in project "default"

What role do I need to be able to get pods?  Lets start by looking at the head node again:

[ansible@munchlax ~]$ oc get ClusterRoles | wc -l
64
[ansible@munchlax ~]$ oc get Roles | wc -l
No resources found.
0

This seems a bit strange. ClusterRoles are not limited to a namespace, whereas Roles are. Why am I not seeing any roles defined?

Lets start with figuring out who can list pods:

oadm policy who-can GET pods
Namespace: default
Verb:      GET
Resource:  pods

Users:  system:admin
        system:serviceaccount:default:deployer
        system:serviceaccount:default:router
        system:serviceaccount:management-infra:management-admin
        system:serviceaccount:openshift-infra:build-controller
        system:serviceaccount:openshift-infra:deployment-controller
        system:serviceaccount:openshift-infra:deploymentconfig-controller
        system:serviceaccount:openshift-infra:endpoint-controller
        system:serviceaccount:openshift-infra:namespace-controller
        system:serviceaccount:openshift-infra:pet-set-controller
        system:serviceaccount:openshift-infra:pv-binder-controller
        system:serviceaccount:openshift-infra:pv-recycler-controller
        system:serviceaccount:openshift-infra:statefulset-controller

Groups: system:cluster-admins
        system:cluster-readers
        system:masters
        system:nodes

And why is this? What roles are permitted to list pods?

$ oc get rolebindings
NAME                   ROLE                    USERS     GROUPS                           SERVICE ACCOUNTS     SUBJECTS
system:deployer        /system:deployer                                                   deployer, deployer   
system:image-builder   /system:image-builder                                              builder, builder     
system:image-puller    /system:image-puller              system:serviceaccounts:default                        

I don’t see anything that explains why admin would be able to list pods there. And the list is a bit thin.

Another page advises I try the command

oc describe  clusterPolicy

But the output of that is voluminous. With a little trial and error, I discover I can do the same thing using the kubectl command, and get the output in JSON, to let me inspect. Here is a fragment of the output.

         "roles": [
                {
                    "name": "admin",
                    "role": {
                        "metadata": {
                            "creationTimestamp": "2017-05-05T02:24:17Z",
                            "name": "admin",
                            "resourceVersion": "24",
                            "uid": "f063233e-3139-11e7-8169-1c666d8b0614"
                        },
                        "rules": [
                            {
                                "apiGroups": [
                                    ""
                                ],
                                "attributeRestrictions": null,
                                "resources": [
                                    "pods",
                                    "pods/attach",
                                    "pods/exec",
                                    "pods/portforward",
                                    "pods/proxy"
                                ],
                                "verbs": [
                                    "create",
                                    "delete",
                                    "deletecollection",
                                    "get",
                                    "list",
                                    "patch",
                                    "update",
                                    "watch"
                                ]
                            },

There are many more rules, but this one shows what I want: there is a policy role named “admin” that has a rule that provides access to the pods via the list verbs, among others.

Lets see if I can make my ayoung account into a cluster-reader by adding the role to the user directly.

On the master

$ oadm policy add-role-to-user cluster-reader ayoung
role "cluster-reader" added: "ayoung"

On my laptop

$ kubectl get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-2-z91cq    1/1       Running   3          8d
registry-console-1-g4qml   1/1       Running   3          8d
router-5-4w3zt             1/1       Running   3          8d

Back on master, we see that:

$  oadm policy who-can list pods
Namespace: default
Verb:      list
Resource:  pods

Users:  ayoung
        system:admin
        system:serviceaccount:default:deployer
        system:serviceaccount:default:router
        system:serviceaccount:management-infra:management-admin
        system:serviceaccount:openshift-infra:build-controller
        system:serviceaccount:openshift-infra:daemonset-controller
        system:serviceaccount:openshift-infra:deployment-controller
        system:serviceaccount:openshift-infra:deploymentconfig-controller
        system:serviceaccount:openshift-infra:endpoint-controller
        system:serviceaccount:openshift-infra:gc-controller
        system:serviceaccount:openshift-infra:hpa-controller
        system:serviceaccount:openshift-infra:job-controller
        system:serviceaccount:openshift-infra:namespace-controller
        system:serviceaccount:openshift-infra:pet-set-controller
        system:serviceaccount:openshift-infra:pv-attach-detach-controller
        system:serviceaccount:openshift-infra:pv-binder-controller
        system:serviceaccount:openshift-infra:pv-recycler-controller
        system:serviceaccount:openshift-infra:replicaset-controller
        system:serviceaccount:openshift-infra:replication-controller
        system:serviceaccount:openshift-infra:statefulset-controller

Groups: system:cluster-admins
        system:cluster-readers
        system:masters
        system:nodes

And now to remove the role:
On the master

$ oadm policy remove-role-from-user cluster-reader ayoung
role "cluster-reader" removed: "ayoung"

On my laptop

$ kubectl get pods
Error from server (Forbidden): User "ayoung" cannot list pods in project "default"

Fixing Bug 96869

Posted by Adam Young on May 23, 2017 03:47 AM

Bug 968696

The word Admin is used all over the place. To administer was originally something servants did to their masters. In one of the greater inversions of linguistic history, we now use Admin as a way to indicate authority. In OpenStack, the admin role is used for almost all operations that are reserved for someone with a higher level of authority. These actions are not expected to be performed by people with the plebean Member role.


Global versus Scoped

We have some objects that are global, and some that are scoped to projects. Global objects are typically things used to run the cloud, such as the set of hypervisor machines that Nova knows about. Everyday members are not allowed to “Enable Scheduling For A Compute Service” via the HTTP Call PUT /os-services/enable.

Keystone does not have a way to do global roles. All roles are scoped to a project. This by itself is not a problem. The problem is that a resource like a hypervisor does not have a project associated with it. If keystone can only hand out tokens scoped to projects, there is still no way to match the scoped token to the unscoped resource.

So, what Nova and many other services do is just look for the Role. And thus our bug. How do we go about fixing this?

Use cases

Let me see if I can show this.

In our initial state, we have two users.  Annie is the cloud admin, responsible for maintaining the over all infrastructure, such as “Enable Scheduling For A Compute Service”.  Pablo is a project manager. As such, he has to do admin level things, but only with his project, such as setting the Metadata used for servers inside this project.  Both operations are currently protected by the “admin” role.

Role Assignments

Lets look at the role assignment object diagram.  For this discussion, we are going to assume everything is inside a domain called “Default” which I will leave out of the diagrams to simplify them.

In both cases, our users are explicitly assigned roles on a project: Annie has the Admin role on the Infra project, and Pablo has the Admin role on the Devel project.

Policy

The API call to Add Hypervisor only checks the role on the token, and enforces that it must be “Admin.”  Thus, both Pablo and Annie’s scoped tokens will pass the policy check for the Add Hypervisor call.

How do we fix this?

Scope everything

Lets assume, for the moment, that we were able instantly run a migration that added a project_id to every database table that holds a resource, and to every API that manages those resources.  What would we use to populate that project_id?  What value would we give it?

Lets say we add an admin project value to Keystone.  When a new admin-level resource is made, it gets assigned to this admin project.  All of those resources we have already should get this value, too. How would we communicate this project ID?  We don’t have a keystone instance available when running the Nova Database migrations.

Turns out Nova does not need to know the actual project_id.  Nova just needs to know that Keystone considers the token valid for global resources.

Admin Projects

We’ve added a couple values to the Keystone configuration file: admin_domain_name and admin_project_name.  These two values are how Keystone specifies which project is represents and admin project.  When these two values are set, all token validation responses contain a value for is_admin_project.  If the project requested matches the domain and project name, that value is True, otherwise false.

is_admin_project

instead, we want the create_cell call to use a different rule.  Instead of the scope check performed by admin_or_owner, it should confirm the admin role, as it did before, and also that the token has the is_admin_project Flag set.

Transition

Keystone already has support for setting is_admin_project, but none of the remote service are honoring it yet. Why?  In part because, in order for it to make sense for one to do so, they all must do so.  But also, because we cannot predict what project would be the admin project.

If we select a project based on name (e.g. Admin) we might be selecting a project that does not exist.

If we force that project to exist, we still do not know what users to assign to it.  We would have effectively broken their cloud, as no users could execute Global admin level tasks.

In the long run, the trick is to provide a transition plan for when the configuration options are unset.]

The Hack

If no admin project is set, then every project is admin project.  This is enforced by oslo-context, which is used in policy enforcement.

Yeah, that seems surprising, but tt turns out that we have just codified what every deployment has already.  Look ad the bug description again:

Problem: Granting a user an “admin” role on ANY tenant grants them unlimited “admin”-ness throughout the system because there is no differentiation between a scoped “admin”-ness and a global “admin”-ness.

Adding in the field is a necessary per-cursor to solving it, but the real problem is in the enforcement in Nova, Glance, and Cinder.  Until they enforce on the flag, the bug still exists.

Fixing things

There is a phased plan to fix things.

  1. enable the is_admin_project mechanism in Keystone but leave it disabled.
  2. Add is_admin_project enforcement in the policy file for all of the services
  3. Enable an actual admin_project in devstack and Tempest
  4. After a few releases, when we are sure that people are using admin_project, remove the hack from oslo-context.

This plan was discussed and agreed upon by the policy team within Keystone, and vetted by several of the developers in the other projects, but it seems it was never fully disseminated, and thus the patches have sat in a barely reviewed state for a long while…over half a year.  Meanwhile, the developers focused on this have shifted tasks.

Now’s The Time

We’ve got a renewed effort, and some new, energetic developers committed to making this happen.  The changes have been rewritten with advice from earlier code reviews and resubmitted.  This bug has been around for a long time: Bug #968696 was reported by Gabriel Hurley on 2012-03-29.  Its been a hard task to come up with and execute a plan to solve it.  If you are a core project reviewer, please look for the reviews for your project, or, even better, talk with us on IRC (Freenode #openstack-keystone) and help us figure out how to best adjust the default policy for your service. 

 

You know how to fix enterprise patching? Please tell me more!!!

Posted by Josh Bressers on May 22, 2017 12:54 AM
If you pay attention to Twitter at all, you've probably seen people arguing about patching your enterprise after the WannaCry malware. The short story is that Microsoft fixed a very serious security flaw a few months before the malware hit. That means there are quite a few machines on the Internet that haven't applied a critical security update. Of course as you imagine there is plenty of back and forth about updates. There are two basic arguments I keep seeing.

Patching is hard and if you think I can just turn on windows update for all these computers running Windows 3.11 on token ring you've never had to deal with a real enterprise before! You out of touch hipsters don't know what it's really like here. We've seen thing, like, real things. We party like it's 1995. GET OFF MY LAWN.

The other side sounds a bit like this.

How can you be running anything that's less than a few hours old? Don't you know what the Internet looks like! If everyone just applied all updates immediately and ran their business in the cloud using agile scrum based SecDevSecOps serverless development practices everything would be fine!

Of course both of these groups are wrong for basically the same reason. The world isn't simple, and whatever works for you won't work for anyone else. The tie that binds us all together is that everything is broken, all the time. All the things we use are broken, how we use them is broken, and how we manage them is broken. We can't fix them even though we try and sometimes we pretend we can fix things.

However ...

Just because everything is broken, that's no excuse to do nothing. It's easy to declare something too hard and give up. A lot of enterprises do this, a lot of enterprise security people are using this defense why they can't update their infrastructure. On the other side though, sometimes moving too fast is more dangerous than moving too slow. Reckless updates are no better than no updates. Sometimes there is nothing we can do. Security as an industry is basically a big giant Kobayashi Maru test.

I have no advice to give on how to fix this problem. I think both groups are silly and wrong but why I think this is unimportant. The right way is for everyone to have civil conversations where we put ourselves in the other person's shoes. That won't happen though, it never happens even though basically ever leader ever has said that sort of behavior is a good idea. I suggest you double down on whatever bad practices you've hitched your horse to. In the next few months we'll all have an opportunity to show why our way to do things is the worst way ever, and we'll also find an opportunity to mock someone else for noting doing things the way we do.

In this game there are no winners and losers, just you. And you've already lost.

Episode 48 - Machine Learning: Not actually magic

Posted by Open Source Security Podcast on May 21, 2017 07:53 PM
Josh and Kurt have a guest! Mike Paquette from Elastic discusses the fundamentals and basics of Machine Learning. We also discuss how ML could have helped with WannaCry.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/323810101&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



Unix Sockets For Auth

Posted by Robbie Harwood on May 21, 2017 04:00 AM

Let's not talk about the Pam/NSS stack and instead talk about a different weird auth thing on Linux.

So sockets aren't just for communication over the network. And by that I don't mean that one can talk to local processes on the same machine by connecting to localhost (which is correct, but goes over the "lo" network), but rather something designed for this purpose only: Unix domain sockets. Because they're restricted to local use only, their features can take advantage of both ends being managed by the same kernel.

I'm not interested in performance effects (and I doubt there are any worth writing home about), but rather what the security implications are. So of particular interest is SO_PEERCRED. With the receiving end of an AF_UNIX stream socket, if you ask getsockopt(2) nicely, it will give you back assurances about the connecting end of the socket in the form of a struct ucred. When _GNU_SOURCE is defined, this will contain pid, uid, and gid of the process on the other end.

It's worth noting that these are set while in the syscall connect(2). Which is to say that they can be changed by the process on the other end by things like dropping privileges, for instance. This isn't really a problem, though, in that it can't be exploited to gain a higher level of access, since the connector already has that access.

Anyway, the uid information is clearly useful; one can imagine filtering such that a connection came from apache, for instance (or not from apache, for that matter), or keeping per-user settings, or any number of things. The gid is less clearly useful, but I can immediately see uses in terms of policy setting, perhaps. But what about the pid?

Linux has a relative of plan9's procfs, which means there's a lot of information presented in /proc. (/proc can be locked down pretty hard by admins, but let's assume it's not.) proc(5) covers more of these than I will, but there are some really neat ones. Within /proc/[pid], the interesting ones for my purposes are:

  • cmdline shows the process's argv.

  • cwd shows the current working directory of the process.

  • environ similarly shows the process's environment.

  • exe is a symlink to the executable for the process.

  • root is a symlink to the process's root directory, which means we can tell whether it's in a chroot.

So it seems like we could use this to implement filtering by the process being run: for instance, we could do things only if the executable is /usr/bin/ssh. And indeed we can; /proc/[pid]/exe will be a symlink to the ssh binary, and everything works out.

There's a slight snag, though: /usr/bin/ssh is a native executable (in this case, an ELF file). But we can also run non-native executables using the shebang - e.g., #!/bin/sh, or #!/usr/bin/python2, and so on. While this is convenient for scripting, it makes the /proc/[pid]/exe value much less useful, since it will just point at the interpreter.

The way the shebang is implemented causes the interpreter to be run with argv[1] set to the input file. So we can pull it out of /proc/[pid]/cmdline and everything is fine, right?

Well, no. Linux doesn't canonicalize the path to the script file, so unless it was originally invoked using a non-relative path, we don't have that information.

Maybe we can do the resolution ourselves, though. We have the process environment, so $PATH-based resolution should be doable, right? And if it's a relative path, we can use /proc/[pid]/cwd, right?

Nope. Although inspecting the behavior of shells would suggest that /proc/[pid]/cwd doesn't change, this is a shell implementation detail; the program can just modify this value if it wants.

Even if we nix relative paths, we're still not out of the woods. /proc/[pid]/environ looks like exactly what it want, as the man page specifies that even getenv(3)/setenv(3) do not modify this. However, the next paragraph indicates the syscall needed to just move what region of memory it points to, so we can't trust that value either.

There's actually a bigger problem, though. Predictably, from the way the last two went, processes can just modify argv. So: native code only.

Anyway, thanks for reading this post about a piece of gssproxy's guts. Surprise!

OpenShift Origin Default Auth

Posted by Adam Young on May 16, 2017 04:37 PM

Once I got the Ansible playbook to run, I was able to poke at the openshift setup.

The install creates a default configuration in the Ansible users home directory on the master node.

I can use the openshift client:

ssh ansible@munchlax oc get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-2-z91cq    1/1       Running   0          18h
registry-console-1-g4qml   1/1       Running   0          20h
router-5-4w3zt             1/1       Running   0          18h
ssh ansible@munchlax oc create user ayoung

Or even the kubectl executable:

$ ssh ansible@munchlax kubectl get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-2-z91cq    1/1       Running   0          18h
registry-console-1-g4qml   1/1       Running   0          20h
router-5-4w3zt             1/1       Running   0          18h

If I want to pull this over to my home machine, I can use rsync:

rsync -a  ansible@munchlax:.kube ~/
[ayoung@ayoung541 kubevirt-ansible]$ ls ~/.kube/
cache  config  munchlax_8443  schema
[ayoung@ayoung541 kubevirt-ansible]$ kubectl get pods
NAME                       READY     STATUS    RESTARTS   AGE
docker-registry-2-z91cq    1/1       Running   0          18h
registry-console-1-g4qml   1/1       Running   0          20h
router-5-4w3zt             1/1       Running   0          18h

Although the advice I got from sdodson in IRC sounds solid:

ansible_user on the first master should have admin’s kubeconfig in ~/.kube/config The intention is that you use that to provision additional admins/users and grant them required permissions. Then they can use `oc` or the web console using whatever credentials you’ve created for them.

I can use the WebUI by requesting the following URL from the Browser.

https://munchlax:8443/console/

Assuming I bypass the Certificate warnings, I can see the login screen. Since the admin user is secured with a Client Cert, and the UI supports password login, I’ll create a user to mirror my account and log in that way, Following the instructions here:

[ayoung@ayoung541 kubevirt-ansible]$ oc create user ayoung
user "ayoung" created
[ayoung@ayoung541 kubevirt-ansible]$ oc get user ayoung
NAME      UID                                    FULL NAME   IDENTITIES
ayoung    cca08f74-3a53-11e7-9754-1c666d8b0614               
[ayoung@ayoung541 kubevirt-ansible]$ oc get identities
No resources found.
[ayoung@ayoung541 kubevirt-ansible]$ oc get identity
No resources found.

Hmmm, no Identity providers seem to be configured. I see I can override this via the Ansible inventory file if I rerun.

I can see the current configuration in

sudo cat /etc/origin/master/master-config.yaml

Which has this line in it…

      kind: AllowAllPasswordIdentityProvider

perhaps my new user will work?

From the login screen.  Using a password of ‘test’ which I have not set anywhere.

 

I get logged in and see the “new project” screen.

This works for development, but I need something more serious for a live deployment in the future.

Installing OpenShift Origin via Ansible on Fedora 25

Posted by Adam Young on May 15, 2017 11:18 PM

While many people referred me to run one of the virtualized setups of OpenShift, I wanted something on baremetal in order to eventually test out KubeVirt.  Just running

oc cluster up

As some people suggested did not work, as it assumes prerequisites are properly set up;  the docker registry was one that I tripped over.  So, I decided to give openshift-ansible a test run.  Here are my notes.

SSH and Ansible has been set up and used for upstream Kubernetes testing on this machine.  Kubernetes has been removed.  There might be artifacts left behind or not explicitly listed here.

There is no ~/.kube directory, which I know has messed me up elsewhere in the past.

 
git clone https://github.com/openshift/openshift-ansible

I have two nodes for the cluster. My head node is munchlax, and dialga the compute node.

 
sudo yum install python3 --best --allowerasing
sudo yum install python3-yaml

I created a local file for inventory that looks like this:

[all]
munchlax
dialga

[all:vars]
ansible_ssh_user=ansible
containerized=true
openshift_deployment_type=origin
ansible_python_interpreter=/usr/bin/python3
openshift_release=v1.5
openshift_image_tag=v1.5.0

[masters]
munchlax

[masters:vars]
ansible_become=true

[nodes]
dialga openshift_node_labels="{'region': 'infra'}"

[nodes:vars]
ansible_become=true

 

Note that, while it might seem silly to specify

ansible_become=true

for each of the groups instead of under all, specifying it for all:vars will break the deployment as it then overrides local commands and performs them via sudo, and those should not be done as root.

I’m still working on getting the versions values right, but these seemed to work, with a couple work arounds.  I’ve  posted a diff at the end.

The value openshift_node_labels=”{‘region’: ‘infra’}” is used to specify where the registry is installed.

To run the install, I ran:

ansible-playbook -vvvi /home/ayoung/devel/local-openshift-ansible/inventory.ini /home/ayoung/devel/openshift-ansible/playbooks/byo/config.yml

To test the cluster.

ssh ansible@munchlax

[ansible@munchlax ~]$ kubectl get pods


NAME READY STATUS RESTARTS AGE
docker-registry-1-deploy 0/1 Pending 0 31m
registry-console-1-g4qml 1/1 Running 0 31m
router-4-deploy 0/1 Pending 0 32m

Update: I also needed one commit from a Pull request:

commit 75da091c3e917dc3cd673d4fd201c1b2606132f2
Author: Jeff Peeler <jpeeler>
Date:   Fri May 12 18:51:26 2017 -0400

    Fix python3 error in repoquery
    
    Explicitly convert from bytes to string so that splitting the string is
    successful. This change works with python 2 as well.
    
    Closes #4182

Here are the change from master I had to make by hand:

  1. the cerficate allocation used the unsupported flag –expired-days  which I removed.
  2. The Ansible sysctl module has a known issue for Python 3.  I converted to running the CLI
  3. The version check betwen the container and RPM versions was too strict an unpassable on my system.  Commented it out.
diff --git a/roles/openshift_hosted/tasks/registry/secure.yml b/roles/openshift_hosted/tasks/registry/secure.yml
index 29c164f..5134fdd 100644
--- a/roles/openshift_hosted/tasks/registry/secure.yml
+++ b/roles/openshift_hosted/tasks/registry/secure.yml
@@ -58,7 +58,7 @@
 - "{{ docker_registry_route_hostname }}"
 cert: "{{ openshift_master_config_dir }}/registry.crt"
 key: "{{ openshift_master_config_dir }}/registry.key"
- expire_days: "{{ openshift_hosted_registry_cert_expire_days if openshift_version | oo_version_gte_3_5_or_1_5(openshift.common.deployment_type) | bool else omit }}"
+# expire_days: "{{ openshift_hosted_registry_cert_expire_days if openshift_version | oo_version_gte_3_5_or_1_5(openshift.common.deployment_type) | bool else omit }}"
 register: server_cert_out
 
 - name: Create the secret for the registry certificates
diff --git a/roles/openshift_node/tasks/main.yml b/roles/openshift_node/tasks/main.yml
index 656874f..e2e187b 100644
--- a/roles/openshift_node/tasks/main.yml
+++ b/roles/openshift_node/tasks/main.yml
@@ -105,7 +105,12 @@
 # startup, but if the network service is restarted this setting is
 # lost. Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1372388
 - name: Persist net.ipv4.ip_forward sysctl entry
- sysctl: name="net.ipv4.ip_forward" value=1 sysctl_set=yes state=present reload=yes
+ command: sysctl -w net.ipv4.ip_forward=1 
+
+- name: reload for net.ipv4.ip_forward sysctl entry
+ command: sysctl -p/etc/sysctl.conf
+
+
 
 - name: Start and enable openvswitch service
 systemd:
diff --git a/roles/openshift_version/tasks/main.yml b/roles/openshift_version/tasks/main.yml
index 2e9b4ca..cc14453 100644
--- a/roles/openshift_version/tasks/main.yml
+++ b/roles/openshift_version/tasks/main.yml
@@ -99,11 +99,11 @@
 when: not rpm_results.results.package_found
 - set_fact:
 openshift_rpm_version: "{{ rpm_results.results.versions.available_versions.0 | default('0.0', True) }}"
- - name: Fail if rpm version and docker image version are different
- fail:
- msg: "OCP rpm version {{ openshift_rpm_version }} is different from OCP image version {{ openshift_version }}"
+# - name: Fail if rpm version and docker image version are different
+# fail:
+# msg: "OCP rpm version {{ openshift_rpm_version }} is different from OCP image version {{ openshift_version }}"
 # Both versions have the same string representation
- when: openshift_rpm_version != openshift_version
+# when: openshift_rpm_version != openshift_version
 when: is_containerized | bool
 
 # Warn if the user has provided an openshift_image_tag but is not doing a containerized install

 

 

Episode 47 - WannaCry: Everything is basically broken

Posted by Open Source Security Podcast on May 14, 2017 05:22 PM
Josh and Kurt discuss the WannaCry worm.
Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/322577205&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



Please Remove Your Prng

Posted by Robbie Harwood on May 14, 2017 04:00 AM

The gist of this post is that if your program or library has its own PRNG, I would like you to remove it. If you are not convinced that this is a good idea, read on; if you want links on what to do instead, skip to the second section.

Why do you have this?

I believe in code re-use, in not re-inventing the wheel where necessary, and in cooperation. So if code to do something exists, and there are no strong reasons why I shouldn't use it: I use it. Formulated in this manner it sounds almost like a strict rule; in practice, the same result can be achieved for new code just by laziness. For existing code, it becomes a maintainability question. I maintain code as well as writing it (as everyone who writes code should), and while I won't deny a certain satisfaction in well-oiled machinery, less would be better. So everything we maintain should serve a purpose, and reducing unneeded size, scope, and complexity are worthwhile.

Every project is different, which means your project's reasons for having a PRNG will be different as well. Maybe it doesn't care about the quality of the pseudorandom numbers (at which point it should probably just read /dev/urandom). Or maybe it's performing cryptographic operations (use getrandom(2) or similar). But I invite you to think about whether continuing to maintain your own is worth it, or whether it might be better to use something which has been more strongly audited (e.g., the kernel CSPRNG).

To look at an example: a few months ago now, I performed this change for krb5. In our case, we had a Fortuna implementation because quality entropy used to be difficult to come by. Fortuna specifically was used due to its ability to recover from low-quality input. However, upon further examination, the time to recover is quite long (so it only really helps the server), and in the mean time, operation will appear to be normal, with low quality random numbers. Since there're already quality random numbers available on all server platforms we support, I added the option to just use them directly. (This describes the current behavior on Fedora, as well as the behavior for all future RHEL/CentOS releases).

What to do instead

For this post, I will be focusing on Linux. If you are not in a Linux environment, you call a different function.

Anyway, the short answer is: you should use getrandom(2).

The longer answer is just me telling you how to use getrandom(2). For that, I want to draw from this post which contains a useful diagram about how /dev/random and /dev/urandom relate. The author points out two issues with using /dev/urandom directly on Linux (that do not occur on certain BSD, where one just uses /dev/random instead):

  • First, that /dev/urandom is not guaranteed to be seeded. getrandom(2) actually provides seeding guarantees by default. More precisely, it will block the call until the requested number of bytes can be generated; in the case of the urandom pool, this means until the pool has been seeded. (If this behavior is not desired, one should just read directly from /dev/urandom instead.)

  • Second, that one may wish to use /dev/random despite it being slower if they're feeling especially paranoid. There's a getrandom(2) flag for this, it turns out: GRND_RANDOM.

There's one pitfall with this approach, which is that (for reasons that are opaque to me) glibc was slow to add a wrapper for this function. (See: rhbz, upstream.) So if you want to support older versions of glibc, you have to use syscall(2) instead, like this.

Future work

A while back, I remember reading (but can no longer find) a post which surveyed open source software's usage of rand()/srand() and related functions. There were some decidedly bizarre usages in there, but most of them were just run-of-the-mill incorrect. Anyway, inspired by that, I've been toying with the idea of writing a shim library of sorts to make these functions actually return cryptographically random numbers, discarding seeding and such. The only real pitfall I'm aware of with this is users of these functions that expect deterministic behavior, but I'm not really sure I want to care. Maybe an environment variable for configuration or something.

Why Quotas are Hard

Posted by Adam Young on May 12, 2017 02:39 AM

A quota is a numerical limit on a group of resources. Quotas have to be both recorded and enforced.

We had a session at the summit this past week about hierarchical quotas and, if I took anything away from it, it is that quotas are hard.

Keystone supports a project hierarchy. Here’s a sample one for you:

Hierarchical quotas are assigned to a parent project and applied to a child project.  This hierarchy is only 3 levels deep and only has 9 projects.  A real deployment will be much larger than this. Often, a large organization has one project per user, in addition to departmental projects like the ones shown above.

Lets assume that our local sys-admin has granted our Internal domain a quota of 100 virtual machines.  How would we enforce this.  If the user attempts to create a VM in the root project of the hierarchy (a domain IS-A project) then Nova should see that the quota for that domain is 100, and that there are currently 0 VMs, so it should create the VM.  The second time this happens, there is a remaining quota of 99, and so on.

Now, lets assume that the quota is stored in Keystone, as in the current proposal we were discussing.  When Nova asks Keystone what is the quoat for “Internal” Keystone can return 100.  Nova can then query all VMs to find out which have a project ID that matches that of “Internal” and verify that there are 2. Since 100 – 2 > 0, Nova should create the VM.

What if the user wants to create a VM in the “Sales” project?  That is where things get hierarchical.  We discussed schemes where the quota would be explicitly assigned to Sales and where the quota was assumed to come from “Internal.”  Both are tricky.

Lets say we allow the explicit allocation of quota from higher to lower.  Does this mean that the parent project is reducing its own quota while creating an explicit quota for the lower project?  Or does it mean that both quotas need to be enforced?  If the quota for sales is set to 10, and the quota for the three node projects are all set to 10, is this legal or an error?

Lets assume, for a moment, that it is legal.  Under this scheme, a user with a token scoped to TestingA create 10 projects. As each project is created, Nova needs to check the number of machines already created in project TestingA.  It also needs to check the number of machines in project StagingA, ProductionA, and Sales to ensure that the quota for “Sales” has not been exceeded.  If the is an explicit quota on “Internal”, Nova needs to check the number of VMs created in that project and any project under it.  Our entire tree must be searched and counted and that count compared with the parent project.

Ideally, we would only ever have to check the quota for a single project.  That only works if:

  1. Every project in the whole tree has an explicit quota
  2. Quotas can be “split” amongst child projects but never reclaimed.

If that second statement seems strong, assume the “Marketing”  project with a quota of 10 chips off 9 for TestingB, creates 5 VMs, drops the quota for TestingB to 0, Sets the quota for StagingB to 9, and creates 9 VMs in that project.  This leaves it with 18 VMs running but only an explicit quota of 10.

The word “never” really is too strong, but it would require some form of reconcilliation process, by which Nova confirmed that both projects were within the end-state limits.

Automated Reconciliation is hard.  Keystone needs to know how to query random quanties on remote objects, and it probably should not even have acceess to those objects.  Or, Nova (and every other service using Quotas) needs to provide an API for keystone to query to confirm resources have been freed.

Manual reconcilliation is probably possible, but will be labor intensive.

One possibility is that Keystone actually record the usage of quotas, as well as the freeing of actual resources.  This is also painful, as now every single call that either creates or deletes a resource requires an additional call to Keystone.  Or, If quotas are “Batch” fetched by Nova, Nova needs to remember them, and store them locally.  If quotas then  change in Keystone, the cache is invalid.

This is only a fragment of the whole discussion.

Quotas are hard.

SELinux and --no-new-privs and the setpriv command.

Posted by Dan Walsh on May 05, 2017 03:00 PM
BOUNDED TRANSITIONS

SELinux transitions are in some ways similar to a setuid executable in that when a transition happens the new process has different security properties then the calling process.  When you execute setuid executable, your parent process has one UID, but the child process has a different UID.

The kernel has a way for block these transitions, called --no-new-privs.  If I turn on --no-new-privs, then setuid applications like sudo, su and others will no longer work.   Ie you can not get more privileges then the parent process.

SELinux does something similar and in most cases the transition is just blocked.

For example.

We have a rule that states a httpd_t process executing a script labeled httpd_sys_script_exec_t (cgi label) will transition to httpd_sys_script_t.  But if you tell the kernel that your process tree has --no-new-privs, depending on how you wrote the policy , when the process running as httpd_t executes the httpd_sys_script_exec_t it will no longer transition, it will attempt to continue to run the script as httpd_d.

SELinux enforces that the new transition type must have a subset of the allow rules of its parent process.  But it can have no other allow rules.  IE the transitioned process can not get any NEW Privs.

This feature has not been used much in SELinux Policy, so we have found and fixed a few issues.

Container Runtimes like docker and runc have now added the --no-new-privs flag.  We have been working to make container-selinux follow the rules.

The container runtimes running container_runtime_t can start a container_t process only if container_runtime_t has all of the privileges of its parent process.

In SELinux policy you write a rule like:

typebounds container_runtime_t container_t;

This tells the compiler of policy to make sure that the container_t is a subset. If you are running the compiler in strict mode, the compiler will fail, if it is not a subsection.
If you are not running in strict mode, the compiler will silently remove any allow rules that are not in the parent, which can cause some surprises.


setpriv command

I recently heard about the setpriv command which is pretty cool for playing with kernel security features like dropping capabilities, and SELinux. One of the things you can do is execute


$ setpriv --no-new-privs sudo sh
sudo: effective uid is not 0, is sudo installed setuid root?


But if you want to try out SELinux you could combine the setpriv command and runcon together

$ setpriv --no-new-privs runcon -t container_t id -Z
staff_u:staff_r:container_t:s0-s0:c0.c1023
$ setpriv --no-new-priv runcon -t container_t id -Z
runcon: id: Operation not permitted


This happens because container_t is not type bounds by staff_t.

Episode 46 - Turns out I'm not a bad guy

Posted by Open Source Security Podcast on May 04, 2017 08:38 PM
Josh and Kurt discuss the recent Google phish attack.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/320997006&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes






Security like it's 2005!

Posted by Josh Bressers on May 03, 2017 12:29 PM
I was reading the newspaper the other day (the real dead tree newspaper) and I came across an op-ed from my congressperson.

Gallagher: Cybersecurity for small business

It's about what you'd expect but comes with some actionable advice! Well, not really. Here it is so you don't have to read the whole thing.

Businesses can start by taking some simple and relatively inexpensive steps to protect themselves, such as:
» Installing antivirus, threat detection and firewall software and systems.
» Encrypting company data and installing security patches to make sure computers and servers are up to date.
» Strengthening password practices, including requiring the use of strong passwords and two-factor authentication.
» Educating employees on how to recognize an attempted attack, including preparing rapid response measures to mitigate the damage of an attack in progress or recently completed.
I read that and my first thought was "how on earth would a small business have a clue about any of this", but then it got me thinking about the bigger problem. This advice isn't even useful in 2017. It sort of made sense a long time ago when this was the way of thinking, it's not valid anymore though.

Let's pick them apart one by one.

Installing antivirus, threat detection and firewall software and systems.
It's no secret that antivirus doesn't really work anymore. It's expensive in terms of cost and resources. In most settings I've seen it probably causes more trouble than it solves. Threat detection doesn't really mean anything. Virtually all systems come with a firewall enabled and some level of software protections that makes existing antivirus obsolete. Honestly, this is about as solved as it's going to get. There's no positive value you can add here.

Encrypting company data and installing security patches to make sure computers and servers are up to date
This is two unrelated things. Encrypting data is probably overkill for most settings. Any encryption that's usable doesn't really protect you. Encryption that actually protects needs a dedicated security team to manage. Let's not get into an argument about offline vs online data.

Keeping systems updated a fantastic idea. Nobody does it because it's too hard to do. If you're a small business you'll either have zero updates, or automatically install them all. The right answer is to use something as a service so you don't have to think about updates. Make sure automatic updates are working on your desktops.

Strengthening password practices, including requiring the use of strong passwords and two-factor authentication

Just use two-factor auth from your as a service provider. If you're managing your own accounts and you lack a dedicated identity team failure is the only option. Every major cloud provider can help you solve this.

Educating employees on how to recognize an attempted attack, including preparing rapid response measures to mitigate the damage of an attack in progress or recently completed

Just no. There is value in helping them understand the risks and threats, but this won't work. Social engineering attacks go after the fundamental nature of humanity. You can't stop this with training. The only hope is we create cold calculating artificial intelligence that can figure this out before it reaches humans. A number of service providers can even stop some of this today because they have ways to detect anomalies. A small business doesn't and probably never will.


As you can see, this list isn't really practical for anyone to worry about. Why should you have to worry about this today? These sort of problems have been plaguing small business and home users for years. These points are all what I would call "mid 200X" advice. These were suggestions everyone was giving out around 2005, they didn't really work then but it made everyone feel better. Most of these bullets aren't actionable unless you have a security person on staff. Would a non security person have any idea where to start or what of these items mean?

The 2017 world has a solution to these problems. Use the cloud. Stuff as a Service is without question the way to solve these problems because it makes them go away. There are plenty who will naysay public cloud citing various breeches, companies leaking data, companies selling data, and plenty of other problems. The cloud isn't magic, but it lets you trade a lot of horrible problems for "slightly bad". I guarantee the problems with the cloud are substantially better than letting most people try to run their own infrastructure. I see this a bit like airplane vs automobile crashes. There are magnitudes more deaths by automobile every year, but it's the airplane crashes that really get the attention. It's much much safer to fly than to drive, just as it's much much safer to use services than to manage your own infrastructure.

Episode 45 - Trust is more important now than the truth

Posted by Open Source Security Podcast on May 02, 2017 01:52 AM
Josh and Kurt discuss not-counterfeit MTG cards, antivirus, squirrelmail, unroll.me, grsecurity, baby monitors, and trust.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/320432805&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



Security fail is people

Posted by Josh Bressers on April 30, 2017 05:16 PM
The other day I ran across someone trying to keep their locker secured by using a combination lock. As you can see in the picture, the lock is on the handle of the locker, not on the loop that actually locks the door. When I saw this I had a good chuckle, took a picture, and put out a snarky tweet. I then started to think about this quite a bit. Is this the user's fault or is this bad design? I'm going to blame bad design on this one. It's easy to blame users, we do it often, but I think in most instances, the problem is the design, not the user. If nothing is ever our fault, we will never improve anything. I suspect this is part of the problem we see across the cybersecurity universe.

On Humans

One of the great truths I'm starting to understand as I deal with humans more and more is that the one thing we all have in common is that we have waves of unpredictability. Sometimes we pay very close attention to our surroundings and situations, sometimes we don't. We can be distracted by someone calling our name, by something that happened earlier in the day, or even something that happened years ago. If you think you pay very close attention to everything at all times you're fooling yourself. We are squishy bags of confusing emotions that don't always make sense.

In the above picture, I can see a number of ways this happens. Maybe the person was very old and couldn't see. I have bad eyesight and could see this happening. Maybe they were talking to a friend and didn't notice where they put the lock. What if they dropped their phone moments before putting the lock on the door. Maybe they're just a clueless idiot who can't use locks! Well, not that last one.

This example is bad design. Why is there a handle that can hold a lock directly above the loop that is supposed to hold the lock? I can think of a few ways to solve this. The handle could be something other than a loop. A pull knob would be a lot harder to screw up. The handle could be farther up, or down. The loop could be larger or in a different place. No matter how you solve this, this is just a bad design. But we blame the user. We get a good laugh at a person making a simple mistake. Someday we'll make a simple mistake then blame bad design. It is also human nature to find someone or something else to blame.

The question I keep wondering; did whoever design this door think about security in any way? Do you think they were wondering how a system can and would fail? How would it be misused? How it could be broken? In this case I doubt there was anyone thinking about security failures for the door to a locker, it's just a locker. They probably told the intern to go draw a rectangle and put a handle on it. If I could find the manufacturer and tell them about this would they listen? I'd probably get pushed into the "crazy old kook" queue. You can even wonder if anyone really cares about locker security.

Wrapping up a post like this is always tricky. I could give advice about secure design, or tell everyone they should consult with a security expert. Maybe the answer is better user education (haha no). I think I'll target this at the security people who see something like this, take a picture, then write a tweet about how stupid someone is. We can use examples like this to learn and shape our own way of thinking. It's easy to use snark when we see something like this. The best thing we can do is make note of what we see, think about how this could have happened, and someday use it as an example to make something we're building better. We can't fix the world, but we can at least teach ourselves.

Episode 44 - Bug Bounties vs Pen Testing

Posted by Open Source Security Podcast on April 25, 2017 12:06 PM
Josh and Kurt discuss Lego, bug bounties, pen testing, thought leadership, cars, lemons, entropy, and CVE.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/319388588&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



I have seen the future, and it is bug bounties

Posted by Josh Bressers on April 24, 2017 02:23 PM

Every now and then I see something on a blog or Twitter about how you can't replace a pen test with a bug bounty. For a long time I agreed with this, but I've recently changed my mind. I know this isn't a super popular opinion (yet), and I don't think either side of this argument is exactly right. Fundamentally the future of looking for issues will not be a pen test. They won't really be bug bounties either, but I'm going to predict pen testing will evolve into what we currently call bug bounties.

First let's talk about a pen test. There's nothing wrong with getting a pen test, I'd suggest everyone goes through a few just to see what it's like. I want to be clear that I'm not saying pen testing is bad. I'm going to be making the argument why it's not the future. It is the present, many organizations require them for a variety of reasons. They will continue to be a thing for a very long time. If you can only pick one thing, you should probably choose a pen test today as it's at least a known known. Bug bounties are still known unknowns for most of us.

I also want to clarify that internal pen testing teams don't fall under this post. Internal teams are far more focused and have special knowledge that an outside company never will. It's my opinion that an internal team is and will always be superior to an outside pen test or bug bounty. Of course a lot of organizations can't afford to keep a dedicated internal team, so they turn to the outside.

So anyhow, it's time for a pen test. You find a company to conduct it, you scope what will be tested (it can't be everything). You agree on various timelines, then things get underway. After perhaps a week of testing, you have a very very long and detailed report of what was found. Here's the thing about a pen test; you're paying someone to look for problems. You will get what you pay for, you'll get a list of problems, usually a huge list. Everyone knows that the bigger the list, the better the pen test! But here's the dirty secret. Most of the results won't ever be fixed. Most results will fall below your internal bug bar. You paid for a ton of issues, you got a ton of issues, then you threw most of them out. Of course it's quite likely there will be high priority problems found, which is great. Those are what you really care about, not all the unexciting problems that are 95% of the report. What's your cost per issue fixed from that pen test?

Now let's look at how a bug bounty works. You find a company to run the bounty (it's probably not worth doing this yourself, there are many logistics). You scope what will be tested. You can agree on certain timelines and/or payout limits. Then things get underway. Here's where it's very different though. You're paying for the scope of bounty, you will get what you pay for, so there is an aspect of control. If you're only paying for critical bugs, by definition, you'll only get critical bugs. Of course there will be a certain amount of false positives. If I had to guess it's similar to a pen test today, but it's going to decrease as these organizations start to understand how to cut down on noise. I know HackerOne is doing some clever things to prevent noise.

My point to this whole post revolves around getting what you pay for, essential a cost per issue fixed instead of the current cost per issue found model. The real difference is that in the case of a bug bounty, you can control the scope of incoming. In no way am I suggesting a pen test is a bad idea, I'm simply suggesting that 200 page report isn't very useful. Of course if a pen test returned three issues, you'd probably be pretty upset when paying the bill. We all have finite resources so naturally we can't and won't fix minor bugs. it's just how things work. Today at best you'll about the same results from a bug bounty and a pen test, but I see a bug bounty as having room to improve. I think the pen test model isn't full of exciting innovation.

All this said, not every product and company will be able to attract enough interest in a bug bounty. Let's face it, the real purpose behind all this is to raise the security profiles of everyone involved. Some organizations will have to use a pen test like model to get their products and services investigated. This is why the bug bounty program won't be a long term viable option. There are too many bugs and not enough researchers.

Now for the bit about the future. The near future we will see the pendulum swing from pen testing to bug bounties. The next swing of the pendulum after bug bounties will be automation. Humans aren't very good at digging through huge amounts of data but computers are. What we're really good at and computers are (currently) really bad at is finding new and exciting ways to break systems. We once thought double free bugs couldn't be exploited. We didn't see a problem with NULL pointer dereferences. Someone once thought deserializing objects was a neat idea. I would rather see humans working on the future of security instead of exploiting the past. The future of the bug bounty can be new attack methods instead of finding bugs. We have some work to do, I've not seen an automated scanner that I'd even call "almost not terrible". It will happen though, tools always start terrible and get better through the natural march of progress. The road to this unicorn future will pass through bug bounties. However, if we don't have automation ready on the other side, it's nothing but dragons.

Episode 43 - We are totally immature

Posted by Open Source Security Podcast on April 19, 2017 12:52 PM
Josh and Kurt discuss Shadow Brokers, pronouncing GIF, Atlanta's road problems, browser phishing, warning sirens, IoT, and fake Magic the Gathering cards.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/318438805&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Crawl, Walk, Drive

Posted by Josh Bressers on April 17, 2017 12:16 AM
It's that time of year again. I don't mean when all the government secrets are leaked onto the Internet by some unknown organization. I mean the time of year when it's unsafe to cross streets or ride your bike. At least in the United States. It's possible more civilized countries don't have this problem. I enjoy getting around without a car, but I feel like the number of near misses has gone up a fair bit, and it's always a person much younger than me with someone much older than them in the passenger seat. At first I didn't think much about this and just dreamed of how self driving cars will rid us of the horror that is human drivers. After the last near fatality while crossing the street it dawned on me that now is the time all the kids have their driving learner's permit. I do think I preferred not knowing this since now I know my adversary. It has a name, and that name is "youth".

For those of you who aren't familiar with how this works in the US. Essentially after less training than is given to a typical volunteer, a young person generally around the age of 16 is given the ability to drive a car, on real streets, as long as there is a "responsible adult" in the car with them. We know this is impossible as all humans are terribly irresponsible drivers. They then spend a few months almost getting in accidents, take a proper test administered by someone who has one of the few jobs worse than IT security, and generally they end up with a real driver's license, ensuring we never run out of terrible human drivers.

There are no doubt a ton of stories that could be told here about mentorship, learning, encouraging, leadership, or teaching.  I'm not going to talk about any of that that today. I think often about how we raise up the next generation of security goons, I'm tired of talking about how we're all terrible people and nobody likes us, at least for this week.

I want to discuss the challenges of dealing with someone who is very new, very ambitious, and very dangerous. There are always going to be "new" people in any group or organization. Eventually they learn the rules they need to know, generally because they screw something up and someone yells at them about it. Goodness knows I learned most everything I know like this. But the point is, as security people, we have to not only do some yelling but we have to keep things in order while the new person is busy making a mess of everything. The yelling can help make us feel better, but we still have to ensure things can't go too far off the rails.

In many instances the new person will have some sort of mentor. They will of course try to keep them on task and learning useful things, but just like the parent of our student driver, they probably spend more time gaping in terror than they do teaching anything useful. If things really go crazy you can blame them someday, but at the beginning they're just busy hanging on trying not to soil themselves in an attempt to stay composed.

This brings us back to the security group. If you're in a large organization, every day is new person screwing something up day. I can't even begin to imagine what it must be like at a public cloud provider where you not only have new employees but also all your customers are basically ongoing risky behavior. The solution to this problem is the same as our student driver problem. Stop letting humans operate the machines. I'm not talking about the new people, I'm talking about the security people. If you don't have heavy use of automation, if you're not aggregating logs and having algorithms look for problems for example, you've already lost the battle.

Humans in general are bad at repetitive boring tasks. Driving falls under this category, and a lot of security work does too. I touched on the idea of measuring what you do in my last post. I'm going to tie these together in the next post. We do a lot of things that don't make sense if we measure them, but we struggle to measure security. I suspect part of that reason is because for a long time we were the passenger with the student drivers. If we emerged at the end of the ride alive, we were mostly happy.

It's time to become the groups building the future of cars, not waiting for a horrible crash to happen. The only way we can do that is if we start to understand and measure what works and what doesn't work. Everything from ROI to how effective is our policy and procedure. Make sure you come back next week. Assuming I'm not run down by a student driver before then.

Using the OPTIONS Verb for RBAC

Posted by Adam Young on April 15, 2017 02:41 AM

Lets say you have  RESTful Web Service.  For any given URL, you might support one or more of the HTTP verbs:  GET, PUT, POST, DELETE and so on.  A user might wonder what they mean, and which you actually support.  One way of reporting that is by using the OPTION Verb.  While this is a relatively unusual verb, using it to describe a resource is a fairly well known mechanism.  I want to take it one step further.

Both OpenStack and Kubernetes support scoped role based access control.  The OPTIONS verb can be used to announce to the world what role is associated with each verb.

Lets use Keystone’s User API as an example.  We have typical CRUD operations on users.

https://developer.openstack.org/api-ref/identity/v3/#list-users

Thus, the call OPTIONS https://hostname:port/v3/users

Could return data like this:

"actions": {
  "POST": {
     "roles": ["admin"]
  },
  "GET": {
     "roles": ["admin", "Member"]
  }
}

 

That would be in addition to any other data you might feel relevant to return there:  JSON-HOME type information on the “POST” would be helpful in creating a new User, for example.

Ideally, the server would even respond to both template and actual URLS.  Both these should return the same response:

/v3/users/{user_id}

/v3/users/DEEDCAFE

Regardless of whether the ID passed was actually a valid ID or not.

 

ADDENDUM:

A few people have asked if it is opening up a security hole.  NOthing I am saying here is proposing a change the existing security approach.  If you want to make sure a user is authenticated before telling them this information, do so.  If you only want to return role information for a user that already has that role, go for it.

There is a flip side here, to protecting the user.  If  user does not know what role is required, and she wants to create a delegation to some other user, she cannot safely do that;  she has to provide the full set of roles she has in that delegation.  Without telling people what key opens the door, they have to try every key they own.

Episode 42 - Hitchhiker's Guide to Security

Posted by Open Source Security Podcast on April 13, 2017 12:13 PM
Josh and Kurt discuss the security themes and events in the context of the HHGG movie.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/317490724&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


The obvious answer is never the secure answer

Posted by Josh Bressers on April 11, 2017 08:03 PM
One of the few themes that comes up time and time again when we talk about security is how bad people tend to be at understanding what's actually going on. This isn't really anyone's fault, we're expecting people to go against what is essentially millions of years of evolution that created our behaviors. Most security problems revolve around the human being the weak link and doing something that is completely expected and completely wrong.

This brings us to a news story I ran across that reminded me of how bad humans can be at dealing with actual risk. It seems that peanut free schools don't work. I think most people would expect a school that bans peanuts to have fewer peanut related incidents than a school that doesn't. This seems like a no brainer, but if there's anything I've learned from doing security work for as long as I have, the obvious answer is always wrong.

The report does have a nugget of info in it where they point out that having a peanut free table at lunch seems to work. I suspect this is different than a full on ban, in this case you have the kids who are sensitive to peanuts sit at a table where everyone knows peanuts are bad. There is of course a certain amount of social stigma that comes with having to sit at a special table, but I suspect anyone reading this often sat alone during schooltime lunch for a very different reason ;)

This is similar to Portugal making all drugs legal and having one of the lowest overdose rates in Europe. It seems logical that if you want fewer drugs you make them illegal. It doesn't make sense to our brains that if you want fewer drugs and problems you make them legal. There are countless other examples of reality seeming to be totally backwards from what we think should be true.

So that brings us to security. There are lessons in stories like these. It's not to do the opposite of what makes sense though. The lesson is to use real data to make decisions. If you think something is true and you can't prove it either way, you could be making decisions that are actually hurting instead of helping. It's a bit like the scientific method. You have a hypothesis, you test it, then you either update your hypothesis and try again or you end up with proof.

In the near future we'll talk about measuring things; how to do it, what's important, and why it will matter for solving your problems.

Episode 41 - All your money are belong to us

Posted by Open Source Security Podcast on April 10, 2017 12:32 AM
Josh and Kurt discuss airplane laptop bans, ATM hacking, pointing at things, and Certificate Authorities.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/316915938&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



Be careful relabeling volumes with Container run times. Sometimes things can go very wrong?

Posted by Dan Walsh on April 07, 2017 04:08 AM
I recently revieved an email from someone who made the mistake of volume mounting /root into his container with the :Z option.

docker run -ti -v /root:/root:Z fedora sh

The container ran fine, and everything was well on his server machine until the next time he tried to ssh into the server.

The sshd refused to allow him in?  What went wrong?

I wrote about using volumes and SELinux on the project atomic blog.  I explain their that in order to use a volume within a non privileged container, you need to relabel the content on the volume.  You can either use the :z or the :Z option.

:z will relabel with a shared label so other containers ran read and write the volume.
:Z will relabel with a private label so that only this specific container can read and write the volume.

I probably did not emphasize enough is that as Peter Parker (Spider Man) says: With great power comes great responsibility.

Meaning you have to be careful what you relabel.  Using one of the :Z and :z options to recursively change the labels of the source content to container_file_t. When doing this you must be sure that this content is truly private to the container.  The content is not needed by other confined domains.  For example doing a volume mount of -v /var/lib/mariadb:/var/lib/mariadb:Z for a mariadb container is probably a good idea. But while doing -v /var/lib:/var/lib:Z will work, it is probably a bad idea.

Back to the email, the user relabeled all of the content under /root with a label similar to
system_u:object_r:container_file_t:s0:c103:c753.
Later when he attempted to ssh in, the sshd daemon,  running as the sshd_t type, attempts to read content in /root/.ssh it gets permission denied, since sshd_t is not allowed to read container_file_t.

The emailer realized what happened and tried to fix this situation by running restorecon -R -v /root, but this failed to change them?

Why did the labels not change when he ran restorecon?

There is a little known feature of restorecon called customizable_types, that I talked about 10 years ago.

By default, restorecon does not change types defined in the customizable_types file.  These types can be randomly scattered around the file system, and we don't want a global relabel to change them.  This is meant to make it easier to the admin, but sometimes causes confusion.  The
-F option tells restorecon to force the relabel and ignore customizable_types.

restorecon -F -R /root

This comand will reset the labels under /root to the system default and allow the emailer to login to the system via sshd again.

We have safeguards built into the SELinux go bindings which prevent container runtimes relabeling of
/, /etc, and /usr.

I need to open a pull request to add a few more directories to help prevent users from making serious mistakes in labeling, starting with
/root.

The expectation of security

Posted by Josh Bressers on April 02, 2017 10:04 PM
If you listen to my podcast (which you should be doing already), I had a bit of a rant at the start this week about an assignment my son had over the weekend. He wasn't supposed to use any "screens" which is part of a drug addiction lesson. I get where this lesson is going, but I've really been thinking about the bigger idea of expectations and reality. This assignment is a great example of someone failing to understand the world has changed around them.

What I mean is expecting anyone to go without a "screen" for a weekend doesn't make sense. A substantial number of activities we do today rely on some sort of screen because we've replace more inefficient ways of accomplishing tasks with these screens. Need to look something up? That's a screen. What's the weather? Screen. News? Screen. Reading a book? Screen!

You get the idea. We've replaced a large number of books or papers with a screen. But this is a security blog, so what's the point? The point is I see a lot of similarities with a lot of security people. The world has changed quite a bit over the last few years, I feel like a number of our rules are similar to anyone thinking spending time without a screen is some sort of learning experience. I bet we can all think of security people we know who think it's still 1995, if you don't know any you might be that person (time for some self reflection).

Let's look at some examples.

You need to change your password every 90 days.
There are few people who think this is a good idea anymore, even the NIST guidance says this isn't a good idea. I hear this come up on a regular basis though. Password concepts have changed a lot over the last few years, but most people seem to be stuck somewhere between five and ten years ago.

If we put it behind the firewall we don't have to worry about securing it.
Remember when firewalls were magic? Me neither. There was a time from probably 1995 to 2007 or so that a lot of people thought firewalls were magic. Very recently the concept of zero trust networking has come to be a real thing. You shouldn't trust your network, it's probably compromised.

Telling someone they can't do something because it's insecure.
Remember when we used to talk about how security is the industry of "no"? That's not true anymore because now when you tell someone "no" they just go to Amazon and buy $2.38 worth of computing and do whatever it is they need to get done. Shadow IT isn't the problem, it's the solution to the problem that was the security people. It's fairly well accepted by the new trailblazers that "no" isn't an option, the only option is to work together to minimize risk.

I could probably build a list that's enormous with examples like this. The whole point is to point out that everything changes, and we should always be asking ourselves if something still makes sense. It's very easy for us to decide change is dangerous and scary. I would argue that not understanding the new security norms is actually more dangerous than having no security knowledge at all. This is probably one of the few industries where old knowledge may be worse than no knowledge. Imagine if your doctor was using the best ideas and tools from 1875. You'd almost certainly find a new doctor. Password policies and firewalls are our version of blood letting and leeches. We have a long way to go and I have no doubt we all have something to contribute.

Episode 40 - Let's fork bitcoin, again

Posted by Open Source Security Podcast on April 02, 2017 09:22 PM
Josh and Kurt discuss Verizon spyware, FCC privacy, Smart TVs, Tor's rewrite, Google's new operating system, bitcoin, and NanoCore.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/315737179&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Remember kids, if you're going to disclose, disclose responsibly!

Posted by Josh Bressers on March 28, 2017 02:02 AM
If you pay any attention to the security universe, you're aware that Tavis Ormandy is basically on fire right now with his security research. He found the Cloudflare data leak issue a few weeks back, and is currently going to town on LastPass. The LastPass crew seems to be dealing with this pretty well, I'm not seeing a lot of complaining, mostly just info and fixes which is the right way to do these things.

There are however a bunch of people complaining about how Tavis and Google Project Zero in general tend to disclose the issues. These people are wrong, I've been there, it's not fun, but as crazy as it may seem to the ouside, the Project Zero crew knows what they're doing.

Firstly let's get two things out of the way.

1) If nobody is complaining about what you're doing, you're not doing anything interesting (Tavis is clearly doing very interesting things).

2) Disclosure is hard, there isn't a perfect solution, what Project Zero does may seem heartless to some, but it's currently the best way. The alternative is an abusive relationship.

A long time ago I was a vendor receiving security reports from Tavis, and I won't lie, it wasn't fun. I remember complaining and trying to slow things down to a pace I thought was more reasonable. Few of us have any extra time and a new vulnerability disclosure means there's extra work to do. Sometimes a disclosure isn't very detailed or lacks important information. The disclosure date proposed may not line up with product schedules. You could have another more important issue you're working on already. There are lots of reasons to dread dealing with these issues as a vendor.

All that said, it's still OK to complain, and every now and then the criticism is good. We should always be thinking about how we do things, what makes sense today won't make sense tomorrow. The way Google Project Zero does disclosure today was pretty crazy even five years ago. Now it's how things have to work. The world moves very fast now, and as we've seen from various document dumps over the last few years, there are no secrets. If you think you can keep a security issue quiet for a year you are sadly mistaken. It's possible that was once true (I suspect it never was, but that's another conversation). Either way it's not true anymore. If you know about a security flaw it's quite likely someone else does too, and once you start talking to another group about it, the odds of leaking grow at an alarming rate.

The way things used to work is changing rapidly. Anytime there is change, there are always the trailblazers and laggards. We know we can't develop secure software, but we can respond quickly. Spend time where you can make a difference, not chasing the mythical perfect solution.

If your main contribution to society is complaining, you should probably rethink your purpose.

Episode 39 - Flash on your dishwasher

Posted by Open Source Security Podcast on March 28, 2017 01:08 AM
Josh and Kurt discuss certificates, OpenSSL, dishwashers, Flash, and laptop travel bans.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/314794586&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



Inverse Law of CVEs

Posted by Josh Bressers on March 23, 2017 11:26 PM
I've started a project to put the CVE data into Elasticsearch and see if there is anything clever we can learn about it. Ever if there isn't anything overly clever, it's fun to do. And I get to make pretty graphs, which everyone likes to look at.

I stuck a few of my early results on Twitter because it seemed like a fun thing to do. One of the graphs I put up was comparing the 3 BSDs. The image is below.


You can see that none of these graphs has enough data to really draw any conclusions from, again, I did this for fun. I did get one response claiming NetBSD is the best, because their graph is the smallest. I've actually heard this argument a few times over the past month, so I decided it's time to write about it. Especially since I'm sure I'll find many more examples like this while I'm weeding through this mountain of CVE data.

Let's make up a new law, I'll call it the "Inverse Law of CVEs". It goes like this - "The fewer CVE IDs something has has, the less secure it is".

That doesn't make sense to most people. If you have something that is bad, fewer bad things is certainly better than more bad things. This is generally true for physical concepts brains can understand. Less crime is good. Fewer accidents is good. When it comes to something like how many CVE IDs your project or product has, this idea gets turned on its head. Less is probably bad when we think about CVE IDs. There's probably some sort of line somewhere where if you cross it things flip back to bad (wait until I get to PHP). We'll call that the security maginot line because bad security decided to sneak in through the north.

If you have something with very very few CVE IDs it doesn't mean it's secure, it means nobody is looking for security issues. It's easy to understand that if something is used by a large diverse set of users, it will get more bug reports (some of which will be security bugs) and it will get more security attention from both good guys and bad guys because it's a bigger target. If something has very few users, it's quite likely there hasn't been a lot of security attention paid to it. I suspect what the above graphs really mean is Free BSD is more popular than OpenBSD, which is more popular than NetBSD. Random internet searches seem to back this up.

I'm not entirely sure what to do with all this data. Part of the fun is understanding how to classify it all. I'm not a data scientist so there will be much learning. If you have any ideas by all means let me know, I'm quite open to suggestions. Once I have better data I may consider trying to find at what point a project has enough CVE IDs to be considered on the right path, and which have so many they've crossed over to the bad place.

Episode 38 - We Ruin Everything

Posted by Open Source Security Podcast on March 22, 2017 01:34 AM
Josh and Kurt discuss disclosing your password, pwn2own, wikileaks, Back Orifice, HTTPS inspection, and antivirus.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/313701429&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Supporting large key sizes in FreeIPA certificates

Posted by Fraser Tweedale on March 21, 2017 12:59 AM

A couple of issues around key sizes in FreeIPA certificates have come to my attention this week: how to issue certificates for large key sizes, and how to deploy FreeIPA with a 4096-bit key. In this post I’ll discuss the situation with each of these issues. Though related, they are different issues so I’ll address each separately.

Issuing certificates with large key sizes

While researching the second issue I stumbled across issue #6319: ipa cert-request limits key size to 1024,2048,3072,4096 bits. To wit:

ftweedal% ipa cert-request alice-8192.csr --principal alice
ipa: ERROR: Certificate operation cannot be completed:
  Key Parameters 1024,2048,3072,4096 Not Matched

The solution is straightforward. Each certificate profile configures the key types and sizes that will be accepted by that profile. The default profile is configured to allow up to 4096-bit keys, so the certificate request containing an 8192-bit key fails. The profile configuration parameter involved is:

policyset.<name>.<n>.constraint.params.keyParameters=1024,2048,3072,4096

If you append 8192 to that list and update the profile configuration via ipa certprofile-mod (or create a new profile via ipa certprofile-import), then everything will work!

Deploying FreeIPA with IPA CA signing key > 2048-bits

When you deploy FreeIPA today, the IPA CA has a 2048-bit RSA key. There is currently no way to change this, but Dogtag does support configuring the key size when spawning a CA instance, so it should not be hard to support this in FreeIPA. I created issue #6790 to track this.

Looking beyond RSA, there is also issue #3951: ECC Support for the CA which concerns supporting a elliptic curve signing key in the FreeIPA CA. Once again, Dogtag supports EC signing algorithms, so supporting this in FreeIPA should be a matter of deciding the ipa-server-install(1) options and mechanically adjusting the pkispawn configuration.

If you have use cases for large signing keys and/or NIST ECC keys or other algorithms, please do not hesitate to leave comments in the issues linked above, or get in touch with the FreeIPA team on the freeipa-users@redhat.com mailing list or #freeipa on Freenode.

Installing R Packages in Fedora as a user

Posted by Adam Young on March 16, 2017 01:35 AM

When I was trying to run R code that required additional packages, I got the error message:

Installing packages into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Warning in install.packages(new.pkg, dependencies = TRUE) :
 'lib = "/usr/lib64/R/library"' is not writable

Summary: If you create the following directory then R will install package files in there instead.

~/R/x86_64-redhat-linux-gnu-library/3.3/

Here’s the more detail steps I took.

In order to work around this, I tried running just the install command from an interactive R prompt. In this case, the package was “SuperLearner”

> install.packages("SuperLearner")
Installing package into ‘/usr/lib64/R/library’
(as ‘lib’ is unspecified)
Warning in install.packages("SuperLearner") :
  'lib = "/usr/lib64/R/library"' is not writable
Would you like to use a personal library instead?  (y/n) y
Would you like to create a personal library
~/R/x86_64-redhat-linux-gnu-library/3.3
to install packages into?  (y/n) y

After this, a dialog window popped up and had me select a CRAN mirror (I picked the geographical closest one) and it was off and running.

It errored out later with :

bit-ops.c:1:15: fatal error: R.h: No such file or directory
 #include 

Which looks like I am missing the development libraries for R. I’ll return to this in a bit.

If I exit out of R and check the directory ls ~/R/x86_64-redhat-linux-gnu-library/3.3/ I can now see it is populated.

The Base R install from fedora owns the library directory:

 rpmquery -f /usr/lib64/R/library/Matrix/
R-core-3.3.2-3.fc25.x86_64

And we don’t want to mix the core libraries with user installed libraries. Leet me try a different one, now that the local users libary directory structure has been created:

> install.package(“doRNG”)

Similar error…ok, let’s take care of the compile error.

sudo dnf install R-core-devel

And rerun the install in an R session now completes successfully. I tried on a different machine and had to install the ‘ed’ command line tool first.

Security, Consumer Reports, and Failure

Posted by Josh Bressers on March 12, 2017 09:03 PM
Last week there was a story about Consumer Reports doing security testing of products.


As one can imagine there were a fair number of “they’ll get it wrong” sort of comments. They will get it wrong, at first, but that’s not a reason to pick on these guys. They’re quite brave to take this task on, it’s nearly impossible if you think about the state of security (especially consumer security). But this is how things start. There is no industry that has gone from broken to perfect in one step. It’s a long hard road when you have to deal with systemic problems in an industry. Consumer product security problems may be larger and more complex than any other industry has ever had to solve thanks to things such as globalization and how inexpensive tiny computers have become.

If you think about the auto industry, you’re talking about something that costs thousands of dollars. Safety is easy to justify as it’s going to be less than the overall cost of the vehicle. Now if we think about tiny computing devices, you could be talking about chips that cost less than one dollar. If the cost of security and safety will be more than the initial cost of the computing hardware it can be impossible to justify that cost. If adding security doubles the cost of something, the manufacturers will try very hard to find ways around having to include such features. There are always bizarre technicalities that can help avoid regulation, groups like Consumer Reports help with accountability.

Here is where Consumer Reports and other testing labs will be incredibly important to this story. Even if there is regulation a manufacturer chooses to ignore, a group like Consumer Reports can still review the product. Consumer Reports will get things very wrong at first, sometimes it will be hilariously wrong. But that’s OK, it’s how everything starts. If you look back at any sort of safety and security in the consumer space, it took a long time, sometimes decades, to get it right. Cybersecurity will be no different, it’s going to take a long time to even understand the problem.

Our default reaction to mistakes is often one of ridicule, this is one of those times we have to be mindful of how dangerous this attitude is. If we see a group trying to do the right thing but getting it wrong, we need to offer advice, not mockery. If we don’t engage in a useful and serious way nobody will take us seriously. There are a lot of smart security folks out there, we can help make the world a better place this time. Sometimes things can look hopeless and horrible, but things will get better. It’ll take time, it won’t be easy, but things will get better thanks to efforts such as this one.

Episode 37 - Your bathtub is more dangerous than a shark

Posted by Open Source Security Podcast on March 09, 2017 12:40 AM
Josh and Kurt discuss how the Vault 7 leaks shows we live in the Neuromancer world, and this is likely the new normal.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/311442678&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Episode 36 - A Good Enough Podcast

Posted by Open Source Security Podcast on March 05, 2017 06:48 PM
Josh and Kurt discuss an IoT bear, Alexa and Siri, Google's E2Email and S/MIME.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/310851037&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Better Resolution of Kerberos Credential Caches

Posted by Nathaniel McCallum on March 03, 2017 03:45 PM

DevConf is a great time of year. Lots of developers gather in one place and we get to discuss integration issues between projects that may not have a direct relationship. One of those issues this year was the desktop integration of Kerberos authentication.

GNOME Online Accounts has supported the creation of Kerberos accounts since nearly the beginning, thanks to the effort of Debarshi Ray. However, we were made aware of an issue this year that had not come up before. Namely, in a variety of cases GSSAPI would not be able to complete authentication for non-default TGTs.

Roughly, this meant that if you logged into Kerberos using two different accounts GSSAPI would only be able to complete authentication using your default credential cache - meaning the last account you logged into. Users could work around this problem by using kswitch to change their default credential cache. However, since authentication transparently failed, there was no indication to the user that this could work. So the user experience was particularly poor.

This difficulty became even more noticable after the Fedora deployment of Kerberos by Patrick Uiterwijk. Many Fedora developers also use Kerberos for other realms, so the pain was spreading.

I am happy to say that we have discovered a cure for this malady!

Matt Rogers worked with upstream to merge this patch which causes GSSAPI to do the RightThing™. Robbie Harwood landed the patch in Fedora (rawhide, 26, 25). So we believe this issue to be resolved.

If you’re a Fedora 25 user, please help us test the fix! There is a pending update for krb5 on Bodhi. The easy way to reproduce this issue is as follows:

  1. Log in with the Kerberos account you want to use for the test.
  2. Log in with another Kerberos account.
  3. Confirm that the second account is default with klist.
  4. Attempt to login to a service using the first credential and GSSAPI. The easiest way to do this is probably to go to a Kerberos protected website using your browser (assming it is properly configured for GSSAPI).
  5. Before the patch, automatic login should fail. Afterwards, it shouldn’t.

Enjoy!

What the Oscars can teach us about security

Posted by Josh Bressers on March 02, 2017 07:05 PM
If you watched the 89th Academy Awards you saw a pretty big mistake at the end of the show, the short story is Warren Beatty was handed the wrong envelope, he opened it, looked at it, then gave it to Faye Dunaway to read, which she did. The wrong people came on stage and started giving speeches, confused scrambling happened, and the correct winner was brought on stage. No doubt this will be talked about for many years to come as one of the most interesting and exciting events in the history of the awards ceremony.

People make mistakes, we won’t dwell on how the wrong envelope made it into the announcer’s hands. The details of how this error came to be isn’t what’s important for this discussion. The important lesson for us is watch Warren Beatty’s behavior. He clearly knew something was wrong, if you watch the video of him, you can tell things aren’t right. But he just kept going, gave the card to Faye Dunaway, and she read the name of the movie on the card. These people aren’t some young amateurs here, these are seasoned actors. It’s not their first rodeo. So why did this happen?

The lesson for us all is to understand that when things start to break down, people will fall back to their instincts. The presenters knew their job was to open the card and read the name. Their job wasn’t to think about it or question what they were handed. As soon as they knew something was wrong, they went on autopilot and did what was expected. This happens with computer security all the time. If people get a scary phishing email, they will often go into autopilot and do things they wouldn’t do if they kept a level head. Most attackers know how this works and they prey on this behavior. It’s really easy to claim you’d never be so stupid as to download that attachment or click on that link, but you’re not under stress. Once you’re under stress, everything changes.

This is why police, firefighters, and soldiers get a lot of training. You want these people to do the right thing when they enter autopilot mode. As soon as a situation starts to get out of hand, training kicks in and these people will do whatever they were trained to do without thinking about it. Training works, there’s a reason they train so much. Most people aren’t trained like this so they generally make poor decisions when under stress.

So what should we take away from all this? The thing we as security professionals needs to keep in mind is how this behavior works. If you have a system that isn’t essentially “secure by default”, anytime someone find themselves under mental stress, they’re going to take the path of least resistance. If this path of least resistance is also something dangerous happening, you’re not designing for security. Even security experts will have this problem, we don’t have superpowers that let us make good choices in times of high stress. It doesn’t matter how smart you think you are, when you’re under a lot of stress, you will go into autopilot, you will make bad choices if bad choices are the defaults.

Episode 35 - Crazy Cosmic Accident

Posted by Open Source Security Podcast on February 28, 2017 03:04 AM
Josh and Kurt discuss SHA-1 and cloudbleed. Bug bounties come up, we compare security to the Higgs boson, and IPv6 comes up at the end.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/309898784&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes



A Farewell To Git

Posted by Robbie Harwood on February 26, 2017 05:00 AM

Last week, by chance, I wrote the Git tutorial I'd been threatening friends with. And I say by chance, of course, because this week the ability to generate SHA-1 collisions was all but dropped on the world.

Which, let's be clear, is horrible news for everyone who now has to move their software off of SHA-1. But this work isn't entirely new: for the most part, it's a practicalization of work published in 2015. Which follows on the heels of a string of weakings dating all the way back into early 2005 (or even earlier, depending on what you're willing to count). During which time there has been growing concern from the security-minded about software that used SHA-1, and even more importantly, serious efforts from industry leaders to futureproof their code.

It's a shame that Git, released in mid-2005, didn't heed the warnings. From John Gilmore, no less. But that was my main point in the previous post: Linus - and Git - abhor abstraction (and politeness) in any form. And it is to their detriment.

Problem?

So in practical terms, to you and me (both relatively normal Git users), what does the ability to generate SHA-1 collisions mean?

To be fair, Linus did think about this. And he's right, as far as I know: Git prefers the local version of a hash, so there is no danger of a remote overwriting it. But the problem space is bigger than that, and therefore not a "non-issue" (as he put it).

The important discussion is this: what if the collided hash isn't in the local repository already?

Of course, there are many hashes (most, even) which are not in any given local repository. We, as hypothetical attackers, need concern ourselves with predicting a hash that the user will download (which they do not already have) that we can substitute a malicious collision for. We also need to do this in such a way that the user will not notice: pulling a commit or branch for purposes of code review will not be sufficient, for instance.

I think the easiest way to ensnare users is to catch emergency re-clones, and similar operations (toot toot), during which the user places more trust than they ought in the server. If everything looks right on the surface, and the old repository is in a semi-destroyed state, are we really going to review the immense amount of code looking for problems? In a project the size of this blog: maybe. Certainly not for the larger projects.

And of course there are a few clever offshoots of this idea that could be exploited (how many people do you think really check that the code they reviewed in the web tool matches the contents of the commit they just merged?), but that interests me less than this attack I thought up this morning.

In last week's post, I mentioned release branches. These are an model I work with a great deal, both as a contributor to established upstream projects and as a maintainer for distributions. A release branch, of course, mostly consists of commits which are the result of cherry-picking the development branch. And in that post, I suggested further the use of a flag - -x - which embeds the hash of the original commit in the release branch's version (in the commit messsage, specifically).

Having the "original" commit hash easily accessible from the release branch is advantageous because we are working around a design decision in Git: that the uniqueness of commits takes into account their parents. More specifically, two commits which make the exact same changes to the exact same files but with different parents are, for purposes of Git, considered different commits. They will have different SHA-1 hashes.

This design decision becomes a design failure when we want to make release branches (which the kernel itself does...) and need to backport (i.e., duplicate changing only the parent) commits from stable branches. So to paper around the issue, we use -x, and then we can then (by hand!) extract the original commit hash from the backport's message. It's worth noting that in "patch-based" systems (Darcs, Pijul, and friends), the original commit and the backport are the same commit, just applied on different branches. They do not suffer from this problem.

As each commit contains parent information, Git can normally be quite good at leaving no references unresolved by just downloading all the hashes that are referred to. But the cherry-pick hashes are not references that are visible to Git, and even if we added logic to try to detect them, it would never be good enough. Git also has the misfeature of allowing unique substrings of hashes at any point where full hashes could be used, which, in hindsight, seems designed to enable collision (and is a decision that GPG also made for fingerprints, where it is a problem today).

All that remains at that point is for an unsuspecting user to have a release branch checked out without a full copy of the master branch. Which is surprisingly likely for a distro packager (speaking from experience), or for new users who don't really need years of development and so made a shallow clone (git clone --depth 1 or so) in order to save time/bandwidth.

This practice of embedding SHA-1 hashes in commits is also why I predict that migrating Git off of SHA-1, if it even happens, will require an effort on the scale of migrating off of SVN (which still hasn't happened for many modern projects!).

Final thoughts

I would love it if the ideas we abandoned in the quest for speed uber alles would return. I see projects like Pijul, improving on Darcs, written in a modern language, and boasting performance better than Git, and it gives me hope. BitKeeper has an open source license now, so perhaps we will think about weave merge once again. It doesn't feel like too much of stretch to imagine a resurrected Monotone pushing the importance of integrity (cryptographic or otherwise) and abstraction, or perhaps large corporate players will continue to forcibly drag Mercurial along with them. Or maybe a new tool that hasn't yet seen attention will steal the show.

Just please, not Git. Not again. Not still. No more.

I am a Cranky, White, Male Feminist

Posted by Stephen Gallagher on February 25, 2017 02:52 AM

Today, I was re-reading an linux.com article from 2014 by Leslie Hawthorne which had been reshared by the Linux Foundation Facebook account yesterday in honor of #GirlDay2017 (which I was regrettably unaware of until it was over). It wasn’t so much the specific content of the article that got me thinking, but instead the level of discourse that it “inspired” on the Facebook thread that pointed me there (I will not link to it as it is unpleasant and reflects poorly on The Linux Foundation, an organization which is in most circumstances largely benevolent).

In the article, Hawthorne describes the difficulties that she faced as a woman in getting involved in technology (including being dissuaded by her own family out of fear for her future social interactions). While in her case, she ultimately ended up involved in the open-source community (albeit through a roundabout journey), she explained the sexism that plagued this entire process, both casual and explicit.

What caught my attention (and drew my ire) was the response to this article. This included such thoughtful responses as “Come to my place baby, I’ll show you my computer” as well as completely tone-deaf assertions that if women really wanted to be involved in tech, they’d stick it out.

Seriously, what is wrong with some people? What could possibly compel you to “well, actually” a post about a person’s own personal experience? That part is bad enough, but to turn the conversation into a deeply creepy sexual innuendo is simply disgusting.

Let me be clear about something: I am a grey-haired, cis-gendered male of Eastern European descent. As Patrick Stewart famously said:

patrickstewart

I am also the parent of two young girls, one of whom is celebrating her sixth birthday today. The fact of the timing is part of what has set me off. You see, this daughter of mine is deeply interested in technology and has been since a very early age. She’s a huge fan of Star Wars, LEGOs and point-and-click adventure games. She is going to have a very different experience from Ms. Hawthorne’s growing up, because her family is far more supportive of her interests in “nerdy” pursuits.

But still I worry. No matter how supportive her family is: Will this world be willing to accept her when she’s ready to join it? How much pressure is the world at large going to put on her to follow “traditional” female roles. (By “traditional” I basically mean the set of things that were decided on in the 1940s and 1950s and suddenly became the whole history of womanhood…)

So let me make my position perfectly clear.  I am a grey-haired, cis-gendered male of Eastern European descent. I am a feminist, an ally and a human-rights advocate. If I see bigotry, sexism, racism, ageism or any other “-ism” that isn’t humanism in my workplace, around town, on social media or in the news, I will take a stand against it, I will fight it in whatever way is in my power and I will do whatever I can to make a place for women (and any other marginalized group) in the technology world.

Also, let me be absolutely clear about something: if I am interviewing two candidates for a job (any job, at my current employer or otherwise) of similar levels of suitability, I will fall on the side of hiring the woman, ethnic minority or non-cis-gendered person over a Caucasian man. No, this is not “reverse racism” or whatever privileged BS you think it is. Simply put: this is a set of people who have had to work at least twice as hard to get to the same point as their privileged Caucasion male counterpart and I am damned sure that I’m going to hire the person with that determination.

As my last point (and I honestly considered not addressing it), I want to call out the ignorant jerks who claim, quote “Computer science isn’t a social process at all, it’s a completely logical process. People interested in comp. sci. will pursue it in spite of people, not because of it. If you value building relationships more than logical systems, then clearly computer science isn’t for you.” When you say this, you are saying that this business should only permit socially-inept males into the club. So let me use some of your “completely logical process” to counter this – and I use the term extremely liberally – argument.

In computer science, we have an expression: “garbage in, garbage out”. What it essentially means is that when you write a function or program that processes data, if you feed it bad data in, you generally get bad (or worthless… or harmful…) data back out. This is however not limited to code. It is true of any complex system, which includes social and corporate culture. If the only input you have into your system design is that of egocentric, anti-social men, then the only things you can ever produce are those things that can be thought of by egocentric, anti-social men. If you want instead to have a unique, innovative idea, then you have to be willing to listen to ideas that do not fit into the narrow worldview that is currently available to you.

Pushing people away and then making assertions that “if people were pushed away so easily, then they didn’t really belong here” is the most deplorable ego-wank I can think of. You’re simultaneously disregarding someone’s potential new idea while helping to remove all of their future contributions from the available pool while at the same time making yourself feel superior because you think you’re “stronger” than they are.

To those who are reading this and might still feel that way, let me remind you of something: chances are, you were bullied as a child (I know I was). There are two kinds of people who come away from that environment. One is the type who remembers what it was like and tries their best to shield others from similar fates. The other is the type that finds a pond where they can be the big fish and then gets their “revenge” by being a bully themselves to someone else.

If you’re one of those “big fish”, let me be clear: I intend to be an osprey.


SHA-1 is dead, long live SHA-1!

Posted by Josh Bressers on February 24, 2017 01:45 AM
Unless you’ve been living under a rock, you heard that some researchers managed to create a SHA-1 collision. The short story as to why this matters is the whole purpose of a hashing algorithm is to make it impossible to generate collisions on purpose. Unfortunately though impossible things are usually also impossible so in reality we just make sure it’s really really hard to generate a collision. Thanks to Moore’s Law, hard things don’t stay hard forever. This is why MD5 had to go live on a farm out in the country, and we’re not allowed to see it anymore … because it’s having too much fun. SHA-1 will get to join it soon.

The details about this attack are widely published at this point, but that’s not what I want to discuss, I want to bring things up a level and discuss the problem of algorithm deprecation. SHA-1 was basically on the way out. We knew this day was coming, we just didn’t know when. The attack isn’t super practical yet, but give it a few years and I’m sure there will be some interesting breakthroughs against SHA-1. SHA-2 will be next, which is why SHA-3 is a thing now. At the end of the day though this is why we can’t have nice things.

A long time ago there weren’t a bunch of expired standards. There were mostly just current standards and what we would call “old” standards. We kept them around because it was less work than telling them we didn’t want to be friends anymore. Sure they might show up and eat a few chips now and then, but nobody really cared. Then researchers started to look at these old algorithms and protocols as a way to attack modern systems. That’s when things got crazy.

It’s a bit like someone bribing one of your old annoying friends to sneak the attacker through your back door during a party. The friend knows you don’t really like him anymore, so it won’t really matter if he gets caught. Thus began the long and horrible journey to start marking things as unsafe. Remember how long it took before MD5 wasn’t used anymore? How about SSL 2 or SSHv1? It’s not easy to get rid of widely used standards even if they’re unsafe. Anytime something works it won't be replaced without a good reason. Good reasons are easier to find these days than they were even a few years ago.

This brings us to the recent SHA-1 news. I think it's going better this time, a lot better. The browsers already have plans to deprecate it. There are plenty of good replacements ready to go. Did we ever discuss killing off md5 before it was clearly dead? Not really. It wasn't until a zero day md5 attack was made public that it was decided maybe we should stop using it. Everyone knew it was bad for them, but they figured it wasn’t that big of a deal. I feel like everyone understands SHA-1 isn’t a huge deal yet, but it’s time to get rid of it now while there’s still time.

This is the world we live in now. If you can't move quickly you will fail. It's not a competitive advantage, it's a requirement for survival. Old standards no longer ride into the sunset quietly, they get their lunch money stolen, jacket ripped, then hung by a belt loop on the fence.

Episode 34 - Bathing in Ebola Virus

Posted by Open Source Security Podcast on February 22, 2017 09:26 PM
Josh and Kurt discuss RSA, the cryptographer's panel and of course, AI.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/309062655&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Our Bootloader Problem

Posted by Nathaniel McCallum on February 21, 2017 11:05 PM

GRUB, it is time we broke up. It’s not you, it’s me. Okay, it’s you. The last 15+ years have some great (read: painful) memories. But it is time to call it quits.

Red Hat Linux (not RHEL) deprecated LILO for version 9 (PDF; hat tip: Spot). This means that Fedora has used GRUB as its bootloader since the very first release: Fedora Core 1.

GRUB was designed for a world where bootloaders had to locate a Linux kernel on a filesystem. This meant it needed support for all the filesystems anyone might conceivably use. It was also built for a world where dual-booting meant having a bootloader implemented menu to choose between operating systems.

The UEFI world we live in today looks nothing like this. UEFI requires support for a standard filesystem. This filesystem, which for all intents and purposes duplicates the contents of /boot, is required on every Linux system which boots UEFI. So UEFI loads the bootloader from the UEFI partition and then the bootloader loads the kernel from the /boot partition.

Did you know that UEFI can just boot the kernel directly? It can!

The situation, however, is much worse than just duplicated effort. With the exception of Apple hardware, practically all UEFI implementations ship with Secure Boot and a TPM enabled by default. Only appropriately signed UEFI code will be run. This means we now introduce a [shim][shim] which is signed. This, in turn, loads GRUB from the UEFI partition.

This means that our boot process now looks like this:

  • UEFI filesystem
    1. shim
    2. GRUB
  • /boot filesystem
    1. Linux

It gets worse. Microsoft OEMs are now enabling BitLocker by default. BitLocker seals (encrypts) the Windows partition to the TPM PCRs. This means that if the boot process changes (and you have no backup of the key), you can’t decrypt your data. So remember that great boot menu that GRUB provided so we can dual-boot with Windows? It can never work, cryptographically.

The user experience of this process is particularly painful. Users who manage to get Fedora installed will see a nice GRUB menu entry for Windows. But if they select it, they are immediately greeted with a terrifying message telling them that the boot configuration has changed and their encrypted data is inaccessible.

To recap, where Secure Boot is enabled (pretty much all Intel hardware), we must use the boot menu provided by UEFI. If we don’t, the PCRs of the TPM have unknown hashes and anything sealed to the boot state will fail to decrypt.

The good news is that Intel provides a reference implementation of UEFI, and it includes pretty much everything we’d ever need. This means that most vendors get it pretty much correct as well. OEMs are even using these facilities for their own (hidden) recovery partitions.

So why not just have UEFI boot the kernel directly? There are still some drawbacks to this approach.

First, it requires signing every build of the kernel. This is definitely undesirable since kernels are updated pretty regularly.

Second, every kernel upgrade would mean a write to UEFI NVRAM. There are some concerns about the longevity of the hardware under such frequent UEFI writes.

Third, it exposes kernels as a menu option in UEFI. This menu typically contains operating systems, not individual kernels, which results in a poor user experience. Most users don’t need to care about what kernel they boot. There should be a bootloader which loads the most recently installed kernel and falls back to older kernels if the new kernels fail to boot. All of this can be done without a menu (unless the user presses a key).

Fortunately, systemd already implements precisely such a bootloader. Previously, this bootloader was called gummiboot. But it has since been merged into the systemd repository as systemd-boot.

With systemd-boot, our boot process can look like this:

  • UEFI filesystem
    1. shim
    2. systemd-boot
    3. Linux

It would even be possible (though, not necessarily desirable) to sign systemd-boot directly and get rid of the shim.

In short, we need to stop trying to make GRUB work in our current context and switch to something designed specifically for the needs of our modern systems. We already ship this code in systemd. Further, systemd already ships a tool for managing the bootloader. We just need to enable it in Anaconda and test it.

Who’s with me!?

P.S. - It would be very helpful if we could get some good documentation on manually migrating from GRUB to systemd-boot. This would at least enable the testing of this setup by brave users.

Wildcard certificates in FreeIPA

Posted by Fraser Tweedale on February 20, 2017 04:55 AM

The FreeIPA team sometimes gets asked about wildcard certificate support. A wildcard certificate is an X.509 certificate where the DNS-ID has a wildcard in it (typically as the most specific domain component, e.g. *.cloudapps.example.com). Most TLS libraries match wildcard domains in the obvious way.

In this blog post we will discuss the state of wildcard certificates in FreeIPA, but before proceeding it is fitting to point out that wildcard certificates are deprecated, and for good reason. While the compromise of any TLS private key is a serious matter, the attacker can only impersonate the entities whose names appear on the certificate (typically one or a handful of DNS addresses). But a wildcard certificate can impersonate any host whose name happens to match the wildcard value.

In time, validation of wildcard domains will be disabled by default and (hopefully) eventually removed from TLS libraries. The emergence of protocols like ACME that allow automated domain validation and certificate issuance mean that there is no real need for wildcard certificates anymore, but a lot of programs are yet to implement ACME or similar; therefore there is still a perceived need for wildcard certificates. In my opinion some of this boils down to lack of awareness of novel solutions like ACME, but there can also be a lack of willingness to spend the time and money to implement them, or a desire to avoid changing deployed systems, or taking a "wait and see" approach when it comes to new, security-related protocols or technologies. So for the time being, some organisations have good reasons to want wildcard certificates.

FreeIPA currently has no special support for wildcard certificates, but with support for custom certificate profiles, we can create and use a profile for issuing wildcard certificates.

Creating a wildcard certificate profile in FreeIPA

This procedure works on FreeIPA 4.2 (RHEL 7.2) and later.

First, kinit admin and export an existing service certificate profile configuration to a file:

ftweedal% ipa certprofile-show caIPAserviceCert --out wildcard.cfg
---------------------------------------------------
Profile configuration stored in file 'wildcard.cfg'
---------------------------------------------------
  Profile ID: caIPAserviceCert
  Profile description: Standard profile for network services
  Store issued certificates: TRUE

Modify the profile; the minimal diff is:

--- wildcard.cfg.bak
+++ wildcard.cfg
@@ -19 +19 @@
-policyset.serverCertSet.1.default.params.name=CN=$request.req_subject_name.cn$, o=EXAMPLE.COM
+policyset.serverCertSet.1.default.params.name=CN=*.$request.req_subject_name.cn$, o=EXAMPLE.COM
@@ -108 +108 @@
-profileId=caIPAserviceCert
+profileId=wildcard

Now import the modified configuration as a new profile called wildcard:

ftweedal% ipa certprofile-import wildcard \
    --file wildcard.cfg \
    --desc 'Wildcard certificates' \
    --store 1
---------------------------
Imported profile "wildcard"
---------------------------
  Profile ID: wildcard
  Profile description: Wildcard certificates
  Store issued certificates: TRUE

Next, set up a CA ACL to allow the wildcard profile to be used with the cloudapps.example.com host:

ftweedal% ipa caacl-add wildcard-hosts
-----------------------------
Added CA ACL "wildcard-hosts"
-----------------------------
  ACL name: wildcard-hosts
  Enabled: TRUE

ftweedal% ipa caacl-add-profile wildcard-hosts --certprofiles wildcard
  ACL name: wildcard-hosts
  Enabled: TRUE
  CAs: ipa
  Profiles: wildcard
-------------------------
Number of members added 1
-------------------------

ftweedal% ipa caacl-add-host wildcard-hosts --hosts cloudapps.example.com
  ACL name: wildcard-hosts
  Enabled: TRUE
  CAs: ipa
  Profiles: wildcard
  Hosts: cloudapps.example.com
-------------------------
Number of members added 1
-------------------------

An additional step is required in FreeIPA 4.4 (RHEL 7.3) and later (it does not apply to FreeIPA < 4.4):

ftweedal% ipa caacl-add-ca wildcard-hosts --cas ipa
  ACL name: wildcard-hosts
  Enabled: TRUE
  CAs: ipa
-------------------------
Number of members added 1
-------------------------

Then create a CSR with subject CN=cloudapps.example.com (details omitted), and issue the certificate:

ftweedal% ipa cert-request my.csr \
    --principal host/cloudapps.example.com \
    --profile wildcard
  Issuing CA: ipa
  Certificate: MIIEJzCCAw+gAwIBAgIBCzANBgkqhkiG9w0BAQsFADBBMR8...
  Subject: CN=*.cloudapps.example.com,O=EXAMPLE.COM
  Issuer: CN=Certificate Authority,O=EXAMPLE.COM
  Not Before: Mon Feb 20 04:21:41 2017 UTC
  Not After: Thu Feb 21 04:21:41 2019 UTC
  Serial number: 11
  Serial number (hex): 0xB

Alternatively, you can use Certmonger to request the certificate:

ftweedal% ipa-getcert request \
  -d /etc/httpd/alias -p /etc/httpd/alias/pwdfile.txt \
  -n wildcardCert \
  -T wildcard

This will request a certificate for the current host. The -T option specifies the profile to use.

Discussion

Observe that the subject common name (CN) in the CSR does not contain the wildcard. FreeIPA requires naming information in the CSR to perfectly match the subject principal. As mentioned in the introduction, FreeIPA has no specific support for wildcard certificates, so if a wildcard were included in the CSR, it would not match the subject principal and the request would be rejected.

When constructing the certificate, Dogtag performs a variable substitution into a subject name string. That string contains the literal wildcard and the period to its right, and the common name (CN) from the CSR gets substituted in after that. The relevant line in the profile configuration is:

policyset.serverCertSet.1.default.params.name=CN=*.$request.req_subject_name.cn$, o=EXAMPLE.COM

When it comes to wildcards in Subject Alternative Name DNS-IDs, it might be possible to configure a Dogtag profile to add this in a similar way to the above, but I do not recommend it, nor am I motivated to work out a reliable way to do this, given that wildcard certificates are deprecated. (By the time TLS libraries eventually remove support for treating the subject CN as a DNS-ID, I will have little sympathy for organisations that still haven’t moved away from wildcard certs).

In conclusion: you shouldn’t use wildcard certificates, and FreeIPA has no special support for them, but if you really need to, you can do it with a custom certificate profile.

Episode 33 - Everybody who went to the circus is in the circus (RSA 2017)

Posted by Open Source Security Podcast on February 15, 2017 06:22 AM
Josh and Kurt are at the same place at the same time! We discuss our RSA sessions and how things went. Talk of CVE IDs, open source libraries, Wordpress, and early morning sessions.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/307825712&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes


Reality Based Security

Posted by Josh Bressers on February 13, 2017 04:37 AM
If I demand you jump off the roof and fly, and you say no, can I call you a defeatist? What would you think? To a reasonable person it would be insane to associate this attitude with being a defeatist. There are certain expectations that fall within the confines of reality. Expecting things to happen outside of those rules is reckless and can often be dangerous.

Yet in the universe of cybersecurity we do this constantly. Anyone who doesn’t pretend we can fix problems is a defeatist and part of the problem. We just have to work harder and not claim something can’t be done, that’s how we’ll fix everything! After being called a defeatist during a discussion, I decided to write some things down. We spend a lot of time trying to fly off of roofs instead of looking for practical realistic solutions for our security problems.

The way cybersecurity works today someone will say “this is a problem”. Maybe it’s IoT, or ransomware, or antivirus, secure coding, security vulnerabilities; whatever, pick something, there’s plenty to choose from. It’s rarely in a general context though, it will be sort of specific, for example “we have to teach developers how to stop adding security flaws to software”. Someone else will say “we can’t fix that”, then they get called a defeatist for being negative and it’s assumed the defeatists are the problem. The real problem is they’re not wrong. It can’t be fixed. We will never see humans write error free code, there is no amount of training we can give them. Pretending it can is what’s dangerous. Pretending we can fix problems we can’t is lying.

The world isn’t fairy dust and rainbows. We can’t wish for more security and get it. We can’t claim to be working on a problem if we have no clue what it is or how to fix it. I’ll pick on IoT for a moment. How many security IoT “experts” exist now? The number is non trivial. Does anyone have any ideas how to understand the IoT security problems? Talking about how to fix IoT doesn’t make sense today, we don’t even really understand what’s wrong. Is the problem devices that never get updates? What about poor authentication? Maybe managing the devices is the problem? It’s not one thing, it’s a lot of things put together in a martini shaker, shook up, then dumped out in a heap. We can’t fix IoT because we don’t know what it even is in many instances. I’m not a defeatist, I’m trying to live in reality and think about the actual problems. It’s a lot easier to focus on solutions for problems you don’t understand. You will find a solution, those solutions won’t make sense though.

So what do we do now? There isn’t a quick answer, there isn’t an easy answer. The first step is to admit you have a problem though. Defeatists are a real thing, there’s no question about it. The trick is to look at the people who might be claiming something can’t be fixed. Are they giving up, or are they trying to reframe the conversation? If you declare them a defeatist, the conversation is now over, you killed it. On the other side of the coin, pretending things are fine is more dangerous than giving up, you’re living in a fantasy. The only correct solution is reality based security. Have honest and real conversations, don’t be afraid to ask hard questions, don’t be afraid to declare something unfixable. An unfixable problem is really just one that needs new ideas.

You can't fly off the roof, but trampolines are pretty awesome.

I'm @joshbressers on Twitter, talk to me.