Fedora security Planet

Testing if a patch has test coverage

Posted by Adam Young on July 17, 2018 05:35 PM

When a user requests a code review, the review is responsible for making sure that the code is tested.  While the quality of the tests is a subjective matter, their presences is not;  either they are there or they are not there.  If they are not there, it is on the developer to explain why or why not.

Not every line of code is testable.  Not every test is intelligent.  But, at a minimum, a test should ensure that the code in a patch is run at least once, without an unexpected exception.

For Keystone and related projects, we have a tox job called cover that we can run on a git repo at a given revision.  For example, I can code review (even without git review) by pulling down a revision using the checkout link in  gerrit, and then running tox:


git fetch git://git.openstack.org/openstack/keystoneauth refs/changes/15/583215/2 && git checkout FETCH_HEAD
git checkout -b netloc-and-version
tox -e cover

I can look at the patch using show –stat to see what files were changed:

$ git show --stat
commit 2ac26b5e1ccdb155a4828e3e2d030b55fb8863b2
Author: wangxiyuan <wangxiyuan>
Date:   Tue Jul 17 19:43:21 2018 +0800

    Add netloc and version check for version discovery
    If the url netloc in the catalog and service's response
    are not the same, we should choose the catalog's and
    add the version info to it if needed.
    Change-Id: If78d368bd505156a5416bb9cbfaf988204925c79
    Closes-bug: #1733052

 keystoneauth1/discover.py                                 | 16 +++++++++++++++-
 keystoneauth1/tests/unit/identity/test_identity_common.py |  2 +-

and I want to skip looking at any files in keystoneauth1/tests as those are not production code. So we have 16 lines of new code. What are they?

Modifying someone elses’ code, I got to

 git show | gawk 'match($0,"^@@ -([0-9]+),[0-9]+ [+]([0-9]+),[0-9]+ @@",a){left=a[1];right=a[2];next};\
   /^-/{left++; next};\
   /^[+]/{print right++;next};\
   {left++; right++}'

Which gives me:

+++ b/keystoneauth1/discover.py
+++ b/keystoneauth1/tests/unit/identity/test_identity_common.py

Looking in a the cover directory, I can see if a line is uncovered by its class:

class="stm mis"

For example:

$ grep n432\" cover/keystoneauth1_discover_py.html | grep "class=\"stm mis\""


For the lines above, I can use a seq to check them, since they are in order (with none missing)

for LN in `seq 420 437` ; do grep n$LN\" cover/keystoneauth1_discover_py.html ; done

Which produces:



















I drop the grep “class=\”stm mis\”” to make sure I get something, then add it back in, and get no output.

Episode 105 - More backdoors in open source

Posted by Open Source Security Podcast on July 16, 2018 01:08 AM
Josh and Kurt talk about some recent backdoor problems in open source packages. We touch on is open source secure, how that security works, and what it should look like in the future. This problem is never going to go away or get better, and that's probably OK.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6814252/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Building QGo on RHEL 7.5

Posted by Adam Young on July 12, 2018 03:06 PM

I’ve played Go for years. I’ve found that having a graphical Go client has helped me improve my game immensely. And, unlike many distractors,. I can make a move, then switch back in to work mode without really losing my train of thought.

I always like the QGo client. I have found it to be worthwhile to build and run from the git repo. After moving to RHEL 7.5 for my desktop, I had to go through the process again. Here is the short version.

Playing Go using the the QGo Client

All of the pre-reqs can come from Yum.

For the compiler and build tools, it is easiest to use a yum group:

sudo yum groupinstall "Development and Creative Workstation"

Once those packages are installed, you need some of the Qt5 development packages. At the bottom are is the complete list I have. I did not install all of these directly, but instead recently installed:


TO run the actual qmake command, things are a bit different from the README.

/usr/bin/qmake-qt5 src

That puts things in ../build, which took me a moment to find.

Now I can run qgo with


Et Voila

QGo Running on RHEL 7.5

The complete list of qt packages I have installed are:


unlabeled_t type

Posted by Dan Walsh on July 12, 2018 03:02 PM

I often see bug reports or people showing AVC messages about confined domains not able to deal with unlabeled_t files.

type=AVC msg=audit(1530786314.091:639): avc:  denied  { read } for  pid=4698 comm="modprobe" name="modules.alias.bin" dev="dm-0" ino=9115100 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:unlabeled_t:s0 tclass=file

I just saw this AVC, which shows the openvswitch domain attempting to read a file, modules.alias.bin, with modprobe.   The usual response to this is to run restorecon on the files and everything should be fine.

But the next question I get is how did this content get the label unlabeled_t, and my response is usually I don't know, you did something.

Well lets look at how unlabeled_t files get created.

unlabeled_t really just means that the file on disk does not have an SELinux xattr indicating a file label.  Here are a few ways these files can get created

1 File was created by on a file system when the kernel was not running in SELinux mode.  If you take a system that was installed without SELinux (God forbid) or someone booted the machine with SELinux disabled, then all files created will not have labels.  This is why we force a relabel, anytime someone changes from SELinux disabled to SElinux enabled at boot time.

2. An extension on content created while the kernel is not in SELinux mode is files created in the initramfs before SELinux Policy in the kernel.  We have an issue in CoreOS Right now, where when the system boots up the initramfs is running `ignition`, which runs before systemd loads SELinux policy.  The inition scrips create files on the file system, while SELinux is not enabled in the kernel, so those files get created as unlabeled_t.  Ignition is adding a onetime systemd unit file to run restorecon on the content created.

3. People create USB Sticks with ext4 or xfs on them, on a non SELinux system, and then stick into systems with SELinux enabled and 'mv' the content onto the system.  The `mv` command actually maintains the SELinux label or lack thereof, when it moves files across file systems.  If you use a `mv -Z`, the mv command will relabel the target content, or you can just use restorecon.

4 The forth way I can think of creating unlabeled_t files it to create a brand new file system on an SELinux system.  When you create  a new file system the kernel creates the "/" (root) of the file system without a label.  So if you mound the file system on to a mount point, the directory where you mounted it will have no labels.  If an unconfined domain creates files no this new file system, then it will also create unlabeled_t files since the default behaviour of the SELinux kernel is create content based on the parents directory, which in this case is labeled unlabeled_t.  I recommend people run restorecon on the mount point as soon as you mount a new file system, to fix this behaviour.  Or you can run `restorecon -R -v MOUNTPOINT ` to cleanup all the files.

Note: The unlabeled_t type can also show up on other objects besides file system objects.  For example on labeled networks, but this blog is only concerned with file system objects.

Bottom Line:

Unlabeled file should always be cleaned up ASAP since they will cause confined domains lots of problems and restorecon is your friend.

Fun with DAC_OVERRIDE and SELinux

Posted by Dan Walsh on July 10, 2018 02:31 PM

Lately the SELinux team has been trying to remove as many SELinux Domain Types that have DAC_OVERRIDE.

man capabilities



              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

This means a process with CAP_DAC_OVERRIDE can read any file on the system and can write any file on the system from a standard permissions point of view.  With SELinux it means that they can read all file types that SELinux allows them to read, even if they are running with a process UID that is not allowed to read the file.  Similar they are allowed to write all SELinux writable types even if they aren't allowed to write based on UID.  

Obviously most confined domains never need to have this access, but some how over the years lots of domains got added this access.  

I recently received and email asking about syslog, generating lots of AVC's.  The writer said that he understood SELinux and has set up the types for syslog to write to, and even the content was getting written properly.  But the Kernel was generating an AVC every time the service started.

Here is the AVC.

Jul 09 15:24:57

 audit[9346]: HOSTNAME AVC avc:  denied  { dac_override }  for  pid=9346 comm=72733A6D61696E20513A526567 capability=1   scontext=system_u:system_r:syslogd_t:s0  tcontext=system_u:system_r:syslogd_t:s0 tclass=capability permissive=0

Sadly the kernel is not in full debug mode so we don't know what the path that the syslog process was trying to read or write.  

Note: You can turn on full auditing using a command like: `auctl -w /etc/shadow`. But this could effect your system performances.

But I had a guess on what could be causing the AVC's.


A couple of easy places that a root process needs DAC_OVERRIDE is to look at the /etc/shadow file.

 ls -l /etc/shadow
----------. 1 root root 1474 Jul  9 14:02 /etc/shadow

As you see in the permissions no UID is allowed to read or write /etc/shadow,  So the only want to examine this file is using DAC_OVERRIDE.  But I am pretty sure syslogd is not attempting to read this file.  (Other SELinux AVC's would be screaming it if was).

The other location that can easily cause DAC_OVERRIDE AVC's is attempting to create content in the /root directory.

 ls -ld /root
dr-xr-x---. 19 root root 4096 Jul  9 15:59 /root

On Fedora, RHEL, Centos boxes, the root directory is set with permissions that do not allow any process to write to it even the root process, unless it uses DAC_OVERRIDE.  This is a security measure which prevents processes running as root that drop privileges from being able to write content in /root.  If a process can write content in /root, they could modify the /root/.bashrc file.  This means later an admin logging into the system as root executing a shell would execute the .bashrc script with full privs.  By setting the privs on the /root directory to 550, the systems are a little more security and admins know that only processes with DAC_OVERRIDE can write to this directory.  

Well this causes an issue.  Turns out that starting a shell like bash, ut wants to write to the the .bash_history directory in its home dir, if the script is running as root it wants to write /root/.bash_history file.  If the file does not exists, then the shell would require DAC_OVERRIDE to write this file.  Luckily bash continues working fine if it can not write this file.  

But if you are running on an SELinux system a confined application that launches bash, will generate an AVC message to the kernel stating that the confined domain wans DAC_OVERRIDE.

I recommend that if this situation happens to just add a DONTAUDIT rule to the policy.  Then SELinux will be silent about the denial, but the process will still not gain that access.

audit2allow -D -i /tmp/avc.log
#============= syslogd_t ==============
dontaudit syslogd_t self:capability dac_override;

To generate policy

audit2allow -M mysyslog -D -i /tmp/t1
******************** IMPORTANT ***********************
To make this policy package active, execute:
semodule -i mysyslog.pp


Bottom line, DAC_OVERRIDE is a fairly dangerous access to grant and can often be granted when it is not really necessary.  So I recommend fixing the permissions on files/directories or just adding dontaudit rules.

Episode 104 - The Gentoo security incident

Posted by Open Source Security Podcast on July 09, 2018 12:14 AM
Josh and Kurt talk about the Gentoo security incident. Gentoo did a really good job being open and dealing with the incident quickly. The basic takeaway from all this is make sure your organization is forcing users to use 2 factor authentication. The long term solution is going to be all identity providers forcing everyone to use 2FA.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6784275/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Converting policy.yaml to a list of dictionaries

Posted by Adam Young on July 08, 2018 03:38 AM

The policy .yaml file generated from oslo has the following format:

# Intended scope(s): system
#"identity:update_endpoint_group": "rule:admin_required"

# Delete endpoint group.
# DELETE /v3/OS-EP-FILTER/endpoint_groups/{endpoint_group_id}
# Intended scope(s): system
#"identity:delete_endpoint_group": "rule:admin_required"

This is not very useful for anything other than feeding to oslo-policy to enforce. If you want to use these values for anything else, it would be much more useful to have each rule as a dictionary, and all of the rules in a list. Here is a little bit of awk to help out:

#!/usr/bin/awk -f
BEGIN {apilines=0; print("---")}
/#"/ {
    if (api == 1){
	printf("  ")
	printf("- ")
  split ($0,array,"\"")
  print ("rule:", array[2]);
  print ("  check:", array[4]);
/# / {api=1;}
/^$/ {api=0; apilines=0;}
api == 1 && apilines == 0 {print ("- description:" substr($0,2))}
/# GET/  || /# DELETE/ || /# PUT/ || /# POST/ || /# HEAD/ || /# PATCH/ {
     print ("  " $2 ": " $3)
api == 1 { apilines = apilines +1 }

I have it saved in mungepolicy.awk. I ran it like this:

cat etc/keystone.policy.yaml.sample | ./mungepolicy.awk > /tmp/keystone.access.yaml

And the output looks like this:

- rule: admin_required
  check: role:admin or is_admin:1
- rule: service_role
  check: role:service
- rule: service_or_admin
  check: rule:admin_required or rule:service_role
- rule: owner
  check: user_id:%(user_id)s
- rule: admin_or_owner
  check: rule:admin_required or rule:owner
- rule: token_subject
  check: user_id:%(target.token.user_id)s
- rule: admin_or_token_subject
  check: rule:admin_required or rule:token_subject
- rule: service_admin_or_token_subject
  check: rule:service_or_admin or rule:token_subject
- description: Show application credential details.
  GET: /v3/users/{user_id}/application_credentials/{application_credential_id}
  HEAD: /v3/users/{user_id}/application_credentials/{application_credential_id}
  rule: identity:get_application_credential
  check: rule:admin_or_owner
- description: List application credentials for a user.
  GET: /v3/users/{user_id}/application_credentials
  HEAD: /v3/users/{user_id}/application_credentials
  rule: identity:list_application_credentials
  check: rule:admin_or_owner

Which is valid yaml. It might be a pain to deal with the verbs in separate keys. Ideally, that would be a list, too, but this will work for starters.

Running OpenStack components on RHEL with Software Collections

Posted by Adam Young on July 07, 2018 01:50 PM

The Python world has long since embraced Python3.  However, the stability guarantees of RHEL have limited it to Python2.7 as the base OS.  Now that I am running RHEL on my laptop, I have to find a way to work with Python 3.5 in order to contribute to OpenStack.  To further constrain myself, I do not want to “pollute” the installed python modules by using PIP to mix and match between upstream and downstream.  The solution is the Software Collections version of Python 3.5.  Here’s how I got it to work.

Start by enabling the Software Collections Yum repos and refreshing:

sudo subscription-manager repos --enable rhel-workstation-rhscl-7-rpms
sudo subscription-manager refresh

Now what I need is Python 3.5.  Since I did this via trial and error, I don’t have the exact yum command I used, but I ended up with the following rpms installed, and they were sufficient.


To enable the software collections:

scl enable rh-python35 bash

However, thus far there is no Tox installed. I can get using pip, and I’m ok with that so long as I do a user install.  Make sure You have run the above scl enable command to do this for the right version of python.

 pip install --user --upgrade tox

This puts all the code in ~/.local/ as well as appending ~/.local/bin dir to the PATH env var.  You need to restart your terminal session to pick that up on first use.

Now I can run code in the Keystone repo.  For example, to build the sample policy.json files:

tox -e genpolicy

A Git Style change management for a Database driven app.

Posted by Adam Young on July 06, 2018 07:38 PM

The Policy management tool I’m working on really needs revision and change management.  Since I’ve spent so much time with Git, it affects my thinking about change management things.  So, here is my attempt to lay out my current thinking for implementing a git-like scheme for managing policy rules.

A policy line is composed of two chunks of data.  A Key and a Value.  The keys are in the form


Additionally, the keys are scoped to a specific service (Keystone, Nova, etc).

The value is the check string.  These are of the form

role:admin and project_id=target.project_id

It is the check string that is most important to revision control. This lends itself to an entity diagram like this:

Whether each of these gets its own table remains to be seen.  The interesting part is the rule_name to policy_rule mapping.

Lets state that the policy_rule table entries are immutable.  If we want to change policy, we add a new entry, and leave the old ones in there.  The new entry will have a new revision value.  For now, lets assume revisions are integers and are monotonically increasing.  So, when I first upload the Keystone policy.json file, each entry gets a revision ID of 1.  In this example, all check_strings start off as are “is_admin:True”

Now lets assume I modify the identity:create_user rule.  I’m going to arbitrarily say that the id for this record is 68.  I want to Change it to:

role:admin and domain_id:target.domain_id

So we can do some scope checking.  This entry goes into the policy_rule table like so:


rule_name_id check_string revision
68 is_admin:True 1
68 role:admin and domain_id:target.domain_id 2

From a storage perspective this is quite nice, but from a “what does my final policy look like” perspective it is a mess.

In order to build the new view, we need sql along the lines of

select * from policy_rule where revision = ?

Lets call this line_query and assume that when we call it, the parameter is substituted for the question mark.  We would then need code like this pseudo-code:

doc = dict()
for revision in 1 to max:
    for result in line_query.execute(revision):
        index = result['rule_name_id']
        doc[index] = result.check_string()


This would build a dictionary layer by layer through all the revisions.

So far so good, but what happens if we decided to revert, and then to go a different direction? Right now, we have a revision chain like this:

And if we keep going, we have,

But what happens if 4 was a mistake? We need to revert to 6 and create a new new branch.

We have two choices. First, we could be destructive and delete all of the lines in revision 4, 5, and 6. This means we can never recreate the state of 6 again.

What if we don’t know that 4 is a mistake? What if we just want to try another route, but come back to 4,5, and 6 in the future?

We want this:


But how will we know to take the branch when we create the new doc?

Its a database! We put it in another table.

revision_id revision_parent_id
2 1
3 2
4 3
5 4
6 5
7 3
8 7
9 8

In order to recreate revision 9, we use a stack. Push 9 on the stack, then find the row with revision_id 9 in the table, push the revision_parent_id on the stack, and continue until there are no more rows.  Then, pop each revision_id off the stack and execute the same kind of pseudo code I posted above.

It is a lot.  It is kind of complicated, but it is the type of complicated that Python does well.  However, database do not do this kind of iterative querying well.  It would take a stored procedure to perform this via a single database query.

Talking through this has encouraged me decide to take another look at using git as the backing store instead of a relational database.

Episode 103 - The Seven Properties of Highly Secure Devices

Posted by Open Source Security Podcast on July 02, 2018 02:13 AM
Josh and Kurt talk about a Microsoft Research paper titled "The Seven Properties of Highly Secure Devices". We take a real world view into how to secure our devices. What works, what doesn't work, and why this list is actually really good.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6764831/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Requirements for an OpenStack Access Control Policy Management Tool

Posted by Adam Young on June 28, 2018 06:57 PM

“We need a read only role.”

It seems like such a simple requirement.  Users have been requesting a read-only role for several years now.  Why is it so tough to implement?   Because it calls for  modifying access control policy across multiple, disjoint services deployed at innumerable distinct locations.

“We need help in modifying policy to implement our own read only role.”

This one is a little bit more attainable.  We should be able to provide better tools to help people customize their policy.  What should that look like?

We gathered some information at the last summit, and I am going to try and distill it to a requirements document here.


  • Verb and Path:  the combination of the HTTP verb and the templated sub path that is used by the mapping engines.  If I were to use Curl to call https://hostname:5000/v3/users/a0123ab6, the verb would be the implicit GET, and the path would be /v3/users/{user_id}.
  • policy key:  the key in the policy.json and policy.yaml file that is used to match the python code to the policy.  For example, the Keystone GET /v3/user/{user_id} verb and path tests against the policy key identity:get_user.
  • API Policy Mapping:  the mapping from Verb and Path to Policy key.

The tool needs to be run from the installer. While that means Tripleo for my team, it should be a tool that can be enlisted into any of the installers.  It should also be able to run for day 2 operations from numerous tools.

It should not be deployed as a standard service, at least not one tied in with the active OpenStack install, as modifying policy is a tricky and potentially destructive and dangerous operation.


Policy files need to be gathered from the various services, but this tool does not need to do that; the variations in how to generate, collect, and distribute policy files are too numerous to solve in a single, focused tool.  The collection and distribution fits more into Ansible playbooks than a tool for modifying policy.

External API definitions

End users need to be able to test their policy.  While the existing oslo-policy command line can tell whether a token would or would not pass the checks, those are done at the policy key level.  All integration is done at the URL level, even if it then passes through libraries or the CLI.  The Verb and URL  can be retrieved from network tools or debug mode of the CLI, and matched against the tuple of (service,verb,template path) to link back to the policy key, and the thus the policy rule that oslo-policy will enforce.  Deducing this mapping must be easy.  With this mapping, additional tools can mock a request/response to test whether a given set of auth-data would pass or fail a request.  Thus, the tool should accept a simple format for uploading the mappings of Verb and Path to policy key.


Policy files have several implementations.  The old Policy.json structure provides the least amount of information. Here is a sample:

"context_is_admin": "role:admin",
"default": "role:admin",

"add_image": "",
"delete_image": "",
"get_image": "",
"get_images": "",
"modify_image": "",
"publicize_image": "role:admin",
"copy_from": "",


The policy in code structure provides the most, including the HTTP Verbs and templated Paths that map to the rules that are the keys in the policy files. The Python code that is used by oslo-policy to generate the sample YAML files uses, but does not expose, all that data.  Here is an example:

# This policy only checks if the user has access to the requested
# project limits. And this check is performed only after the check
# os_compute_api:limits passes
# GET /limits
# "os_compute_api:os-used-limits": "rule:admin_api"

A secondary tool should expose all this data as YAML , probably a modification of the oslo-policy CLI.  The management tool should be able to consume this format.  It should also be able to consume a document that maps the policy keys  to the Verb and Path separate from the policy


A new version of an OpenStack service will likely have new APIs.  These APIs will not be covered by existing policy.  However, if a site has made major efforts into customizing policy in the past, they will not want to lose and redo all of their changes.  Thus, it should be possible to upload a new file indicating the over all or just changes to the API mapping from a previous version.  If an updated policy-in-code format is available, that file should merge in with the existing policy modifications.  The user needs to be able to identify

  • Any new APIs that require application of the transformations listed below
  • Any changes to base policy that the user has customized and now conflict with the assumptions.  The tool user should be able to accept the old version, the new version, or come up with a modified new, manually merged version.


End users need to be able to describe the transformations that then need to perform in simple terms.  Here are some that have been identified so far:

  • ensure that all APIs match against some role
  • ensure that APIs that require an role (especially admin) also perform a scope check
  • switch the role used for a given operation or set of operations
  • standardize the meaning of interim rules such as “owner.”
  • Inline an interim rule into the rules that use it
  • Extract an interim rule from all the rules that have a common fragment

Implied Roles

The Implied Roles mechanism provides support for policy,  The tool should be able to help the tool users to take advantage of implied roles.

  • Make use of implied roles to simplify complex matching rules
  • Make use of implied roles to provide additional granularity for an API:
  • Make it possible to expand implied rules in the policy file based on a data model

Change sets

The operations to transform the rules are complex enough that users will need to be able to role them forward and back, much like a set of changes to a git repository.

User Interface

While the tool should be visible, the majority of the business logic should reside in an API that is callable from other systems.  This seems to imply a pattern of REST API + A visible UI toolkit.

The User Interface should make working with large sets of rules possible and convenient.  Appropriate information hiding and selection should be coupled with the transformations to select the set of rules to be transformed.


The data store for the application should be light enough to run during the install process. For example, SQLite would be preferred over MySQL.


The tool should be able to produce the individual policy files consumed by the APIs.

It is possible to have a deployment where different policy is in place for different endpoints of the same service.  The tools should support endpoint specific overrides.  However, the main assumption is that these will be small changes from the core service definitions.  As such, they should be treated as “service X plus these changes” as opposed to a completely separate set of policy rules.


Episode 102 - Michael Feiertag from tCell

Posted by Open Source Security Podcast on June 25, 2018 12:35 AM
Josh and Kurt talk to Michael Feiertag, the CEO of tCell. We talk about what a Web Application Firewall is, what it does and doesn't do, and what the future of this technology looks like. We touch on how this affects a DevOps environment. Security has to fit into the existing model, not try to change it.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6735685/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Episode 101 - Our unregulated future is here to stay

Posted by Open Source Security Podcast on June 17, 2018 11:56 PM
Josh and Kurt talk about Bird scooters. The implications of the scooters on the city, segways, bicycles. The topic of how these vehicles interact with pedestrians on the road and trails. It's an example of humans not wanting to follow the rules and generally making the situation annoying for everyone. It's the old security story of new technology without clear rules. The show ends with some horrifying numbers behind how bad things get before  people really care.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6710747/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Episode 100 - You're bad at buying security, we can help!

Posted by Open Source Security Podcast on June 11, 2018 01:30 AM
Josh and Kurt talk about how to be a smart security buyer. We have guest Steve Mayzak walk us through how a the buying process works as well as giving out a ton of great advice. Even if you're experienced with how to buy security technology you should give this a listen.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6673481/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Cool SELinux hack provide by systemd

Posted by Dan Walsh on June 08, 2018 02:17 PM

Sometimes content is created in /run during boot that ends up mislabeled.  We sometimes here, every time I boot, this file gets created with the wrong label.   

This can happen if initramfs is creating content before systemd has loaded policy.  This means the content would get created with var_run_t as the label.

Well I was looking at tmpfs.d and it has a cool feature.

man tmpfs.d



           Recursively set the access mode, group and user, and restore the SELinux security context of a file or directory

           if it exists, as well as of its subdirectories and the files contained therein (if applicable). Lines of this type

           accept shell-style globs in place of normal path names. Does not follow symlinks.

One hack you could try, would be to add /run to the tmpfiles.d directory and systemd will relabel all of the content in /run when the system reboots.

echo "Z /run — — — — —" > /etc/tmpfiles.d/relabelrun.conf

Of course if the content gets created after the tmpfs runs with the wrong label, you are out of luck, or enabled the old service restorecond...

Episode 99 - Consumer security is too broken to fix, and it doesn't matter

Posted by Open Source Security Podcast on June 04, 2018 01:53 AM
Josh and Kurt talk about a number of consumer security issues. The FBI told everyone to reboot their routers which they won't do. The .app top level domain is a cesspool of malware. Everyone has a cell phone and won't update them properly. None of this probably matters though. Unless there are real measurable tragedies caused by this tech, people tend not to really care.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6658585/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Command line VPN connection

Posted by Adam Young on May 31, 2018 04:43 PM

I need to connect to my office via VPN. Fedora has a point and click interface, but I am trying to minimize mouse usage. So, instead I have a small bash function that does it for me.

I has an OTP that I need to enter in, so I have nmcli prompt me.

$ cat `which vpn_up `
nmcli --ask c up "Raleigh (RDU2)"

Passwordless access to System libvirt on Fedora 28

Posted by Adam Young on May 31, 2018 03:42 PM

I can connect to the system libvirtd on my system without password. I set this up some time ago, and forgot how, so figured I would document it.

TO check that I can connect via virsh to the libvirst unix domain socket without a password.

$ virsh -c qemu:///system list --all
Id Name State
- cfme-tng shut off
- generic shut off
- pagure_pagure shut off

How?  File permissions.  The socket file can be found using a command like:

$ strace virsh -c qemu:///system list --all 2>&1 | grep connect
connect(5, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(5, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
connect(6, {sa_family=AF_UNIX, sun_path="/var/lib/sss/pipes/nss"}, 110) = 0
connect(7, {sa_family=AF_UNIX, sun_path="/var/run/libvirt/libvirt-sock"}, 110) = 0

It is the last line that we care about.

[ayoung@ayoung541 rippowam]$ ls -la /var/run/libvirt/libvirt-sock
srwxrwx---. 1 root libvirtd 0 May 31 09:30 /var/run/libvirt/libvirt-sock

My user account is a member of the libvirtd group.

[ayoung@ayoung541 rippowam]$ groups
ayoung wheel kvm qemu dockerroot libvirt devel openstack gss-eng-collab idm-dev-lab libvirtd docker

SELinux team works to remove DAC_OVERRIDE Permissions.

Posted by Dan Walsh on May 29, 2018 01:00 PM

DAC_OVERRIDE is one of the most powerful capabilities, and most app developers don't understand when they are taking advantage of it, or how easy it is to eliminate the need.


man capabilities



              Bypass file read, write, and execute permission checks.  (DAC is an abbreviation of "discretionary access control".)

Looking at /usr/include/linux/capability.h

#define CAP_DAC_OVERRIDE     1

/* Overrides all DAC restrictions regarding read and search on files and directories, including ACL restrictions if [_POSIX_ACL] is defined. Excluding DAC access covered by CAP_LINUX_IMMUTABLE. */

Giving a process this access means it can ignore file system permission checks. Admittedly everyone thinks root can do this by default anyways, but if you can eliminate this access from a system service, you really can tighten the security.  


SELinux ignores DAC permissions, it does not care if a a processes is running as root or any other UID.  The only part of SELinux that concerns itself with UID/GID permissions is in linux capabilities like DAC_OVERRIDE.

With SELinux we often look at what process types require DAC_OVERRIDE and try to figure out if we can rid of the access.  

Usually services that need DAC_OVERRIDE run as a root process for a short time before becoming non root.  Since they are going to run as non root they set up permissions on directories or unix domain sockets to be accessible by the UID/GID pair the service is going to run with.   Often they accidentally or intentionally remove `root UID` access, thinking this will give them better security.  IE if I only want the UID of my process to access an object, I set the object permissions such that only its UID can access it.

Lets look at an example, I create a directory named myapp, and set the ownership and group to my UID/GID 3267 (My UID), now I also set the permissions to 770.

ls -ld /var/lib/myapp
drwxrwx---. 2 dwalsh dwalsh 6 May 28 06:55 /var/lib/myapp

 Now processes running as root are NOT allowed to create any content in this directory, or to execute any content in this directory without using DAC_OVERRIDE.  (Note: It might be able to see and traverse the directory using DAC_READ_SEARCH, but that is a story for another blog).

The simplest way to allow the root process to get full access this directory would be to change the group ownership to root.

chown 3267:0 /var/lib/myapp
# ls -ld /var/lib/myapp
drwxrwx---. 2 dwalsh root 6 May 28 06:55 /var/lib/myapp

The root process gets full access to the directory using its group permissions and processes running as 'dwalsh'  get full access running as that UID using its owner permissions.  

While this does not seem that significant from a DAC point of view, after all root processes has full access to all objects owned by UID=0, in an SELinux world you would be running as myapp_t, and might only have access to the file system objects labeled my myapp_t, if we can drop the DAC_OVERRIDE permissions from SELinux we can really tighten up the security.

Lets look at a real world example via the following bugzilla. 

<figure class="aentry-post__figure aentry-post__figure--media"> </figure>

When dovecot sets up a socket for mail clients to talk to, it sets up the permssions on the socket to be:

# ls -l /var/run/dovecot/login/ipc-proxy
srw-------. 1 dovenull root 0 May 27 12:34 /var/run/dovecot/login/ipc-proxy

This permission means that only process running as the 'dovenull' user can communicate with the socket.  At somepoint when dovecot, running as dovecot_t, is coming up, the 'root' process attempts to access the ipc-proxy socket and is denied by SELinux.  

type=AVC msg=audit(1526480141.321:6579): avc:  denied  { dac_override } for  pid=19839 comm="dovecot" capability=1  scontext=system_u:system_r:dovecot_t:s0 tcontext=system_u:system_r:dovecot_t:s0 tclass=capability permissive=0

The simple thing to do from an SELinux point of view would be to add the allow rule

allow dovecot_t self:capability dac_override;

But from a security proint of view, this is lousy.  The much better solution would be to 'relax' the permissions on the socket by adding group read/write.

# ls -l /var/run/dovecot/login/ipc-proxy
srwrw-----. 1 dovenull root 0 May 27 12:34 /var/run/dovecot/login/ipc-proxy

Now root processes are allowed to access the socket via DAC permissions and no longer need to use linux capabilities to access the socket.  This would be a far more secure way of running Dovecot and really involved a minor change to the code.

When I look at containers, we allow DAC_OVERRIDE by default, because so many containers are badly written, but I think it would be great for us to be able to remove this permission by default.

podman run -d --cap-drop DAC_OVERRIDE myimage  

Or for those of you still using Docker

docker run -d --cap-drop DAC_OVERRIDE myimage  

I will talk more about this in a future blog.

Bottom Line:

In most cases the requirement for DAC_OVERRIDE is a simple programmer error in the way he sets up his application and can be fixed by adjusting the permissions/ownership on file system objects.  Loosening the SELinux constraints should be the last resort.

Customizing container types

Posted by Dan Walsh on May 29, 2018 12:34 PM

In my previous blog, I talked about about container types container_t and svirt_lxc_net_t. Today I get an email, asking about the new container_t type replacing svirt_lxc_net_t.

On 05/23/2018 11:50 PM, Dustin C. Hatch wrote:
I recently upgraded some of my Docker hosts to CentOS 7.5 and started getting "Permission Denied" errors inside of containers. I traced this down to any container that mounts and uses /etc/passwd from the host (so that UIDs inside the container map to the same username as on the host), because the SELinux policy in CentOS 7.5 does not allow the new container_t domain to read passwd_file_t.  
The old svirt_lxc_net_t domain had the nsswitch_domain attribute, while its replacement, container_t, does not. I cannot find any reference for this change, so I was wondering if it was deliberate or not. If it was deliberate, what would be the consequences if I were to make a local policy change to add that attribute back? If it was not deliberate, I would be happy to open a ticket in Bugzilla. 

First let's remove the misconception, container_t was not a new type replacing svirt_lxc_net_t, it was a rename (typealias) of the old type.  

But the more important question was, why did I remove the access `nsswitch_domain` access.  This access allowed containers to read all sorts of user information is a container escaped.  The reason it was there originally was to allow virt-sandbox to care up a host into several containers, as opposed to the way 'Docker' ran containers each as separate unigue userspaces.

When Docker experienced a CVE a couple of  years ago where a container process was able to escape to the host, a security analyst was surprised on what a container process was allowed to read by default by SELinux.  And of course reading /etc/passwd seemed to like something we should prevent.  I agreed, and decided to tighten policy by removing this ability.  I still think it is the right decision.

The emailer goes on to ask if he were to add the attibute back would that be an issue, I say no, if you use case is to allow containers to read user data out of /etc/passwd then you can/should modify the policy to allow it.  Lets look at how.

Create a TE file that looks like the following:

# cat mycontainer.te
policy_module(mycontainer, 1.0)
type container_t;
# make -f /usr/share/selinux/devel/Makefile
# semodule -i mycontainer.pp

 Now container processes on systems with this policy module will be able to interact with the host in all the different ways that you can use nsswitch, which means you can not only read /etc/passwd, but also communicate with sssd, and remote IPA and authentication databases.  If you want to write a tighter policy that simply allows  your containers to read /etc/passwd, you could write a module like:

# cat mycontainer.te
policy_module(mycontainer, 1.0)
type container_t;
# make -f /usr/share/selinux/devel/Makefile
# semodule -i mycontainer.pp

Obviously I would recommend the second one.

Bottom Line

I am trying to balance container security with the ability to run most container workloads.  When you have a use case where these conflict, SELinux has the flexibility to allow you to customize the policy.

Episode 98 - When IT decisions kill people

Posted by Open Source Security Podcast on May 28, 2018 12:30 AM
Josh and Kurt talk about the NTSB report from the fatal Uber crash and what happened with Amazon's Alexa recording then emailing a private conversation. IT decisions now have real world consequences like never before.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6638938/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Tracking Quota

Posted by Adam Young on May 26, 2018 05:46 AM

This OpenStack summit marks the third that I have attended where we’ve discussed the algorithms to try and record quota in Keystone but not update it on each resource allocation and free.

We were stumped, again. The process we had planned on using was game-able and thus broken. I was kinda bummed.

Fortunately, I had a long car ride from Vancouver to Seattle and talked it over with Morgan Fainberg.

We also discussed the Pig War. Great piece of history from the region.

By the time we got to the airport the next day, I think we had it solved. Morgan came to the solution first, and I followed, slowly.  Here’s what we think will work.

First, lets get a 3 Level deep project setup to use for out discussion.

The rule is simple:  even if a quota is subdivided, you still need to check the overall quota of the project  and all the parent projects.

In the example structure above,  lets assume that project A gets a quota of 100 units of some resource: VMs, GB memory, network ports, hot-dogs, very small rocks, whatever.  I’ll use VMs in this example.

There are a couple ways this could be further managed.  The simplest is that, any resource allocated anywhere else in this tree is counted against this quota.  There are 9 total projects in the tree.  If each allocate 11 VMs, there will be 99 created and counted against the quota.  The next VM created uses up the quota.  The request after that will fail due to lack of available quota.

Lets say, however, that the users in project C33 are greedy, and allocate all 100 VMs.  The people in C11 Are filled with righteous indignation.  They need VMs too.

The Admins wipe everything out and we start all over.  They set up a system to fragment the quota by allowing A project to split its quota assignment up and allocate some of it to subordinate projects.

Project A says “I’m going to keep 50 VMs for myself, and allocate 25 to B1 and B2.”

Project B1 Says I am going to keep 10 for Me and I’m going to allocate 5 to each C11, C12, C13.  And the B1 Tree is happy.

B2 is a manipulative schemer and decides to play around.  B2 Allocates his entire quota of 25 to C21.  C21 Creates 25 VMs.

B2 now withdraws his quota from C21.  There is no communication with Nova.  The VMs keep running.  He then allocates his entire quota of 25 VMs to C22, and C22 creates 25 VMs.

Nova says “What project is this?  C22?   What is its quota?  25?  All good.”

But in reality, B2 has doubled his quota.  His subordinates have allocated 50 VMs total.  He does this again with project C33, gets up to 75 VMs, and contemplates creating yet another project C34 just to keep up the pattern.  This would allocate more VMs than project A was originally allocated.

The admins notice this and get mad, wipe everything out, and start over again.  This time they’ve made a change.  Whenever the check quota on a project, they also will go and check quota on the parent project, counting all VMs underneath that parent.  Essentially, they will record that a VM created in project C11 is also reducing the original quota on B1  and on A.  In essence, they record a table.  If the user creates a VM in Project C11, the following will be recorded and check for quota.

VM Project
VM1 B1
VM1 C11


When a User then creates a VM in C21 the table will extend to this:

VM Project
VM1 B1
VM1 C11
VM2 B2
VM2 C21

In addition, when creating the VM2, Nova will check quota and see that, after creation:

  • C21 now has 1 out of 25 allocated
  • B2 now has 1 out of 25 allocated
  • A now has 2 out of 100 allocated

(quota is allocated prior to the creation of the resource to prevent a race condition)

Note that the quota is checked against the original amount, and not the amount reduced by sub allocating the quota.  If project C21 allocates 24 more VMs, to quota check will show:

  • C21 now has 25 out of 25 allocated
  • B2 now has 25 out of 25 allocated
  • A now has 26 out of 100 allocated

If B2 tried to play games, and removes the quota from C21 and gives it to C22, project C21 will be over quota, but Nova will have no way to trap this.  However, the only people this affects is other people within projects B2, C21, C22, and C23.  If C22 attempts to allocate a virtual machine, the quota check will show that B2 has allocated its full quota and cannot create any more.  The quota check will fail.

You might have noticed that the higher level projects can rob quota from the child projects in this scheme.  For example.  If Project A allocates 74 more VMs now, project B1 and children will still have allocated quota, but their quota check will fail because A is at full.  This could be mitigated by having 2 checks for project A: total quota (max 100), and directly allocated quota (max 50).

This scheme removes the quota violation by gaming the system.  I promised to write it up so we could continue to try and poke holes in it.

EDIT:  Quota would also have to be allocated per endpoint, or the endpoints will have to communicate with each other to evaluate usage.


container_t versus svirt_lxc_net_t

Posted by Dan Walsh on May 22, 2018 06:12 PM

For some reason recently I have been asked via email and twitter about what the difference is between the container_t type and the svirt_lxc_net_t type. Or similarly between container_file_t and svirt_sandbox_file_t.  Bottom line, NOTHING.  They are aliases of each other.

In SELinux policy language they have a typealias  command.

typealias container_t alias svirt_lxc_net_t;

typealias container_file_t alias svirt_sandbox_file_t;

When I first started working on containers and SELinux prior to Docker, we were writing a tool called virt-sandbox that used libvirt to launch containers, specifically it used libvirt-lxc.  We had labeled all of the VMs launched by libvirt, svirt_t.  This stood for secure virt.  When I decided to write policy for the libvirt_lxc containers, I created a type called svirt_lxc_t.  This type was not allowed to do network access, so I added another type called svirt_lxc_net_t that had full network access.  The type for content that he svirt_lxc types could manage as svirt_sandbox_file_t.  (svirt_file_t was already used for virtual machine images.)  Why I did not call it svirt_lxc_file_t, I don't know. 

When Docker exploded on the scene we quickly adapted the SELinux policy we had written for virt-sandbot to work with Docker and that is how these types got associated with containers.  After a few years of using this and constantly telling people that svirt_lxc_net_t was the type of a container process, I figured it was time to change the type name.  I created container_t and container_file_t and then just aliased the old names to the new.  

One problem was that RHEL policy updated much slower and we were not able to get these changes in untile RHEL7.5 (Or maybe RHEL7.4, I don't remember).   But for now on we will use the new type names.  

Google Memory

One issue you have with technology is the old howto's and information out on the internet never goes away.  If someone googles how to label a volume so that it works with a container and SELinux, they are likely to get told to label the content svirt_sandbox_file_t.

This is not an issue with the type aliases.  If you have scripts or customer policy modules that use the old names you are fine. Since the old names will still work. 

But I would prefer that everyone just use the newer easily understandable types.

Episode 97 - Automation: Humans are slow and dumb

Posted by Open Source Security Podcast on May 20, 2018 11:18 PM
Josh and Kurt talk about the security of automation as well as automating security. The only way automation will really work long term is full automation. Humans can't be trusted enough to rely on them to do things right.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6584004/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Minicom to a Juniper SRX-220

Posted by Adam Young on May 14, 2018 08:28 PM

Cluster computing requires a cluster of computers. For the past several years, I have been attempting to get work down without having a home cluster. This is no longer tenable, and I need to build my own.

One of the requirements for a home cluster is a progammable network device. I’ve purchased a second hand Junper SRC22.

Juniper SRX 220

Juniper SRX 220

Here are my configuration notes.

To start, I needed a Get console cable. RJ456 on one end, serial port on the other…excpet I don’t have any more serial ports on my workstations, so an addtional USB to Serial port cable to connect to that.

Console Cables

Yeah, just get a USB to RJ45 Console cable. But don’t expect to be able to buy one at your local office supply store.

Configure Minicom

<preclass>sudo minicom -s

pick A


poweron SRX220

Can’t login

Reset root password


push and hold recessed “config” button until status turns Amber

hit spacebar.
hit spacebar
boot -s
configure shared
set system root-authentication plain-text-password

Got this error

root# commit
 [edit interfaces]
 HA management port cannot be configured
 error: configuration check-out failed

Delete Old Interfaces

SRX cluster

<iframe class="wp-embedded-content" data-secret="Qhxsri0zJF" frameborder="0" height="329" marginheight="0" marginwidth="0" sandbox="allow-scripts" scrolling="no" security="restricted" src="http://rtodto.net/srx-cluster/embed/#?secret=Qhxsri0zJF" title="“SRX cluster” — Tech Notes / RtoDto.net" width="584"></iframe>

delete interfaces ge0/0/6
 delete interfaces ge0/0/7

 commit complete

Share Certs Data into a container.

Posted by Dan Walsh on May 14, 2018 07:04 AM

Last week, on the Fedora Users list someone was asking a question about getting SElinux to work with a container.  The mailer said that he was sharing certs into the container but SELinux as blocking access.

Here are the AVC's that were reported. 

Fri May 11 03:35:19 2018 type=AVC msg=audit(1526024119.640:1052): avc:  denied  { write } for   pid=13291 comm="touch" name="php-fpm.access" dev="dm-2" ino=20186094 scontext=system_u:system_r:container_t:s0:c581,c880 tcontext=system_u:object_r:user_home_t:s0 tclass=file permissive=0 

Looks like there is a container (container_t) that is attempting to write some content in you homedir (user_home_t). 

I surmised that the mailer must have been volume mounting a directory from his homedir into the container.

I responded to him with:

Private to container:

If these certs are only going to be used within one container you should add a :Z to the volume mount. 

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

Or if you are still using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:Z fedora-app

This causes  the container runtime to relabel the volume with a SELinux label private to the container.

Shared with other Containers

If you want the container-Cert-Dir to be shared between multiple containers, and it can be shared read/only I would add the :z,ro flags

podman run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

Using Docker.

docker run -d -v ~/container-Cert-Dir:/PATHINCONTAINER:z,ro fedora-app

This causes  the container runtime(s) to relabel the volume with an SELinux label that can be shared between containers.  Of course if the containers need to write to the directory, you would remove the ,ro.

Shared with other confined domains on host

If you want to share the Cert Directory with other confined apps on  your system, then you will need to disable SELinux separation in the  container.

podman run -d --security-opt label:disable -v ~/container-Cert-Dir:/PATHINCONTAINER fedora-app

Using Docker.

docker run -d --security-opt label:disable -v ~/container-Cert-Dir:/PATHINCONTAINER fedora-app

This causes  the container runtime(s) to launch the container with a unconfined type, allowing the container to read/write the volume without relabeling.

Episode 96 - Are legal backdoors a good idea?

Posted by Open Source Security Podcast on May 14, 2018 06:43 AM
Josh and Kurt talk about backdoors in code and products that have been put there on purpose. We talk about unlocking phones. Encryption backdoors with a focus on why they won't work.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6583352/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Episode 95 - Twitter passwords and npm backdoors

Posted by Open Source Security Podcast on May 07, 2018 12:40 AM
Josh and Kurt talk about Twitter doing the right thing when they logged a lot of passwords, the npm malicious getcookies package, and how backdoors work in code.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6562568/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

SELinux and Containers

Posted by Dan Walsh on May 01, 2018 01:00 PM

Next week at the Red Hat summit, I have a short session to talk about SELinux and Containers.  I am constantly reminded in bugzilla about how great the combination is.  

It truly is like Peanut Butter and Jelly.  

Sadly, people are still surprised when it blocks access.  For example I got a bugzilla recently that talked about containers not working on Fedora.  The avc was

type=AVC msg=audit(1524873307.948:1814): avc:  denied  { connectto } for  pid=28746 comm="boinc" path=002F746D702F2E5831312D756E69782F5831 scontext=system_u:system_r:container_t:s0:c420,c759 tcontext=unconfined_u:unconfined_r:xserver_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

This AVC shows SELinux blocking the container process from connecting to the Xserver. We definitely do not want to allow containers to connect to the Xserver.  SELinux is doing precisely what it is designed to do.

Allowing a process to connect to the XServer would allow it to screen scrape all of you data on the desktop, it would also allow it to fool humans into typing passwords.  It would also allow it to grab all data in the cut and paste buffer. Especially things like passwords.

I can imagine that this works fine on other platforms with SELinux disabled. 

I often say SELinux is the best tool for protecting the file system from container break out, and in this case it prevents a container that has broken out from communicating with other services on the system via unix domain sockets.  SELinux examines both ends of communications on fifos and domain sockets and will prevent it from fooling privileged services.  No other security prevents this communication.

But what if I want to allow this communication?

If you want to run trusted applications to connect to the desktop then you need to disable SELinux.

The way you do this with podman is

podman run --security-opt label=disable ...

Or with Docker

docker run --security-opt label=disable ...

Episode 94 - DNSSEC, BGP, and reality

Posted by Open Source Security Podcast on April 30, 2018 12:40 AM
Josh and Kurt talk about the Amazon Route 53 incident and what it really means for the modern infrastructure. Complaining nobody is using DNSSEC or securing BGP aren't the right conversations to be having. Reality must be considered in any honest conversation about these topics.

<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6535092/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

AD directory admins group setup

Posted by William Brown on April 25, 2018 02:00 PM

AD directory admins group setup

Recently I have been reading many of the Microsoft Active Directory best practices for security and hardening. These are great resources, and very well written. The major theme of the articles is “least privilege”, where accounts like Administrators or Domain Admins are over used and lead to further compromise.

A suggestion that is put forward by the author is to have a group that has no other permissions but to manage the directory service. This should be used to temporarily make a user an admin, then after a period of time they should be removed from the group.

This way you have no Administrators or Domain Admins, but you have an AD only group that can temporarily grant these permissions when required.

I want to explore how to create this and configure the correct access controls to enable this scheme.

Create our group

First, lets create a “Directory Admins” group which will contain our members that have the rights to modify or grant other privileges.

# /usr/local/samba/bin/samba-tool group add 'Directory Admins'
Added group Directory Admins

It’s a really good idea to add this to the “Denied RODC Password Replication Group” to limit the risk of these accounts being compromised during an attack. Additionally, you probably want to make your “admin storage” group also a member of this, but I’ll leave that to you.

# /usr/local/samba/bin/samba-tool group addmembers "Denied RODC Password Replication Group" "Directory Admins"

Now that we have this, lets add a member to it. I strongly advise you create special accounts just for the purpose of directory administration - don’t use your daily account for this!

# /usr/local/samba/bin/samba-tool user create da_william
User 'da_william' created successfully
# /usr/local/samba/bin/samba-tool group addmembers 'Directory Admins' da_william
Added members to group Directory Admins

Configure the permissions

Now we need to configure the correct dsacls to allow Directory Admins full control over directory objects. It could be possible to constrain this to only modification of the cn=builtin and cn=users container however, as directory admins might not need so much control for things like dns modification.

If you want to constrain these permissions, only apply the following to cn=builtins instead - or even just the target groups like Domain Admins.

First we need the objectSID of our Directory Admins group so we can build the ACE.

# /usr/local/samba/bin/samba-tool group show 'directory admins' --attributes=cn,objectsid
dn: CN=Directory Admins,CN=Users,DC=adt,DC=blackhats,DC=net,DC=au
cn: Directory Admins
objectSid: S-1-5-21-2488910578-3334016764-1009705076-1104

Now with this we can construct the ACE.


This permission grants:

  • RP: read property
  • WP: write property
  • LC: list child objects
  • LO: list objects
  • RC: read control

It could be possible to expand these rights: it depends if you want directory admins to be able to do “day to day” ad control jobs, or if you just use them for granting of privileges. That’s up to you. An expanded ACE might be:

# Same as Enterprise Admins

Now lets actually apply this and do a test:

# /usr/local/samba/bin/samba-tool dsacl set --sddl='(A;CI;RPWPLCLORC;;;S-1-5-21-2488910578-3334016764-1009705076-1104)' --objectdn='dc=adt,dc=blackhats,dc=net,dc=au'
# /usr/local/samba/bin/samba-tool group addmembers 'directory admins' administrator -U 'da_william%...'
Added members to group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'
# /usr/local/samba/bin/samba-tool group removemembers 'directory admins' -U 'da_william%...'
Removed members from group directory admins
# /usr/local/samba/bin/samba-tool group listmembers 'directory admins' -U 'da_william%...'

It works!


With these steps we have created a secure account that has limited admin rights, able to temporarily promote users with privileges for administrative work - and able to remove it once the work is complete.

Episode 93 - Security flaws in beep and patch, how did we get here?

Posted by Open Source Security Podcast on April 23, 2018 12:36 AM
Josh and Kurt talk about security flaws in beep and patch. How on earth were there security flaws in beep and patch?
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6483689/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Understanding AD Access Control Entries

Posted by William Brown on April 19, 2018 02:00 PM

Understanding AD Access Control Entries

A few days ago I set out to work on making samba 4 my default LDAP server. In the process I was forced to learn about Active Directory Access controls. I found that while there was significant documentation around the syntax of these structures, very little existed explaining how to use them effectively.

What’s in an ACE?

If you look at the the ACL of an entry in AD you’ll see something like:


This seems very confusing and complex (and someone should write a tool to explain these … maybe me). But once you can see the structure it starts to make sense.

Most of the access controls you are viewing here are DACLs or Discrestionary Access Control Lists. These make up the majority of the output after ‘O:DAG:DAD:AI’. TODO: What does ‘O:DAG:DAD:AI’ mean completely?

After that there are many ACEs defined in SDDL or ???. The structure is as follows:


Each of these fields can take varies types. These interact to form the access control rules that allow or deny access. Thankfully, you don’t need to adjust many fields to make useful ACE entries.

MS maintains a document of these field values here.

They also maintain a list of wellknown SID values here

I want to cover some common values you may see though:


Most of the types you’ll see are “A” and “OA”. These mean the ACE allows an access by the SID.


These change the behaviour of the ACE. Common values you may want to set are CI and OI. These determine that the ACE should be inherited to child objects. As far as the MS docs say, these behave the same way.

If you see ID in this field it means the ACE has been inherited from a parent object. In this case the inherit_object_guid field will be set to the guid of the parent that set the ACE. This is great, as it allows you to backtrace the origin of access controls!


This is the important part of the ACE - it determines what access the SID has over this object. The MS docs are very comprehensive of what this does, but common values are:

  • RP: read property
  • WP: write property
  • CR: control rights
  • CC: child create (create new objects)
  • DC: delete child
  • LC: list child objects
  • LO: list objects
  • RC: read control
  • WO: write owner (change the owner of an object)
  • WD: write dac (allow writing ACE)
  • SW: self write
  • SD: standard delete
  • DT: delete tree

I’m not 100% sure of all the subtle behaviours of these, because they are not documented that well. If someone can help explain these to me, it would be great.


We will skip some fields and go straight to SID. This is the SID of the object that is allowed the rights from the rights field. This field can take a GUID of the object, or it can take a “well known” value of the SID. For example ‘AN’ means “anonymous users”, or ‘AU’ meaning authenticated users.


I won’t claim to be an AD ACE expert, but I did find the docs hard to interpret at first. Having a breakdown and explanation of the behaviour of the fields can help others, and I really want to hear from people who know more about this topic on me so that I can expand this resource to help others really understand how AD ACE’s work.

Making Samba 4 the default LDAP server

Posted by William Brown on April 17, 2018 02:00 PM

Making Samba 4 the default LDAP server

Earlier this year Andrew Bartlett set me the challenge: how could we make Samba 4 the default LDAP server in use for Linux and UNIX systems? I’ve finally decided to tackle this, and write up some simple changes we can make, and decide on some long term goals to make this a reality.

What makes a unix directory anyway?

Great question - this is such a broad topic, even I don’t know if I can single out what it means. For the purposes of this exercise I’ll treat it as “what would we need from my previous workplace”. My previous workplace had a dedicated set of 389 Directory Server machines that served lookups mainly for email routing, application authentication and more. The didn’t really process a great deal of login traffic as the majority of the workstations were Windows - thus connected to AD.

What it did show was that Linux clients and applications:

  • Want to use anonymous binds and searchs - Applications and clients are NOT domain members - they just want to do searches
  • The content of anonymous lookups should be “public safe” information. (IE nothing private)
  • LDAPS is a must for binds
  • MemberOf and group filtering is very important for access control
  • sshPublicKey and userCertificate;binary is important for 2fa/secure logins

This seems like a pretty simple list - but it’s not the model Samba 4 or AD ship with.

You’ll also want to harden a few default settings. These include:

  • Disable Guest
  • Disable 10 machine join policy

AD works under the assumption that all clients are authenticated via kerberos, and that kerberos is the primary authentication and trust provider. As a result, AD often ships with:

  • Disabled anonymous binds - All clients are domain members or service accounts
  • No anonymous content available to search
  • No LDAPS (GSSAPI is used instead)
  • no sshPublicKey or userCertificates (pkinit instead via krb)
  • Access control is much more complex topic than just “matching an ldap filter”.

As a result, it takes a bit of effort to change Samba 4 to work in a way that suits both, securely.

Isn’t anonymous binding insecure?

Let’s get this one out the way - no it’s not. In every pen test I have seen if you can get access to a domain joined machine, you probably have a good chance of taking over the domain in various ways. Domain joined systems and krb allows lateral movement and other issues that are beyond the scope of this document.

The lack of anonymous lookup is more about preventing information disclosure - security via obscurity. But it doesn’t take long to realise that this is trivially defeated (get one user account, guest account, domain member and you can search …).

As a result, in some cases it may be better to allow anonymous lookups because then you don’t have spurious service accounts, you have a clear understanding of what is and is not accessible as readable data, and you don’t need every machine on the network to be domain joined - you prevent a possible foothold of lateral movement.

So anonymous binding is just fine, as the unix world has shown for a long time. That’s why I have very few concerns about enabling it. Your safety is in the access controls for searches, not in blocking anonymous reads outright.

Installing your DC

As I run fedora, you will need to build and install samba for source so you can access the heimdal kerberos functions. Fedora’s samba 4 ships ADDC support now, but lacks some features like RODC that you may want. In the future I expect this will change though.

These documents will help guide you:


build steps

install a domain

I strongly advise you use options similar to:

/usr/local/samba/bin/samba-tool domain provision --server-role=dc --use-rfc2307 --dns-backend=SAMBA_INTERNAL --realm=SAMDOM.EXAMPLE.COM --domain=SAMDOM --adminpass=Passw0rd

Allow anonymous binds and searches

Now that you have a working domain controller, we should test you have working ldap:

/usr/local/samba/bin/samba-tool forest directory_service dsheuristics 0000002 -H ldaps://localhost --simple-bind-dn='administrator@samdom.example.com'
ldapsearch -b DC=samdom,DC=example,DC=com -H ldaps://localhost -x

You can see the domain object but nothing else. Many other blogs and sites recommend a blanket “anonymous read all” access control, but I think that’s too broad. A better approach is to add the anonymous read to only the few containers that require it.

/usr/local/samba/bin/samba-tool dsacl set --objectdn=DC=samdom,DC=example,DC=com --sddl='(A;;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Users,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd
/usr/local/samba/bin/samba-tool dsacl set --objectdn=CN=Builtin,DC=samdom,DC=example,DC=com --sddl='(A;CI;RPLCLORC;;;AN)' --simple-bind-dn="administrator@samdom.example.com" --password=Passw0rd

In AD groups and users are found in cn=users, and some groups are in cn=builtin. So we allow read to the root domain object, then we set a read on cn=users and cn=builtin that inherits to it’s child objects. The attribute policies are derived elsewhere, so we can assume that things like kerberos data and password material are safe with these simple changes.

Configuring LDAPS

This is a reasonable simple exercise. Given a ca cert, key and cert we can place these in the correct locations samba expects. By default this is the private directory. In a custom install, that’s /usr/local/samba/private/tls/, but for distros I think it’s /var/lib/samba/private. Simply replace ca.pem, cert.pem and key.pem with your files and restart.

Adding schema

To allow adding schema to samba 4 you need to reconfigure the dsdb config on the schema master. To show the current schema master you can use:

/usr/local/samba/bin/samba-tool fsmo show -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au' --password=Password1

Look for the value:

SchemaMasterRole owner: CN=NTDS Settings,CN=LDAPKDC,CN=Servers,CN=Default-First-Site-Name,CN=Sites,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au

And note the CN=ldapkdc = that’s the hostname of the current schema master.

On the schema master we need to adjust the smb.conf. The change you need to make is:

    dsdb:schema update allowed = yes

Now restart the instance and we can update the schema. The following LDIF should work if you replace ${DOMAINDN} with your namingContext. You can apply it with ldapmodify

dn: CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
cn: sshPublicKey
name: sshPublicKey
lDAPDisplayName: sshPublicKey
description: MANDATORY: OpenSSH Public key
oMSyntax: 4
isSingleValued: FALSE
searchFlags: 8

dn: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
cn: ldapPublicKey
name: ldapPublicKey
description: MANDATORY: OpenSSH LPK objectclass
lDAPDisplayName: ldapPublicKey
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: sshPublicKey

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
sudo ldapmodify -f sshpubkey.ldif -D 'administrator@adt.blackhats.net.au' -w Password1 -H ldaps://localhost
adding new entry "CN=sshPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

adding new entry "CN=ldapPublicKey,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

modifying entry "CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au"

To my surprise, userCertificate already exists! The reason I missed it is a subtle ad schema behaviour I missed. The ldap attribute name is stored in the lDAPDisplayName and may not be the same as the CN of the schema element. As a result, you can find this with:

ldapsearch -H ldaps://localhost -b CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au -x -D 'administrator@adt.blackhats.net.au' -W '(attributeId='

This doesn’t solve my issues: Because I am a long time user of 389-ds, that means I need some ns compat attributes. Here I add the nsUniqueId value so that I can keep some compatability.

dn: CN=nsUniqueId,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: attributeSchema
attributeID: 2.16.840.1.113730.3.1.542
cn: nsUniqueId
name: nsUniqueId
lDAPDisplayName: nsUniqueId
description: MANDATORY: nsUniqueId compatability
oMSyntax: 4
isSingleValued: TRUE
searchFlags: 9

dn: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: add
objectClass: top
objectClass: classSchema
governsID: 2.16.840.1.113730.3.2.334
cn: nsOrgPerson
name: nsOrgPerson
description: MANDATORY: Netscape DS compat person
lDAPDisplayName: nsOrgPerson
subClassOf: top
objectClassCategory: 3
defaultObjectCategory: CN=nsOrgPerson,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
mayContain: nsUniqueId

dn: CN=User,CN=Schema,CN=Configuration,DC=adt,DC=blackhats,DC=net,DC=au
changetype: modify
replace: auxiliaryClass
auxiliaryClass: ldapPublicKey
auxiliaryClass: posixAccount
auxiliaryClass: shadowAccount
auxiliaryClass: nsOrgPerson

Now with this you can extend your users with the required data for SSH, certificates and maybe 389-ds compatability.

/usr/local/samba/bin/samba-tool user edit william  -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'


Out of the box a number of the unix attributes are not indexed by Active Directory. To fix this you need to update the search flags in the schema.

Again, temporarily allow changes:

    dsdb:schema update allowed = yes

Now we need to add some indexes for common types. Note that in the nsUniqueId schema I already added the search flags. We also want to set that these values should be preserved if they become tombstones so we can recove them.

/usr/local/samba/bin/samba-tool schema attribute modify uid --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify nsUniqueId --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify uidnumber --searchflags=9
/usr/local/samba/bin/samba-tool schema attribute modify gidnumber --searchflags=9
# Preserve on tombstone but don't index
/usr/local/samba/bin/samba-tool schema attribute modify x509-cert --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify sshPublicKey --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify gecos --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify loginShell --searchflags=8
/usr/local/samba/bin/samba-tool schema attribute modify home-directory --searchflags=24

AD Hardening

We want to harden a few default settings that could be considered insecure. First, let’s stop “any user from being able to domain join machines”.

/usr/local/samba/bin/samba-tool domain settings account_machine_join_quota 0 -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'

Now let’s disable the Guest account

/usr/local/samba/bin/samba-tool user disable Guest -H ldaps://localhost --simple-bind-dn='administrator@adt.blackhats.net.au'

I plan to write a more complete samba-tool extension for auditing these and more options, so stay tuned!

SSSD configuration

Now that our directory service is configured, we need to configure our clients to utilise it correctly.

Here is my SSSD configuration, that supports sshPublicKey distribution, userCertificate authentication on workstations and SID -> uid mapping. In the future I want to explore sudo rules in LDAP with AD, and maybe even HBAC rules rather than GPO.

Please refer to my other blog posts on configuration of the userCertificates and sshKey distribution.

ignore_group_members = False

# There is a bug in SSSD where this actually means "ipv6 only".
# lookup_family_order=ipv6_first
cache_credentials = True
id_provider = ldap
auth_provider = ldap
access_provider = ldap
chpass_provider = ldap
ldap_search_base = dc=blackhats,dc=net,dc=au

# This prevents an infinite referral loop.
ldap_referrals = False
ldap_id_mapping = True
ldap_schema = ad
# Rather that being in domain users group, create a user private group
# automatically on login.
# This is very important as a security setting on unix!!!
# See this bug if it doesn't work correctly.
# https://pagure.io/SSSD/sssd/issue/3723
auto_private_groups = true

ldap_uri = ldaps://ad.blackhats.net.au
ldap_tls_reqcert = demand
ldap_tls_cacert = /etc/pki/tls/certs/bh_ldap.crt

# Workstation access
ldap_access_filter = (memberOf=CN=Workstation Operators,CN=Users,DC=blackhats,DC=net,DC=au)

ldap_user_member_of = memberof
ldap_user_gecos = cn
ldap_user_uuid = objectGUID
ldap_group_uuid = objectGUID
# This is really important as it allows SSSD to respect nsAccountLock
ldap_account_expire_policy = ad
ldap_access_order = filter, expire
# Setup for ssh keys
ldap_user_ssh_public_key = sshPublicKey
# This does not require ;binary tag with AD.
ldap_user_certificate = userCertificate
# This is required for the homeDirectory to be looked up in the sssd schema
ldap_user_home_directory = homeDirectory

services = nss, pam, ssh, sudo
config_file_version = 2
certificate_verification = no_verification

domains = blackhats.net.au
homedir_substring = /home

pam_cert_auth = True







With these simple changes we can easily make samba 4 able to perform the roles of other unix focused LDAP servers. This allows stateless clients, secure ssh key authentication, certificate authentication and more.

Some future goals to improve this include:

  • CLI tools to manage sshPublicKeys and userCertificates easily
  • Ship samba 4 with schema templates that can be used
  • Schema querying (what objectclass takes this attribute?)
  • Group editing (same as samba-tool user edit)
  • Security auditing tools
  • user/group modification commands
  • Refactor and improve the cli tools python to be api driven - move the logic from netcmd into samdb so that samdb can be an API that python can consume easier. Prevent duplication of logic.

The goal is so that an admin never has to see an LDIF ever again.

Episode 92 - Chat with Rami Saas the CEO of WhiteSource

Posted by Open Source Security Podcast on April 15, 2018 11:00 PM
Josh and Kurt talk to Rami Saas, the CEO of WhiteSource about 3rd party open source security as well as open source licensing.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6483430/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Comparing Keystone and Istio RBAC

Posted by Adam Young on April 09, 2018 05:55 PM

To continue with my previous investigation to Istio, and to continue the comparison with the comparable parts of OpenStack, I want to dig deeper into how Istio performs
RBAC. Specifically, I would love to answer the question: could Istio be used to perform the Role check?


Let me reiterate what I’ve said in the past about scope checking. Oslo-policy performs the scope check deep in the code base, long after Middleware, once the resource has been fetched from the Database. Since we can’t do this in Middleware, I think it is safe to say that we can’t do this in Istio either. SO that part of the check is outside the scope of this discussion.

Istio RBAC Introduction

Lets look at how Istio performs RBAC.

The first thing to compare is the data that is used to represent the requester. In Istio, this is the requestcontext. This is comparable to the Auth-Data that Keystone Middleware populates as a result of a successful token validation. How does Istio populate the the requestcontext? My current assumption is that it makes an Remote call to Mixer with the authenticated REMOTE_USER name.

What is telling is that, in Istio, you have

      user: source.user | ""
      groups: ""
         service: source.service | ""
         namespace: source.namespace | ""

Groups no roles. Kubernetes has RBAC, and Roles, but it is a late addition to the model. However…

Istio RBAC introduces ServiceRole and ServiceRoleBinding, both of which are defined as Kubernetes CustomResourceDefinition (CRD) objects.

ServiceRole defines a role for access to services in the mesh.
ServiceRoleBinding grants a role to subjects (e.g., a user, a group, a service)

This is interesting. Where-as Keystone requires a user to go to Keystone to get a token that is then associated with a a set of role assignments, Istio expands this assignment inside the service.

Keystone Aside: Query Auth Data without Tokens

This is actually not surprising. When looking into Keystone Middleware years ago, in the context of PKI tokens, I realized that we could do exactly the same thing; make a call to Keystone based on the identity, and look up all of the data associated with the token. This means that a user can go from a SAML provider right to the service without first getting a Keystone token.

What this means is that the Mixer can respond return the Roles assigned by Kubernetes as additional parameters in the “Properties” collection. However, with the ServiceRole, you would instead get the Service Role Binding list from Mixer and apply it in process.

We discussed Service Roles on multiple occasions in Keystone. I liked the idea, but wanted to make sure that we didn’t limit the assignments, or even the definitions, to just a service. I could see specific Endpoints varying in their roles even within the same service, and certainly have different Service Role Assignments. I’m not certain if Istio distinguishes between “services” and “different endpoints of the same service” yet…something I need to delve in to. However, assuming that it does distinguish, what Istio needs to be able to get request is “Give me the set of Role bindings for this specific endpoint.”

A history lesson in Endpoint Ids.

It was this last step that was a problem in Keystonemiddleware. An endpoint did not know its own ID, and the provisioning tools really did not like the workflow of

  1. create an endpoint for a service
  2. register endpoint with Keystone
  3. get back the endpoint ID
  4. add endpoint  ID to the config file
  5. restart the service

Even if we went with an URL based scheme, we would have had this problem.  An obvious (in hindsight) solution would be to pre-generate the Ids as a unique hash, and to pre-populate the configuration files as well as to post the IDs to Keystone.  These IDs could easily be tagged as a nickname, not even the canonical name of the service.

Istio Initialization

Istio does no have this problem, directly, as it knows the name of the service that it is protecting, and can use that to fetch the correct rules.  However, it does point to a chicken-egg problem that Istio has to solve; which is created first, the service itself, or the abstraction in Istio to cover it?  Since Kubernetes is going to orchestrate the Service deployment, it can make the sensible call;  Istio can cover the service and just reject calls until it is properly configured.

URL Matching Rules

If we look at the Policy enforcement in Nova, we can use the latest “Policy in Code” mechanisms to link from the URL pattern to the Policy rule key, and the key to the actual enforced policy.  For example, to delete a server we can look up the API

And see that it is


And from the Nova source code:

        SERVERS % 'delete',
        "Delete a server",
                'method': 'DELETE',
                'path': '/servers/{server_id}'

With SERVERS %  expanding via :  SERVERS = 'os_compute_api:servers:%s'  to  os_compute_api:servers:delete.

Digging into Openstack Policy

Then, assuming you can get you hand on the policy file specific to that Nova server you could look at the policy for that rule. Nova no longer includes that generated file in the etc directory. But in my local repo I have:
"os_compute_api:servers:delete": "rule:admin_or_owner"

And the rule:admin_or_owner expanding to "admin_or_owner": "is_admin:True or project_id:%(project_id)s" which does not do a role check at all. The policy.yaml or policy.json file is not guaranteed to exist, in which case you can either use the tool to generate it, or read the source code. From the above link we see the Rule is:


and then we need to look where that is defined.

Lets assume, for the moment, that a Nova deployment has overridden the main rule to implement a custom role called Custodian which has the ability to execute this API. Could Istio match that? It really depends on whether it can match the URL-Pattern: '/servers/{server_id}'.

In ServiceRole, the combination of “namespace”+”services”+”paths”+”methods” defines “how a service (services) is allowed to be accessed”.

So we can match down to the Path level. However, there seems to be no way to tokenize a Path. Thus, while you could set a rule that says a client can call DELETE on a specific instance, or DELETE on /services, or even DELETE on all URLS in the catalog (whether they support that API or not) you could not say that it could call delete on all services within a specific Namespace. If the URL were defined like this:

DELETE /services?service_id={someuuid}

Istio would be able to match the service ID in the set of keys.

In order for Istio to be able to effectively match, all it really would need would be to identify that an URL that ends /services/feed1234 Matches the pattern /services/{service_id} which is all that the URL pattern matching inside the Web servers do.

Istio matching

It looks like paths can have wildcards. Scroll down a bit to the quote:

In addition, we support prefix match and suffix match for all the fields in a rule. For example, you can define a “tester” role that has the following permissions in “default” namespace:

which has the further example:

    - services: ["bookstore.default.svc.cluster.local"]
       paths: ["*/reviews"]
       methods: ["GET"]

Deep URL matching

So, while this is a good start, there are many more complicated URLs in the OpenStack world which are tokenized in the middle: for example, the new API for System role assignments has both the Role ID and the User ID embedded. The Istio match would be limited to matching: PUT /v3/system/users/* which might be OK in this case. But there are cases where a PUT at one level means one role, much more powerful than a PUT deeper in the URL chain.

For example: The base role assignments API itself is much more complex. To assign a role on a domain uses an URL fragment comparable to that to edit the domain specific configuration file. Both would have to be matched with

       paths: ["/v3/domains/*"]
       methods: ["PUT"]

But assigning a role is a far safer operation than setting a domain specific config, which is really an administrative only operation.

However, I had to dig deeply to find this conflict. I suspect that there are ways around it, and comparable conflicts in the catalog.


So, the tentative answer to my question is:

Yes, Istio could perform the Role check part of RBAC for OpenStack.

But it would take some work. Of Course. An early step would be to write a Mixer plugin to fetch the auth-data from Keystone based on a user. This would require knowing about Federated mappings and how to expand them, plus query the Role assignments. Of, and get the list of Groups for a user. And the project ID needs to be communicated, somehow.

Episode 91 - Security lessons from a 7 year old

Posted by Open Source Security Podcast on April 08, 2018 10:51 PM
Josh and Kurt talk to a 7 year old about security. We cover Minecraft security, passwords, hacking, and many many other nuggets of wisdom.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6432455/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Comparing Istio and Keystone Middleware

Posted by Adam Young on April 07, 2018 10:00 PM

One way to learn a new technology is to compare it to what you already know. I’ve heard a lot about Istio, and I don’t really grok it yet, so this post is my attempt to get the ideas solid in my own head, and to spur conversations out there.

I asked the great Google “Why is Istio important” and this was the most interesting response it gave me: “What is Istio and Its Importance to Container Security.” So I am going to start there. There are obviously many articles about Istio, and this might not even be the best starting point, but this is the internet: I’m sure Ill be told why something else is better!

Lets start with the definition:

Istio is an intelligent and robust web proxy for traffic within your Kubernetes cluster as well as incoming traffic to your cluster

At first blush, these seems to be nothing like Keystone. However, Lets take a look at the software definition of Proxy:

A proxy, in its most general form, is a class functioning as an interface to something else.

In the OpenStack code base, the package python-keystonemiddleware provides a Python class that complies with the WSGI contract that serves as a Proxy to the we application underneath. Keystone Middleware, then is an analogue to the Istio Proxy in that it performs some of the same functions.

Istio enables you to specify access control rules for web traffic between Kubernetes services

So…Keystone + Oslo-policy serves this role in OpenStack. The Kubernetes central control is a single web server, and thus it can implement Access control for all subcomponenets in a single process space. OpenStack is distributed, and thus the access control is also distributed. However, due to the way that OpenStack objects are stored, we cannot do the full RBAC enforcement in middleware (much as I would like to). IN order to check access to an existing resource object in OpenStack, you have to perform the policy enforcement check after the object has been fetched from the Database. That check needs to ensure that the project of the token matches the project of the resource. Since we don’t know this information based solely on the URL, we cannot perform it in Middleware.

What we can perform in Middleware, and what I presented on last year at the OpenStack Summit, is the ability to perform the Role check portion of RBAC in middleware, but defer the project check until later. While we are not going to be doing exactly that, we are pursuing a related effort for application credentials. However, that requires a remote call to a database to create those rules. Istio is not going to have that leeway. I think? Please correct me if I am wrong.

I don’t think Istio could perform this level of deep check, either. It requires parsing the URL and knowing the semantics of the segments, and having the ability to correlate them. That is a lot to ask.

Isito enables you to seamlessly enforce encryption and authentication between node

Keystone certainly does not do this. Nothing enforced TLS between services in OpenStack. Getting TLS everywhere in Tripleo was a huge effort, and it still needs to be explicitly enabled. OpenStack does not provide a CA. Tripleo, when deployed, depends on the Dogtag instance from the FreeIPA server to manage certificates.

By the time Keystone Middleware is executed, the TLS layer would be a distant memory.

Keystoneauth1 is the client piece from Keystone, and it could be responsible for making sure that only HTTPS is supported, but it does not do that today.

Istio collects traffic logs, and then parses and presents them for you:

Keystone does not do this, although it does produce some essential log entries about access.

At this point, I am wondering if Istio would be a viable complement to the security story in OpenStack. My understand thus far is that it would. It might conflict a minor bit with the RBAC enforcement, but I suspect that is no the key piece of what it is doing, and conflict there could be avoided.

Please post your comments, as I would really like to get to know this better, and we can share the discussion with the larger community.

Recursive DNS and FreeIPA

Posted by Adam Young on April 05, 2018 05:38 PM

DNS is essential to Kerberos. Kerberos Identity for servers is based around host names, and if you don’t have a common view between client and server, you will not be able to access your remote systems. Since DNS is an essential part of FreeIPA, BIND is one of the services integrated into the IPA server.

When a user wants to visit a public website, like this one, they click a link or type that URL into their browsers navigation bar. The browser then requests the IP address for the hostname inside the URL from the operating system via a library call. On a Linux based system, the operating system makes the DNS call to the server specified in /etc/resolv.conf. But what happens if the DNS server does not know the answer? It depends on how it is configured. In the simple case, where the server is not allowed to make additional calls, it returns a response that indicates the record is not found.

Since IPA is supposed to be the one-source-of-truth for a client system, it is common practice to register the IPA server as the sole DNS resolver. As such, it cannot just short-circuit the request. Instead, it performs a recursive search to the machines it has set up as Forwarders. For example, I often will set up a sample server that points to the google resolver at Or, now CloudFlare has DNS privacy enabled, I might use that.

This is fine inside controlled environments, but is sub-optimal if the DNS portion of the IPA server is accessible on the public internet. It turns out that forwarding requests allows a DNS server to be used to attack these DNS servers via a distributed denial of service attack. In this attack, the attackers sends the request to all DNS servers that are acting as forwarders, and these forwarders hammer on the central DNS servers.

If you have set up a FreeIPA server on the public internet, you should plan on disabling Recursive DNS queries. You do this by editing the file /etc/named.conf and setting the values:

allow-recursion {"none";};
recursion no;

And restarting the named service.

And then everything breaks. All of your IPA clients can no longer resolve anything except the entries you have in your IPA server.

The fix for that is to add the (former) DNS forward address as a nameserver entry in /etc/resolv.conf on each machine, including your IPA server. Yes, it is a pain, but it limits the query capacity to only requests local to those machines. For example, if my IPA server is on (yes I know this is not routable, just for example) my resolve.conf would look like.

search younglogic.com

If you wonder if your Nameserver has this problem, use this site to test it.

Episode 90 - Humans and misinformation

Posted by Open Source Security Podcast on April 02, 2018 01:24 AM
Josh and Kurt talk about all the current misinformation, how humans react to it, and what it means for security.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6425720/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Home made Matzo

Posted by Adam Young on April 01, 2018 05:01 PM

Plate of Matzo
Sufficient quantities to afflict everyone.

Recipe found from the story here.

Episode 89 - Short selling AMD security flaws

Posted by Open Source Security Podcast on March 25, 2018 09:04 PM
Josh and Kurt talk about the recent AMD flaws and the events surrounding the disclosure.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6406210/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Ansible, Azure, and Managed Disks

Posted by Adam Young on March 22, 2018 02:27 AM

Many applications have a data directory, usually due to having an embedded database. For the set I work with, this includes Red Hat IdM/FreeIPA, CloudForms/ManageIQ, Ansible Tower/AWX, and OpenShift/Kubernetes. Its enough of a pattern that I have Ansible code for pairing a set of newly allocated partitions with a set of previously built virtual machines.

I’ll declare a set of variables like this:

  - {name: idm,    flavor:  m1.medium}
  - {name: sso,    flavor:  m1.medium}
  - {name: master0, flavor:  m1.xlarge}
  - {name: master1, flavor:  m1.xlarge}
  - {name: master2, flavor:  m1.xlarge}
  - {name: node0,  flavor:  m1.medium}
  - {name: node1,  flavor:  m1.medium}
  - {name: node2,  flavor:  m1.medium}
  - {name: bastion,  flavor:  m1.small}
  - {server_name: master0, volume_name: master0_var_volume, size: 30}
  - {server_name: master1, volume_name: master1_var_volume, size: 30}
  - {server_name: master2, volume_name: master2_var_volume, size: 30}

In OpenStack, the code looks like this:

- name: create servers
    cloud: "{{ cloudname }}"
    state: present
    name: "{{ item.name }}.{{ clustername }}"
    image: rhel-guest-image-7.4-0
    key_name: ayoung-pubkey
    timeout: 200
    flavor: "{{ item.flavor }}"
      - "{{ securitygroupname }}"
      -  net-id:  "{{ osnetwork.network.id }}"
         net-name: "{{ netname }}_network" 
      hostname: "{{ item.name }}.{{ clustername }}"
      fqdn: "{{ item.name }}.{{ clustername }}"
    userdata: |
      hostname: "{{ item.name }}.{{ clustername }}"
      fqdn:  "{{ item.name }}.{{ clustername }}"
        -   path: /etc/sudoers.d/999-ansible-requiretty
            permissions: 440
      content: |
        Defaults:{{ netname }} !requiretty
  with_items: "{{ cluster_hosts }}"
  register: osservers

- name: create openshift var volume
    cloud: "{{ cloudname }}"
    size: 40
    display_name: "{{ item.volume_name }}"
  register: openshift_var_volume
  with_items: "{{ cluster_volumes }}"

- name: attach var volume to OCE Master
    cloud: "{{ cloudname }}"
    state: present
    server: "{{ item.server_name }}.{{ clustername }}"
    volume:  "{{ item.volume_name }}"
    device: /dev/vdb
  with_items: "{{ cluster_volumes }}"

I wanted to do something comparable with Azure. My First take was this:

- name: Create virtual machine
    resource_group: "{{ az_resources }}"
    name: "{{ item.name }}"
    admin_username: "{{ az_username }}"
    admin_password: "{{ az_password }}"
      offer: RHEL
      publisher: RedHat
      sku: '7.3'
      urn: 'RedHat:RHEL:7.3:latest'
      version: '7.3.2017090723'
  with_items: "{{ cluster_hosts }}"
  register: az_servers

- name: create additional volumes
    name: "{{ item.volume_name }}"
    location: eastus
    resource_group: "{{ az_resources }}"
    disk_size_gb: 40
    managed_by: "{{ item.server_name }}"
  register: az_cluster_volumes
  with_items: "{{ cluster_volumes }}"

However, when I ran that, I got the error:

“Error updating virtual machine idm – Azure Error: OperationNotAllowed
Message: Addition of a managed disk to a VM with blob based disks is not supported.
Target: dataDisk”

And I was not able to reproduce using the CLI:

$ az vm create -g  ayoung_resources  -n IDM   --admin-password   e8f58a03-3fb6-4fa0-b7af-0F1A71A93605 --admin-username ayoung --image RedHat:RHEL:7.3:latest
  "fqdns": "",
  "id": "/subscriptions/362a873d-c89a-44ec-9578-73f2e492e2ae/resourceGroups/ayoung_resources/providers/Microsoft.Compute/virtualMachines/IDM",
  "location": "eastus",
  "macAddress": "00-0D-3A-1D-99-18",
  "powerState": "VM running",
  "privateIpAddress": "",
  "publicIpAddress": "",
  "resourceGroup": "ayoung_resources",
  "zones": ""
[ayoung@ayoung541 rippowam]$ az vm disk attach  -g ayoung_resources --vm-name IDM --disk  CFME-NE-DB --new --size-gb 100

However, looking into the https://github.com/ansible/ansible/blob/v2.5.0rc3/lib/ansible/modules/cloud/azure/azure_rm_virtualmachine.py#L1150:

if not data_disk.get('managed_disk_type'):
    data_disk_managed_disk = None
    disk_name = data_disk['storage_blob_name']
    data_disk_vhd = self.compute_models.VirtualHardDisk(uri=data_disk_requested_vhd_uri)

It looks like there is code to default to blob type if the “managed_disk_type” value is unset.

I added in the following line:

    managed_disk_type: "Standard_LRS"

Thus, my modified ansible task looks like this:

- name: Create virtual machine
    resource_group: "{{ az_resources }}"
    name: "{{ item.name }}"
    managed_disk_type: "Standard_LRS"
    admin_username: "{{ az_username }}"
    admin_password: "{{ az_password }}"
    managed_disk_type: "Standard_LRS"
      offer: RHEL
      publisher: RedHat
      sku: '7.3'
      urn: 'RedHat:RHEL:7.3:latest'
      version: '7.3.2017090723'
  with_items: "{{ cluster_hosts }}"
  register: az_servers

- name: create additional volumes
    name: "{{ item.volume_name }}"
    location: eastus
    resource_group: "{{ az_resources }}"
    disk_size_gb: 40
    managed_by: "{{ item.server_name }}"
  register: az_cluster_volumes
  with_items: "{{ cluster_volumes }}"

Which completed successfully:

Launching Custom Image VMs on Azure With Anisble

Posted by Adam Young on March 20, 2018 08:37 PM

Part of my Job is making sure our customers can run our software in Public clouds.  Recently, I was able to get CloudForms Management Engine (CFME) to deploy to Azure. Once I got it done manually, I wanted to automate the deployment, and that means Ansible.  Turns out that launching custom images from Ansible is not support int the current GA version of the Azure modules, but has been implemented upstream.

Ansible releases package versions here. I wanted the 2.5 that was aligned with Fedora 27 which is RC3 right now. Install using dnf:

sudo dnf install  http://releases.ansible.com/ansible/rpm/preview/fedora-27-x86_64/ansible-2.5.0-0.1003.rc3.fc27.ans.noarch.rpm

And then I can launch using the new syntax for the image dictionary. Here is the task fragment from my tasks/main.yml

- name: Create virtual machine
    resource_group: "{{ az_resources }}"
    name: CloudForms
    vm_size: Standard_D1
    admin_username: "{{ az_username }}"
    admin_password: "{{ az_password }}"
      name: cfme-azure-      
      resource_group: CFME-NE

Note the two dictionary values under image. This works, so long as the user has access to the image, even if it comes from a different resource group.

Big thanks to <sivel> and <jborean93> In #ansible FreeNode IRC for helping get this to work.

SELinux should and does BLOCK access to Docker socket

Posted by Dan Walsh on March 19, 2018 10:19 AM

I get lots of bugs from people complaining about SELinux blocking access to the Docker socket.  For example https://bugzilla.redhat.com/show_bug.cgi?id=1557893

The aggravating thing is, this is exactly what we want SELinux to prevent.  If a container process got to the point of talking to the /var/run/docker.sock, you know this is a serious security issue.  Giving a container access to the Docker socket, means you are giving it full root on  your system.  

A couple of years ago, I wrote why giving docker.sock to non privileged users is a bad idea.

Now I am getting bug reports about allowing containers access to this socket.

Access to the docker.sock is the equivalent of sudo with NOPASSWD, without any logging. You are giving the process that talks to the socket, the ability to launch a process on the system as full root.

Usually people are doing this because they want the container to do benign operations, like list which containers are on the system, or look a the container logs.  But Docker does not have a nice RBAC system, you basically get full access or no access.  I choose to default to NO ACCESS.

If you need to give container full access to the system then run it as a --privileged container or disable SELinux separation for the container.

podman run --privileged ...
docker run --privileged ...

podman run --security-opt label:disable ...
docker run --security-opt label:disable ...

Run it privileged

There is a discussion going on in Moby github, about breaking out more security options, specifically adding a flag to container runtimes, to allow users to specify whether they want kernel file systems in the container to be readonly. (/proc, /sys,...)

I am fine with doing this, but my concern with this is people want to make little changes to the security of their containers, but at a certain point you allow full breakout.  Like above where you allow a container to talk to the docker.sock.  

Security tools are being developed to search for things like containers running as --privileged, but they might not understand that --security-opt selinux:disable -v /run:/run is the SAME THING from a security point of view.  If it is simple to break out of container confinement, then we should just be honest and run the container with full privilege.  (--privileged). 

Episode 88 - Chat with Chris Rosen from IBM about Container Security

Posted by Open Source Security Podcast on March 18, 2018 08:34 PM
Josh and Kurt talk about container security with IBM's Chris Rosen.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6378524/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Generating a list of URL patterns for OpenStack services.

Posted by Adam Young on March 16, 2018 05:35 PM

Last year at the Boston OpenStack summit, I presented on an Idea of using URL patterns to enforce RBAC. While this idea is on hold for the time being, a related approach is moving forward building on top of application credentials. In this approach, the set of acceptable URLs is added to the role, so it is an additional check. This is a lower barrier to entry approach.

One thing I requested on the specification was to use the same mechanism as I had put forth on the RBAC in Middleware spec: the URL pattern. The set of acceptable URL patterns will be specified by an operator.

The user selects the URL pattern they want to add as a “white-list” to their application credential. A user could further specify a dictionary to fill in the segments of that URL pattern, to get a delegation down to an individual resource.

I wanted to see how easy it would be to generate a list of URL patterns. It turns out that, for the projects that are using the oslo-policy-in-code approach, it is pretty easy;

cd /opt/stack/nova
 . .tox/py35/bin/activate
(py35) [ayoung@ayoung541 nova]$ oslopolicy-sample-generator  --namespace nova | egrep "POST|GET|DELETE|PUT" | sed 's!#!!'
 POST  /servers/{server_id}/action (os-resetState)
 POST  /servers/{server_id}/action (injectNetworkInfo)
 POST  /servers/{server_id}/action (resetNetwork)
 POST  /servers/{server_id}/action (changePassword)
 GET  /os-agents
 POST  /os-agents
 PUT  /os-agents/{agent_build_id}
 DELETE  /os-agents/{agent_build_id}

Similar for Keystone

$ oslopolicy-sample-generator  --namespace keystone  | egrep "POST|GET|DELETE|PUT" | sed 's!# !!' | head -10
GET  /v3/users/{user_id}/application_credentials/{application_credential_id}
GET  /v3/users/{user_id}/application_credentials
POST  /v3/users/{user_id}/application_credentials
DELETE  /v3/users/{user_id}/application_credentials/{application_credential_id}
PUT  /v3/OS-OAUTH1/authorize/{request_token_id}
GET  /v3/users/{user_id}/OS-OAUTH1/access_tokens/{access_token_id}
GET  /v3/users/{user_id}/OS-OAUTH1/access_tokens/{access_token_id}/roles/{role_id}
GET  /v3/users/{user_id}/OS-OAUTH1/access_tokens
GET  /v3/users/{user_id}/OS-OAUTH1/access_tokens/{access_token_id}/roles
DELETE  /v3/users/{user_id}/OS-OAUTH1/access_tokens/{access_token_id}

The output of the tool is a little sub-optimal, as the oslo policy enforcement used to be done using only JSON, and JSON does not allow comments, so I had to scrape the comments out of the YAML format. Ideally, we could tweak the tool to output the URL patterns and the policy rules that enforce them in a clean format.

What roles are used? Turns out, we can figure that out, too:

$ oslopolicy-sample-generator  --namespace keystone  |  grep \"role:
#"admin_required": "role:admin or is_admin:1"
#"service_role": "role:service"

So only admin or service are actually used. On Nova:

$ oslopolicy-sample-generator  --namespace nova  |  grep \"role:
#"context_is_admin": "role:admin"

Only admin.

How about matching the URL pattern to the policy rule?
If I run

oslopolicy-sample-generator  --namespace nova  |  less

In the middle I can see an example like this (# marsk removed for syntax):

# Create, list, update, and delete guest agent builds

# This is XenAPI driver specific.
# It is used to force the upgrade of the XenAPI guest agent on
# instance boot.
 GET  /os-agents
 POST  /os-agents
 PUT  /os-agents/{agent_build_id}
 DELETE  /os-agents/{agent_build_id}
"os_compute_api:os-agents": "rule:admin_api"

This is not 100% deterministic, though, as some services, Nova in particular, enforce policy based on the payload.

For example, these operations can be done by the resource owner:

# Restore a soft deleted server or force delete a server before
# deferred cleanup
 POST  /servers/{server_id}/action (restore)
 POST  /servers/{server_id}/action (forceDelete)
"os_compute_api:os-deferred-delete": "rule:admin_or_owner"

Where as these operations must be done by an admin operator:

# Evacuate a server from a failed host to a new host
 POST  /servers/{server_id}/action (evacuate)
"os_compute_api:os-evacuate": "rule:admin_api"

Both map to the same URL pattern. We tripped over this when working on RBAC in Middleware, and it is going to be an issue with the Whitelist as well.

Looking at the API docs, we can see that difference in the bodies of the operations. The Evacuate call has a body like this:

    "evacuate": {
        "host": "b419863b7d814906a68fb31703c0dbd6",
        "adminPass": "MySecretPass",
        "onSharedStorage": "False"

Whereas the forceDelete call has a body like this:

    "forceDelete": null

From these, it is pretty straight forward to figure out what policy to apply, but as of yet, there is no programmatic way to access that.

It would take a little more scripting to try and identity the set of rules that mean a user should be able to perform those actions with a project scoped token versus the set of APIs that are reserved for cloud operations. However, just looking at the admin_or_owner rule for most is sufficient to indicate that it should be performed using a scoped token. Thus, an end user should be able to determine the set of operations that she can include in a white-list.

Teaching an old dog new tricks

Posted by Dan Walsh on March 12, 2018 08:16 PM

I have been working on SELinux for over 15 years.  I switched my primary job to working on containers several years ago, but one of the first things I did with containers was to add SELinux support.  Now all of the container projects I work on including CRI-O, Podman, Buildah as well as Docker, Moby, Rocket, runc, systemd-nspawn, lxc ... all have SELinux support.  I also maintain the container-selinux policy package which all of these container runtimes rely on.

Any ways container runtimes started adding the no-new-privileges capabilities a couple of years ago. 


The no_new_privs kernel feature works as follows:

  • Processes set no_new_privs bit in kernel that persists across fork, clone, & exec.
  • no_new_privs bit ensures process/children processes do not gain any additional privileges.
  • Process aren't allowed to unset no_new_privs bit once set.
  • no_new_privs processes are not allowed to change uid/gid or gain any other capabilities, even if the process executes setuid binaries or executables with file capability bits set.
  • no_new_privs prevents Linux Security Modules (LSMs) like SELinux from transitioning to process labels that have access not allowed to the current process. This means an SELinux process is only allowed to transition to a process type with less privileges.

Oops that last flag is a problem for containers and SELinux.  If I am running a command like

# podman run -ti --security-opt no-new-privileges fedora sh

On and SELinux system, usually podman command would be running as unconfined_t, and usually podman asks for the container process to be launched as container_t.  


docker run -ti --security-opt no-new-privileges fedora sh

In the case of Docker the docker daemon is usually running as container_runtime_t. And will attempt to launch the container as container_t.

But the user also asked for no-new-privileges. If both flags are set the kernel would not allow the process to transition from unconfined_t -> container_t. And in the Docker case the kernel would not allow a transition from container_runtime_t -> container_t.

Well you may say that is pretty Dumb.  no_new_privileges is supposed to be a security measure that prevents a process from gaining further privs, but in this case it is actually preventing us from lessening its SELinux access.

Well the SELinux kernel and policy had the concept of "typebounds", where a policy writer could write that one type typebounted another type.  For example

typebounds container_runtime_t container_t, and the kernel would then make sure that container_t has not more allow rules then container_runtime_t.  This concept proved to be problematic for two reasons.  


Writing policy for the typebounds was very difficult and in some cases we would have to add additional access to the bounding type.  An example of this is SELinux can control the `entrypoint` of a process.  For example we write policy that says httpd_t can only be entered by executables labeled with the entrypoint type httpd_exec_t.  We also had a rule that container_runtime_t can only be entered via the entrpoint type of container_runtime_exec_t.  But we wanted to allow any processes to be run inside of a container, we wrote a rules that all executable types could be used as entrypoints to container_t.  With typebounds we needed to add all of these rules to container_runtime_t meaning we would have to allow all executables to be run as container_runtime_t.  Not ideal.

The second problem with typebounds and the kernel and policy only allowed a single typebounds of a type.  So if we wanted to allow unconfined_t processes to launch container_t processes, we would end up writing rules like

typebounds unconfined_t container_runtime_t
typebounds container_runtime_t container_t.

Now unconfined_t would need to grow all of the allow rules of container_runtime_t and container_t.



Well I was complaining about this to Lucas Vrabec, the guy who took over selinux-policy from me, and he tells me about this new allow rule called nnp_transitions.  The policy writer could write a rule into policy to say that a process could nnp_transition from one domain to another.

allow container_runtime_t confined_t:process2 nnp_transition;
allow unconfined_t confined_t:process2 nnp_transition;

With a recent enough kernel, SELinux would allow the transition even if the no_new_privs kernel flag was set, and the typebounds rules were NOT in place.  

Boy did I feel like an SELinux NEWBIE.  I added the rules on Fedora 27 and suddenly everything started working.  As of RHEL7.5 this feature will be back ported into the RHEL7.5 kernel.   Awesome.


While I was looking at the nnp_transition rules, I noticed that there was also a nosuid_transition permission. nosuid allows people to mount a file system with nosuid flag, this tells the kernel that even if a setuid application exists on this file system, the kernel should ignore it and not allow a process to gain privilege via the file.  You always want untrusted file systems like usb sticks to be mounted with this flag.  Well SELinux systems similarly ignore transition rules on labels based on a nosuid file system.  Similar to nnp_transition, this blocks a process from transition from a privileged domain to a less privileged domain.  But the nosuid_transtion flag allows us to tell the kernel to allow transitions from one domain to another even if the file system is marked nosuid.

allow container_runtime_t confined_t:process2 nosuid_transition;
allow unconfined_t container_t:process2 nosuid_transition;

This means that even if a user used podman to execute a file on a nosuid file system it would be allowed to transition from the unconfined_t to container_t. 

Well it is nice to know there are still things that I can learn new about SELinux.

Episode 87 - Chat with Let's Encrypt co-founder Josh Aas

Posted by Open Source Security Podcast on March 11, 2018 11:00 PM
Josh and Kurt talk about Let's Encrypt with co-founder Josh Aas. We discuss the past, present, and future of the project.
<iframe allowfullscreen="" height="90" mozallowfullscreen="" msallowfullscreen="" oallowfullscreen="" scrolling="no" src="https://html5-player.libsyn.com/embed/episode/id/6353715/height/90/theme/custom/autoplay/no/autonext/no/thumbnail/yes/preload/no/no_addthis/no/direction/backward/render-playlist/no/custom-color/6e6a6a/" style="border: none;" webkitallowfullscreen="" width="100%"></iframe>

Show Notes

Containers and MLS

Posted by Dan Walsh on March 08, 2018 10:43 AM

I have just updated the container-selinux policy to support MLS (Multi Level Security).  

SELinux and Container technology have a long history together.  Some people imagine that containers started just a few years ago with the introduction of Docker, but the technology goes back a lot further then that. 

SELinux originally supported two type of Mandatory Access Control,  Type Enforcement and RBAC (Roles Based Access Control).  

Type Enforcement

Type Enforcement is SELinux main security measure.  Basically every process on the system gets a assigned a type (httpd_t, unconfined_t, container_t ...)  And every object (file, directory, socket, tcp port ...) gets assigned a type. (httpd_sys_content_t, user_home_t, httpd_port_t, container_file_t).  Then we write rules that define the access between process types and object types. Note: The type field is always the third field of the SELinus label.

allow container_t container_file_t:file { open read write }

Anything that is not allowed is denied.

RBAC (Roles Based Access Control)

RBAC although seldom used was an enhancement on Type Enforcement, it basically controlled the types that a user process could become.  When a user logs onto the system he gets assigned User (staff_u, which can have one or more rules, including a default role  staff_r).    Note: the user field is the first field of the SELinux label and the role field is the second field.  Policy defines which types that staff_r role can run with.  For example a staff_r can run with the staff_t type, but it is not allowed to use the unconfined_t type.  If the SELinux user the user logs in with has multiple roles then he could change his role using the newrole command or sudo.  I often setup my account with a user with the staff_r and unconfined_r roles.   Where I login i get the staff_r role but when I become root through sudo I become the unconfined_r role with the unconfined_t type.  This means that if I accedently ran a setuid app on my system that made me root, I would still be staff_t and not able to modify the kernel.

MLS (Multi Level Security)

Back around the time we were working on RHEL 5 (2005/2006) time frame, we decided we wanted to use SELinux for MLS.  We wanted to get RHEL to be certified as EAL4+ LSPP, which required us to support handling data at different levels.  MLS Is very different then type enforcement, in that it is not concerned with the type of the process that is running, but at it sensitivity or security level.  The easiest way to understand this is imagine a group of processes running as TopSecret and another group of processes running as Secret, and then data/file on the system would also be assigned a level.  Like TopSecret and Secret.  The kernel then controls the communcations between these processes based on levels.  A Secret process can not read TopSecret data.  MLS Labels consist of this Sensitivity field and up to 1024 different categories.  Note: the MLS Field is the forth field of an SELinux Label.  You might have a TopSecret Process with the British Intelligence Category.   In order for processes to communicate the kernel enforces SELinux policy on both the sensitivities and categories.  I don't want to get into the weeds on this, there is plenty of information on the web on how MLS systems work. 

Way back in RHEL5 time the "mount namespace" was added to the kernel to handle MLS workloads.  We wanted to setup a system which allowed a user to login to a system at Secret  level, and then later login to the same system as TopSecret.  But we wanted him to have different home directories and /tmp directories when he logged in.  He would see his Secret Home directory when he logged in as secret and the Top Secret directory when he logged in as Top Secret. The Mount namespace allowed us to change the "mount tables" for a group of processes so /home/dwalsh and /tmp would vary for different processes on the system.  Now Mount Namespace is the crucial container feature that allows us to have multiple containers on a system each seeing different version of the OS mount on /.

MCS (Multi Category Security)

MCS was introduced around RHEL 6 Time frame (2008) in order to keep virtual machine separate.  We needed a way to make sure that two Virtual Machines could not attack each other if their was a hypervisor break out.  From an Type enforcement point of view each VM would run as svirt_t, and thier images would be svirt_image_t, we had rules that said:

allow svirt_t svirt_image_t:file manage_file_perms;

This means we could control a VM to only be able to read/write svirt images, and would not be allowed to attack the rest of the host.  But if every VM ran as svirt_t and all their images were svirt_image_t, they would be able to attack each other.  We needed a way in SELinux to keep the VMS apart.

Most systems did not use the MLS field of SELinux since they were not running in MLS mode,  If you are on a targeted system, you would see most processes running with an MLS filed of `s0`.  We decided to take advantage of the MLS field and create a new form of policy enforcement.  We would ignore Sensitivity and just use the categories.  We modified libvirt, the tool we use to launch VMs, to assign 2 random categories to each process and image, making sure the categories for the VM matched its image.  Then the Kernel would enforce that the categories had to match exactly to allow access.  This means a VM running with a Label like system_u:system_r:svirt_t:s0:C1,C2 would be allowed to manage an image labeled system_u:object_r:svirt_image_t:s0:c1,c2 but not allowed access to one labeled system_u:object_r:svirt_image_t:s0:c3,c4. We called this isolation svirt.

We now use MCS Separation and Type Enforcement to isolate containers.  We use different types for containers, so containers run as container_t and content is labeled container_file_t.

Containers and MLS

But what about people who want to run containers on MLS systems?  I recently worked with a user who wanted to run containers on an MLS machine, so we had to modify the policy.  Container Selinux policy is available in github. 


init_ranged_daemon_domain(container_runtime_t, container_runtime_exec_t, s0 - mls_systemhigh)

The only changed we need to make were to allow the container runtimes to run fully ranged.  This means container runtimes like CRI-O, Podman, Buildah and Docker can execute any any level and MCS Category.  We also had to allow container runtimes to be able to manage content at all of its sensitivies and categories.  Luckily this only meant we had to add three lines to the existing policy.  Now an MLS user could install the container-selinux package onto a MLS system and install a container runtime, and they can select which MLS Label they want to run at.

podman run --secutity-opt level:TopSecret:BritishIntelligence fedora sh
docker run --security-opt level:s15:c100,c200 -v /var/lib/content:/var/lib/content:Z nginx 

And the processes in the container and the content in the container will match the label.