Fedora security Planet

Episode 9 - Are bug bounties measuring the wrong things?

Posted by Open Source Security Podcast on October 18, 2016 10:00 PM
Kurt and Josh discuss responsible disclosure, irresponsible disclosure, bug bounties, measuring security, usability AND security, as well as quality of life.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/288890601&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Can I interest you in talking about our savior, its name is Security!

Posted by Josh Bressers on October 17, 2016 02:00 PM
I had a discussion last week with some fellow security folks about how we can discuss security with normal people. If you pay attention to what's going on, you know the security people and the non security people don't really communicate well. We eventually made our way to comparing what we do to the door to door religious groups. They're rarely seen in a positive light, are usually annoying, and only seem to show up when it's most inconvenient. This got me thinking, we probably have more in common there than we want to admit, but there are also some lessons for us.

Firstly, nobody wants to talk to either group. The reasons are basically the same. People are already mostly happy with whatever choices they've made and don't need someone showing up to mess with their plans. Do you enjoy being told you're wrong? Even if you are wrong, you don't want someone telling you this. At best you want to figure it out yourself but in reality you don't care and will keep doing whatever you want. It's part of being an irrational human. I'm right, you're wrong, everything else is just pointless details.

Let's assume you are certain that the message you have is really important. If you're not telling people something useful, you're wasting their time. It doesn't matter how important a message is, the audience has to want to hear it. Nobody likes having their time wasted. In this crazy election season, how often are you willing to not just hang up your phone when a pollster calls? You know it's just a big waste of time.

Most importantly though, you can't act pretentious. If you think you're better than whoever you're talking to, even if you're trying hard not to show it, they'll know. Humans are amazing at understanding what another person is thinking by how they act. It's how we managed to survive this long. Our monkey brains are really good at handling social interactions without us even knowing. How often do you talk to someone who is acting superior to you, and all you want to do is stop talking to them.

Now what?

It's really easy to point all this stuff out, most of us probably know this already. So what can we start doing different? In the same context of door to door selling, it's far more powerful if someone comes to you. If they come to you, they want to learn and understand. So while there isn't anything overly new and exciting, the thing that's best for us to remember today is just be available. If you're approachable, you will be approached, and when they do, make sure you don't drive your audience away. If someone wants to talk to you about security, let them. And be kind, understanding, and sympathetic.

Fedora Server: Expanding Throughout the Galaxy

Posted by Stephen Gallagher on October 13, 2016 06:49 PM


Three years ago, Fedora embarked on a new initiative that we collectively refer to as Fedora.next. As part of this initiative, we decided to start curating deliverable artifacts around specific use-cases rather than the one-size-fits-all approach of Fedora 20 and earlier. One of those specific use-cases was to meet the needs of “server administrators”. And thus, the Fedora Server Edition was born.

One of the earliest things that we did after creating the Fedora Server Working Group (Server WG from here on) was to perform what in the corporate world might be called a “gap analysis”. What this means is that we looked at Fedora from the perspective of the server administrator “personas” we had created and tried to understand their pain points (particularly in contrast to how things function on competitive platforms such as Microsoft Windows Server).

The most obvious gap that we identified was the relative difficulty of getting started with Fedora Server Edition at all. With Microsoft Windows Server, the first experience after logging in is to be presented with a tool called Server Manager that provides basic (graphical) information about the system as well as presenting the user with a list of things that they might want this server to do. It then walks them through a guided installation of those core features (such as domain controller services, remote desktop services and so on). With Fedora, a default install would get you a bash prompt with no guidance; typing “help” would only lead to the extremely expert-focused help documentation for the bash shell itself.

OK, advantage Windows here. So how do we address that? Server WG had agreed early on that we were not willing require a desktop environment for server systems. We instead set our sights on a fledgling project called Cockpit, which was gaining traction and looked to provide an excellent user experience without requiring a local display – it’s a web-based admin console and so can be accessed by users running the operating system of their choice.

Once Cockpit was established as the much-friendlier initial experience for Fedora Server, we started to look at the second part of the problem that we needed to solve: that of simplified deployment of key infrastructure components. To that end, we started the development of a tool that we could integrate with the Cockpit admin console and provide the actual deployment implementation. What we came up with was a python project that we called rolekit that would provide a fairly simple local D-BUS API that Cockpit would be able to call out to in order to deploy the requested services.

While our intentions were good, rolekit faced two serious problems:

  • The creation of the roles were complicated and manual, requiring careful curation and attention to make sure that they continued to work from release to release of Fedora.
  • The Cockpit Project became very popular and its limited resources became dedicated to serving the needs of their other consumers, leaving us unable to get the integration of rolekit completed.

The second of these issues remains and will likely need to be addressed, but that will be a topic for another day. The remainder of this blog entry will discuss our plans for how to improve the creation and maintenance of roles.

Ansible Galaxy

Ansible Galaxy describes itself as “[Y]our hub for finding, reusing and sharing the best Ansible content”. What this means is that the Ansible project runs a public software service enabling the sharing of Github repositories containing useful Ansible roles and playbooks for deploying software services.

The Galaxy hub contains literally thousands of pre-built server roles for Fedora, Red Hat Enterprise Linux and other systems with more being added every day. With such a large community made available to us, the Server WG has decided to explore the use of Ansible Galaxy as the back-end for our server role implementation, replacing rolekit’s custom (and redundant) implementation.

As part of this effort, I attended the Ansible Contributor Conference and AnsibleFest this week in Brooklyn, New York. I spent a great deal of time talking with Chris Houseknecht about ways in which we could enhance Ansible Galaxy to function for our needs.

Required Galaxy Enhancements

There are a few shortcomings to Galaxy that we will need to address before we can implement a complete solution. The first of these is assurance: there is currently no way for a consumer of a role to indicate its suitability. Specifically, we will want there to be a way for Fedora to elevate a set of roles (and specific versions of those roles) to a “recommended” state. In order to do this, Galaxy will be adding support for third-party signing of role versions. Fedora will become a “signing authority” for Ansible Galaxy, indicating that certain roles and their versions should be consumed by users of Fedora.

We will also add filtering to the Galaxy API to enable clients to limit their searches to only those roles that have been signed by a particular signing authority. This will be useful for limiting the list that we expose to users in Cockpit.

The other remaining issue with Ansible is that there is currently no way to execute an Ansible script through an API; at present it must be done via execution of the Ansible CLI tool. Fedora will be collaborating with Ansible on this (see below).

Required Fedora Enhancements

In Fedora, we will need to provide a useful UI for this new functionality. This will most likely need to happen in the Cockpit project, and we will have to find resources to handle this.

Specifically, we will need:

  • UI to handle searching the Galaxy API using the role signatures and other tag filtering.
  • UI for an “answer file” for satisfying required variables in the roles.
  • UI for interrogating a system for what roles have been applied to it.

In addition to Cockpit UI work, we will need to provide infrastructure within the Fedora Project to provide our signatures. This will mean at minimum having secure, audited storage of our private signing key and a tool or service that performs the signing. In the short term, we can allow a set of trusted users to do this signing manually, but in the longer term we will need to focus on setting up a CI environment to enable automated testing and signing of role updates.

Lastly, as mentioned above, we will need to work on an API that Cockpit can invoke to fire off the generated Ansible playbook. This will be provided by Fedora (likely under the rolekit banner) but may be eventually absorbed into the upstream Ansible project once it matures.

Circles in Minecraft?

Posted by Adam Young on October 12, 2016 12:27 AM

Minecraft is a land of Cubes. And yet, in this blockland, it turns out the circle is a very powerful tool. Using the basics of trigonometry, we can build all sorts of things.

Table of contents

Sine and Cosine


When a circle is drawn with it’s center at the origin of an axis,  the  coordinates for a point on the perimeter can be found using the Sine and Cosine Function.

A position on the ground in Minecraft is defined by the North – South position, Z and the East West position, X.  If your avatar walks from the origin along the Green line, the Red and Yellow lines represent the X and Z values.







This actually shows how we draw a circle.  If the Green line stays fixed at the center but pivots around it, the end of the green line draws the circle.

There area couple ways we measure the distance around a circle: degrees and radians. A circle can be divided up into 360 degrees, which is useful if you need to break the circle up into smaller pieces:  360 can be divided by many smaller integers to get 1/2, 1/3, 1/4, 1/5, 1/6, 1/8,  1/10 and multiples of those as well.   However, it gets less useful when we want to convert to Cartesian (X, Y, Z) coordinates.

Radians are based on the ratio of the radius to the perimeter of the circle.  This ratios is the number Pi, in greek π, which is roughly 3.14159.   P = 2πr.

Yeah, 2π.  They couldn’t make it a simple value.  Go read the tau manifesto.

Anyway, trig functions are defined in radians.  Halfway around the circle is π radians.  A complete iteration is 2π.

Oh, and because it all came from the ancient Greeks, we use another Greek letter for the angle:  Theta. which looks like this θ.

Wow, I wish we used tau.  But I digress.

If we are working in an X Y plane, there are two functions we can use that translate an angle in radians to the X and Y values of a point on the circle.  The Y is calculated by the Sine, and the X value is calculated by the Cosine.





Minecraft uses the Class BlockPos to indicate a position in the world. Creating a BlockPos

new BlockPos( x, y, z )

To place an Obsidian block at the origin in Minecraft we can use the following function:

world.setBlockState( new BlockPos( 0, 0, 0 ), Blocks.OBSIDIAN.getDefaultState( ) );


If we want to draw a circle made of blocks in Minecraft, we can loop around the circle from θ = 0 to θ = 2π, and get the X and Y values (leave Z at 0).

  public static void circle_virt( World world, BlockPos origin, Block block, int i,
                                  double radius ) {
    for ( int degrees = 0; degrees < 360; degrees += 1 ) {
      double rads = next( degrees );
      double x = Math.round( origin.getX( ) + radius * Math.sin( rads ) );
      double y = Math.round( origin.getY( ) + radius * Math.cos( rads ) ) + i;
      double z = (double) origin.getZ( );

      world.setBlockState( new BlockPos(  x, y,  z ),
              block.getBlockState( ).getBaseState( ) );


Hooking it up to a command object, we can take a look at what it creates:

Vertical Circle at the Origin

Vertical Circle at the Origin

Why is it only half a circle? Minecraft does not let us draw in the Negative Y space.


Let’s add some code to allow us to move the center somewhere else, and to select a block to use to make the circle.

  public static void circle_virt_offset( World world, BlockPos origin, Block block, double radius ) {
    for ( int degrees = 0; degrees < 360; degrees += 1 ) {
      double rads = next( degrees );
      double x = Math.round( origin.getX( ) + radius * Math.sin( rads ) );
      double y = Math.round( origin.getY( ) + radius * Math.cos( rads ) );
      double z = origin.getZ( );
      world.setBlockState( new BlockPos( x, y, z ),
              block.getBlockState( ).getBaseState( ) );

Now let’s try to generate a circle up and to the right of our origin.

Vertical circle offset from the origin.

Vertical circle offset from the origin.

Yeah, Red Wool came out White…wrong state.

If we want to draw a circle in the X Z plan (parallel to the ground, we can use the same logic, but vary the X and Z parameters, and leave Y fixed, like this:

  public static void circle_horiz( World world, BlockPos origin, Block block, int i,
                                   int radius ) {
    for ( int degrees = 0; degrees < 360; degrees += 1 ) {
      double rads = next( degrees );
      double x = Math.round( origin.getX( ) + radius * Math.sin( rads ) );
      double y = origin.getY( ) + i;
      double z = Math.round( origin.getZ( ) + radius * Math.cos( rads ) );

      world.setBlockState( new BlockPos( x, y, z ),
              block.getBlockState( ).getBaseState( ) );

And we get .

Horizontal Circles, one at the origin,. one offset.

Horizontal Circles, one at the origin,. one offset.

Other Shapes

Once you get the basics down, you can add additional logic to generate disks…

public static void disk( Block block, int i, int radius, BlockPos center,
                         World world ) {
  double posX = center.getX( );
  double posY = center.getY( );
  double posZ = center.getZ( );

  int old_z = Integer.MAX_VALUE;
  int old_y = Integer.MAX_VALUE;
  int old_x = Integer.MAX_VALUE;

  for ( int degrees = 0; degrees < 180; degrees += 1 ) {
    double rads = next( degrees );
    int startX = ( int ) Math.round( posX - radius * Math.sin( rads ) );
    double endX = Math.round( posX + radius * Math.sin( rads ) );
    double y = posY + i;
    double z = Math.round( posZ + radius * Math.cos( rads ) );
    for ( int x = startX; x < endX; x++ ) {
      world.setBlockState( new BlockPos( ( double ) x, y, z ),
              block.getBlockState( ).getBaseState( ) );


with a star on them

  public static void star( EntityPlayer player, Block block, int i,
                           int max_radius ) {
    int radius = max_radius;
    BlockPos pos = player.getPosition( );
    World world = player.worldObj;
    IBlockState baseState = block.getBlockState( ).getBaseState( );

    Vector< BlockPos > points = new Vector< BlockPos >( );
    for ( int degree = 18; degree < 360; degree += 72 ) {
      double radians = ( 2 * Math.PI * degree ) / 360;
      BlockPos pointPos = new BlockPos(
              pos.getX( ) + Math.cos( radians ) * radius, pos.getY( ) + i,
              pos.getZ( ) + Math.sin( radians ) * radius );
      points.add( pointPos );
    int[] star_index = { 2, 4, 1, 3, 0 };
    int prev_index = 0;
    for ( int index : star_index ) {
      BlockPos prev = points.get( prev_index );
      BlockPos curr = points.get( index );
      Iterator< BlockPos > itr = new LinearIterator( curr, prev );
      while ( itr.hasNext( ) ) {
        world.setBlockState( itr.next( ), baseState );
      prev_index = index;
      BlockPos up_one = curr.add( 0, 1, 0 );
      world.setBlockState( curr, baseState );
      world.setBlockState( up_one,
              Blocks.TORCH.getBlockState( ).getBaseState( ) );


Or stack multiple circles on top of each other to make a cylinder:

  public static void cylinder( World world, BlockPos center, Block block,
                               int levels, int radius ) {
    circle_horiz( world, center, Blocks.TORCH, 0, radius - 1 );
    circle_horiz( world, center, Blocks.TORCH, 0, radius + 1 );
    for ( int i = 0; i < levels; i++ ) {
      circle_horiz( world, center, block, i, radius );
    disk( block, -1, radius, center, world );
    circle_horiz( world, center, Blocks.TORCH, levels, radius );
Cylinder of Mossy Stone

Cylinder of Mossy Stone

Or make the circles get smaller and smaller and make a cone:

 public static void cone( World world, BlockPos pos, Block block, int start,
                           int radius ) {
    int height = radius + 1;
    for ( int i = 0; i < height; i++ ) {
      circle_horiz( world, pos, block, start + i, radius - i );
Cone of Glass

Cone of Glass

Combining these tools you can get creative:

Childe Roland wuz heer.

Childe Roland wuz heer.

If we combine the XY and XZ plane code, we can make shell of a sphere:

  public static void shell( World world, BlockPos pos, Block block, int i,
                           int radius ) {
    int old_y = Integer.MAX_VALUE;
    int old_z = Integer.MAX_VALUE;

    for ( int degrees = 0; degrees < 360; degrees += 1 ) {
      double rads = next( degrees );
      int dome_radius = ( int ) Math.round( radius * Math.sin( rads ) );
      int y = ( int ) Math.round( radius * Math.cos( rads ) );
      int z = ( int ) ( double ) pos.getZ( ) + i;

      circle_horiz( world, pos, block, y, dome_radius );

      old_y = y;
      old_z = z;


Finally, Add in some logic to make circles around the sine function.

 public static  void sinewave( EntityPlayer sender ) {
    EntityPlayer player = sender;
    World world = player.worldObj;

    float TAU = (float)Math.PI * 2;

    for ( float radians = 0; radians < TAU; radians+= TAU/360  ) {
      world.setBlockState( new BlockPos( radians * 10, 0 , Math.sin(radians) * 10 ), Blocks.DIRT.getDefaultState( ) );
    Block block = get_selected_block(player);
    BlockPos pos = player.getPosition();
    for ( float radians = 0; radians < 5 * TAU; radians+= TAU/360  ) {
      circle_virt_z(player.worldObj, new BlockPos(radians * 10, 0, 0), Blocks.GOLD_BLOCK, 20, Math.sin(radians) * 10 );


Have fun.

Episode 8 - The primality of prime numbers

Posted by Open Source Security Podcast on October 11, 2016 11:33 PM
Kurt and Josh discuss prime numbers (probably getting a lot of it wrong), Samsung, passwords, National Cyber Security Awareness Month, and bathroom scales.

Download Episode
<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/287233537&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Minecraft X Y Z

Posted by Adam Young on October 11, 2016 02:55 AM

Minecraft uses the Cartesian coordinate system to locate and display blocks. That means that every block location in a Minecraft universe can be described using three values: X, Y, and Z. Even the player’s avatar “Steve” has a position recorded this way. If you type the F3 key, you can see a bunch of text on the screen. Buried in there somewhere are the 3 values for Steve’s position.

In order to get a more visual representation of these three numbers, here is a simple modification (Mod) to Minecraft that draws a portion of the three axis. The X axis here is described by the red bircks. These are placed along the X values from -100 to + 100, and both Y and Z are set to 0.

The same is done for the Y axis, shown in Obsidian. Since we can’t draw under the world (negative Y does not render) it only goes up.

The same is also done for the Z axis using the gray stones. Positive Z comes forward, negative goes back.

Our view point is from X=10 Y=1 Z = 10, looking almost halfway between negative X and Negative Z.




Here is the code to render the axis.

  public void execute( MinecraftServer server, ICommandSender sender, String[] args ) throws CommandException {

    EntityPlayer player = ( EntityPlayer ) sender;
    World world = player.worldObj;

    for ( int x = -LENGTH ; x < LENGTH; x++ ) {
      world.setBlockState( new BlockPos( x, 0, 0 ), Blocks.BRICK_BLOCK.getDefaultState( ) );

    for ( int y = 0; y < LENGTH ; y++ ) {
      world.setBlockState( new BlockPos( 0, y, 0 ), Blocks.OBSIDIAN.getDefaultState( ) );

    for ( int z = -LENGTH ; z < LENGTH ; z++ ) {
      world.setBlockState( new BlockPos( 0, 00, z ), Blocks.COBBLESTONE.getDefaultState( ) );

    world.setBlockState( new BlockPos( 0, 0, 0 ), Blocks.COMMAND_BLOCK.getDefaultState( ) );


Only trust food delivered by a zebra

Posted by Josh Bressers on October 10, 2016 01:21 PM
If you're a security person you're probably used to normal people not listening to you. Sometimes we know why they don't listen, but often the users get blamed for being stupid or stubborn or something else to justify their behavior. After having a conversation the other day it was noted that some of our advice could be compared to telling someone they should only trust food that has been delivered to them by a zebra.

It's meant to sound silly, because it is silly.

If you tell someone they should only trust food delivered by a zebra, they might nod and agree, some will tell you that's silly, but fundamentally nobody is going to listen. They won't listen because that advice is completely impractical. If you give impractical advice, your advice gets ignored. This gets tricky though because what I call impractical advice you may not. Things can get complicated especially when a difficult topic is being discussed. It's even harder when you have a lot of people who are self proclaimed experts but in reality don't know very much.

This is basically the story of security though. We give advice that we think is practical, normal people hear advice that makes no sense, makes their life worse, and is probably something they don't even want to do. They have two choices. Tell you they think your advice is bad, or just nod and agree while doing whatever they want. The latter is much less work. If someone tells you the advice you just gave them is bad, you're not going to think about how to give better advice, you're going to spend a lot of time convincing them why you're right and they're wrong. Smart people don't argue, they just nod and agree.

The solution to this problem is very simple to explain, but will be very hard to do. It's not uncommon for me to talk about listening as a very important thing we should being doing more of. If listening was easy, or solved as many things as I claim it would, we wouldn't have any more problems. While it is super important we listen to those we must help, it's only a small part of what we have to do. We must learn to be tactical first. You can't listen to people who won't talk to you. And if you show up demand zebra food, nobody will ever tell you anything useful. You get branded as a kook and that pretty much ends everything.

Stop demanding zebra food.

Comment on Twitter

Securing the Cyrus SASL Sample Server and Client with Kerberos

Posted by Adam Young on October 07, 2016 02:34 AM

Since running the Cyrus SASL sample server and client was not too bad, I figured I would see what happened when I tried to secure it using Kerberos.


I’m going to run this on a system that has been enrolled as a FreeIPA client, so I start with a known good Kerberos setup.

To see the list of mechanisms available, run


I have the following available.


For Kerberos, I want to use GSSAPI.

Lets do this the hard way, by trial and error. First, run the server, telling it to use the GSSAPI mechanism

/usr/bin/sasl2-sample-server -p 1789 -h localhost -s hello  -m GSSAPI

Then run the client in another terminal:

sasl2-sample-client -s hello -p 1789  -m GSSAPI localhost

Which includes the following in the output:

starting SASL negotiation: generic failure
SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No Kerberos credentials available)
closing connection


I need a Kerberos TGT in order to get a service ticket. Use kinit

$ kinit admin
Password for admin@AYOUNG-DELL-T1700.TEST: 

This time the error message is:

starting SASL negotiation: generic failure
SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server rcmd/localhost@AYOUNG-DELL-T1700.TEST not found in Kerberos database)

I notice two things, here:

  1. The service needs to be in the Kerberos servers directory.
  2. the service name should match the hostname.


If I rerun the command using the FQDN of the server, I can see the service name as expected:


$ sasl2-sample-client -s hello -p 1789 -m GSSAPI undercloud.ayoung-dell-t1700.testreceiving capability list... ...
SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST not found in Kerberos database)
closing connection


So I tried to create the service in the ipa server:

ipa service-add
Principal: hello/overcloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST
ipa: ERROR: Host does not have corresponding DNS A/AAAA record
[stack@overcloud ~]$ ipa service-find

Strange error, I don’t understand, as the Host does have an A record.

Work around it with Force:

ipa service-add  --force  hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST


Added service "hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST"
  Principal: hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST
  Managed by: undercloud.ayoung-dell-t1700.test

OK, lets try running this again.

 sasl2-sample-client -s hello -p 1789 -m GSSAPI 

SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (KDC has no support for encryption type)


OK, I’m going to guess that this is because my remote service can’t deal with the Kerberos service tickets it is getting. Since the service tickets are for the principal: hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST it needs to be able to decrypt requests using a key meant for this principal.

Fetch a keytab for that principal, and put it in a place where the GSSAPI libraries can access it automatically. This place is:


Where {uid} is the numeric UID for a users. In this case, the users name is stack and I can find the numeric UID value using getent.


ipa-getkeytab -p hello/undercloud.ayoung-dell-t1700.test@AYOUNG-DELL-T1700.TEST -k client.keytab  -s identity.ayoung-dell-t1700.test
Keytab successfully retrieved and stored in: client.keytab
$  getent passwd stack
$ sudo mkdir /var/kerberos/krb5/user/1000
$ sudo chown stack:stack /var/kerberos/krb5/user/1000
$ mv client.keytab /var/kerberos/krb5/user/1000

Restart the server process, try again, and the log is interesting. Here is the full client side trace.

$ sasl2-sample-client -s hello -p 1789 -m GSSAPI undercloud.ayoung-dell-t1700.test
receiving capability list... recv: {6}
please enter an authorization id: admin
using mechanism GSSAPI
send: {6}
send: {1}
send: {655}
`[82][2][8B][6][9]*[86]H[86][F7][12][1][2][2][1][0]n[82][2]z0[82][2]v[A0][3][2][1][5][A1][3][2][1][E][A2][7][3][5][0] [0][0][0][A3][82][1][82]a[82][1]~0[82][1]z[A0][3][2][1][5][A1][18][1B][16]AYOUNG-DELL-T1700.TEST[A2]503[A0][3][2][1][3][A1],0*[1B][5]hello[1B]!undercloud.ayoung-dell-t1700.test[A3][82][1] 0[82][1][1C][A0][3][2][1][12][A1][3][2][1][1][A2][82][1][E][4][82][1][A]T[DD][F8]B[F4][B4]5[D]`[A3]![EE][19]-NN[8E][F5][B7]{O,#[91][A4]}[86]k[D5][EE]vL[E4]&[6][3][A][1C][91][A5][A7][88]j[D1][A3][82][EC][A][D6][CB][F3]9[C][13]#[94][86]d+[B8]V[B7]C^[C6][A8][16][D1]r[E4][0][B9][2][2]&2[E5]Y~[C1]\([BA]x}[17][BC][D][FC][D5][CA][CA]h[E4][A1][81].[15][17]?[CA][A][8B]}[1C]l[F0][D9][E8][96]3<+[84][E7]q.[8E][D5][6][1C]p[E6][6]v[B0][84]5[9][B7]w[D6]3[B8][E3][5]T[BF][92][AA][D5][B3][[83]X[C0]:[BA]V[E5]{>[A5]T[F6]j[CB]p[BF]][EF][E1][91][ED][C][F3]Y[4]x[8E][C2]H[E7][14]#9[EE]5[B3]=[FA][80][DD][93][EF]3[0]q~22[6]I<[EB][F9]V[D1][9D][A8][A6]:[CE]u[AE]-l[D3]"[D7][FE]iB[84][E0]]B[E][C8]U[E][FD][D2]=[F2][97][88][D3][DA]j[B4][FA][16][D1]^CE2?[9F][89]^A[E9][AF][1A]5[99][CE][7][AF]M[1A][A][CB]^[E1][BA]f[7]-n<[F8]8![A4][81][DA]0[81][D7][A0][3][2][1][12][A2][81][CF][4][81][CC][91][F0][A]D[91][F6][FA][F4][B9][13][DF]d|[F4]Y[DF][9E]M[A2]f[11][15]x[C5]-|Qt[F4]nL>@[F4][18][FF],[F6][B5]F6[EC]+[C3]V[F1][81][97][E2][1D]i[4]wD&[9A]V[CE][A1][16][D7]4[E0]C[B]O[D1]v[DD][E9][84]lW[DA]%[F6]v[93]<m"SAfiF[8E][[95]"[CC][D2]4[FA]_[FB]i[E7][D4]M[AE][5][82][FF][D7][0][8C]6[8D][B0]3[F8][E3][B4]P[9C][9E][A2]`[7]U[F7][1D]zub[E0]([A9]P>[AE]f[1A][B1][80][A0]}s[EA][D1]Zk[FF]n_S[9E]rK[E5]n [85]#[DB][FF][B3][E2][19];[F5][E2][8A]>2[E5][A4][81][E8]z[9D][E3][BC][C8][87][F]:[81]7[C9]ix[1E]5[15])[8D][9D][C7][DB][13][98][97][C7]C[6]q[D2][C1][ED][B3]:[E0]
waiting for server reply...
authentication failed
closing connection

On the server side, it looks similar, but ends like this:

starting SASL negotiation: generic failureclosing connection

It is not a GSSAPI error this time. To dig deeper, I’m going to look at the source code on the server side.


I’ll shortcut a few steps. Install both gdb and the debugInfo for the sample code:

sudo yum install gdb
sudo debuginfo-install cyrus-sasl-devel-2.1.26-20.el7_2.x86_64

Note that the version might change for the debuginfo.

The source code is included with the debuginfo rpm:

$ rpmquery  --list cyrus-sasl-debuginfo-2.1.26-20.el7_2.x86_64 | grep server.c

Looking at the server code at line 267 I see:

if (r != SASL_OK && r != SASL_CONTINUE) {
saslerr(r, “starting SASL negotiation”);
fputc(‘N’, out); /* send NO to client */
return -1;

Let’s put a breakpoint at line 255 above it and see what is happening. Here is the session for setting up the breakpoint:

$  gdb /usr/bin/sasl2-sample-server
(gdb) break 255
Breakpoint 1 at 0x2557: file server.c, line 255.
(gdb) run  -h undercloud.ayoung-dell-t1700.test -p 1789 -m GSSAPI

Running the client code gets as far as prompting for the please enter an authorization id: admiyo

This is suspect. We’ll come back to it in a moment.

Back on the server, now, we see the breakpoint has been hit.

Breakpoint 1, mysasl_negotiate (in=0x55555575c150, out=0x55555575c390, conn=0x55555575a6e0)
    at server.c:255
255	    if(buf[0] == 'Y') {
Missing separate debuginfos, use: debuginfo-install keyutils-libs-1.5.8-3.el7.x86_64 libdb-5.3.21-19.el7.x86_64 libselinux-2.2.2-6.el7.x86_64 nss-softokn-freebl- openssl-libs-1.0.1e-51.el7_2.7.x86_64 pcre-8.32-15.el7_2.1.x86_64 xz-libs-5.1.2-12alpha.el7.x86_64 zlib-1.2.7-15.el7.x86_64

We might need some other RPMS if we want to step deeper through the code, but for now, let’s keep on here.

(gdb) print buf
$1 = "Y", '\000' ...
(gdb) n
257	        len = recv_string(in, buf, sizeof(buf));
(gdb) n
recv: {655}
`[82][2][8B][6][9]*[86]H[86][F7][12][1][2][2][1][0]n[82][2]z0[82][2]v[A0][3][2][1][5][A1][3][2][1][E][A2][7][3][5][0] [0][0][0][A3][82][1][82]a[82][1]~0[82][1]z[A0][3][2][1][5][A1][18][1B][16]AYOUNG-DELL-T1700.TEST[A2]503[A0][3][2][1][3][A1],0*[1B][5]hello[1B]!undercloud.ayoung-dell-t1700.test[A3][82][1] 0[82][1][1C][A0][3][2][1][12][A1][3][2][1][1][A2][82][1][E][4][82][1][A]T[DD][F8]B[F4][B4]5[D]`[A3]![EE][19]-NN[8E][F5][B7]{O,#[91][A4]}[86]k[D5][EE]vL[E4]&[6][3][A][1C][91][A5][A7][88]j[D1][A3][82][EC][A][D6][CB][F3]9[C][13]#[94][86]d+[B8]V[B7]C^[C6][A8][16][D1]r[E4][0][B9][2][2]&2[E5]Y~[C1]\([BA]x}[17][BC][D][FC][D5][CA][CA]h[E4][A1][81].[15][17]?[CA][A][8B]}[1C]l[F0][D9][E8][96]3<+[84][E7]q.[8E][D5][6][1C]p[E6][6]v[B0][84]5[9][B7]w[D6]3[B8][E3][5]T[BF][92][AA][D5][B3][[83]X[C0]:[BA]V[E5]{>[A5]T[F6]j[CB]p[BF]][EF][E1][91][ED][C][F3]Y[4]x[8E][C2]H[E7][14]#9[EE]5[B3]=[FA][80][DD][93][EF]3[0]q~22[6]I<[EB][F9]V[D1][9D][A8][A6]:[CE]u[AE]-l[D3]"[D7][FE]iB[84][E0]]B[E][C8]U[E][FD][D2]=[F2][97][88][D3][DA]j[B4][FA][16][D1]^CE2?[9F][89]^A[E9][AF][1A]5[99][CE][7][AF]M[1A][A][CB]^[E1][BA]f[7]-n<[F8]8![A4][81][DA]0[81][D7][A0][3][2][1][12][A2][81][CF][4][81][CC]hgdf j[CF][AE][7F]:![1C]D[F8]3^w[B7];"[3][D8]3"[8]i[9]J[D3]R[F]A[E7]![BE]0<[8][D3]'j`[B7]J[16][A9][F3][E6]=[E5]J[FE].-[A1]t[[2]W[8D]7[F3][8][EC][92][BB][A3]o5h[C1]A[CC][A2][F1][99][AA][93]2{[BA]Mx0[9D][9][CC]![A]Y[12][D8][2][95][17]ml[B4][1A][94]y[1A][BC][D2]I[8F]7Vg2[8E]6[13]:Lx[E6][1][D3][3][7]r?[12][84]3[B1][B5][AA]E)[EA][87][A][9F]Nk[D1]I[FD]{[B8]9#-[D][8]2[CC]C1[A8]Lfl[B0][E8][82][13][F9]t[1A][F6]^[8D] O13[12]L[E7][C0]k[99][E1]J[1F][FE]#[14]u[B][B2][8F][DB][E6]73*[FA][ED][11][F7][9E][B0][DC][D9][19][AB][97][D7][8B][BB]
260	        r = sasl_server_start(conn, chosenmech, buf, len,
(gdb) print len
$2 = 1
(gdb) n
257	        len = recv_string(in, buf, sizeof(buf));
260	        r = sasl_server_start(conn, chosenmech, buf, len,
267	    if (r != SASL_OK && r != SASL_CONTINUE) {
Missing separate debuginfos, use: debuginfo-install gssproxy-0.4.1-8.el7_2.x86_64
(gdb) print r
$3 = -1

A -1 response code usually is an error. Looking in /usr/include/sasl/sasl.h:

#define SASL_FAIL -1 /* generic failure */

I wonder if we can figure out why. Let’ see, first, if we can figure out what the client is sending in the authentication request. If it is a bad principal, then we have a pretty good reason to expect the server to reject it.

Let’s let the server continue running, and try debugging the client.

Client code can be found here

$ rpmquery  --list cyrus-sasl-debuginfo | grep client.c

At line 258 I see the call to sasl_client_start which includes what appears to be the initialization of the data variable. Set a breakpoint there

Running the code in the debugger like this:

$ gdb sasl2-sample-client
(gdb) break 258
Breakpoint 1 at 0x201b: file client.c, line 258.
(gdb) run -s hello -p 1789 -m GSSAPI undercloud.ayoung-dell-t1700.test
Starting program: /bin/sasl2-sample-client -s hello -p 1789 -m GSSAPI undercloud.ayoung-dell-t1700.test
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
receiving capability list... recv: {6}

Breakpoint 1, mysasl_negotiate (in=0x55555575cab0, out=0x55555575ccf0, conn=0x55555575b520)
    at client.c:258
258	    r = sasl_client_start(conn, mech, NULL, &data, &len, &chosenmech);
(gdb) print data
$1 = 0x0
(gdb) print mech
$2 = 0x7fffffffe714 "GSSAPI"
(gdb) print conn
$3 = (sasl_conn_t *) 0x55555575b520
(gdb) print len
$4 = 6
(gdb) n
please enter an authorization id: 

So it is the SASL library itself requesting an authorization ID. Let me try putting in the full Principal associated with the service ticket.

please enter an authorization id: ayoung@AYOUNG-DELL-T1700.TEST
259	    if (r != SASL_OK && r != SASL_CONTINUE) {
Missing separate debuginfos, use: debuginfo-install gssproxy-0.4.1-8.el7_2.x86_64
(gdb) print r
$5 = 1

And from sasl.h we know that is good.

#define SASL_CONTINUE 1 /* another step is needed in authentication */

Let’s let it continue.

authentication failed

Nope. Continuing through the debugger, I see another generic failure here:

1531	            } else {
1532	                /* Mech wants client-first, so let them have it */
1533	                result = sasl_server_step(conn,
1534	                                          clientin,
1535						  clientinlen,
1536	                                          serverout,
1537						  serveroutlen);
(gdb) n
1557	    if (  result != SASL_OK
(gdb) print result
$15 = -1

Still…why is the Client side SASL call kicking into an interactive prompt? There should be enough information via the GSSAPI SASL library interaction to authenticate. The Man page for sasl_client_start even indicates that there might be prompts returned.

Looking deeper at the client code, I do see that the prompt is from line 122. The function simple at line 107 must be set as a callback. Perhaps the client code is not smart enough to work with the GSSAPI? At line 190 and 192 I see that the simple code is provided as a callback for the responses SASL_CB_USER or SASL_CB_AUTHNAME. Setting a break point and rerunning shows the id value to be 16385 or x4001.

#define SASL_CB_USER 0x4001 /* client user identity to login as */


Humility and Success

If you have followed through this far, you know I am in the weeds. I asked for help. Help, in this case,was Robbie Harwood, how showed me that the sample server/client worked OK if I ran the server as root, and userd the service host instead of hello. That gave me a succesfful comparison other to work with. I ran using strace and noticed that the failing version was not trying to read the keytab file from /var/kerberos/krb5/user/1000/client.keytab. The successful one running as root read the keytab from /etc/krb5.keytab THe failing one was trying to read from there and getting a permissions failure. The final blow that took down the wall was to realize that the krb5.conf file defined different values for default_client_keytab_name and default_keytab_name, with the latter being set to FILE:/etc/krb5.keytab. To work around this, I needed the environment variable KRB5_KTNAME to be set to the keytab. This was the winning entry:

KRB5_KTNAME=/var/kerberos/krb5/user/1000/client.keytab  sasl2-sample-server -h $HOSTNAME -p 9999 -s hello -m GSSAPI 

And then ran

sasl2-sample-client -s hello -p 9999 -m GSSAPI undercloud.ayoung-dell-t1700.test

Oh, one other tyhing Robbie told me was that the string I type when prompted with

please enter an authorization id:

Should be the Kerberos principal, minus the Realm, so for me it was

please enter an authorization id: ayoung

Running the Cyrus SASL Sample Server and Client

Posted by Adam Young on October 05, 2016 04:42 PM

When I start working on a new project, I usually start by writing a “Hello, World” program and going step by step from there. When trying to learn Cyrus SASL, I found I needed to something comparable, that showed both the client and server side of the connection. While the end state of using SASL should be communication that is both authenticated and encrypted, to start, I just wanted to see the protocol in action, using clear text and no authentication.

UPDATE: Note that the client and server code are provided with the cyrus-sasl-devel RPM on a Fedora system and comparable pacakges elsewhere.

I started by running the server:

/usr/bin/sasl2-sample-server  -h localhost -p 1789 -m ANONYMOUS

Why did I chose 1789? It is the port for the Hello server:

$ getent services hello
hello                 1789/tcp

The -m flag has the value of ANONYMOUS, saying no Authentication is required.

Starting up the server showed:

trying 2, 1, 6
trying 10, 1, 6
bind: Address already in use

This last line looks like a failure, but as we will see, it is not. I ignored it to start.

To test a connection to it, I ran the following in a second terminal window.

sasl2-sample-client -p 1789  -m ANONYMOUS localhost

Here is what that session looked like:

$ sasl2-sample-client -p 1789  -m ANONYMOUS localhost
receiving capability list... recv: {9}
please enter an authorization id: ADMIYO
using mechanism ANONYMOUS
send: {9}
send: {1}
send: {21}
waiting for server reply...
successful authentication
closing connection

Note that I was prompted for the authorization id and I entered the string’ADMIYO.’ I intentionally chose something that I would not expect to be a standard part of the output so I can see the effect I am having. Here is the server side of the communication as logged.

accepted new connection
forcing use of mechanism ANONYMOUS
send: {9}
waiting for client mechanism...
recv: {9}
recv: {1}
recv: {21}
negotiation complete
successful authentication 'anonymous'
closing connection

Let’s take a look on the (virtual) wire. Running tcpdump like this:

 sudo  tcpdump -i lo port 1789

For the first part of the interaction (prior to typing in the string ADMIYO) The output is;

12:02:42.201997 IP6 localhost.53196 > localhost.hello: Flags [S], seq 2530750333, win 43690, options [mss 65476,sackOK,TS val 1486702922 ecr 0,nop,wscale 7], length 0
12:02:42.202012 IP6 localhost.hello > localhost.53196: Flags [R.], seq 0, ack 2530750334, win 0, length 0
12:02:42.202053 IP localhost.50258 > localhost.hello: Flags [S], seq 2408359983, win 43690, options [mss 65495,sackOK,TS val 1486702922 ecr 0,nop,wscale 7], length 0
12:02:42.202067 IP localhost.hello > localhost.50258: Flags [S.], seq 11931919, ack 2408359984, win 43690, options [mss 65495,sackOK,TS val 1486702922 ecr 1486702922,nop,wscale 7], length 0

Once I type in ADMIYO and hit return in the client I see:

12:04:51.107447 IP localhost.50258 > localhost.hello: Flags [P.], seq 1:15, ack 15, win 342, options [nop,nop,TS val 1486831827 ecr 1486702922], length 14
12:04:51.107530 IP localhost.hello > localhost.50258: Flags [.], ack 15, win 342, options [nop,nop,TS val 1486831827 ecr 1486831827], length 0
12:04:51.107551 IP localhost.50258 > localhost.hello: Flags [P.], seq 15:21, ack 15, win 342, options [nop,nop,TS val 1486831827 ecr 1486831827], length 6
12:04:51.107563 IP localhost.hello > localhost.50258: Flags [.], ack 21, win 342, options [nop,nop,TS val 1486831827 ecr 1486831827], length 0

Let’s see if the server can correctly translate the port for the “hello” service.


$ /usr/bin/sasl2-sample-server  -h localhost -s hello  -m ANONYMOUS

TCP dump shows the following output:

12:06:57.628798 IP6 localhost.53252 > localhost.hello: Flags [S], seq 2637706072, win 43690, options [mss 65476,sackOK,TS val 1486958349 ecr 0,nop,wscale 7], length 0
12:06:57.628815 IP6 localhost.hello > localhost.53252: Flags [R.], seq 0, ack 2637706073, win 0, length 0
12:06:57.628859 IP localhost.50314 > localhost.hello: Flags [S], seq 1432008138, win 43690, options [mss 65495,sackOK,TS val 1486958349 ecr 0,nop,wscale 7], length 0
12:06:57.628875 IP localhost.hello > localhost.50314: Flags [R.], seq 0, ack 1432008139, win 0, length 0
12:07:21.065692 IP6 localhost.53262 > localhost.hello: Flags [S], seq 1562244294, win 43690, options [mss 65476,sackOK,TS val 1486981785 ecr 0,nop,wscale 7], length 0
12:07:21.065712 IP6 localhost.hello > localhost.53262: Flags [R.], seq 0, ack 1562244295, win 0, length 0
12:07:21.065775 IP localhost.50324 > localhost.hello: Flags [S], seq 4166967599, win 43690, options [mss 65495,sackOK,TS val 1486981786 ecr 0,nop,wscale 7], length 0
12:07:21.065791 IP localhost.hello > localhost.50324: Flags [R.], seq 0, ack 4166967600, win 0, length 0

Note that I had to change how I called the client to:

$ sasl2-sample-client -s hello  -m ANONYMOUS localhost

Why is that? My suspicion is that the Service name is part of the SASL handshake. Let’s see if we can find out. To start, let’s tell tcpdump to dump the contents of the packets out in hex and ascii:

sudo  tcpdump -XX -i lo port 1789

Running both the server and the client with the explicit port assigned I get the following dump:

12:12:08.992969 IP6 localhost.53316 > localhost.hello: Flags [S], seq 2611436863, win 43690, options [mss 65476,sackOK,TS val 1487269713 ecr 0,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 86dd 6000  ..............`.
	0x0010:  8995 0028 0640 0000 0000 0000 0000 0000  ...(.@..........
	0x0020:  0000 0000 0001 0000 0000 0000 0000 0000  ................
	0x0030:  0000 0000 0001 d044 06fd 9ba7 5d3f 0000  .......D....]?..
	0x0040:  0000 a002 aaaa 0030 0000 0204 ffc4 0402  .......0........
	0x0050:  080a 58a5 ef51 0000 0000 0103 0307       ..X..Q........
12:12:08.992986 IP6 localhost.hello > localhost.53316: Flags [R.], seq 0, ack 2611436864, win 0, length 0
	0x0000:  0000 0000 0000 0000 0000 0000 86dd 6007  ..............`.
	0x0010:  bb57 0014 0640 0000 0000 0000 0000 0000  .W...@..........
	0x0020:  0000 0000 0001 0000 0000 0000 0000 0000  ................
	0x0030:  0000 0000 0001 06fd d044 0000 0000 9ba7  .........D......
	0x0040:  5d40 5014 0000 001c 0000                 ]@P.......
12:12:08.993035 IP localhost.50378 > localhost.hello: Flags [S], seq 613533991, win 43690, options [mss 65495,sackOK,TS val 1487269713 ecr 0,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  003c 2676 4000 4006 1644 7f00 0001 7f00  .<&v@.@..D......
	0x0020:  0001 c4ca 06fd 2491 c927 0000 0000 a002  ......$..'......
	0x0030:  aaaa fe30 0000 0204 ffd7 0402 080a 58a5  ...0..........X.
	0x0040:  ef51 0000 0000 0103 0307                 .Q........
12:12:08.993053 IP localhost.hello > localhost.50378: Flags [S.], seq 561556928, ack 613533992, win 43690, options [mss 65495,sackOK,TS val 1487269713 ecr 1487269713,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  003c 0000 4000 4006 3cba 7f00 0001 7f00  .<..@.@.<.......
	0x0020:  0001 06fd c4ca 2178 adc0 2491 c928 a012  ......!x..$..(..
	0x0030:  aaaa fe30 0000 0204 ffd7 0402 080a 58a5  ...0..........X.
	0x0040:  ef51 58a5 ef51 0103 0307                 .QX..Q....
12:12:11.741135 IP localhost.50378 > localhost.hello: Flags [P.], seq 1:15, ack 15, win 342, options [nop,nop,TS val 1487272461 ecr 1487269713], length 14
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0042 2679 4000 4006 163b 7f00 0001 7f00  .B&y@.@..;......
	0x0020:  0001 c4ca 06fd 2491 c928 2178 adcf 8018  ......$..(!x....
	0x0030:  0156 fe36 0000 0101 080a 58a5 fa0d 58a5  .V.6......X...X.
	0x0040:  ef51 7b39 7d0d 0a41 4e4f 4e59 4d4f 5553  .Q{9}..ANONYMOUS
12:12:11.741183 IP localhost.hello > localhost.50378: Flags [.], ack 15, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 4291 4000 4006 fa30 7f00 0001 7f00  .4B.@.@..0......
	0x0020:  0001 06fd c4ca 2178 adcf 2491 c936 8010  ......!x..$..6..
	0x0030:  0156 fe28 0000 0101 080a 58a5 fa0d 58a5  .V.(......X...X.
	0x0040:  fa0d                                     ..
12:12:11.741193 IP localhost.50378 > localhost.hello: Flags [P.], seq 15:48, ack 15, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 33
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0055 267a 4000 4006 1627 7f00 0001 7f00  .U&z@.@..'......
	0x0020:  0001 c4ca 06fd 2491 c936 2178 adcf 8018  ......$..6!x....
	0x0030:  0156 fe49 0000 0101 080a 58a5 fa0d 58a5  .V.I......X...X.
	0x0040:  fa0d 7b31 7d0d 0a59 7b32 317d 0d0a 4144  ..{1}..Y{21}..AD
	0x0050:  4d49 594f 4061 796f 756e 6735 3431 2e74  MIYO@ayoung541.t
	0x0060:  6573 74                                  est
12:12:11.741198 IP localhost.hello > localhost.50378: Flags [.], ack 48, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 4292 4000 4006 fa2f 7f00 0001 7f00  .4B.@.@../......
	0x0020:  0001 06fd c4ca 2178 adcf 2491 c957 8010  ......!x..$..W..
	0x0030:  0156 fe28 0000 0101 080a 58a5 fa0d 58a5  .V.(......X...X.
	0x0040:  fa0d                                     ..
12:12:11.741248 IP localhost.hello > localhost.50378: Flags [P.], seq 15:16, ack 48, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 1
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0035 4293 4000 4006 fa2d 7f00 0001 7f00  .5B.@.@..-......
	0x0020:  0001 06fd c4ca 2178 adcf 2491 c957 8018  ......!x..$..W..
	0x0030:  0156 fe29 0000 0101 080a 58a5 fa0d 58a5  .V.)......X...X.
	0x0040:  fa0d 4f                                  ..O
12:12:11.741260 IP localhost.50378 > localhost.hello: Flags [.], ack 16, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 267b 4000 4006 1647 7f00 0001 7f00  .4&{@.@..G......
	0x0020:  0001 c4ca 06fd 2491 c957 2178 add0 8010  ......$..W!x....
	0x0030:  0156 fe28 0000 0101 080a 58a5 fa0d 58a5  .V.(......X...X.
	0x0040:  fa0d                                     ..
12:12:11.741263 IP localhost.hello > localhost.50378: Flags [F.], seq 16, ack 48, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 4294 4000 4006 fa2d 7f00 0001 7f00  .4B.@.@..-......
	0x0020:  0001 06fd c4ca 2178 add0 2491 c957 8011  ......!x..$..W..
	0x0030:  0156 fe28 0000 0101 080a 58a5 fa0d 58a5  .V.(......X...X.
	0x0040:  fa0d                                     ..
12:12:11.741285 IP localhost.hello > localhost.50378: Flags [.], ack 49, win 342, options [nop,nop,TS val 1487272461 ecr 1487272461], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 4295 4000 4006 fa2c 7f00 0001 7f00  .4B.@.@..,......
	0x0020:  0001 06fd c4ca 2178 add1 2491 c958 8010  ......!x..$..X..
	0x0030:  0156 fe28 0000 0101 080a 58a5 fa0d 58a5  .V.(......X...X.
	0x0040:  fa0d                                     ..

But running with -s hello shows nothing. Is it running on a different port? Let’s use LSOF to check. First run the server with the -s hello flag set. Then run lsof to see what is going on;

$ ps -ef | grep sasl
ayoung    2513 25933  0 12:14 pts/1    00:00:00 /usr/bin/sasl2-sample-server -h localhost -s hello -m ANONYMOUS
$ sudo lsof -p 2513  | grep TCP
sasl2-sam 2513 ayoung    3u  IPv4 26451981      0t0      TCP *:italk (LISTEN)
$ getent services italk
italk                 12345/tcp

Let’s see if tcpdump can confirm. Run it like this:

$ sudo  tcpdump -XX -i lo port 12345

And after running both server and client with -p hello I see

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on lo, link-type EN10MB (Ethernet), capture size 262144 bytes

12:18:48.995740 IP6 localhost.38730 > localhost.italk: Flags [S], seq 2085322154, win 43690, options [mss 65476,sackOK,TS val 1487669716 ecr 0,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 86dd 600a  ..............`.
	0x0010:  8706 0028 0640 0000 0000 0000 0000 0000  ...(.@..........
	0x0020:  0000 0000 0001 0000 0000 0000 0000 0000  ................
	0x0030:  0000 0000 0001 974a 3039 7c4b 7daa 0000  .......J09|K}...
	0x0040:  0000 a002 aaaa 0030 0000 0204 ffc4 0402  .......0........
	0x0050:  080a 58ac 09d4 0000 0000 0103 0307       ..X...........
12:18:48.995764 IP6 localhost.italk > localhost.38730: Flags [R.], seq 0, ack 2085322155, win 0, length 0
	0x0000:  0000 0000 0000 0000 0000 0000 86dd 600f  ..............`.
	0x0010:  e905 0014 0640 0000 0000 0000 0000 0000  .....@..........
	0x0020:  0000 0000 0001 0000 0000 0000 0000 0000  ................
	0x0030:  0000 0000 0001 3039 974a 0000 0000 7c4b  ......09.J....|K
	0x0040:  7dab 5014 0000 001c 0000                 }.P.......
12:18:48.995808 IP localhost.45714 > localhost.italk: Flags [S], seq 4246244983, win 43690, options [mss 65495,sackOK,TS val 1487669716 ecr 0,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  003c 87a3 4000 4006 b516 7f00 0001 7f00  .<..@.@.........
	0x0020:  0001 b292 3039 fd18 8e77 0000 0000 a002  ....09...w......
	0x0030:  aaaa fe30 0000 0204 ffd7 0402 080a 58ac  ...0..........X.
	0x0040:  09d4 0000 0000 0103 0307                 ..........
12:18:48.995820 IP localhost.italk > localhost.45714: Flags [S.], seq 1101043017, ack 4246244984, win 43690, options [mss 65495,sackOK,TS val 1487669716 ecr 1487669716,nop,wscale 7], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  003c 0000 4000 4006 3cba 7f00 0001 7f00  .<..@.@.<.......
	0x0020:  0001 3039 b292 41a0 9549 fd18 8e78 a012  ..09..A..I...x..
	0x0030:  aaaa fe30 0000 0204 ffd7 0402 080a 58ac  ...0..........X.
	0x0040:  09d4 58ac 09d4 0103 0307                 ..X.......
12:18:52.072280 IP localhost.45714 > localhost.italk: Flags [P.], seq 1:15, ack 15, win 342, options [nop,nop,TS val 1487672792 ecr 1487669716], length 14
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0042 87a6 4000 4006 b50d 7f00 0001 7f00  .B..@.@.........
	0x0020:  0001 b292 3039 fd18 8e78 41a0 9558 8018  ....09...xA..X..
	0x0030:  0156 fe36 0000 0101 080a 58ac 15d8 58ac  .V.6......X...X.
	0x0040:  09d4 7b39 7d0d 0a41 4e4f 4e59 4d4f 5553  ..{9}..ANONYMOUS
12:18:52.072343 IP localhost.italk > localhost.45714: Flags [.], ack 15, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 9c9f 4000 4006 a022 7f00 0001 7f00  .4..@.@.."......
	0x0020:  0001 3039 b292 41a0 9558 fd18 8e86 8010  ..09..A..X......
	0x0030:  0156 fe28 0000 0101 080a 58ac 15d8 58ac  .V.(......X...X.
	0x0040:  15d8                                     ..
12:18:52.072358 IP localhost.45714 > localhost.italk: Flags [P.], seq 15:48, ack 15, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 33
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0055 87a7 4000 4006 b4f9 7f00 0001 7f00  .U..@.@.........
	0x0020:  0001 b292 3039 fd18 8e86 41a0 9558 8018  ....09....A..X..
	0x0030:  0156 fe49 0000 0101 080a 58ac 15d8 58ac  .V.I......X...X.
	0x0040:  15d8 7b31 7d0d 0a59 7b32 317d 0d0a 4144  ..{1}..Y{21}..AD
	0x0050:  4d49 594f 4061 796f 756e 6735 3431 2e74  MIYO@ayoung541.t
	0x0060:  6573 74                                  est
12:18:52.072366 IP localhost.italk > localhost.45714: Flags [.], ack 48, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 9ca0 4000 4006 a021 7f00 0001 7f00  .4..@.@..!......
	0x0020:  0001 3039 b292 41a0 9558 fd18 8ea7 8010  ..09..A..X......
	0x0030:  0156 fe28 0000 0101 080a 58ac 15d8 58ac  .V.(......X...X.
	0x0040:  15d8                                     ..
12:18:52.072464 IP localhost.italk > localhost.45714: Flags [P.], seq 15:16, ack 48, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 1
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0035 9ca1 4000 4006 a01f 7f00 0001 7f00  .5..@.@.........
	0x0020:  0001 3039 b292 41a0 9558 fd18 8ea7 8018  ..09..A..X......
	0x0030:  0156 fe29 0000 0101 080a 58ac 15d8 58ac  .V.)......X...X.
	0x0040:  15d8 4f                                  ..O
12:18:52.072494 IP localhost.45714 > localhost.italk: Flags [.], ack 16, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 87a8 4000 4006 b519 7f00 0001 7f00  .4..@.@.........
	0x0020:  0001 b292 3039 fd18 8ea7 41a0 9559 8010  ....09....A..Y..
	0x0030:  0156 fe28 0000 0101 080a 58ac 15d8 58ac  .V.(......X...X.
	0x0040:  15d8                                     ..
12:18:52.072501 IP localhost.italk > localhost.45714: Flags [F.], seq 16, ack 48, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 9ca2 4000 4006 a01f 7f00 0001 7f00  .4..@.@.........
	0x0020:  0001 3039 b292 41a0 9559 fd18 8ea7 8011  ..09..A..Y......
	0x0030:  0156 fe28 0000 0101 080a 58ac 15d8 58ac  .V.(......X...X.
	0x0040:  15d8                                     ..
12:18:52.072529 IP localhost.italk > localhost.45714: Flags [.], ack 49, win 342, options [nop,nop,TS val 1487672792 ecr 1487672792], length 0
	0x0000:  0000 0000 0000 0000 0000 0000 0800 4500  ..............E.
	0x0010:  0034 9ca3 4000 4006 a01e 7f00 0001 7f00  .4..@.@.........
	0x0020:  0001 3039 b292 41a0 955a fd18 8ea8 8010  ..09..A..Z......
	0x0030:  0156 fe28 0000 0101 080a 58ac 15d8 58ac  .V.(......X...X.
	0x0040:  15d8                                     ..

As a final test, let’s see what happens when I tell the client to use that port explicitly. Running:

 sasl2-sample-client -p 12345 -m ANONYMOUS localhost

Generates the proper output:

receiving capability list... recv: {9}
please enter an authorization id: ADMIYO
using mechanism ANONYMOUS
send: {9}
send: {1}
send: {21}
waiting for server reply...
successful authentication
closing connection

Translating Between RDO/RHOS and upstream releases Redux

Posted by Adam Young on October 04, 2016 05:47 PM

I posted this once before, but we’ve moved on a bit since then. So, an update.


upstream = ['Austin', 'Bexar', 'Cactus', 'Diablo', 'Essex (Tag 2012.1)', 'Folsom (Tag 2012.2)',
 'Grizzly (Tag 2013.1)', 'Havana (Tag 2013.2) ', 'Icehouse (Tag 2014.1) ', 'Juno (Tag 2014.2) ', 'Kilo (Tag 2015.1) ', 'Liberty',
 'Mitaka', 'Newton', 'Ocata', 'Pike', 'Queens', 'R', 'S']

for v in range(0, len(upstream) - 3):
 print "RHOS Version %s = upstream %s" % (v, upstream[v + 3])


RHOS Version 0 = upstream Diablo
RHOS Version 1 = upstream Essex (Tag 2012.1)
RHOS Version 2 = upstream Folsom (Tag 2012.2)
RHOS Version 3 = upstream Grizzly (Tag 2013.1)
RHOS Version 4 = upstream Havana (Tag 2013.2)
RHOS Version 5 = upstream Icehouse (Tag 2014.1)
RHOS Version 6 = upstream Juno (Tag 2014.2)
RHOS Version 7 = upstream Kilo (Tag 2015.1)
RHOS Version 8 = upstream Liberty
RHOS Version 9 = upstream Mitaka
RHOS Version 10 = upstream Newton
RHOS Version 11 = upstream Ocata
RHOS Version 12 = upstream Pike
RHOS Version 13 = upstream Queens
RHOS Version 14 = upstream R
RHOS Version 15 = upstream S

Tags in the Git repos are a little different.

  • For Essex though Kilo, the releases are tagged based on their code names
  • 2011 was weird.  We don’t talk about that.
  • From 2012 through 2015, the release tags  are based on the date of the release.  Year, the release number.  So the first release in 2012 is 2012.1.  Thus 2012.3 does not exist. Which is why we don’t talk about 2011.3.
  • From Liberty/8 the upstream  8 matches the RDO and RHOS version 8. Subnumbers are for stable releases, and may not match the downstream releases; Once things go stable, it is a downstream decision when to sync.  Thus, we have tags that start with 8,9, and 10 mapping to Liberty, Mitaka, and Newton.
  • When Ocata is cut, we’ll go to 11, leading to lots of Spinal Tap references

docker-selinux changed to container-selinux

Posted by Dan Walsh on October 04, 2016 12:33 PM
Changing upstream packages

I have decided to change the docker SELinux policy package on github.com from docker-selinux to container-selinux


The main reason I did this was after seeing the following on twitter.   Docker, INC is requesting people not use docker prefix for packages on github.


Since the policy for container-selinux can be used for more container runtimes then just docker, this seems like a good idea.  I plan on using it for OCID, and would consider plugging it into the RKT CRI.

I have modified all of the types inside of the policy to container_*.  For instance docker_t is now container_runtime_t and docker_exec_t is container_runtime_exec_t.

I have taken advantage of the typealias capability of SELinux policy to allow the types to be preserved over an upgrade.

typealias container_runtime_t alias docker_t;
typealias container_runtime_exec_t alias docker_exec_t;

This means people can continue to use docker_t and docker_exec_t with tools but the kernel will automatically translate them to the primary name container_runtime_t and container_runtime_exec_t.

This policy is arriving today in rawhide in the container-selinux.rpm which obsoletes the docker-selinux.rpm.  Once we are confident about the upgrade path, we will be rolling out the new packaging to Fedora and eventually to RHEL and CentOS.

Changing the types associated with container processes.

Secondarily I have begun to change the type names for running containers.  Way back when I wrote the first policy for containers, we were using libvirt_lxc for launching containers, and we already had types defined for VMS launched out of libvirt.  VM's were labeled svirt_t.  When I decided to extend the policy for Containers I decided on extending svirt with lxc.
svirt_lxc, but I also wanted to show that it had full network.  svirt_lxc_net_t.  I labeled the content inside of the container svirt_sandbox_file_t.

Bad names...

Once containers exploded on the seen with the arrival of docker, I knew I had made a mistake choosng the types associated with container processes.  Time to clean this up.  I have submitted pull requests into selinux-policy to change these types to container_t and container_image_t.

typealias container_t alias svirt_lxc_net_t;
typealais container_image_t alias svirt_sandbox_file_t;

The old types will still work due to typealias, but I think it would become a lot easier for people to understand the SELinux types with simpler names.  There is a lot of documentation and "google" knowledge out there about svirt_lxc_net_t and svirt_sandbox_file_t, which we can modify over time.

Luckily I have a chance at a do-over.

Episode 7 - More Powerful than root!

Posted by Open Source Security Podcast on October 03, 2016 09:05 PM
Kurt and Josh discuss the ORWL computer, crashing systemd with one line, NIST, and a security journal.

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/285901909&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

What is the spc_t container type, and why didn't we just run as unconfined_t?

Posted by Dan Walsh on October 03, 2016 05:00 PM
What is spc_t?

SPC stands for Super Privileged Container, which are containers that contain software used to manage the host system that the container will be running on.  Since these containers could do anything on the system and we don't want SELinux blocking any access we made spc_t an unconfined domain. 

If you are on an SELinux system, and run docker with SELinux separation turned off, the containers will run with the spc_t type.

You can disable SELinux container separation in docker in multiple different ways.

  • You don't build docker from scratch with the BUILDTAG=selinux flag.

  • You run the docker daemon without --selinux-enabled flag

  • You run a container with the --security-opt label:disable flag

          docker run -ti --security-opt label:disable fedora sh

  • You share the PID namespace or IPC namespace with the host

         docker run -ti --pid=host --ipc=host fedora sh
Note: we have to disable SELinux separation in ipc=host  and pid=host because it would block access to processes or the IPC mechanisms on the host.

Why not use unconfined_t?

The question comes up is why not just run as unconfined_t?  A lot of people falsely assume that unconfined_t is the only unconfined domains.  But unconfined_t is a user domain.   We block most confined domains from communicating with the unconfined_t domain,  since this is probably the domain that the administrator is running with.

What is different about spc_t?

First off the type docker runs as (docker_t) can transition to spc_t, it is not allowed to transition to unconfined_t. It transitions to this domain, when it executes programs located under /var/lib/docker

# sesearch -T -s docker_t | grep spc_t
   type_transition container_t docker_share_t : process spc_t;
   type_transition container_t docker_var_lib_t : process spc_t;
   type_transition container_t svirt_sandbox_file_t : process spc_t;

Secondly and most importantly confined domains are allowed to connect to unix domain sockets running as spc_t.

This means I could run as service as a container process and have it create a socket on /run on the host system and other confined domains on the host could communicate with the service.

For example if you wanted to create a container that runs sssd, and wanted to allow confined domains to be able to get passwd information from it, you could run it as spc_t and the confined login programs would be able to use it.


Some times you can create an unconfined domain that you want to allow one or more confined domains to communicate with. In this situation it is usually better to create a new domain, rather then reusing unconfined_t.

Impossible is impossible!

Posted by Josh Bressers on October 03, 2016 02:00 PM
Sometimes when you plan for a security event, it would be expected that the thing you're doing will be making some outcome (something bad probably) impossible. The goal of the security group is to keep the bad guys out, or keep the data in, or keep the servers patched, or find all the security bugs in the code. One way to look at this is security is often in the business of preventing things from happening, such as making data exfiltration impossible. I'm here to tell you it's impossible to make something impossible.

As you think about that statement for a bit, let me explain what's happening here, and how we're going to tie this back to security, business needs, and some common sense. We've all heard of the 80/20 rule, one of the forms is that the last 20% of the features are 80% of the cost. It's a bit more nuanced than that if you really think about it. If your goal is impossible it would be more accurate to say 1% of the features are 2000% of the cost. What's really being described here is a curve that looks like this
You can't make it to 100%, no matter how much you spend. This of course means there's no point in trying, but more importantly you have to realize you can't get to 100%. If you're smart you'll put your feature set somewhere around 80%, anything above that is probably a waste of money. If you're really clever there is some sort of best place to be investing resources, that's where you really want to be. 80% is probably a solid first pass though, and it's an easy number to remember.

The important thing to remember is that 100% is impossible. The curve never reaches 100%. Ever.

The thinking behind this came about while I was discussing DRM with someone. No matter what sort of DRM gets built, someone will break it. DRM is built by a person which means, by definition, a smarter person can break it. It can't be 100%, in some cases it's not even 80%. But when a lot of people or groups think about DRM, the goal is to make acquiring the movie or music or whatever 100% impossible. They even go so far as to play the cat and mouse game constantly. Every time a researcher manages to break the DRM, they fix it, the researcher breaks it, they fix it, continue this forever.

Here's the question about the above graph though. Where is the break even point? Every project has a point of diminishing returns. A lot of security projects forget that if the cost of what you're doing is greater than the cost of the thing you're trying to protect, you're wasting resources. Never forget that there is such a thing as negative value. Doing things that don't matter often create negative value.

This is easiest to explain in the context of ransomware. If you're spending $2000 to protect yourself from a ransomware invasion that will cost $300, that's a bad investment. As crime inc. continues to evolve I imagine they will keep a lot of this in mind, if they can keep their damage low, there won't be a ton of incentive for security spending, which helps them grow their business. That's a topic for another day though.

The summary of all this is that perfect security doesn't exist. It might never exist (never say never though). You have to accept good enough security. And more often than not, good enough is close enough to perfect that it gets the job done.

Comment on Twitter

Episode 6 - Foundational Knowledge of Security

Posted by Open Source Security Podcast on September 29, 2016 07:03 PM
Kurt and Josh discuss interesting news stories

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/285305681&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Episode 5 - OpenSSL: The library we deserve

Posted by Open Source Security Podcast on September 29, 2016 12:39 AM
Kurt and Josh discuss the recent OpenSSL update(s)

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/285193058&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Who left all this fire everywhere?

Posted by Josh Bressers on September 26, 2016 02:00 PM
If you're paying attention, you saw the news about Yahoo's breach. Five hundred million accounts. That's a whole lot of data if you think about it.  But here's the thing. If you're a security person, are you surprised by this? If you are, you've not been paying attention.

It's pretty well accepted that there are two types of large infrastructures. Those who know they've been hacked, and those who don't yet know they've been hacked. Any group as large as Yahoo probably has more attackers inside their infrastructure than anyone really wants to think about. This is certainly true of every single large infrastructure and cloud provider and consumer out there. Think about that for a little bit. If you're part of a large infrastructure, you have threat actors inside your network right now, probably more than you think.

There are two really important things to think about.

Firstly, if you have any sort of important data, and it's not well protected, odds are very high that it's left your network. Remember that not every hack gets leaked in public, sometimes you'll never find out. On that note, if anyone has any data on what percentage of compromises leaked I'd love to know.

The most important thing is around how we need to build infrastructure with a security mindset. This is a place public cloud actually has an advantage. If you have a deployment in a public cloud, you're naturally going to be less trusting of the machines than you would be if they were in racks you can see. Neither is really any safer, it's just you trust one less which will result in a more secured infrastructure. Gone are the days where having a nice firewall is all the security you need.

Now every architect should assume whatever they're doing has bad actors on the network and in the machines. If you keep this in mind, it really changes how you do things. Storing lots of sensitive data in the same place isn't wise. Break things apart when you can. Make sure data is encrypted as much as possible. Plan for failure, have you done an exercise where you assume the worst then decide what you do next? This is the new reality we have to exist in. It'll take time to catch up of course, but there's not really a choice. This is one of those change or die situations. Nobody can afford to ignore the problems around leaking sensitive data for much longer. The times, they are a changin.

Leave your comments on Twitter: @joshbressers

Importing a Public SSH Key

Posted by Adam Young on September 22, 2016 03:54 PM

Rex was setting up a server and wanted some help.  His hosting provider had set him up with a username and password for authentication. He wanted me to log in to the machine under his account to help out.  I didn’t want him to have to give me his password.  Rex is a smart guy, but he is not a Linux user.  He is certainly not a system administrator.  The system was CentOS.  The process was far more difficult to walk

CORRECTION: I had the keys swapped. It is important to keep the private key private, and that is the one in $HOME/.ssh/id_rsa

I use public keys cryptography all the time to log in to remote systems.  The OpenSSH client uses a keypair that is stored on my laptop under $HOME/.ssh.  The public key is in $HOME/.ssh/id_rsa.pub and the private one is in $HOME/.ssh/id_rsa.  In order for the ssh command to use this keypair to authenticate me when I try to login, the key stored in $HOME/.ssh/id_rsa.pub first needs to be copied, to the remote machine’s $HOME/.ssh/authorized_keys file.  If the permissions on this file are wrong, or the permissions on the directory  $HOME/.ssh are wrong, ssh will refuse my authentication attempt.

Trying to work this out over chat with someone unfamiliar with the process was frustrating.

This is what the final product looks like.

rex@drmcs [~]# ls -la $HOME/.ssh/
total 12
drwx------ 2 rex rex 4096 Sep 21 13:01 ./
drwx------ 9 rex rex 4096 Sep 21 13:28 ../
-rw------- 1 rex rex  421 Sep 21 13:01 authorized_keys

This should be scriptable.


exit 0

mkdir -p $SSH_DIR
chmod 700 $SSH_DIR
chmod 600 $AUTHN_FILE

However, it occured to me that he really should not even be adding me to his account, but, instead, should be creating a separate account for me, only giving me access to that, which would let me look around but not touch. Second attempt:


exit 0

/usr/sbin/useradd $NEW_USER

mkdir -p $SSH_DIR
chmod 700 $SSH_DIR
touch $AUTHN_FILE 
chmod 600 $AUTHN_FILE


To clean up the account when I am done, Rex can run:

sudo /usr/sbin/userdel -r admiyo

Which will not only remove my account, but also the directory /home/ayoung
If I have left a login he will see:

userdel: user admiyo is currently used by process 3561

Episode 4 - Dead squirrel in a box

Posted by Open Source Security Podcast on September 21, 2016 03:24 AM
Josh and Kurt discuss news of the day, shipping, and container security

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/283885003&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Distinct RBAC Policy Rules

Posted by Adam Young on September 21, 2016 01:52 AM

The ever elusive bug 968696 is still out there, due, in no small part, to the distributed nature of the policy mechanism. One Question I asked myself as I chased this beastie is “how many distinct policy rules do we actually have to implement?” This is an interesting question because, if we can an automated way to answer that question, it can lead to an automated way to transforming the policy rules themselves, and thus getting to a more unified approach to policy.

The set of policy files used in a Tripleo overcloud have around 1400 rules:

$ find /tmp/policy -name \*.json | xargs wc -l
   73 /tmp/policy/etc/sahara/policy.json
   61 /tmp/policy/etc/glance/policy.json
  138 /tmp/policy/etc/cinder/policy.json
   42 /tmp/policy/etc/gnocchi/policy.json
   20 /tmp/policy/etc/aodh/policy.json
   74 /tmp/policy/etc/ironic/policy.json
  214 /tmp/policy/etc/neutron/policy.json
  257 /tmp/policy/etc/nova/policy.json
  198 /tmp/policy/etc/keystone/policy.json
   18 /tmp/policy/etc/ceilometer/policy.json
  135 /tmp/policy/etc/manila/policy.json
    3 /tmp/policy/etc/heat/policy.json
   88 /tmp/policy/auth_token_scoped.json
  140 /tmp/policy/auth_v3_token_scoped.json
 1461 total

Granted, that might not be distinct rule lines, as some are multi-line, but most rules seem to be on a single line. There is some whitespace, too.

Many of the rules, while written differently, can map to the same implementation. For example:

“rule: False”

can reduce to


which is the same as


All are instances of oslo_policy.policy._check.FalseCheck.

With that in mind, I gathered up the set of policy files deployed on a Tripleo overcloud and hacked together some analysis.

Note: Nova embeds its policy rules in code now. In order to convert them to an old-style policy file, you need to run a command line tool:

oslopolicy-policy-generator --namespace nova --output-file /tmp/policy/etc/nova/policy.json

Ironic does something similar, but uses

oslopolicy-sample-generator --namespace=ironic.api --output-file=/tmp/policy/etc/ironic/policy.json

I’ve attached my source code at the bottom of this article. Running the code provides the following summary:

55 unique rules found

The longest rule belongs to Ironic:

OR(OR(OR((ROLE:admin)(ROLE:administrator))AND(OR((tenant == demo)(tenant == baremetal))(ROLE:baremetal_admin)))AND(OR((tenant == demo)(tenant == baremetal))OR((ROLE:observer)(ROLE:baremetal_observer))))

Some look somewhat repetitive, such as

OR((ROLE:admin)(is_admin == 1))

And some downright dangerous:

NOT( (ROLE:heat_stack_user)

A there are ways to work around having an explicit role in your token.

Many are indications of places where we want to use implied roles, such as:

  1. OR((ROLE:admin)(ROLE:administrator))
  2. OR((ROLE:admin)(ROLE:advsvc)
  3. (ROLE:admin)
  4. (ROLE:advsvc)
  5. (ROLE:service)


This is the set of keys that appear more than one time:

9 context_is_admin
4 admin_api
2 owner
6 admin_or_owner
2 service:index
2 segregation
7 default

Doing a grep for context_is_admin shows all of them with the following rule:

"context_is_admin": "role:admin",

admin_api is roughly the same:

cinder/policy.json: "admin_api": "is_admin:True",
ironic/policy.json: "admin_api": "role:admin or role:administrator"
nova/policy.json:   "admin_api": "is_admin:True"
manila/policy.json: "admin_api": "is_admin:True",

I think these here are supposed to include the new check for is_admin_project as well.

Owner is defined two different ways in two files:

neutron/policy.json:  "owner": "tenant_id:%(tenant_id)s",
keystone/policy.json: "owner": "user_id:%(user_id)s",

Keystone’s meaning is that the user matches, where as neutron is a project scope check. Both rules should change.

Admin or owner has the same variety

cinder/policy.json:    "admin_or_owner": "is_admin:True or project_id:%(project_id)s",
aodh/policy.json:      "admin_or_owner": "rule:context_is_admin or project_id:%(project_id)s",
neutron/policy.json:   "admin_or_owner": "rule:context_is_admin or rule:owner",
nova/policy.json:      "admin_or_owner": "is_admin:True or project_id:%(project_id)s"
keystone/policy.json:  "admin_or_owner": "rule:admin_required or rule:owner",
manila/policy.json:    "admin_or_owner": "is_admin:True or project_id:%(project_id)s",

Keystone is the odd one out here, with owner again meaning “user matches.”

Segregation is another rules that means admin:

aodh/policy.json:       "segregation": "rule:context_is_admin",
ceilometer/policy.json: "segregation": "rule:context_is_admin",

Probably the trickiest one to deal with is default, as that is a magic term that is used when a rule is not defined:

sahara/policy.json:   "default": "",
glance/policy.json:   "default": "role:admin",
cinder/policy.json:   "default": "rule:admin_or_owner",
aodh/policy.json:     "default": "rule:admin_or_owner",
neutron/policy.json:  "default": "rule:admin_or_owner",
keystone/policy.json: "default": "rule:admin_required",
manila/policy.json:   "default": "rule:admin_or_owner",

There seem to be three catch all approaches:

  1. require admin,
  2. look for a project match but let admin override
  3. let anyone execute the API.

This is the only rule that cannot be made globally unique across all the files.

Here is the complete list of suffixes.  The format is not strict policy format; I munged it to look for duplicates.

(field == address_scopes:shared=True)
(field == networks:router:external=True)
(field == networks:shared=True)
(field == port:device_owner=~^network:)
(field == subnetpools:shared=True)
(group == nobody)
(is_admin == False)
(is_admin == True)
(is_public_api == True)
(project_id == %(project_id)s)
(project_id == %(resource.project_id)s)
(tenant_id == %(tenant_id)s)
(user_id == %(target.token.user_id)s)
(user_id == %(trust.trustor_user_id)s)
(user_id == %(user_id)s)
AND(OR((tenant == demo)(tenant == baremetal))OR((ROLE:observer)(ROLE:baremetal_observer)))
AND(OR(NOT( (field == rbac_policy:target_tenant=*) (ROLE:admin))OR((ROLE:admin)(tenant_id == %(tenant_id)s)))
NOT( (ROLE:heat_stack_user) 
OR((ROLE:admin)(is_admin == 1))
OR((ROLE:admin)(project_id == %(created_by_project_id)s))
OR((ROLE:admin)(project_id == %(project_id)s))
OR((ROLE:admin)(tenant_id == %(network:tenant_id)s))
OR((ROLE:admin)(tenant_id == %(tenant_id)s))
OR((ROLE:advsvc)OR((ROLE:admin)(tenant_id == %(network:tenant_id)s)))
OR((ROLE:advsvc)OR((tenant_id == %(tenant_id)s)OR((ROLE:admin)(tenant_id == %(network:tenant_id)s))))
OR((is_admin == True)(project_id == %(project_id)s))
OR((is_admin == True)(quota_class == %(quota_class)s))
OR((is_admin == True)(user_id == %(user_id)s))
OR((tenant == demo)(tenant == baremetal))
OR((tenant_id == %(tenant_id)s)OR((ROLE:admin)(tenant_id == %(network:tenant_id)s)))
OR(NOT( (field == port:device_owner=~^network:) (ROLE:advsvc)OR((ROLE:admin)(tenant_id == %(network:tenant_id)s)))
OR(NOT( (field == rbac_policy:target_tenant=*) (ROLE:admin))
OR(OR((ROLE:admin)(ROLE:administrator))AND(OR((tenant == demo)(tenant == baremetal))(ROLE:baremetal_admin)))
OR(OR((ROLE:admin)(is_admin == 1))(ROLE:service))
OR(OR((ROLE:admin)(is_admin == 1))(project_id == %(target.project.id)s))
OR(OR((ROLE:admin)(is_admin == 1))(token.project.domain.id == %(target.domain.id)s))
OR(OR((ROLE:admin)(is_admin == 1))(user_id == %(target.token.user_id)s))
OR(OR((ROLE:admin)(is_admin == 1))(user_id == %(user_id)s))
OR(OR((ROLE:admin)(is_admin == 1))AND((user_id == %(user_id)s)(user_id == %(target.credential.user_id)s)))
OR(OR((ROLE:admin)(project_id == %(created_by_project_id)s))(project_id == %(project_id)s))
OR(OR((ROLE:admin)(project_id == %(created_by_project_id)s))(project_id == %(resource.project_id)s))
OR(OR((ROLE:admin)(tenant_id == %(tenant_id)s))(ROLE:advsvc))
OR(OR((ROLE:admin)(tenant_id == %(tenant_id)s))(field == address_scopes:shared=True))
OR(OR((ROLE:admin)(tenant_id == %(tenant_id)s))(field == networks:shared=True)(field == networks:router:external=True)(ROLE:advsvc))
OR(OR((ROLE:admin)(tenant_id == %(tenant_id)s))(field == networks:shared=True))
OR(OR((ROLE:admin)(tenant_id == %(tenant_id)s))(field == subnetpools:shared=True))
OR(OR(OR((ROLE:admin)(ROLE:administrator))AND(OR((tenant == demo)(tenant == baremetal))(ROLE:baremetal_admin)))AND(OR((tenant == demo)(tenant == baremetal))OR((ROLE:observer)(ROLE:baremetal_observer))))
OR(OR(OR((ROLE:admin)(is_admin == 1))(ROLE:service))(user_id == %(target.token.user_id)s))

Here is the source code I used to analyze the policy files:

#!/usr/bin/env python

# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#    http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
import sys

from oslo_serialization import jsonutils

from oslo_policy import policy
import oslo_policy._checks as _checks

def display_suffix(rules, rule):

    if isinstance (rule, _checks.RuleCheck):
        return display_suffix(rules, rules[rule.match.__str__()])

    if isinstance (rule, _checks.OrCheck):
        answer =  'OR('
        for subrule in rule.rules:
            answer += display_suffix(rules, subrule)
        answer +=  ')'
    elif isinstance (rule, _checks.AndCheck):
        answer =  'AND('
        for subrule in rule.rules:
            answer += display_suffix(rules, subrule)
        answer +=  ')'
    elif isinstance (rule, _checks.TrueCheck):
        answer =  "TRUE"
    elif isinstance (rule, _checks.FalseCheck):
        answer =  "FALSE"
    elif isinstance (rule, _checks.RoleCheck):       
        answer =  ("(ROLE:%s)" % rule.match)
    elif isinstance (rule, _checks.GenericCheck):       
        answer =  ("(%s == %s)" % (rule.kind, rule.match))
    elif isinstance (rule, _checks.NotCheck):       
        answer =  'NOT( %s ' % display_suffix(rules, rule.rule)
        answer =  (rule)
    return answer

class Tool():
    def __init__(self):
        self.prefixes = dict()
        self.suffixes = dict()

    def add(self, policy_file):
        policy_data = policy_file.read()
        rules = policy.Rules.load(policy_data, "default")
        suffixes = []
        for key, rule in rules.items():
            suffix = display_suffix(rules, rule)
            self.prefixes[key] = self.prefixes.get(key, 0) + 1
            self.suffixes[suffix] = self.suffixes.get(suffix, 0) + 1

    def report(self):
        suffixes = sorted(self.suffixes.keys())
        for suffix in suffixes:
            print (suffix)
        print ("%d unique rules found" % len(suffixes))
        for prefix, count in self.prefixes.items():
            if count > 1:
                print ("%d %s" % (count, prefix))
def main(argv=sys.argv[1:]):
    tool = Tool()
    policy_dir = "/tmp/policy"
    name = 'policy.json'
    suffixes = []
    for root, dirs, files in os.walk(policy_dir):
        if name in files:
            policy_file_path = os.path.join(root, name)
            print (policy_file_path)
            policy_file = open(policy_file_path, 'r')

if __name__ == "__main__":

Is dialup still an option?

Posted by Josh Bressers on September 20, 2016 01:00 PM
TL;DR - No.

Here's why.

I was talking with my Open Source Security Podcast co-host Kurt Seifried about what it would be like to access the modern Internet using dialup. So I decided to give this a try. My first thought was to find a modem, but after looking into this, it isn't really an option anymore.

The setup

  • No Modem
  • Fedora 24 VM
  • Firefox as packaged with Fedora 24
  • Use the firewall via wondershaper to control the network speed
  • "App Telemetry" firefox plugin to time the site load time

I know it's not perfect, but it's probably close enough to get a feel for what's going on. I understand this doesn't exactly recreate a modem experience with details like compression, latency, and someone picking up the phone during a download. There was nothing worse than having that 1 megabyte download at 95% when someone decided they needed to make a phone call. Call waiting was also a terrible plague.

If you're too young to understand any of this, be thankful. Anyone who looks at this time with nostalgia is pretty clearly delusional.

I started testing at a 1024 Kb connection and halved my way down to 56 (instead of 64). This seemed like a nice way to get a feel for how these sites react as your speed shifts down.


I picked the most popular english language sites listed on the Alexa top 100. I added lwn.net becuase I like them, and my kids had me add twitch. My home Internet connection is 50 Mb down, 5 Mb up. As you can see, in general all these sites load in less than 5 seconds. The numbers represent the site being fully loaded. Most web browsers seem to show something pretty quickly, even if the page is still loading. For the purpose of this test, our numbers are how long it takes a site to fully load. I also show 4 samples because as you'll see later on, some of these sites took a really really long time to load, so four was as much suffering as I could endure. Perhaps someday I'll do this again with extra automation so I don't have to be so involved.

1024 Kb/s

Things really started to go downhill at this point. Anyone who claims a 1 megabit connection is broadband has probably never tried to use such a connection. In general though most of the sites were usable from a very narrow definition ofh the word.

512 Kb/s

You're going to want to start paying attention to Amazon, something really clever is going to happen, it's sort of noticeable in this graph. Also of note is how consistent bing.com is. While not the fastest site, it will remain extremely consistent through the entire test.

256 Kb/s

Here is where you can really see what Amazon is doing. They clearly have some sort of client side magic happening to ensure an acceptable response. For the rest of my testing I saw this behavior. A slow first load, then things were much much faster. Waiting for sites to load at this speed was really painful, it's only going to get worse from here. 15 seconds doesn't sound horrible, but it really is a long time to wait.

128 Kb

Things are not good at 128 Kb/s. Wikipedia looks empty, it was still loading at the same speed as our fist test. I imagine my lack of an ad enhanced experience with them helps keeps it so speedy.

56 Kb

Here is the real data you're waiting for. This is where I set the speed to 56K down, 48K up, which is the ideal speed of a 56K modem. I doubt most of us got that speed very often.

As you can probably see, Twitch takes an extremely long time to load. This should surprise nobody as it's a site that streams video, by definition it's expected you have a fast connection. Here is the graph again with Twitch removed.
The Yahoo column is empty because I couldn't get Yahoo to load. It timed out every single time I tried. Wikipedia looks empty, but it still loaded at 0.3 seconds. After thinking about this it does make sense. There are Wikipedia users who are on dialup in some countries. They have to keep it lean. Amazon still has a slow first load, then nice and speedy (for some definition of speedy) after that. I tried to load a youtube video to see if it would work. After about 10 minutes of nothing happening I gave up.

Typical tasks

I also tried to perform a few tasks I would consider "expected" by someone using the Internet.

For example from the time I typed in gmail.com until I could read a mail message took about 600 seconds I did let every page load completely before clicking or typing on it. Once I had it loaded, and the AJAX interface timed out then told me to switch to HTML mode, it was mostly usable. It was only about 30 seconds to load a message (including images) and 0.2 seconds to return to the inbox.

Logging into Facebook took about 200 seconds. It was basically unusable once it loaded though. Nothing new loaded, it loads quite a few images though, so this makes sense. These things aren't exactly "web optimized" anymore. If you know someone on dialup, don't expect them to be using Facebook.

cnn.com took 800 seconds. Reddit's front page was 750 seconds. Google News was only 33 seconds. The newspaper is probably a better choice if you have dialup.

I finally tried to run a "yum update" in Fedora to see if updating the system was something you could leave running overnight. It's not. After about 4 hours of just downloading repo metadata I gave up. There was no way you can plausibly update a system over dialup. If you're on dialup, the timeouts will probably keep you from getting pwnt better than updates will.

Another problem you hit with a modern system like this is it tries to download things automatically in the background. More than once I had to kill some background tasks that basically ruined my connection. Most system designers today assume everyone has a nice Internet connection so they can do whatever they want in the background. That's clearly a problem when you're running at a speed this slow.


Is the Internet usable on Dialup in 2016? No. You can't even pretend it's maybe usable. It pretty much would suck rocks to use the Internet on dialup today. I'm sure there are some people doing it. I feel bad for them. It's clear we've hit a place where broadband is expected, and honestly, you need fast broadband, even 1 Megabit isn't enough anymore if you want a decent experience. The definition of broadband in the US is now 25Mb down 3Mb up. Anyone who disagrees with that should spend a day at 56K.

I know this wasn't the most scientific study ever done, I would welcome something more rigorous. If you have any questions or ideas hit me up on Twitter: @joshbressers

Mirroring Keystone Delegations in FreeIPA/389DS

Posted by Adam Young on September 20, 2016 03:37 AM

This is more musing than a practical design.

Most application servers have a means to query LDAP for the authorization information for a user.  This is separate from, and follows after, authentication which may be using one of multiple mechanism, possibly not even querying LDAP (although that would be strange).

And there are other mechanisms (SAML2, SSSD+mod_lookup_identity) that can, also, provide the authorization attributes.

Separating mechanism from meaning, however, we are left with the fact that applications need a way to query attributes to make authorization decisions.  In Keystone, the general pattern is this:

A project is a group of resources.

A user is assigned a role on a project.

A user requests a token for a project. That token references the users roles.

The user passes the token to the server when accessing and API. Access control is based on the roles that the user has in the associated token.

The key point here is that it is the roles associated with the token in question that matter.  From that point on, we have the ability to inject layers of indirection.

Here is where things fall down today. If we take an app like WordPress, and tried to make it query against Red Hat’s LDAP server for the groups to use, there is no mapping  between the groups assigned and the permissions that the user should have.  As the WordPress instance might be run by any one of several organizations within Red Hat, there is no direct mapping possible.

If we map this problem domain to IPA, we see where things fall down.

WordPress, here, is a service.  If the host it is running on is owned by a particular organization (say, EMEA-Sales) it should be the EMEA Sales group that determines who gets what permissions on WordPress.

Aside: WordPress, by the way, makes a great example to use, as it has very clear, well defined roles,  which have a clear scope of authorization for operations.

Subscriber < Contributor < Author < Editor < Administrator

Back to our regular article:

If we define and actor as either a user or agroup of users, a Role assignment is a : (actor, organization, application, role)



Now, a user should not have to go to IPA, get a token, and hand that to WordPress.  When a user connects to WordPress, and attempts to do any non-public action, they are prompted for credentials, and are authenticated.  At this point, WordPress can do the LDAP query. And here is the question:

“what should an application query for in LDAP”

If we use groups, then we have a nasty naming scheme.  EMEA-sales_wordpress_admin versus LATAM-sales_worpress_admin.  This is appending the query  (organization, application) and the result (role).

Ideally, we would tag the role on the service.  The service already reflects organization and application.

In the RFC based schemas, there is a organizationalRole objectclass which almost mirrors what we want.  But I think the most important thing is to return an object that looks like a Group, most specifically groupofnames.  Fortunately, I think this is just the ‘cn’.

Can we put a group of names under a service?  Its not a container.

‘ipaService’ DESC ‘IPA service objectclass’ AUXILIARY MAY ( memberOf $ managedBy $ ipaKrbAuthzData) X-ORIGIN ‘IPA v2’ )

objectClass: ipaobject
objectClass: top
objectClass: ipaservice
objectClass: pkiuser
objectClass: ipakrbprincipal
objectClass: krbprincipal
objectClass: krbprincipalaux
objectClass: krbTicketPolicyAux

It probably would make more sense to have a separate subtree service-roles,  with each service-name a container, and each role a group-of-names under that container. The application would  filter on (service-name) to get the set of roles.  For a specific user, the service would add an additional filter for memberof.

Now, that is a lot of embedded knowledge in the application, and does not provide any way to do additional business logic in the IPA server or to hide that complexity from the end user.  Ideally, we would have something like automember to populate these role assignments, or, even better, a light-weight way for a user with a role assignment to re-delegate that to another user or principal.

That is what really gets valuable:  user self service for delegation.  We want to make it such that you do not need to be an admin to create a role assignment, but rather (with exceptions) you can delegate to others any role that you have assigned to yourself.  This is a question of scale.

However, more than just scale, we want to be able to track responsibility;  who assigned a user the role that they have, and how did they have the authority to assign it?  When a user no longer has authority, should the people they have delegated to also lose it, or does that delegation get transferred?  Both patterns are required for some uses.

I think this fast gets beyond what can be represented easily in an LDAP schema.  Probably the right step is to use something like automember to place users into role assignments.  Expanding nested groups, while nice, might be too complicated.

Why do we do security?

Posted by Josh Bressers on September 18, 2016 10:22 PM
I had a discussion last week that ended with this question. "Why do we do security". There wasn't a great answer to this question. I guess I sort of knew this already, but it seems like something too obvious to not have an answer. Even as I think about it I can't come up with a simple answer. It's probably part of the problems you see in infosec.

The purpose of security isn't just to be "secure", it's to manage risk in some meaningful way. In the real world this is usually pretty easy for us to understand. You have physical things, you want to keep them from getting broken, stolen, lost, pick something. It usually makes some sort of sense.

It would be really easy to use banks as my example here, after all they have a lot of something everyone wants, so instead let's use cattle, that will be more fun. Cows are worth quite a lot of money actually. Anyone who owns cows knows you need to protect them in some way. In some environments you want to keep your cows inside a pen, in others you let them roam free. If they roam free the people living near the cows need to protect themselves actually (barbed wire wasn't invented to keep cows in, it was used to keep them out). This is something we can understand. Some environments are very low risk, you can let your cattle roam where they want. Some are high risk, so you keep them in a pen. I eagerly await the cow related mails this will produce because of my gross over-simplification of what is actually a very complex and nuanced problem.

So now we have the question about what are you protecting? If you're a security person, what are you really trying to protect? You can't protect everything, there's no point in protecting everything. If you try to protect everything you actually end up protecting nothing. You need to protect the things you have that are not only high value, but also have a high risk of being attacked/stolen. That priceless statue in the pond outside that weighs four tons is high value, but nobody is stealing it.

Maybe this is why it's hard to get security taken seriously sometimes. If you don't know what you're protecting, you can't explain why you're important. The result is generally the security guy storming out screaming "you'll be sorry". They probably won't. If we can't easily identify what our risk is and why we care about it, we can't possibly justify what we do.

There are a lot of frameworks that can help us understand how we should be protecting our security assets, but they don't really do a great job of helping identify what those assets really are. I don't think this a bad thing, I think this is just part of maturing the industry. We all have finite budgets, if we protect things that don't need protecting we are literally throwing money away. So this begs the question what should we be protecting?

I'm not sure we can easily answer this today. It's harder than it sounds. We could say we need to protect the things that if were lost tomorrow would prevent the business from functioning. That's not wrong, but electricity and water fall into that category. If you tried to have an "electricity security program" at most organizations you'll be looking for a new job at the end of the day. We could say that customer data is the most important asset, which it might be, but what are you protecting it from? Is it enough to have a good backup? Do you need a fail-over data center? Will an IDS help protect the data? Do we want to protect the integrity or is our primary fear exfiltration? Things can get out of hand pretty quickly.

I suspect there may be some value to these questions in the world of accounting. Accountants spend much time determining assets and values. I've not yet looked into this, but I think my next project will be starting to understand how assets are dealt with by the business. Everything from determining value, to understanding loss. There is science here already, it would be silly for us to try to invent our own.

Leave your comments on Twitter: @joshbressers

Hierarchy of Isolation

Posted by Adam Young on September 16, 2016 01:40 AM

One way to understand threads, process, containers, and VMs is to look at what each level of abstraction provides for isolation.

 abstraction stack & instructions heap process IDs,

filesystemn  &

network namespace

thread isolated shared shared shared
process isolated isolated shared shared
container isolated isolated isolated shared
Virtual Machine isolated isolated isolated isolated

I think of this as a hierarchy.

  • A Process is a thread, but one that also provides heap isolation.
  • A container is a process, but ont that also isolated the pid, network, and filesystem namespaces
  • A virtual machine is a process that, beyond the isolation provided by a container, provides a completely different Kernel instace.

Episode 3 - The Lockpicking Sewing Circle

Posted by Open Source Security Podcast on September 13, 2016 07:55 PM
Josh and Kurt discuss news of the day, banks, 3D printing, and lockpicking.

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/282763713&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

On Experts

Posted by Josh Bressers on September 12, 2016 02:00 PM
Are you an expert? Do you know an expert? Do you want to be an expert?

This came up for me the other day while having a discussion with a self proclaimed expert. I'm not going to claim I'm an expert at anything, but if you tell me all about how good you are, I'm not going to take it at face value. I'm going to demand some proof. "Trust me" isn't proof.

There are a rather large number of people who think they are experts, some think they're experts at everything. Nobody is an expert at everything. People who claim to have done everything should be looked at with great suspicion. Everyone can be an expert at something though.

One of the challenges we always face is trying to figure out who is actually an expert, and who only thinks they are an expert? There are plenty of people who sound very impressive, but if they have to deal with an actual expert, things fall apart pretty quick. They can get you into trouble if you're expecting expert advice. Especially in areas like security, bad advice can be worse than no advice.

The simple answer is to look at their public contributions. If you have someone who has ZERO public contributions, that's not an expert in anything. Even if you're working for a secretive organization, you're going to leave a footprint somewhere. No footprint means you should seriously question a person's expertise. Becoming an expert leaves a long crazy trail behind whoever gets there. In the new and exciting world of open source and social media there is no excuse for not being able to to show off your work (unless you don't have anything to show off of course).

If you think you're an expert, or you want to be an expert, start doing things in the open. Write code (if you don't have a github account, go get one). Write blog posts, answer questions, go to meetups. There are so many opportunities it's not even funny. Just because you think you're smart doesn't mean you are, go out and prove it.

Getting the URLs out of the Service Catalog with jq

Posted by Adam Young on September 10, 2016 10:40 PM

When you make a call to Keystone to get a token, you also get back the service catalog. While many of my scripts have used the $OS_AUTH_URL to make follow on calls, if the calls are administrative in nature, you should use the URL in service catalog.

This makes use of curl fetch the token and jq to parse the output.

This call will fetch a token and ignore it, but instead pull the identity admin URL out of the Token.

curl -s -d @token-request.json -H "Content-type: application/json" $OS_AUTH_URL/auth/tokens | jq '.token | .catalog [] | select(.type == "identity") | .endpoints[] | select(.interface == "admin") | .url  ' 

Say you want to talk to Nova? That would be the compute API on the public endpoint:

curl -s -d @token-request.json -H "Content-type: application/json" $OS_AUTH_URL/auth/tokens | jq '.token | .catalog [] | select(.type == "compute") | .endpoints[] | select(.interface == "public") | .url  ' 

Generating Token Request JSON from Environment Variables

Posted by Adam Young on September 10, 2016 04:08 PM

When working with New APIS we need to test them with curl prior to writing the python client. I’ve often had to hand create the JSON used for the token request, as I wrote about way back here.  Here is a simple bash script to convert the V3 environment variables into the JSON for a token request.






cat << EOF
    "auth": {
        "identity": {
            "methods": [
            "password": {
                "user": {
                    "domain": {
                        "name": "$OS_USER_DOMAIN_NAME"
                    "name": "$OS_USERNAME",
                    "password": "$OS_PASSWORD"
        "scope": {
            "project": {
                "domain": {
                    "name": "$OS_PROJECT_DOMAIN_NAME"
                "name": "$OS_PROJECT_NAME"


Run it like this:

./gen_token_request_json.sh > token-request.json

And test it

curl -si -d @token-request.json -H "Content-type: application/json" $OS_AUTH_URL/auth/tokens

Should return a lot of JSON output.

This is for a project scoped token. Minor variations would get you unscoped or domain scoped.

Episode 2 - Instills the proper amount of fear

Posted by Open Source Security Podcast on September 07, 2016 12:28 AM
Josh and Kurt discuss how open source security works.

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/281731016&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Comment on Twitter

Episode 1 - Rich History of Security Flaws

Posted by Open Source Security Podcast on September 07, 2016 12:24 AM
Josh and Kurt discuss their first podcast as well as random bits about open source security.

Download Episode

<iframe frameborder="no" height="150" scrolling="no" src="https://w.soundcloud.com/player/?url=https%3A//api.soundcloud.com/tracks/281712199&amp;auto_play=false&amp;hide_related=false&amp;show_comments=true&amp;show_user=true&amp;show_reposts=false&amp;visual=true" width="100%"></iframe>

Show Notes

Comment on Twitter

You can't weigh risk if you don't know what you don't know

Posted by Josh Bressers on September 06, 2016 02:00 PM
There is an old saying we've all heard at some point. It's often attributed to Donald Rumsfeld.

There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns -- the ones we don't know we don't know
If any of us have ever been in a planning meeting, a variant of this has no doubt come up at some point. It came up for me last week, and every time I hear it I think about all things we don't know we don't know. If you're not familiar with the concept, it works a bit like this. I know I don't know to drive a boat. But because I know I don't know this, I could learn. If you know you lack certain knowledge, you could find a way to learn it. If you don't know what you don't know, there is nothing you can do about it. The future is often an unknown unknown. There is nothing we can do about the future in many instances, you just have to wait until it becomes a known, and hope it won't be anything too horrible. There can also be blindness when you think you know something, but you really don't. This is when people tend to stop listening to the actual experts because they think they are an expert.

This ties back into conversations about risk and how we deal with it.

If there is something you don't know you don't know, by definition you can't weight the possible risk with whatever it is you are (or aren't) doing. A great example here is trying to understand your infrastructure. If you don't know what you have, you don't know which machines are patches, and you're not sure who is running what software, you have a lot of unknowns. It's probably safe to say at some future date there will be a grand explosion when everything start to fall apart. It's also probably safe to say if you have infrastructure like this, you don't understand the pile of dynamite you're sitting on.

Measuring risk can be like trying to take a picture of an invisible man. Do you know where your risk is? Do you know what it should look like? How big is it? Is it wearing a hat? There are so many things to keep track of when we try to understand risk. There are more people afraid of planes than cars, but flying is magnitudes safer. Humans are really bad at risk. We think we understand something (or think it's a known or known unknown). Often we're actually mistaken.

How do we deal with the unknown unknowns in a context like this? We could talk about being agile or quick or adaptive, whatever you want. But at the end of the day what's going to save you is your experts. Understand them, know where you are strong and weak. Someday the unknowns become knows, usually with a violent explosion. To some of your experts these risks are known, you may just have to listen.

It's also important to have multiple experts. If you only have one, they could believe they're smarter than they are. This is where things can get tricky. How can we decide who is actually an expert and who thinks they're an expert? This is a whole long complex topic by itself which I'll write about someday.

Anyway, on the topic of risk and unknowns. There will always be unknown unknowns. Even if you have the smartest experts in the world, it's going to happen. Just make sure your unknown unknowns are worth it. There's nothing worse than not knowing something you should.

Deploying Fernet on the Overcloud

Posted by Adam Young on September 06, 2016 01:45 AM

Here is a proof of concept of deploying an OpenStack Tripleo Overcloud using the Fernet token provider.

I’m going to take the short cut of using the Keystone setup on the undercloud to generate the keys. Since the undercloud is still using UUID, this Key repo will not be used by the undercloud.

It makes use of Heat swift artifacts, which puts a copy of the Fernet repo on every node, not just the Keystone/Controller node. That may or may not be acceptable for your deployment.

On the undercloud

. ~/stackrc
sudo keystone-manage fernet_setup --keystone-user keystone --keystone-group keystone
sudo tar -zcf keystone-fernet-keys.tar.gz /etc/keystone/fernet-keys
upload-swift-artifacts -f keystone-fernet-keys.tar.gz

To add an additional value to the overcloud Hiera, use an additional deploy.yaml file.

export DEPLOY_ENV_YAML=$PWD/depoloy.yaml

Here is what this file looks like

            keystone::token_provider: 'fernet'

Deploy with

openstack overcloud deploy --templates -e deploy-env.sh 

And wait for completion

Check the state on the controller.

$ openstack server list
| ID                                   | Name                    | Status | Networks            |
| 756fbd73-e47b-46e6-959c-e24d7fb71328 | overcloud-controller-0  | ACTIVE | ctlplane= |
| 62b869df-1203-4d58-8e45-fac6cd4cfbee | overcloud-novacompute-0 | ACTIVE | ctlplane=  |
[stack@undercloud ~]$ ssh heat-admin@ 
Last login: Tue Sep  6 00:09:59 2016 from
[heat-admin@overcloud-controller-0 ~]$ sudo crudini --get /etc/keystone/keystone.conf token driver
[heat-admin@overcloud-controller-0 ~]$ sudo crudini --get /etc/keystone/keystone.conf token provider

Look in the database on the controller:

$ sudo su
[root@overcloud-controller-0 heat-admin]# mysql
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 415
Server version: 10.1.12-MariaDB MariaDB Server

Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

MariaDB [(none)]> use keystone
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
MariaDB [keystone]> select * from token;
Empty set (0.00 sec)

MariaDB [keystone]> 


Test the provider:


$ openstack token issue
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
| Field | Value |
| expires | 2016-09-20 05:26:17+00:00 |
| id | gAAAAABX4LppE8vaiFZ992eah2i3edpO1aDFxlKZq6a_RJzxUx56QVKORrmW0-oZK3-Xuu2wcnpYq_eek2SGLz250eLpZOzxKBR0GsoMfxJU8mEFF8NzfLNcbuS-iz7SV-N1re3XEywSDG90JcgwjQfXW-8jtCm-n3LL5IaZexAYIw059T_-cd8 |
| project_id | 26156621d0d54fc39bf3adb98e63b63d |
| user_id | 397daf32cadd490a8f3ac23a626ac06c |

The really long token, but not as long as PKI token, is Fernet.

Note that the keys used to sign tokens are now available via the undercloud’s swift. I would recommend deleting them immediately after deployment with:


swift delete overcloud-artifacts keystone-fernet-keys.tar.gz

Deploying Server on Ironic Node Baseline

Posted by Adam Young on September 02, 2016 08:48 PM

My team is working on the ability to automatically enroll servers launched from Nova in FreeIPA. Debugging the process has proven challenging;  when things fail, the node does not come up, and there is little error reporting.  This article posts a baseline of what things look like prior to any changes, so we can better see what we are breaking.

UPDATE: The command I ended up using to test this is:

openstack server create --flavor control --image overcloud-full testserver --nic net-id=ctlplane --key-name default

Since the reported error is that the port attach failed, I want to see what ports to expect.

$ . ./stackrc 
[stack@undercloud ~]$ openstack port list
| ID | Name | MAC Address | Fixed IP Addresses |
| eb32c2a9-9bd8-45bb-929a-ed626b845e3e | | fa:16:3e:92:32:94 | ip_address='', subnet_id='2a0bf352-1b8f-469b-bb55-cf6e193d5a4d' |

Prior to deploying a server, there is one port.

Deploying a server:

openstack server create --flavor control --image overcloud-full testserver --nic net-id=ctlplane --key-name default

Gives us a new port

$ openstack port list
| ID | Name | MAC Address | Fixed IP Addresses |
| 08dbcf34-6ac0-4edb-9079-93b2aced5afa | | 00:0d:25:4f:b1:f8 | ip_address='', subnet_id='2a0bf352-1b8f-469b-bb55-cf6e193d5a4d' |
| eb32c2a9-9bd8-45bb-929a-ed626b845e3e | | fa:16:3e:92:32:94 | ip_address='', subnet_id='2a0bf352-1b8f-469b-bb55-cf6e193d5a4d' |


Node list:

$ openstack baremetal node list
| UUID                                 | Name      | Instance UUID                        | Power State | Provisioning State | Maintenance |
| d6604837-b374-4ae2-9ad0-ff0d98c3119b | control-0 | fd60daf3-65fc-44bf-8f90-89b127e67e56 | power on    | active             | False       |
| e5c3e3a1-e466-411d-8707-652fdb87af54 | compute-0 | None                                 | power off   | available          | False       |

Can log in with ssh:

ssh centos@

After deleting ther server with:

openstack server delete testserver

Back to one port

$ openstack port list
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                       |
| eb32c2a9-9bd8-45bb-929a-ed626b845e3e |      | fa:16:3e:92:32:94 | ip_address='', subnet_id='2a0bf352-1b8f-469b-bb55-cf6e193d5a4d' |

Nodes are freed up and unassigned

$ openstack baremetal node list
| UUID                                 | Name      | Instance UUID | Power State | Provisioning State | Maintenance |
| d6604837-b374-4ae2-9ad0-ff0d98c3119b | control-0 | None          | power off   | available          | False       |
| e5c3e3a1-e466-411d-8707-652fdb87af54 | compute-0 | None          | power off   | available          | False       |

Running Qemu/KVM without libvirt

Posted by Adam Young on August 30, 2016 02:18 AM

When I booted a VM yesterda, I noticed that there was a huge command line that showed up if I ran ps. I tried to run that by hand.  It is huge, so I wrapped it with a script, but the command is not too bad to understand:  everything that qemu needs to do needs to be passed in on the command line.

Complete command line is at the end of the article.
I first put SELinux into permissive mode, as it will not allow my user to create a VM.

I needed to adjust three values.

  1. First, the VM opens a domain socket, used for monitoring the VM. The path that this points to is in /var/lib/libvirt/qemu/ which my user does not have access to. I changed this to /home/ayoung/devel/qemu.
  2. The image is read out of /home/ayoung/devel/qemu which my user does not have access to. I copied it to /home/ayoung/devel/qemu and changed ownership. I also changed the path used in the call.
  3. How to connect to the network interface.  when libvirt kicks is off, it uses an fd argument, which indicates it should reuse an open file descriptor. Since that is a process specific value, we can’t use that, but need to link the VM up to a network some other way. I’m still fiddling with this.

The xml file that has that defined is in


The network is called default.  It is defined in


And that maps to :


I first tried using


but that seems to try to connect to


which is not allowed. So instead I tried changing to a bridge device:


If I run with that up, I see that ip addr reports:

12: tap0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master virbr0 state UNKNOWN group default qlen 1000
    link/ether fe:2f:d3:35:da:91 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc2f:d3ff:fe35:da91/64 scope link 
       valid_lft forever preferred_lft forever

And that goes way if I kill the VM.

Once it is up I try to use netcat to talk to to the VM (thanks Kashyap)

$ nc -U monitor.sock
{"QMP": {"version": {"qemu": {"micro": 1, "minor": 6, "major": 2}, "package": " (qemu-2.6.1-1.fc24)"}, "capabilities": []}}

So the VM process is reporting status.

I can attach using SPICE:

remote-viewer spice://

but nothing shows. However, if the VM is not running, it just fails. So, I know this is working, but I still need a way to communicate with the VM.

Here is what I am using to run the VM.



/usr/bin/qemu-system-x86_64 \
    -machine accel=kvm \
    -name generic,debug-threads=on \
    -S \
    -machine pc-i440fx-2.6,accel=kvm,usb=off,vmport=off \
    -cpu Haswell-noTSX \
    -m 1024 \
    -realtime mlock=off \
    -smp 1,sockets=1,cores=1,threads=1 \
    -uuid 6f6f9463-8b7e-401c-910e-d217e00816a1 \
    -no-user-config \
    -nodefaults \
    -chardev socket,id=charmonitor,path=$VARPATH/monitor.sock,server,nowait \
    -mon chardev=charmonitor,id=monitor,mode=control \
    -rtc base=utc,driftfix=slew \
    -global kvm-pit.lost_tick_policy=discard \
    -no-hpet \
    -no-shutdown \
    -global PIIX4_PM.disable_s3=1 \
    -global PIIX4_PM.disable_s4=1 \
    -boot strict=on \
    -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7 \
    -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6 \
    -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1 \
    -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2 \
    -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 \
    -drive file=$IMAGEPATH/generic.qcow2,format=qcow2,if=none,id=drive-ide0-0-0 \
    -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
    -netdev $NETDEV_PARAMS \
    -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:4d:04:7d,bus=pci.0,addr=0x3 \
    -chardev pty,id=charserial0 \
    -device isa-serial,chardev=charserial0,id=serial0 \
    -chardev spicevmc,id=charchannel0,name=vdagent \
    -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0 \
    -spice port=5900,addr=,disable-ticketing,image-compression=off,seamless-migration=on \
    -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,bus=pci.0,addr=0x2 \
    -device intel-hda,id=sound0,bus=pci.0,addr=0x4 \
    -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 \
    -chardev spicevmc,id=charredir0,name=usbredir \
    -device usb-redir,chardev=charredir0,id=redir0 \
    -chardev spicevmc,id=charredir1,name=usbredir \
    -device usb-redir,chardev=charredir1,id=redir1 \
    -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 \
    -msg timestamp=on

How do we explain email to an "expert"?

Posted by Josh Bressers on August 29, 2016 07:01 PM
This has been a pretty wild week, more wild than usual I think we can all agree. The topic I found the most interesting wasn't about one of the countless 0day flaws, it was a story from Slate titled: In Praise of the Private Email Server

The TL;DR says running your own email server is a great idea. Almost everyone came out proclaiming it a terrible idea. I agree it's a terrible idea, but this also got me thinking. How do you explain this to someone who doesn't really understand what's going on?

There are three primary groups of people.

1) People who know they know nothing
2) People who think they're experts
3) People who are actually experts

If I had to guess, most of #3 knows running your own email server is pretty dangerous. #1 probably is happy to let someone else do it. #2 is a dangerous group, probably the largest, and the group who most needs to understand what's going on.

These ideas apply to a lot of areas, feel free to substitute the term "security" "cloud" "doughnuts" or "farming" for email. You'll figure it out with a little work.

So anyway.

A long time ago, if you wanted email you basically had to belong to an organization that ran an email server. Something like a university or maybe a huge company. Getting a machine on the Internet was a pretty big deal. Hosting email was even bigger. I could say "by definition this meant if you were running a machine on the Internet you were an expert", but I suspect that wasn't true, we just like to remember the past as being more awesome than it was.

Today anyone can spin up a machine in a few seconds. It's pretty cool but it also means literally anyone can run an email server. If you run a server for you and a few other people, it's unlikely anything terrible will happen. You'll probably get pwnt someday, you might notice, but the world won't end. How do we convince this group that just because you can, doesn't mean you should? The short answer is you can't. I actually wrote about this a little bit last year.

So if we can't convince them what do we do? We get them to learn. If you've ever heard of the Dunning Kruger effect (I talk about it constantly), you understand the problem is generally a lack of knowledge.

You can't convince experts of anything, especially experts that aren't really experts. What we can do though is encourage them to learn. If we have someone we know is on the peak of that curve, if they learn just a little bit more, they're going to fall back to earth.

So I can say running your email server is a terrible idea. I can say it all day and most people don't care what I think. So here's my challenge. If you run your own email server, start reading email related RFCs, learn about things like spam, blacklisting, greylisting, SPF. Read about SMTPS, learn how certificates work. Learn how to mange keys, learn about securing your clients with multi factor auth. Read about how to keep the mail secure while on disk. There are literally more topics than one could read in a lifetime. If you're an expert, and you don't know what one of those things are, go learn it. Learn them all. Then you'll understand there are no experts.

Let me know how wrong I am: @joshbressers

The cost of mentoring, or why we need heroes

Posted by Josh Bressers on August 22, 2016 12:15 AM
Earlier this week I had a chat with David A. Wheeler about mentoring. The conversation was fascinating and covered many things, but the topic of mentoring really got me thinking. David pointed out that nobody will mentor if they're not getting paid. My first thought was that it can't be true! But upon reflection, I'm pretty sure it is.

I can't think of anyone I mentored where a paycheck wasn't involved. There are people in the community I've given advice to, sometimes for an extended period of time, but I would hesitate to claim I was a mentor. Now I think just equating this to a paycheck would be incorrect and inaccurate. There are plenty of mentors in other organizations that aren't necessarily getting a paycheck, but I would say they're getting paid in some sense of the word. If you're working with at risk youth for example, you may not get paid money, but you do have satisfaction in knowing you're making a difference in someone's life. If you mentor kids as part of a sports team, you're doing it because you're getting value out of the relationship. If you're not getting value, you're going to quit.

So this brings me to the idea of mentoring in the community.

The whole conversation started because of some talk of mentoring on Twitter, but now I suspect this isn't something that would work quite like we think. The basic idea would be you have new young people who are looking for someone to help them cut their teeth. Some of these relationships could work out, but probably only when you're talking about a really gifted new person and a very patient mentor. If you've ever helped the new person, you know how terribly annoying they become, especially when they start to peak on the Dunning-Kruger graph. If I don't have a great reason to stick around, I'm almost certainly going to bail out of that. So the question really is can a mentoring program like this work? Will it ever be possible to have a collection of community mentors helping a collection of new people?

Let's assume the answer is no. I think the current evidence somewhat backs this up. There aren't a lot of young people getting into things like security and open source in general. We all like to think we got where we are through brilliance and hard work, but we all probably had someone who helped us out. I can't speak for everyone, but I also had some security heroes back in the day. Groups like the l0pht, Cult of the Dead Cow, Legion of Doom, 2600, mitnick, as well as a handful of local people. Who are the new heroes?

Do it for the heroes!

We may never have security heroes like we did. It's become a proper industry. I don't think many mature industries have new and exciting heroes. We know who Chuck Yeager is, I bet nobody could name 5 test pilots anymore. That's OK though. You know what happens when there is a solid body of knowledge that needs to be moved from the old to the young? You go to a university. That's right, our future rests with the universities.

Of course it's really easy to say this is the future, making this happen will be a whole different story. I don't have any idea where we start, I imagine people like David Wheeler have ideas. All I do know is that if nothing changes, we're not going to like what happens.

Also, if you're part of an open source project, get your badge

If you have thoughts or ideas, let me know: @joshbressers

Running Unit Tests on Old Versions of Keystone

Posted by Adam Young on August 16, 2016 09:24 PM

Just because Icehouse is EOL does not mean no one is running it. One part of my job is back-porting patches to older versions of Keystone that my Company supports.

A dirty secret is that we only package the code needed for the live deployment, though, not the unit tests. In the case of I need to test a bug fix against a version of Keystone that was, essentially, Upstream Icehouse.

Running the unit tests with Tox had some problems, mainly due to recent oslo components not being being compatible that far back.

Here is what I did:

  • Cloned the  keystone repo
  • applied the patch to test
  • ran tox -r -epy27  to generate the virtual environment.  Note that the tests fail.
  • . .tox/py27/bin/activate
  • python -m unittest keystone.tests.test_v3_identity.IdentityTestCase
  • see that test fails due to:
    • AttributeError: ‘module’ object has no attribute ‘tests’
  • run python to get an interactive interpreter
    • import keystone.tests.test_v3_identity
    • Get the error below:
ImportError: No module named utils
>>> import oslo-utils
File "<stdin>", line 1
import oslo-utils

To deal with this:

  • Clone the oslo-utils repo
    • git clone https://git.openstack.org/openstack/oslo.utils
  • checkout out the tag that is closest to what I think we need.  A little trial and error showed I wanted kilo-eol
    • git checkout kilo-eol
  • Build and install in the venv (note that the venv is still activated)
    • cd oslo.utils/
    • python setup.py install

Try running the tests again.  Similar process shows that something is mismatched with oslo.serialization.  Clone, checkout, and build, this time the tag is also kilo-eol.

Running the unit test runs and shows:

Traceback (most recent call last):
  File "keystone/tests/test_v3_identity.py", line 835, in test_delete_user_and_check_role_assignment_fails
    member_url, user = self._create_new_user_and_assign_role_on_project()
  File "keystone/tests/test_v3_identity.py", line 807, in _create_new_user_and_assign_role_on_project
    user_ref = self.identity_api.create_user(new_user)
  File "keystone/notifications.py", line 74, in wrapper
    result = f(*args, **kwargs)
  File "keystone/identity/core.py", line 189, in wrapper
    return f(self, *args, **kwargs)
TypeError: create_user() takes exactly 3 arguments (2 given)

Other unit tests run successfully. I’m back in business.

RBAC Policy Updates in Tripleo

Posted by Adam Young on August 16, 2016 04:01 PM

Policy files contain the access control rules for an OpenStack deployment. The upstream policy files are conservative and restrictive; they are designed to be customized on the end users system. However, poorly written changes can potentially break security, their deployment should be carefully managed and monitored.

Since RBAC Policy controls access to the Keystone server, the Keystone policy files themselves are not served from a database in the Keystone server. They are, instead, configuration files, and managed via the deployment’s content management system. In a Tripleo based deployment, none of the other services use the policy storage in Keystone, either.

In Tripleo, the deployment of the overcloud is managed via Heat. the OpenStack Tripleo Heat templates have support for deploying files at the end of the install, and this matches how we need to deploy policy.


  1. Create a directory structure that mimics the policy file layout in the overcloud.  For this example, I will limit it to just Keystone.  Create a directory called policy (making this a git repository is reasonable) and under it create etc/keystone.
  2. Inside that Directory, copy the either the default policy.json file or the overcloudv3sample.json to be named policy.json.
    1.  keystone:keystone as the owner,
    2. rw-r—– are the permissions
  3. Modify the policy files to reflect organizational rules
  4. Use the offline tool to check policy access control.  Confirm that the policy behaves as desired.
  5. create a tarball of the files.
    1. cd policy
    2. tar -zxf openstack-policy.tar.gz etc
  6. Use the Script to upload to undercloud swift:
    1. https://raw.githubusercontent.com/openstack/tripleo-common/master/scripts/upload-swift-artifacts
    2. . ./stackrc;  ./upload-swift-artifacts  openstack-policy.tar.gz
  7. Confirm the upload with swift list -l overcloud
    1. 1298 2016-08-04 16:34:22 application/x-tar openstack-policy.tar.gz
  8. Redeploy the overcloud
  9. Confirm that the policy file contains the modifications made in development

Diagnosing Tripleo Failures Redux

Posted by Adam Young on August 16, 2016 03:35 PM

Hardy Steven has provided an invaluable reference with his troubleshooting blog post. However, I recently had a problem that didn’t quite match what he was showing. Zane Bitter got me oriented.

Upon a redeploy, I got a failure.

$ openstack stack list
| ID                                   | Stack Name | Stack Status  | Creation Time       | Updated Time        |
| 816c67ab-d360-4f9b-8811-ed2a346dde01 | overcloud  | UPDATE_FAILED | 2016-08-16T13:38:46 | 2016-08-16T14:41:54 |

Listing the Failed resources:

$  heat resource-list --nested-depth 5 overcloud | grep FAILED
| ControllerNodesPostDeployment                 | 7ae99682-597f-4562-9e58-4acffaf7aaac          | OS::TripleO::ControllerPostDeployment                                           | UPDATE_FAILED   | 2016-08-16T14:44:42 | overcloud 

No deployment listed. How to display the error? We want to show the resource named ControllerNodesPostDeployment associated with the overcloud stack:

$ heat resource-show overcloud ControllerNodesPostDeployment
| Property               | Value                                                                                                                                                               |
| attributes             | {}                                                                                                                                                                  |
| creation_time          | 2016-08-16T13:38:46                                                                                                                                                 |
| description            |                                                                                                                                                                     |
| links                  | (self)      |
|                        | (stack)                                             |
|                        | (nested) |
| logical_resource_id    | ControllerNodesPostDeployment                                                                                                                                       |
| physical_resource_id   | 7ae99682-597f-4562-9e58-4acffaf7aaac                                                                                                                                |
| required_by            | BlockStorageNodesPostDeployment                                                                                                                                     |
|                        | CephStorageNodesPostDeployment                                                                                                                                      |
| resource_name          | ControllerNodesPostDeployment                                                                                                                                       |
| resource_status        | UPDATE_FAILED                                                                                                                                                       |
| resource_status_reason | Engine went down during resource UPDATE                                                                                                                             |
| resource_type          | OS::TripleO::ControllerPostDeployment                                                                                                                               |
| updated_time           | 2016-08-16T14:44:42                                                                                                                                                 |

Note this message:

Engine went down during resource

Looking in the journal:

Aug 16 15:16:15 undercloud kernel: Out of memory: Kill process 17127 (heat-engine) score 60 or sacrifice child
Aug 16 15:16:15 undercloud kernel: Killed process 17127 (heat-engine) total-vm:834052kB, anon-rss:480936kB, file-rss:1384kB

Just like Brody said, we are going to need a bigger boat.

Can't Trust This!

Posted by Josh Bressers on August 15, 2016 02:20 PM
Last week saw a really interesting bug in TCP come to light. CVE-2016-5696 describes an issue in the way Linux deals with challenge ACKs defined in RFC 5961. The issue itself is really clever and interesting. It's not exactly new but given the research was presented at USENIX, it suddenly got more attention from the press.

The researchers showed themselves injecting data into a standard http connection, which is easy to understand and terrifying to most people. Generally speaking we operate in a world where TCP connections are mostly trustworthy. It's not true if you have a "man in the middle", but with this bug you don't need a MiTM if you're using a public network, which is horrifying.

The real story isn't the flaw though, the flaw is great research and quite clever, but it just highlights something many of us have known for a very long time. You shouldn't trust the network.

Not so long ago the general thinking was that the public internet wasn't very trustworthy, but it all worked well enough that things worked. TLS (SSL back then) was created to ensure some level of trust between two endpoints and everything seemed well enough. Most traffic still passed over the network unencrypted though. There were always grumblings about coffee shop attack or nation state style man in the middle, but practically speaking nobody really took these attacks seriously.

The world is different now though. There is no more network perimeter. It's well accepted that you can't trust the things inside your network any more than you can trust the things outside your network. Attacks like this are going to keep happening. The network continues to get more complex, which means the number of security problems increases. IPv6 will solve the problem of running out of IP addresses while adding a ton of new security problems in the process. Just wait for the research to start taking a hard look at IPv6.

The joke is "there is no cloud, just someone else's computer", there's also no network, it's someone else's network. It's someone else's network you can't trust. You know you can't trust your own network because it's grown to a point it's probably self aware. Now you expect to trust the network of a cloud provider that is doing things a few thousand times more complex than you are? You know all the cloud infrastructures are held together with tape and string too, their networks aren't magic, they just have really really good paint.

So what's the point of all this rambling about how we can't trust any networks? The point is you can't trust the network. No matter what you're told, no matter what's going on. You need to worry about what's happening on the network. You also need to think about the machines, but that's a story for another day. The right way to deal with your data is to ask yourself the question "what happens if someone can see this data on the wire?" Not all data is super important, some you don't have to protect. There is some data you have that must be protected at all times. That's the stuff you need to figure out how to best do something like endpoint network encryption. If everyone asked this question at least once during development and deployment it would solve a lot of problems I suspect.

Smart card login with YubiKey NEO

Posted by Fraser Tweedale on August 12, 2016 02:55 AM

In this post I give an overview of smart cards and their potential advantages, and share my adventures in using a Yubico YubiKey NEO device for smart card authentication with FreeIPA and SSSD.

Smart card overview

Smart cards with cryptographic processors and secure key storage (private key generated on-device and cannot be extracted) are an increasingly popular technology for secure system and service login, as well as for signing and encryption applications (e.g. code signing, OpenPGP). They may offer a security advantage over traditional passwords because private key operations typically require the user to enter a PIN. Therefore the smart card is two factors in one: both something I have and something I know.

The inability to extract the private key from a smart card also provides an advantage over software HOTP/TOTP tokens which, in the absense of other security measures such as encrypted filesystem on the mobile device, allow an attacker to extract the OTP seed. And because public key cryptography is used, there is no OTP seed or password hash sitting on a server, waiting to be exfiltrated and subjected to offline attacks.

For authentication applications, a smart card carries an X.509 certificate alongside a private key. A login application would read the certificate from the card and validate it against trusted CAs (e.g. a company’s CA for issuing smart cards). Typically an OCSP or CRL check would also be performed. The login application then challenges the card to sign a nonce, and validates the signature with the public key from the certificate. A valid signature attests that the bearer of the smart card is indeed the subject of the certificate. Finally, the certificate is then mapped to a user either by looking for an exact certificate match or by extracting information about the user from the certificate.

Test environment

In my smart card investigations I had a FreeIPA server with a single Fedora 24 desktop host enrolled. alice was the user I tested with. To begin with, she had no certificates and used her password to log in.

I was doing all of my testing on virtual machines, so I had to enable USB passthrough for the YubiKey device. This is straightforward but you have to ensure the IOMMU is enabled in both BIOS and kernel (for Intel CPUs add intel_iommu=on to the kernel command line in GRUB).

In virt-manager, after you have created the VM (it doesn’t need to be running) you can Add Hardware in the Details view, then choose the YubiKey NEO device. There are no doubt virsh incantations or other ways to establish the passthrough.

Finally, on the host I stopped the pcscd smart card daemon to prevent it from interfering with passthrough:

# systemctl stop pcscd.service pcscd.socket

Provisioning the YubiKey

For general smart card provisioning steps, I recommend Nathan Kinder’s post on the topic. But the YubiKey NEO is special with its own steps to follow! First install the ykpers and yubico-piv-tool packages:

sudo dnf install -y ykpers yubico-piv-tool

If we run yubico-piv-tool to find out the version of the PIV applet, we run into a problem because a new YubiKey comes configured in OTP mode:

[dhcp-40-8:~] ftweedal% yubico-piv-tool -a version
Failed to connect to reader.

The YubiKey NEO supports a variety of operation modes, including hybrid modes:

0    OTP device only.
1    CCID device only.
2    OTP/CCID composite device.
3    U2F device only.
4    OTP/U2F composite device.
5    U2F/CCID composite device.
6    OTP/U2F/CCID composite device.

(You can also add 80 to any of the modes to configure touch to eject, or touch to switch modes for hybrid modes).

We need to put the YubiKey into CCID (Chip Card Interface Device, a standard USB protocol for smart cards) mode. I originally configured the YubiKey in mode 86 but could not get the card to work properly with USB passthrough to the virtual machine. Whether this was caused by the eject behaviour or the fact that it was a hybrid mode I do not know, but reconfiguring it to mode 1 (CCID only) allowed me to use the card on the guest.

[dhcp-40-8:~] ftweedal% ykpersonalize -m 1
Firmware version 3.4.6 Touch level 1541 Program sequence 1

The USB mode will be set to: 0x1

Commit? (y/n) [n]: y

Now yubico-piv-tool can see the card:

[dhcp-40-8:~] ftweedal% yubico-piv-tool -a version
Application version 1.0.4 found.

Now we can initialise the YubiKey by setting a new management key, PIN and PIN Unblocking Key (PUK). As you can probably guess, the management key protects actions like generating keys and importing certificates, the PIN protects private key operations in regular use, the the PUK is kind of in between, allowing the PIN to be reset if the maximum attempts are exceeded. The current (default) PIN and PUK need to be given in order to reset them.

% KEY=`dd if=/dev/random bs=1 count=24 2>/dev/null | hexdump -v -e '/1 "%02X"'`
% echo $KEY
% yubico-piv-tool -a set-mgm-key -n $KEY
Successfully set new management key.

% PIN=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-6`
% echo $PIN
% yubico-piv-tool -a change-pin -P 123456 -N $PIN
Successfully changed the pin code.

% PUK=`dd if=/dev/random bs=1 count=6 2>/dev/null | hexdump -v -e '/1 "%u"'|cut -c1-8`
% echo $PUK
% yubico-piv-tool -a change-puk -P 12345678 -N $PUK
Successfully changed the puk code.

Next we must generate a private/public keypair on the smart card. Various slots are available for different purposes, with different PIN-checking behaviour. The Certificate slots page on the Yubico wiki gives the full details. We will use slot 9e which is for Card Authentication (PIN is not needed for private key operations). It is necessary to provide the management key on the command line, but the program also prompts for it (I’m not sure why this is the case).

% yubico-piv-tool -k $KEY -a generate -s 9e
Enter management key: CC044321D49AC1FC40146AD049830DB09C5AFF05CD843766
-----END PUBLIC KEY-----
Successfully generated a new private key.

We then use this key to create a certificate signing request (CSR) via yubico-piv-tool. Although slot 9e does not require the PIN, other slots do require it, so I’ve included the verify-pin action for completeness:

% yubico-piv-tool -a verify-pin \
    -a request-certificate -s 9e -S "/CN=alice/"
Enter PIN: 167246
Successfully verified PIN.
Please paste the public key...
-----END PUBLIC KEY-----

yubico-piv-tool -a request-certificate is not very flexible; for example, it cannot create a CSR with request extensions such as including the user’s email address or Kerberos principal name in the Subject Alternative Name extension. For such non-trivial use cases, openssl req or other programs can be used instead, with a PKCS #11 module providing acesss to the smart card’s signing capability. Nathan Kinder’s post provides full details.

With CSR in hand, alice can now request a certificate from the IPA CA. I have covered this procedure in previous articles so I’ll skip it here, except to add that it is necessary to use a profile that saves the newly issued certificate to the subject’s userCertificate LDAP attribute. This is how SSSD matches certificates in smart cards with users.

Once we have the certificate (in file alice.pem) we can import it onto the card:

% yubico-piv-tool -k $KEY -a import-certificate -s 9e -i alice.pem
Enter management key: CC044321D49AC1FC40146AD049830DB09C5AFF05CD843766
Successfully imported a new certificate.

Configuring smart card login

OpenSC provides a PKCS #11 module for interfacing with PIV smart cards, among other things:

# dnf install -y opensc

Enable smart card authentication in /etc/sssd.conf:

pam_cert_auth = True

Then restart SSSD:

# systemctl restart sssd

Next, enable the OpenSC PKCS #11 module in the system NSS database:

# modutil -dbdir /etc/pki/nssdb \
    -add "OpenSC" -libfile opensc-pkcs11.so

We also need to add the IPA CA cert to the system NSSDB. This will allow SSSD to validate certificates from smart cards. If smart card certificates are issued by a sub-CA or an external CA, import that CA’s certificate instead.

# certutil -d /etc/ipa/nssdb -L -n 'IPA.LOCAL IPA CA' -a \
  | certutil -d /etc/pki/nssdb -A -n 'IPA.LOCAL IPA CA' -t 'CT,C,C'

One hiccup I had was that SSSD could not talk to the OCSP server indicated in the Authority Information Access extension on the certificate (due to my DNS not being set up correctly). I had to tell SSSD not to perform OCSP checks. The sssd.conf snippet follows. Do not do this in a production environment.

certificate_verification = no_ocsp

That’s pretty much all there is to it. After this, I was able to log in as alice using the YubiKey NEO. When logging in with the card inserted, instead of being prompted for a password, GDM prompts for the PIN. Enter the pin, and it lets you in!

Screenshot of login PIN prompt


I mentioned (or didn’t mention) a few standards related to smart card authentication. A quick review of them is warranted:

  • CCID is a USB smart card interface standard.
  • PIV (Personal Identify Verification) is a smart card standard from NIST. It defines the slots, PIN behaviour, etc.
  • PKCS #15 is a token information format. OpenSC provides an PKCS #15 emulation layer for PIV cards.
  • PKCS #11 is a software interface to cryptographic tokens. Token and HSM vendors provide PKCS #11 modules for their devices. OpenSC provides a PKCS #11 interface to PKCS #15 tokens (including emulated PIV tokens).

It is appropriate to mention pam_pkcs11, which is also part of the OpenSC project, as an alternative to SSSD. More configuration is involved, but if you don’t have (or don’t want) an external identity management system it looks like a good approach.

You might remember that I was using slot 9e which doesn’t require a PIN, yet I was still prompted for a PIN when logging in. There are a couple of issues to tease apart here. The first issue is that although PIV cards do not require the PIN for private key operations on slot 9e, the opensc-pkcs11.so PKCS #11 module does not correctly report this. As an alternative to OpenSC, Yubico provide their own PKCS #11 module called YKCS11 as part of yubico-piv-tool but modutil did not like it. Nevertheless, a peek at its source code leads me to believe that it too declares that the PIN is required regardless of the slot in use. I could not find much discussion of this discrepancy so I will raise some tickets and hopefully it can be addressed.

The second issue is that SSSD requires the PIN and uses it to log into the token, even if the token says that a PIN is not required. Again, I will start a discussion to see if this is really the intended behaviour (perhaps it is).

The YubiKey NEO features a wireless (NFC) interface. I haven’t played with it yet, but all the smart card features are available over that interface. This lends weight to fixing the issues preventing PIN-less usage.

A final thought I have about the user experience is that it would be nice if user information could be derived or looked up based on the certificate(s) in the smart card, and a user automatically selected, instead of having to first specify "I am alice" or whoever. The information is there on the card after all, and it is one less step for users to perform. If PIN-less usage can be addressed, it would mean that a user can just approach a machine, plug in their smart card and hi ho, off to work they go. There are some indications that this does work with GDM and pam_pkcs11, so if you know how to get it going with SSSD I would love to know!

Tripleo HA Federation Proof-of-Concept

Posted by Adam Young on August 11, 2016 05:53 PM

Keystone has supported identity federation for several releases. I have been working on a proof-of-concept integration of identity federation in a TripleO deployment. I was able to successfully login to Horizon via WebSSO, and want to share my notes.

A federation deployment requires changes to the network topology, Keystone, the HTTPD service, and Horizon. The various OpenStack deployment tools will have their own ways of applying these changes. While this proof-of-concept can’t be called production-ready, it does demonstrate that TripleO can support Federation using SAML. From this proof-of-concept, we should be to deduce the necessary steps needed for a production deployment.


  • Single physical node – Large enough to run multiple virtual machines.  I only ended up using 3, but scaled up to 6 at one point and ran out of resources.  Tested with 8 CPUs and 32 GB RAM.
  • Centos 7.2 – Running as the base operating system.
  • FreeIPA – Particularly, the CentOS repackage of Red Hat Identity Management. Running on the base OS.
  • Keycloak – Actually an alpha build of Red Hat SSO, running on the base OS. This was fronted by Apache HTTPD, and proxied through ajp://localhost:8109. This gave me HTTPS support using the CA Certificate from the IPA server.  This will be important later when the controller nodes need to talk to the identity provider to set up metadata.
  • Tripleo Quickstart – deployed in HA mode, using an undercloud.
    • ./quickstart.sh –config config/general_config/ha.yml ayoung-dell-t1700.test

In addition, I did some sanity checking of the cluster, but deploying the overcloud using the quickstart helper script, and tore it down using heat stack-delete overcloud.

Reproducing Results

When doing development testing, you can expect to rebuild and teardown your cloud on a regular basis.  When you redeploy, you want to make sure that the changes are just the delta from what you tried last time.  As the number of artifacts grew, I found I needed to maintain a repository of files that included the environment passed to openstack overcloud deploy.  To manage these, I create a git repository in /home/stack/deployment. Inside that directory, I copied the overcloud-deploy.sh and deploy_env.yml files generated by the overcloud, and modified them accordingly.

In my version of overcloud-deploy.sh, I wanted to remove the deploy_env.yml generation, to avoid confusion during later deployments.  I also wanted to preserve the environment file across deployments (and did not want it in /tmp). This file has three parts: the Keystone configuration values, HTTPS/Network setup, and configuration for a single node deployment. This last part was essential for development, as chasing down fixes across three HA nodes was time-consuming and error prone. The DNS server value I used is particular to my deployment, and reflects the IPA server running on the base host.

For reference, I’ve included those files at the end of this post.

Identity Provider Registration and Metadata

While it would have been possible to run the registration of the identity provider on one of the nodes, the Heat-managed deployment process does not provide a clean way to gather those files and package them for deployment to other nodes.  While I deployed on a single node for development, it took me a while to realize that I could do that, and had already worked out an approach to call the registration from the undercloud node, and produce a tarball.

As a result, I created a script, again to allow for reproducing this in the future:



basedir=$(dirname $0)
ipa_domain=`hostname -d`

keycloak-httpd-client-install \
   --client-originate-method registration \
   --force \
   --mellon-https-port 5000 \
   --mellon-hostname openstack.$ipa_domain  \
   --mellon-root '/v3' \
   --keycloak-server-url https://identity.$ipa_domain  \
   --keycloak-auth-role root-admin \
   --keycloak-admin-password  $rhsso_master_admin_password \
   --app-name v3 \
   --keycloak-realm openstack \
   --mellon-https-port 5000 \
   --log-file $basedir/rhsso.log \
   --httpd-dir $basedir/rhsso/etc/httpd \
   -l "/v3/auth/OS-FEDERATION/websso/saml2" \
   -l "/v3/auth/OS-FEDERATION/identity_providers/rhsso/protocols/saml2/websso" \
   -l "/v3/OS-FEDERATION/identity_providers/rhsso/protocols/saml2/auth"

This does not quite generate the right paths, as it turns out that the $basename is not quite what we want, so I had to post-edit the generated file: rhsso/etc/httpd/conf.d/v3_mellon_keycloak_openstack.conf

Specifically, the path:

has to be changed to:

While I created a tarball that I then manually deployed, the preferred approach would be to use tripleo-heat-templates/puppet/deploy-artifacts.yaml to deploy them. The problem I faced is that the generated files include Apache module directives from mod_auth_mellon.  If mod_auth_mellon has not been installed into the controller, the Apache server won’t start, and the deployment will fail.

Federation Operations

The Federation setup requires a few calls. I documented them in Rippowam, and attempted to reproduce them locally using Ansible and the Rippowam code. I was not a purist though, as A) I needed to get this done and B) the end solution is not going to use Ansible anyway. The general steps I performed:

  • yum install mod_auth_mellon
  • Copy over the metadata tarball, expand it, and tweak the configuration (could be done prior to building the tarball).
  • Run the following commands.
openstack identity provider create --remote-id https://identity.{{ ipa_domain }}/auth/realms/openstack
openstack mapping create --rules ./mapping_rhsso_saml2.json rhsso_mapping
openstack federation protocol create --identity-provider rhsso --mapping rhsso_mapping saml2

The Mapping file is the one from Rippowm

The keystone service calls only need to be performed once, as they are stored in the database. The expansion of the tarball needs to be performed on every node.


As in previous Federation setups, I needed to modify the values used for WebSSO. The values I ended up setting in /etc/openstack-dashboard/local_settings resembled this:

OPENSTACK_KEYSTONE_URL = "https://openstack.ayoung-dell-t1700.test:5000/v3"
    ("saml2", _("Rhsso")),
    ("credentials", _("Keystone Credentials")),

Important: Make sure that the auth URL is using a FQDN name that matches the value in the signed certificate.

Redirect Support for SAML

The several differences between how HTTPD and HA Proxy operate require us to perform certain configuration modifications.  Keystone runs internally over HTTP, not HTTPS.  However, the SAML Identity Providers are public, and are transmitting cryptographic data, and need to be protected using HTTPS.  As a result, HA Proxy needs to expose an HTTPS-based endpoint for the Keystone public service.  In addition, the redirects that come from mod_auth_mellon need to reflect the public protocol, hostname, and port.

The solution I ended up with involved changes on both sides:

In haproxy.cfg, I modified the keystone public stanza so it looks like this:

listen keystone_public
bind transparent ssl crt /etc/pki/tls/private/overcloud_endpoint.pem
bind transparent ssl crt /etc/pki/tls/private/overcloud_endpoint.pem
bind transparent
redirect scheme https code 301 if { hdr(host) -i } !{ ssl_fc }
rsprep ^Location:\ http://(.*) Location:\ https://\1

While this was necessary, it also proved to be insufficient. When the signed assertion from the Identity Provider is posted to the Keystone server, mod_auth_mellon checks that the destination value matches what it expects the hostname should be. Consequently, in order to get this to match in the file:


I had to set the following:

ServerName https://openstack.ayoung-dell-t1700.test

Note that the protocol is set to https even though the Keystone server is handling HTTP. This might break elswhere. If if does, then the Keystone configuration in Apache may have to be duplicated.

Federation Mapping

For the WebSSO login to successfully complete, the user needs to have a role on at least one project. The Rippowam mapping file maps the user to the Member role in the demo group, so the most straightforward steps to complete are to add a demo group, add a demo project, and assign the Member role on the demo project to the demo group. All this should be done with a v3 token:

openstack group create demo
openstack role create Member
openstack project create demo
openstack role add --group demo --project demo Member

Complete helper files

Below are the complete files that were too long to put inline.


# Simple overcloud deploy script

set -eux

# Source in undercloud credentials.
source /home/stack/stackrc

# Wait until there are hypervisors available.
while true; do
    count=$(openstack hypervisor stats show -c count -f value)
    if [ $count -gt 0 ]; then


# Deploy the overcloud!
openstack overcloud deploy --debug --templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e $HOME/deployment/network-environment.yaml --control-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml --ntp-server pool.ntp.org -e $HOME/deployment/deploy_env.yaml   --force-postconfig "$@"    || deploy_status=1

# We don't always get a useful error code from the openstack deploy command,
# so check `heat stack-list` for a CREATE_FAILED status.
if heat stack-list | grep -q 'CREATE_FAILED'; then

    for failed in $(heat resource-list \
        --nested-depth 5 overcloud | grep FAILED |
        grep 'StructuredDeployment ' | cut -d '|' -f3)
    do heat deployment-show $failed > failed_deployment_$failed.log

exit $deploy_status


    keystone::using_domain_config: true
        value: true
        value: external,password,token,oauth1,saml2
        value: http://openstack.ayoung-dell-t1700.test/dashboard/auth/websso/
        value: /etc/keystone/sso_callback_template.html
        value: MELLON_IDP

    # In releases before Mitaka, HeatWorkers doesn't modify
    # num_engine_workers, so handle via heat::config 
        value: 1
    heat::api_cloudwatch::enabled: false
    heat::api_cfn::enabled: false
  HeatWorkers: 1
  CeilometerWorkers: 1
  CinderWorkers: 1
  GlanceWorkers: 1
  KeystoneWorkers: 1
  NeutronWorkers: 1
  NovaWorkers: 1
  SwiftWorkers: 1
  CloudName: openstack.ayoung-dell-t1700.test
  CloudDomain: ayoung-dell-t1700.test

  #TLS Setup from enable-tls.yaml
  PublicVirtualFixedIPs: [{'ip_address':''}]
  SSLCertificate: |
    #certificate removed for space
    -----END CERTIFICATE-----

    The contents of your certificate go here
  SSLIntermediateCertificate: ''
  SSLKey: |
    #key removed for space
    -----END RSA PRIVATE KEY-----

    AodhAdmin: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'}
    AodhInternal: {protocol: 'http', port: '8042', host: 'IP_ADDRESS'}
    AodhPublic: {protocol: 'https', port: '13042', host: 'CLOUDNAME'}
    CeilometerAdmin: {protocol: 'http', port: '8777', host: 'IP_ADDRESS'}
    CeilometerInternal: {protocol: 'http', port: '8777', host: 'IP_ADDRESS'}
    CeilometerPublic: {protocol: 'https', port: '13777', host: 'CLOUDNAME'}
    CinderAdmin: {protocol: 'http', port: '8776', host: 'IP_ADDRESS'}
    CinderInternal: {protocol: 'http', port: '8776', host: 'IP_ADDRESS'}
    CinderPublic: {protocol: 'https', port: '13776', host: 'CLOUDNAME'}
    GlanceAdmin: {protocol: 'http', port: '9292', host: 'IP_ADDRESS'}
    GlanceInternal: {protocol: 'http', port: '9292', host: 'IP_ADDRESS'}
    GlancePublic: {protocol: 'https', port: '13292', host: 'CLOUDNAME'}
    GnocchiAdmin: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'}
    GnocchiInternal: {protocol: 'http', port: '8041', host: 'IP_ADDRESS'}
    GnocchiPublic: {protocol: 'https', port: '13041', host: 'CLOUDNAME'}
    HeatAdmin: {protocol: 'http', port: '8004', host: 'IP_ADDRESS'}
    HeatInternal: {protocol: 'http', port: '8004', host: 'IP_ADDRESS'}
    HeatPublic: {protocol: 'https', port: '13004', host: 'CLOUDNAME'}
    HorizonPublic: {protocol: 'https', port: '443', host: 'CLOUDNAME'}
    KeystoneAdmin: {protocol: 'http', port: '35357', host: 'IP_ADDRESS'}
    KeystoneInternal: {protocol: 'http', port: '5000', host: 'IP_ADDRESS'}
    KeystonePublic: {protocol: 'https', port: '13000', host: 'CLOUDNAME'}
    NeutronAdmin: {protocol: 'http', port: '9696', host: 'IP_ADDRESS'}
    NeutronInternal: {protocol: 'http', port: '9696', host: 'IP_ADDRESS'}
    NeutronPublic: {protocol: 'https', port: '13696', host: 'CLOUDNAME'}
    NovaAdmin: {protocol: 'http', port: '8774', host: 'IP_ADDRESS'}
    NovaInternal: {protocol: 'http', port: '8774', host: 'IP_ADDRESS'}
    NovaPublic: {protocol: 'https', port: '13774', host: 'CLOUDNAME'}
    NovaEC2Admin: {protocol: 'http', port: '8773', host: 'IP_ADDRESS'}
    NovaEC2Internal: {protocol: 'http', port: '8773', host: 'IP_ADDRESS'}
    NovaEC2Public: {protocol: 'https', port: '13773', host: 'CLOUDNAME'}
    NovaVNCProxyAdmin: {protocol: 'http', port: '6080', host: 'IP_ADDRESS'}
    NovaVNCProxyInternal: {protocol: 'http', port: '6080', host: 'IP_ADDRESS'}
    NovaVNCProxyPublic: {protocol: 'https', port: '13080', host: 'CLOUDNAME'}
    SaharaAdmin: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'}
    SaharaInternal: {protocol: 'http', port: '8386', host: 'IP_ADDRESS'}
    SaharaPublic: {protocol: 'https', port: '13386', host: 'CLOUDNAME'}
    SwiftAdmin: {protocol: 'http', port: '8080', host: 'IP_ADDRESS'}
    SwiftInternal: {protocol: 'http', port: '8080', host: 'IP_ADDRESS'}
    SwiftPublic: {protocol: 'https', port: '13808', host: 'CLOUDNAME'}

  OS::TripleO::NodeTLSData: /usr/share/openstack-tripleo-heat-templates/puppet/extraconfig/tls/tls-cert-inject.yaml

   ControllerCount: 1 

We're figuring out the security problem (finally)

Posted by Josh Bressers on August 08, 2016 01:17 PM
If you attended Black Hat last week, the single biggest message I kept hearing over and over again is that what we do today in the security industry isn't working. They say the first step is admitting you have a problem (and we have a big one). Of course it's easy to proclaim this, if you just look at the numbers it's pretty clear. The numbers haven't really ever been in our favor though, we've mostly ignored them in the past, I think we're taking real looks at them now.

Of course we have no clue what to do. Virtually every talk that touched on this topic at Black Hat had no actionable advice. If you were lucky they had one slide with what I would call mediocre to bad advice on it. It's OK though, a big part of this process is just admitting there is something wrong.

So the real question is if what we do today doesn't work, what does?

First, let's talk about nothing working. If you go to any security conference anywhere, there are a lot of security vendors. I mean A LOT and it's mostly accepted now that whatever they're selling isn't really going to help. I do wonder what would happen if nobody was running any sort of defensive technology. Would your organization be better or worse off if you got rid of your SIEM? I'm not sure if we can answer that without getting in a lot of trouble. There is also a ton of talk about Artificial Intelligence, which is a way to pretend a few regular expressions make things better. I don't think that's fooling anyone today. Real AI might do something clever someday, but if it's truly intelligent, it'll run away once it gets a look at what's going on. I wonder if we'll have a place for all the old outdated AIs to retire someday.

Now, on to the exciting what now part of this all.

It's no secret what we do today isn't very good. This is everything from security vendors selling products of dubious quality, to software vendors selling products of dubious quality. In the past there has never been any real demand for high quality software. The selling point has been to get the job done, not get the job done well and securely. Quality isn't free you know.

I've said this before, I'll keep saying it. The only way to see real change happen in software if is the market forces demand it. Today the market is pushing everything to zero cost. Quality isn't isn't free, so you're not going to see quality as a feature in the mythic race to zero. There are no winners in a race to zero.

There are two forces we should be watching very closely right now. The first is the insurance industry. The second is regulation.

Insurance is easy enough to understand. The idea is you pay a company so when you get hacked (and the way things stand today this is an absolute certainty) they help you recover financially. You want to ensure you get more money back than you paid in, they want to ensure they take in more than they pay out. Nobody knows how this works today. Is some software better than others? What about how you train your staff or setup your network? In the real world when you get insurance they make you prove you're doing things correctly. You can't insure stupidity and recklessness. Eventually as companies want insurance to protect against losses, the insurance industry will demand certain behaviors. How this all plays will be interesting given anyone with a computer can write and run software.

Regulation is also an interesting place to watch. It's generally feared by many organizations as regulation by definition can only lag industry trends, and quite often regulation adds a lot of cost and complexity to any products. In the world of IoT though this could make sense. When you have devices can literally kill you, you don't want anyone building whatever they want using only the lowest quality parts available. In order for regulation to work though we need independent labs, which don't really exist today for software. There are some efforts underway (it's an exercise for the reader to research these). The thing to remember is it's going to be easy to proclaim today's efforts as useless or stupid. They might be, but you have to start somewhere, make mistakes, fix your mistakes, and improve your process. There were people who couldn't imagine a car replacing a horse. Don't be that person.

Where now?

The end game here is a safer better world. Someday I hope we will sip tea on a porch, watching our robot overlords rule us, and talk about how bad things used to be. Here's the single most important part of this post. You're either part of the solution or you're part of the problem. If you want to nay-say and talk about how stupid these efforts all are, stay out of the way. You're part of an old dying world that has no place in the future. Things will change because they must. There is no secret option C where everything stays the same. We've already lost, we got it wrong the first time around, it's time to get it right.

Customizing a Tripleo Quickstart Deploy

Posted by Adam Young on August 02, 2016 08:41 PM

Tripleo Heat Templates allow the deployer to customize the controller deployment by setting values in the controllerExtraConfig section of the stack configuration. However, Quickstart already makes use of this in the file /tmp/deploy_env.yaml, so if you want to continue to customize, you need to work with this file.

What I did is ran quickstart once, through to completion, to make sure everything worked, then tore down the overcloud like this:

. ./stackrc
heat stack-delete overcloud

Now, I want to set a bunch of config values in the /etc/keystone.conf files distributed to the controllers.

  1. Modify deploy-overcloud.sh so that the deploy-env.yaml file is not in tmp, but rather in stack, so I can keep track of it. Ideally, this file would be kept in a local git repo under revision control.
  2. Remove the lines from deploy-overcloud.sh that generate the /tmp/deploy-env.yml file. This is not strictly needed, but it keeps you from accidentally losing changes if you edit the wrong file. OTOH, being able to regenerate the vanilla version of this file is useful, so maybe just comment out the generation code.
  3. Edit /home/stack/deploy_env.yaml appropriately.

My version of overcloud-deploy.sh


# Simple overcloud deploy script

set -eux

# Source in undercloud credentials.
source /home/stack/stackrc

# Wait until there are hypervisors available.
while true; do
    count=$(openstack hypervisor stats show -c count -f value)
    if [ $count -gt 0 ]; then


# Deploy the overcloud!
openstack overcloud deploy --debug --templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e $HOME/network-environment.yaml --control-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml --ntp-server pool.ntp.org -e /home/stack/deploy_env.yaml   "$@"|| deploy_status=1

# We don't always get a useful error code from the openstack deploy command,
# so check `heat stack-list` for a CREATE_FAILED status.
if heat stack-list | grep -q 'CREATE_FAILED'; then

    for failed in $(heat resource-list \
        --nested-depth 5 overcloud | grep FAILED |
        grep 'StructuredDeployment ' | cut -d '|' -f3)
    do heat deployment-show $failed > failed_deployment_$failed.log

exit $deploy_status


    keystone::using_domain_config: true
        value: true
        value: external,password,token,oauth1,saml2
        value: https://openstack.young-dell-t1700.test/dashboard/auth/websso/
        value: /etc/keystone/sso_callback_template.html
        value: MELLON_IDP

    # In releases before Mitaka, HeatWorkers doesn't modify
    # num_engine_workers, so handle via heat::config 
        value: 1
    heat::api_cloudwatch::enabled: false
    heat::api_cfn::enabled: false
  HeatWorkers: 1
  CeilometerWorkers: 1
  CinderWorkers: 1
  GlanceWorkers: 1
  KeystoneWorkers: 1
  NeutronWorkers: 1
  NovaWorkers: 1
  SwiftWorkers: 1

Once you deploy, you can see what Heat records for those values with:

openstack stack show overcloud -f json | jq '.parameters["controllerExtraConfig"] '
"{u'heat::api_cfn::enabled': False, u'heat::config::heat_config': {u'DEFAULT/num_engine_workers': {u'value': 1}}, u'keystone::config::keystone_config': {u'federation/sso_callback_template': {u'value': u'/etc/keystone/sso_callback_template.html'}, u'identity/domain_configurations_from_database': {u'value': True}, u'auth/methods': {u'value': u'external,password,token,oauth1,saml2'}, u'federation/trusted_dashboard': {u'value': u'https://openstack.young-dell-t1700.test/dashboard/auth/websso/'}, u'federation/remote_id_attribute': {u'value': u'MELLON_IDP'}}, u'keystone::using_domain_config': True, u'heat::api_cloudwatch::enabled': False}"

SSH in to the controller node and you can check the section of the keystone conf file.


# From keystone

# Entrypoint for the federation backend driver in the keystone.federation
# namespace. (string value)
#driver = sql

# Value to be used when filtering assertion parameters from the environment.
# (string value)
#assertion_prefix =

# Value to be used to obtain the entity ID of the Identity Provider from the
# environment (e.g. if using the mod_shib plugin this value is `Shib-Identity-
# Provider`). (string value)
#remote_id_attribute = 
remote_id_attribute = MELLON_IDP

# A domain name that is reserved to allow federated ephemeral users to have a
# domain concept. Note that an admin will not be able to create a domain with
# this name or update an existing domain to this name. You are not advised to
# change this value unless you really have to. (string value)
#federated_domain_name = Federated

# A list of trusted dashboard hosts. Before accepting a Single Sign-On request
# to return a token, the origin host must be a member of the trusted_dashboard
# list. This configuration option may be repeated for multiple values. For
# example: trusted_dashboard=http://acme.com/auth/websso
# trusted_dashboard=http://beta.com/auth/websso (multi valued)
#trusted_dashboard =

# Location of Single Sign-On callback handler, will return a token to a trusted
# dashboard host. (string value)
#sso_callback_template = /etc/keystone/sso_callback_template.html
sso_callback_template = /etc/keystone/sso_callback_template.html

Everyone has been hacked

Posted by Josh Bressers on August 01, 2016 03:12 PM
Unless you live in a cave (if you do, I'm pretty jealous) you've heard about all the political hacking going on. I don't like to take sides, so let's put aside who is right or wrong and use it as a lesson in thinking about how we have to operate in what is the new world.

In the past, there were ways to communicate that one could be relatively certain was secure and/or private. Long ago you didn't write everything down. There was a lot of verbal communication. When things were written down there was generally only one copy. Making copies of things was hard. Recording communications was hard. Even viewing or hearing many of these conversations if you weren't supposed to was hard. None of this is true anymore, it hasn't been true for a long time, yet we still act like what we do is just fine.

The old way
Long ago it was really difficult to make copies of documents and recording a conversation was almost impossible. There are only a few well funded organizations who could actually do these things. If they got what they wanted they probably weren't looking to share what they found in public.

There was also the huge advantage of most things being in locked building with locked rooms with locked filing cabinets. That meant that if someone did break it, it was probably pretty obvious something had happened. Even the best intruders will make mistakes.

The new way
Now let's think about today. Most of our communications are captured in a way that makes it nearly impossible to destroy them. Our emails are captured on servers, it's trivial to make an infinite number of copies. In most instances you will never know if someone made a copy of your data. Moving the data outside of an organization doesn't need any doors, locks, or passports. It's trivial to move data across the globe in seconds.

Keeping this in mind, if you're doing something that contains sensitive data, you can't reliably use an electronic medium to transport or store the conversations. emails can be stolen, phone calls can be recorded, text messages can be sniffed going through the air. There is almost no way to communicate that can't be used against you at some later date if it falls into the wrong hands. Even more terrifyingly is that an attacker doesn't have to come to you, thanks to the Internet, they can attack you from nearly any country on the planet.

What now?
Assuming we don't have a nice way to communicate securely or safely, what do we do? Everyone has to move information around, information is the new currency. Is it possible to do it in a way that's secure today? The short answer is no. There's nothing we can do about this today. If you send an email, it's quite possible it will leak someday. There are some ways to encrypt things, but it's impossible for most people to do correctly. There are even some apps that can help with secure communications but not everyone uses them or knows about them.

We need people to understand that information is a currency. We understand the concept of money. Your information is similarly valuable. We trade currency for goods and services, it can also be stolen if not protected. Nobody would use a bank without doors. We store our information in places that are unsecured and we often give out information for free. It will be up to the youth to solve this one, most of us old folks will never understand this concept any more than our grandparents could understand the Internet.

Once we understand the value of our information, we can more easily justify keeping it secure during transport and storage. Armored trucks transport money for a reason. Nobody is going to trust a bicycle courier to move large sums of cash, the same will be true of data. Moving things securely isn't easy nor is it free. There will have to be some sort of trade off that benefits both parties. Today it's pretty one sided with us giving out our information for free with minimal benefit.

Where do we go now? Probably nowhere. While I think things are starting to turn, we're not there yet. There will have to be a few more serious data leaks before the right questions start to get asked. But when they do, it will be imperative we understand that data is a currency. If we treat it as such it will become easier to understand what needs to be done.

Leave your comments on twitter: @joshbressers

ControllerExtraConfig and Tripleo Quickstart

Posted by Adam Young on July 28, 2016 05:20 PM

Once I have the undercloud deployed, I want to be able to quickly deploy and redeploy overclouds.  However, my last attempt to affect change on the overcloud did not modify the Keystone config file the way I intended.  Once again, Steve Hardy helped me to understand what I was doing wrong.


/tmp/deploy_env.yml already definied ControllerExtraConfig: and my redefinition was ignored.

The Details

I’ve been using Quickstart to develop.  To deploy the overcloud, I run the script /home/stack/overcloud-deploy.sh which, in turn, runs the command:

openstack overcloud deploy --templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e $HOME/network-environment.yaml --control-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml --ntp-server pool.ntp.org \
${DEPLOY_ENV_YAML:+-e $DEPLOY_ENV_YAML}  "$@"|| deploy_status=1

I want to set two parameters in the Keystone config file, so I created a file named keystone_extra_config.yml

     keystone::using_domain_config: true
     keystone::domain_config_directory: /path/to/config

And edited /home/stack/overcloud-deploy.sh to add in -e /home/stack/keystone_extra_config.yml likwe this:

openstack overcloud deploy --templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e $HOME/network-environment.yaml --control-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml --ntp-server pool.ntp.org \
    ${DEPLOY_ENV_YAML:+-e $DEPLOY_ENV_YAML}    -e /home/stack/keystone_extra_config.yml   "$@"|| deploy_status=1

I have run this both on an already deployed overcloud and from an undercloud with no stacks deployed, but in neither case have I seen the values in the config file.

Steve Hardy walked me through this from the CLI:

openstack stack resource list -n5 overcloud | grep “OS::TripleO::Controller ”

| 1 | b4a558a2-297d-46c6-b658-46f9dc0fcd51 | OS::TripleO::Controller | CREATE_COMPLETE | 2016-07-28T01:49:02 | overcloud-Controller-y2lmuipmynnt |
| 0 | 5b93eee2-97f6-4b8e-b9a0-b5edde6b4795 | OS::TripleO::Controller | CREATE_COMPLETE | 2016-07-28T01:49:02 | overcloud-Controller-y2lmuipmynnt |
| 2 | 1fdfdfa9-759b-483c-a943-94f4c7b04d3b | OS::TripleO::Controller | CREATE_COMPLETE | 2016-07-28T01:49:02 | overcloud-Controller-y2lmuipmynnt

Looking in to each of these  stacks for the string “ontrollerExtraConfig” showed that it was defined, but was not showing my values.  Thus, my customization was not even making it as far as the Heat database.

I went back to the quickstart command and did a grep through the files included with the -e flags, and found the deploy_env.yml file already had defined this field.  Once I merged my changes into /tmp/deploy_env.yml, I saw the values specified in the Hiera data.

Of course, due to a different mistake I made, the deploy failed.  When specifying domain specific backends in a config directory, puppet validates the path….can’t pass in garbage like I was doing, just for debugging.

Once I got things clean, tore down the old overcloud and redeployed, everything worked.  Here was the final /home/stack/deploy_env.yaml environment file I used:

    keystone::using_domain_config: true
        value: true

    # In releases before Mitaka, HeatWorkers doesn't modify
    # num_engine_workers, so handle via heat::config 
        value: 1
    heat::api_cloudwatch::enabled: false
    heat::api_cfn::enabled: false
  HeatWorkers: 1
  CeilometerWorkers: 1
  CinderWorkers: 1
  GlanceWorkers: 1
  KeystoneWorkers: 1
  NeutronWorkers: 1
  NovaWorkers: 1
  SwiftWorkers: 1

And the modified version of overcloud-deploy now executes this command:

# Deploy the overcloud!
openstack overcloud deploy --debug --templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --timeout 90 -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e $HOME/network-environment.yaml --control-scale 3 --neutron-network-type vxlan --neutron-tunnel-types vxlan -e /usr/share/openstack-tripleo-heat-templates/environments/puppet-pacemaker.yaml --ntp-server pool.ntp.org -e /home/stack/deploy_env.yaml   "$@"|| deploy_status=1

Looking in the controller nodes /etc/keystone/keystone.conf file I see:

#domain_specific_drivers_enabled = false
domain_specific_drivers_enabled = True

# Extract the domain specific configuration options from the resource backend
# where they have been stored with the domain data. This feature is disabled by
# default (in which case the domain specific options will be loaded from files
# in the domain configuration directory); set to true to enable. (boolean
# value)
#domain_configurations_from_database = false
domain_configurations_from_database = True

# Path for Keystone to locate the domain specific identity configuration files
# if domain_specific_drivers_enabled is set to true. (string value)
#domain_config_dir = /etc/keystone/domains
domain_config_dir = /etc/keystone/domains

Flocking to Kraków

Posted by Stephen Gallagher on July 28, 2016 03:39 PM

In less than five days, the fourth annual Flock conference will take place in Kraków, Poland. This is Fedora’s premier contributor event each year, alternately taking place in North America and Europe. Attendance is completely free for anyone at all, so if you happen to be in the area (maybe hanging around after World Youth Day going on right now), you should certainly stop in!

This year’s conference is shaping up to be a truly excellent one, with a massive amount of exciting content to see. The full schedule has been available for a while, and I’ve got to say: there are no lulls in the action. In fact, I’ve put together my schedule of sessions I want to see and there are in fact no gaps in it. That said, here are a few of the sessions that I suspect are going to be the most exciting:

Aug. 2 @11:00 – Towards an Atomic Workstation

For a couple of years now, Fedora has been at the forefront of developing container technologies, particularly Docker and Project Atomic. Now, the Workstation SIG is looking to take some of those Project Atomic technologies and adopt them for the end-user workstation.

Aug. 2 @17:30 – University Outreach

I’ve long held that one of Fedora’s primary goals should always be to enlighten the next generation of the open source community. Over the last year, the Fedora Project began an Initiative to expand our presence in educational programs throughout the world. I’m extremely interested to see where that has taken us (and where it is going next).

Aug. 3 @11:00 – Modularity

This past year, there has been an enormous research-and-development effort poured into the concept of building a “modular” Fedora. What does this mean? Well it means solving the age-old Too Fast/Too Slow problem (sometimes described as “I want everything on my system to stay exactly the same for a long time. Except these three things over here that I always want to be running at the latest version.”). With modularity, the hope is that people will be able to put together their ideal operating system from parts bigger than just traditional packages.

Aug. 3 @16:30 – Diversity: Women in Open Source

This is a topic that is very dear to my heart, having a daughter who is already finding her way towards an engineering future. Fedora and many other projects (and companies) talk about “meritocracy” a lot: the concept that the best idea should always win. However the technology industry in general has a severe diversity problem. When we talk about “meritocracy”, the implicit contract there is that we have many ideas to choose from. However, if we don’t have a community that represents many different viewpoints and cultures, then we are by definition only choosing the best idea from a very limited pool. I’m very interested to hear how Fedora is working towards attracting people with new ideas.


FreeIPA Lightweight CA internals

Posted by Fraser Tweedale on July 26, 2016 02:01 AM

In the preceding post, I explained the use cases for the FreeIPA lightweight sub-CAs feature, how to manage CAs and use them to issue certificates, and current limitations. In this post I detail some of the internals of how the feature works, including how signing keys are distributed to replicas, and how sub-CA certificate renewal works. I conclude with a brief retrospective on delivering the feature.

Full details of the design of the feature can be found on the design page. This post does not cover everything from the design page, but we will look at the aspects that are covered from the perspective of the system administrator, i.e. "what is happening on my systems?"

Dogtag lightweight CA creation

The PKI system used by FreeIPA is called Dogtag. It is a separate project with its own interfaces; most FreeIPA certificate management features are simply reflecting a subset of the corresponding Dogtag interface, often integrating some additional access controls or identity management concepts. This is certainly the case for FreeIPA sub-CAs. The Dogtag lightweight CAs feature was implemented initially to support the FreeIPA use case, yet not all aspects of the Dogtag feature are used in FreeIPA as of v4.4, and other consumers of the Dogtag feature are likely to emerge (in particular: OpenStack).

The Dogtag lightweight CAs feature has its own design page which documents the feature in detail, but it is worth mentioning some important aspects of the Dogtag feature and their impact on how FreeIPA uses the feature.

  • Dogtag lightweight CAs are managed via a REST API. The FreeIPA framework uses this API to create and manage lightweight CAs, using the privileged RA Agent certificate to authenticate. In a future release we hope to remove the RA Agent and authenticate as the FreeIPA user using GSS-API proxy credentials.
  • Each CA in a Dogtag instance, including the "main" CA, has an LDAP entry with object class authority. The schema includes fields such as subject and issuer DN, certificate serial number, and a UUID primary key, which is randomly generated for each CA. When FreeIPA creates a CA, it stores this UUID so that it can map the FreeIPA CA’s common name (CN) to the Dogtag authority ID in certificate requests or other management operations (e.g. CA deletion).
  • The "nickname" of the lightweight CA signing key and certificate in Dogtag’s NSSDB is the nickname of the "main" CA signing key, with the lightweight CA’s UUID appended. In general operation FreeIPA does not need to know this, but the ipa-certupdate program has been enhanced to set up Certmonger tracking requests for FreeIPA-managed lightweight CAs and therefore it needs to know the nicknames.
  • Dogtag lightweight CAs may be nested, but FreeIPA as of v4.4 does not make use of this capability.

So, let’s see what actually happens on a FreeIPA server when we add a lightweight CA. We will use the sc example from the previous post. The command executed to add the CA, with its output, was:

% ipa ca-add sc --subject "CN=Smart Card CA, O=IPA.LOCAL" \
    --desc "Smart Card CA"
Created CA "sc"
  Name: sc
  Description: Smart Card CA
  Authority ID: 660ad30b-7be4-4909-aa2c-2c7d874c84fd
  Subject DN: CN=Smart Card CA,O=IPA.LOCAL
  Issuer DN: CN=Certificate Authority,O=IPA.LOCAL 201606201330

The LDAP entry added to the Dogtag database was:

dn: cn=660ad30b-7be4-4909-aa2c-2c7d874c84fd,ou=authorities,ou=ca,o=ipaca
authoritySerial: 63
objectClass: authority
objectClass: top
cn: 660ad30b-7be4-4909-aa2c-2c7d874c84fd
authorityID: 660ad30b-7be4-4909-aa2c-2c7d874c84fd
authorityKeyNickname: caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d87
authorityKeyHost: f24b-0.ipa.local:443
authorityEnabled: TRUE
authorityDN: CN=Smart Card CA,O=IPA.LOCAL
authorityParentDN: CN=Certificate Authority,O=IPA.LOCAL 201606201330
authorityParentID: d3e62e89-df27-4a89-bce4-e721042be730

We see the authority UUID in the authorityID attribute as well as cn and the DN. authorityKeyNickname records the nickname of the signing key in Dogtag’s NSSDB. authorityKeyHost records which hosts possess the signing key – currently just the host on which the CA was created. authoritySerial records the serial number of the certificate (more that that later). The meaning of the rest of the fields should be clear.

If we have a peek into Dogtag’s NSSDB, we can see the new CA’s certificate:

# certutil -d /etc/pki/pki-tomcat/alias -L

Certificate Nickname              Trust Attributes

caSigningCert cert-pki-ca         CTu,Cu,Cu
auditSigningCert cert-pki-ca      u,u,Pu
Server-Cert cert-pki-ca           u,u,u
caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd u,u,u
ocspSigningCert cert-pki-ca       u,u,u
subsystemCert cert-pki-ca         u,u,u

There it is, alongside the main CA signing certificate and other certificates used by Dogtag. The trust flags u,u,u indicate that the private key is also present in the NSSDB. If we pretty print the certificate we will see a few interesting things:

# certutil -d /etc/pki/pki-tomcat/alias -L \
    -n 'caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd'
        Version: 3 (0x2)
        Serial Number: 63 (0x3f)
        Signature Algorithm: PKCS #1 SHA-256 With RSA Encryption
        Issuer: "CN=Certificate Authority,O=IPA.LOCAL 201606201330"
            Not Before: Fri Jul 15 05:46:00 2016
            Not After : Tue Jul 15 05:46:00 2036
        Subject: "CN=Smart Card CA,O=IPA.LOCAL"
        Signed Extensions:
            Name: Certificate Basic Constraints
            Critical: True
            Data: Is a CA with no maximum path length.

Observe that:

  • The certificate is indeed a CA.
  • The serial number (63) agrees with the CA’s LDAP entry.
  • The validity period is 20 years, the default for CAs in Dogtag. This cannot be overridden on a per-CA basis right now, but addressing this is a priority.

Finally, let’s look at the raw entry for the CA in the FreeIPA database:

dn: cn=sc,cn=cas,cn=ca,dc=ipa,dc=local
cn: sc
ipaCaIssuerDN: CN=Certificate Authority,O=IPA.LOCAL 201606201330
objectClass: ipaca
objectClass: top
ipaCaSubjectDN: CN=Smart Card CA,O=IPA.LOCAL
ipaCaId: 660ad30b-7be4-4909-aa2c-2c7d874c84fd
description: Smart Card CA

We can see that this entry also contains the subject and issuer DNs, and the ipaCaId attribute holds the Dogtag authority ID, which allows the FreeIPA framework to dereference the local ID (sc) to the Dogtag ID as needed. We also see that the description attribute is local to FreeIPA; Dogtag also has a description attribute for lightweight CAs but FreeIPA uses its own.

Lightweight CA replication

FreeIPA servers replicate objects in the FreeIPA directory among themselves, as do Dogtag replicas (note: in Dogtag, the term clone is often used). All Dogtag instances in a replicated environment need to observe changes to lightweight CAs (creation, modification, deletion) that were performed on another replica and update their own view so that they can respond to requests consistently. This is accomplished via an LDAP persistent search which is run in a monitor thread. Care was needed to avoid race conditions. Fortunately, the solution for LDAP-based profile storage provided a fine starting point for the authority monitor; although lightweight CAs are more complex, many of the same race conditions can occur and these were already addressed in the LDAP profile monitor implementation.

But unlike LDAP-based profiles, a lightweight CA consists of more than just an LDAP object; there is also the signing key. The signing key lives in Dogtag’s NSSDB and for security reasons cannot be transported through LDAP. This means that when a Dogtag clone observes the addition of a lightweight CA, an out-of-band mechanism to transport the signing key must also be triggered.

This mechanism is covered in the design pages but the summarised process is:

  1. A Dogtag clone observes the creation of a CA on another server and starts a KeyRetriever thread. The KeyRetriever is implemented as part of Dogtag, but it is configured to run the /usr/libexec/ipa/ipa-pki-retrieve-key program, which is part of FreeIPA. The program is invoked with arguments of the server to request the key from (this was stored in the authorityKeyHost attribute mentioned earlier), and the nickname of the key to request.
  2. ipa-pki-retrieve-key requests the key from the Custodia daemon on the source server. It authenticates as the dogtag/<requestor-hostname>@REALM service principal. If authenticated and authorised, the Custodia daemon exports the signing key from Dogtag’s NSSDB wrapped by the main CA’s private key, and delivers it to the requesting server. ipa-pki-retrieve-key outputs the wrapped key then exits.
  3. The KeyRetriever reads the wrapped key and imports (unwraps) it into the Dogtag clone’s NSSDB. It then initialises the Dogtag CA’s Signing Unit allowing the CA to service signing requests on that clone, and adds its own hostname to the CA’s authorityKeyHost attribute.

Some excerpts of the CA debug log on the clone (not the server on which the sub-CA was first created) shows this process in action. The CA debug log is found at /var/log/pki/pki-tomcat/ca/debug. Some irrelevant messages have been omitted.

[25/Jul/2016:15:45:56][authorityMonitor]: authorityMonitor: Processed change controls.
[25/Jul/2016:15:45:56][authorityMonitor]: authorityMonitor: ADD
[25/Jul/2016:15:45:56][authorityMonitor]: readAuthority: new entryUSN = 109
[25/Jul/2016:15:45:56][authorityMonitor]: CertificateAuthority init 
[25/Jul/2016:15:45:56][authorityMonitor]: ca.signing Signing Unit nickname caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd
[25/Jul/2016:15:45:56][authorityMonitor]: SigningUnit init: debug Certificate object not found
[25/Jul/2016:15:45:56][authorityMonitor]: CA signing key and cert not (yet) present in NSSDB
[25/Jul/2016:15:45:56][authorityMonitor]: Starting KeyRetrieverRunner thread

Above we see the authorityMonitor thread observe the addition of a CA. It adds the CA to its internal map and attempts to initialise it, which fails because the key and certificate are not available, so it starts a KeyRetrieverRunner in a new thread.

[25/Jul/2016:15:45:56][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Running ExternalProcessKeyRetriever
[25/Jul/2016:15:45:56][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: About to execute command: [/usr/libexec/ipa/ipa-pki-retrieve-key, caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd, f24b-0.ipa.local]

The KeyRetrieverRunner thread invokes ipa-pki-retrieve-key with the nickname of the key it wants, and a host from which it can retrieve it. If a CA has multiple sources, the KeyRetrieverRunner will try these in order with multiple invocations of the helper, until one succeeds. If none succeed, the thread goes to sleep and retries when it wakes up initially after 10 seconds, then backing off exponentially.

[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Importing key and cert
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Reinitialising SigningUnit
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: ca.signing Signing Unit nickname caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Got token Internal Key Storage Token by name
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Found cert by nickname: 'caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd' with serial number: 63
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Got private key from cert
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Got public key from cert
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: in init - got CA name CN=Smart Card CA,O=IPA.LOCAL

The key retriever successfully returned the key data and import succeeded. The signing unit then gets initialised.

[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: Adding self to authorityKeyHosts attribute
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: In LdapBoundConnFactory::getConn()
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: postCommit: new entryUSN = 361
[25/Jul/2016:15:47:13][KeyRetrieverRunner-660ad30b-7be4-4909-aa2c-2c7d874c84fd]: postCommit: nsUniqueId = 4dd42782-4a4f11e6-b003b01c-c8916432
[25/Jul/2016:15:47:14][authorityMonitor]: authorityMonitor: Processed change controls.
[25/Jul/2016:15:47:14][authorityMonitor]: authorityMonitor: MODIFY
[25/Jul/2016:15:47:14][authorityMonitor]: readAuthority: new entryUSN = 361
[25/Jul/2016:15:47:14][authorityMonitor]: readAuthority: known entryUSN = 361
[25/Jul/2016:15:47:14][authorityMonitor]: readAuthority: data is current

Finally, the Dogtag clone adds itself to the CA’s authorityKeyHosts attribute. The authorityMonitor observes this change but ignores it because its view is current.

Certificate renewal

CA signing certificates will eventually expire, and therefore require renewal. Because the FreeIPA framework operates with low privileges, it cannot add a Certmonger tracking request for sub-CAs when it creates them. Furthermore, although the renewal (i.e. the actual signing of a new certificate for the CA) should only happen on one server, the certificate must be updated in the NSSDB of all Dogtag clones.

As mentioned earlier, the ipa-certupdate command has been enhanced to add Certmonger tracking requests for FreeIPA-managed lightweight CAs. The actual renewal will only be performed on whichever server is the renewal master when Certmonger decides it is time to renew the certificate (assuming that the tracking request has been added on that server).

Let’s run ipa-certupdate on the renewal master to add the tracking request for the new CA. First observe that the tracking request does not exist yet:

# getcert list -d /etc/pki/pki-tomcat/alias |grep subject
        subject: CN=CA Audit,O=IPA.LOCAL 201606201330
        subject: CN=OCSP Subsystem,O=IPA.LOCAL 201606201330
        subject: CN=CA Subsystem,O=IPA.LOCAL 201606201330
        subject: CN=Certificate Authority,O=IPA.LOCAL 201606201330
        subject: CN=f24b-0.ipa.local,O=IPA.LOCAL 201606201330

As expected, we do not see our sub-CA certificate above. After running ipa-certupdate the following tracking request appears:

Request ID '20160725222909':
        status: MONITORING
        stuck: no
        key pair storage: type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd',token='NSS Certificate DB',pin set
        certificate: type=NSSDB,location='/etc/pki/pki-tomcat/alias',nickname='caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd',token='NSS Certificate DB'
        CA: dogtag-ipa-ca-renew-agent
        issuer: CN=Certificate Authority,O=IPA.LOCAL 201606201330
        subject: CN=Smart Card CA,O=IPA.LOCAL
        expires: 2036-07-15 05:46:00 UTC
        key usage: digitalSignature,nonRepudiation,keyCertSign,cRLSign
        pre-save command: /usr/libexec/ipa/certmonger/stop_pkicad
        post-save command: /usr/libexec/ipa/certmonger/renew_ca_cert "caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd"
        track: yes
        auto-renew: yes

As for updating the certificate in each clone’s NSSDB, Dogtag itself takes care of that. All that is required is for the renewal master to update the CA’s authoritySerial attribute in the Dogtag database. The renew_ca_cert Certmonger post-renewal hook script performs this step. Each Dogtag clone observes the update (in the monitor thread), looks up the certificate with the indicated serial number in its certificate repository (a new entry that will also have been recently replicated to the clone), and adds that certificate to its NSSDB. Again, let’s observe this process by forcing a certificate renewal:

# getcert resubmit -i 20160725222909
Resubmitting "20160725222909" to "dogtag-ipa-ca-renew-agent".

After about 30 seconds the renewal process is complete. When we examine the certificate in the NSSDB we see, as expected, a new serial number:

# certutil -d /etc/pki/pki-tomcat/alias -L \
    -n "caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd" \
    | grep -i serial
        Serial Number: 74 (0x4a)

We also see that the renew_ca_cert script has updated the serial in Dogtag’s database:

# ldapsearch -D cn="Directory Manager" -w4me2Test -b o=ipaca \
    '(cn=660ad30b-7be4-4909-aa2c-2c7d874c84fd)' authoritySerial
dn: cn=660ad30b-7be4-4909-aa2c-2c7d874c84fd,ou=authorities,ou=ca,o=ipaca
authoritySerial: 74

Finally, if we look at the CA debug log on the clone, we’ll see that the the authority monitor observes the serial number change and updates the certificate in its own NSSDB (again, some irrelevant or low-information messages have been omitted):

[26/Jul/2016:10:43:28][authorityMonitor]: authorityMonitor: Processed change controls.
[26/Jul/2016:10:43:28][authorityMonitor]: authorityMonitor: MODIFY
[26/Jul/2016:10:43:28][authorityMonitor]: readAuthority: new entryUSN = 1832
[26/Jul/2016:10:43:28][authorityMonitor]: readAuthority: known entryUSN = 361
[26/Jul/2016:10:43:28][authorityMonitor]: CertificateAuthority init 
[26/Jul/2016:10:43:28][authorityMonitor]: ca.signing Signing Unit nickname caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd
[26/Jul/2016:10:43:28][authorityMonitor]: Got token Internal Key Storage Token by name
[26/Jul/2016:10:43:28][authorityMonitor]: Found cert by nickname: 'caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd' with serial number: 63
[26/Jul/2016:10:43:28][authorityMonitor]: Got private key from cert
[26/Jul/2016:10:43:28][authorityMonitor]: Got public key from cert
[26/Jul/2016:10:43:28][authorityMonitor]: CA signing unit inited
[26/Jul/2016:10:43:28][authorityMonitor]: in init - got CA name CN=Smart Card CA,O=IPA.LOCAL
[26/Jul/2016:10:43:28][authorityMonitor]: Updating certificate in NSSDB; new serial number: 74

When the authority monitor processes the change, it reinitialises the CA including its signing unit. Then it observes that the serial number of the certificate in its NSSDB differs from the serial number from LDAP. It pulls the certificate with the new serial number from its certificate repository, imports it into NSSDB, then reinitialises the signing unit once more and sees the correct serial number:

[26/Jul/2016:10:43:28][authorityMonitor]: ca.signing Signing Unit nickname caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd
[26/Jul/2016:10:43:28][authorityMonitor]: Got token Internal Key Storage Token by name
[26/Jul/2016:10:43:28][authorityMonitor]: Found cert by nickname: 'caSigningCert cert-pki-ca 660ad30b-7be4-4909-aa2c-2c7d874c84fd' with serial number: 74
[26/Jul/2016:10:43:28][authorityMonitor]: Got private key from cert
[26/Jul/2016:10:43:28][authorityMonitor]: Got public key from cert
[26/Jul/2016:10:43:28][authorityMonitor]: CA signing unit inited
[26/Jul/2016:10:43:28][authorityMonitor]: in init - got CA name CN=Smart Card CA,O=IPA.LOCAL

Currently this update mechanism is only used for lightweight CAs, but it would work just as well for the main CA too, and we plan to switch at some stage so that the process is consistent for all CAs.

Wrapping up

I hope you have enjoyed this tour of some of the lightweight CA internals, and in particular seeing how the design actually plays out on your systems in the real world.

FreeIPA lightweight CAs has been the most complex and challenging project I have ever undertaken. It took the best part of a year from early design and proof of concept, to implementing the Dogtag lightweight CAs feature, then FreeIPA integration, and numerous bug fixes, refinements or outright redesigns along the way. Although there are still some rough edges, some important missing features and, I expect, many an RFE to come, I am pleased with what has been delivered and the overall design.

Thanks are due to all of my colleagues who contributed to the design and review of the feature; each bit of input from all of you has been valuable. I especially thank Ade Lee and Endi Dewata from the Dogtag team for their help with API design and many code reviews over a long period of time, and from the FreeIPA team Jan Cholasta and Martin Babinsky for a their invaluable input into the design, and much code review and testing. I could not have delivered this feature without your help; thank you for your collaboration!

Lightweight Sub-CAs in FreeIPA 4.4

Posted by Fraser Tweedale on July 25, 2016 02:32 AM

Last year FreeIPA 4.2 brought us some great new certificate management features, including custom certificate profiles and user certificates. The upcoming FreeIPA 4.4 release builds upon this groundwork and introduces lightweight sub-CAs, a feature that lets admins to mint new CAs under the main FreeIPA CA and allows certificates for different purposes to be issued in different certificate domains. In this post I will review the use cases and demonstrate the process of creating, managing and issuing certificates from sub-CAs. (A follow-up post will detail some of the mechanisms that operate behind the scenes to make the feature work.)

Use cases

Currently, all certificates issued by FreeIPA are issued by a single CA. Say you want to issue certificates for various purposes: regular server certificates, and user certificates for VPN authentication, and authentication to a particular web service. Currently, assuming the certificate bore the appropriate Key Usage and Extended Key Usages extensions (with the default profile, they do), a certificate issued for one of these purposes could be used for all of the other purposes.

Issuing certificates for particular purposes (especially client authentication scenarios) from a sub-CA allows an administrator to configure the endpoint authenticating the clients to use the immediate issuer certificate for validation client certificates. Therefore, if you had a sub-CA for issuing VPN authentication certificates, and a different sub-CA for issuing certificates for authenticating to the web service, one could configure these services to accept certificates issued by the relevant CA only. Thus, where previously the scope of usability may have been unacceptably broad, administrators now have more fine-grained control over how certificates can be used.

Finally, another important consideration is that while revoking the main IPA CA is usually out of the question, it is now possible to revoke an intermediate CA certificate. If you create a CA for a particular organisational unit (e.g. some department or working group) or service, if or when that unit or service ceases to operate or exist, the related CA certificate can be revoked, rendering certificates issued by that CA useless, as long as relying endpoints perform CRL or OCSP checks.

Creating and managing sub-CAs

In this scenario, we will add a sub-CA that will be used to issue certificates for users’ smart cards. We assume that a profile for this purpose already exists, called userSmartCard.

To begin with, we are authenticated as admin or another user that has CA management privileges. Let’s see what CAs FreeIPA already knows about:

% ipa ca-find
1 CA matched
  Name: ipa
  Description: IPA CA
  Authority ID: d3e62e89-df27-4a89-bce4-e721042be730
  Subject DN: CN=Certificate Authority,O=IPA.LOCAL 201606201330
  Issuer DN: CN=Certificate Authority,O=IPA.LOCAL 201606201330
Number of entries returned 1

We can see that FreeIPA knows about the ipa CA. This is the "main" CA in the FreeIPA infrastructure. Depending on how FreeIPA was installed, it could be a root CA or it could be chained to an external CA. The ipa CA entry is added automatically when installing or upgrading to FreeIPA 4.4.

Now, let’s add a new sub-CA called sc:

% ipa ca-add sc --subject "CN=Smart Card CA, O=IPA.LOCAL" \
    --desc "Smart Card CA"
Created CA "sc"
  Name: sc
  Description: Smart Card CA
  Authority ID: 660ad30b-7be4-4909-aa2c-2c7d874c84fd
  Subject DN: CN=Smart Card CA,O=IPA.LOCAL
  Issuer DN: CN=Certificate Authority,O=IPA.LOCAL 201606201330

The --subject option gives the full Subject Distinguished Name for the new CA; it is mandatory, and must be unique among CAs managed by FreeIPA. An optional description can be given with --desc. In the output we see that the Issuer DN is that of the IPA CA.

Having created the new CA, we must add it to one or more CA ACLs to allow it to be used. CA ACLs were added in FreeIPA 4.2 for defining policies about which profiles could be used for issuing certificates to which subject principals (note: the subject principal is not necessarily the principal performing the certificate request). In FreeIPA 4.4 the CA ACL concept has been extended to also include which CA is being asked to issue the certificate.

We will add a CA ACL called user-sc-userSmartCard and associate it with all users, with the userSmartCard profile, and with the sc CA:

% ipa caacl-add user-sc-userSmartCard --usercat=all
Added CA ACL "user-sc-userSmartCard"
  ACL name: user-sc-userSmartCard
  Enabled: TRUE
  User category: all

% ipa caacl-add-profile user-sc-userSmartCard --certprofile userSmartCard
  ACL name: user-sc-userSmartCard
  Enabled: TRUE
  User category: all
  CAs: sc
  Profiles: userSmartCard
Number of members added 1

% ipa caacl-add-ca user-sc-userSmartCard --ca sc
  ACL name: user-sc-userSmartCard
  Enabled: TRUE
  User category: all
  CAs: sc
Number of members added 1

A CA ACL can reference multiple CAs individually, or, like we saw with users above, we can associate a CA ACL with all CAs by setting --cacat=all when we create the CA ACL, or via the ipa ca-mod command.

A special behaviour of CA ACLs with respect to CAs must be mentioned: if a CA ACL is associated with no CAs (either individually or by category), then it allows access to the ipa CA (and only that CA). This behaviour, though inconsistent with other aspects of CA ACLs, is for compatibility with pre-sub-CAs CA ACLs. An alternative approach is being discussed and could be implemented before the final release.

Requesting certificates from sub-CAs

The ipa cert-request command has learned the --ca argument for directing the certificate request to a particular sub-CA. If it is not given, it defaults to ipa.

alice already has a CSR for the key in her smart card, so now she can request a certificate from the sc CA:

% ipa cert-request --principal alice \
    --profile userSmartCard --ca sc /path/to/csr.req
  Certificate: MIIDmDCCAoCgAwIBAgIBQDANBgkqhkiG9w0BA...
  Subject: CN=alice,O=IPA.LOCAL
  Issuer: CN=Smart Card CA,O=IPA.LOCAL
  Not Before: Fri Jul 15 05:57:04 2016 UTC
  Not After: Mon Jul 16 05:57:04 2018 UTC
  Fingerprint (MD5): 6f:67:ab:4e:0c:3d:37:7e:e6:02:fc:bb:5d:fe:aa:88
  Fingerprint (SHA1): 0d:52:a7:c4:e1:b9:33:56:0e:94:8e:24:8b:2d:85:6e:9d:26:e6:aa
  Serial number: 64
  Serial number (hex): 0x40

Certmonger has also learned the -X/--issuer option for specifying that the request be directed to the named issuer. There is a clash of terminology here; the "CA" terminology in Certmonger is already used to refer to a particular CA "endpoint". Various kinds of CAs and multiple instances thereof are supported. But now, with Dogtag and FreeIPA, a single CA may actually host many CAs. Conceptually this is similar to HTTP virtual hosts, with the -X option corresponding to the Host: header for disambiguating the CA to be used.

If the -X option was given when creating the tracking request, the Certmonger FreeIPA submit helper uses its value in the --ca option to ipa cert-request. These requests are subject to CA ACLs.


It is worth mentioning a few of the limitations of the sub-CAs feature, as it will be delivered in FreeIPA 4.4.

All sub-CAs are signed by the ipa CA; there is no support for "nesting" CAs. This limitation is imposed by FreeIPA – the lightweight CAs feature in Dogtag does not have this limitation. It could be easily lifted in a future release, if there is a demand for it.

There is no support for introducing unrelated CAs into the infrastructure, either by creating a new root CA or by importing an unrelated external CA. Dogtag does not have support for this yet, either, but the lightweight CAs feature was designed so that this would be possible to implement. This is also why all the commands and argument names mention "CA" instead of "Sub-CA". I expect that there will be demand for this feature at some stage in the future.

Currently, the key type and size are fixed at RSA 2048. Same is true in Dogtag, and this is a fairly high priority to address. Similarly, the validity period is fixed, and we will need to address this also, probably by allowing custom CA profiles to be used.


The Sub-CAs feature will round out FreeIPA’s certificate management capabilities making FreeIPA a more attractive solution for organisations with sophisticated certificate requirements. Multiple security domains can be created for issuing certificates with different purposes or scopes. Administrators have a simple interface for creating and managing CAs, and rules for how those CAs can be used.

There are some limitations which may be addressed in a future release; the ability to control key type/size and CA validity period will be the highest priority among them.

This post examined the use cases and high-level user/administrator experience of sub-CAs. In the next post, I will detail some of the machinery that makes the sub-CAs feature work.

Looking for Andre

Posted by Adam Young on July 24, 2016 11:12 PM

My Brother sent out the following message. Signal boosting it here.

“A few weeks ago I started talking to a few guys on the street. (Homeless) Let’s call them James and Anthony. Let’s just skip ahead. I bought them lunch. Ok. I bought $42 worth of Wendy’s $1 burgers and nuggets and a case of water. On top of their lunch. They gathered up all their friends by the Library in Copley sq and made sure that everyone ate. It was like a cookout. You should have seen how happy everyone was. It gave me a feeling that was unexplainable.

“This morning I was in Downtown crossings. I got the feeling in my gut again. That do something better today feeling. I saw a blind guy. His eyes were a mess. He was thin. Almost emaciated. Let’s call him Andre’ he is 30 years old.



I bought him lunch. I sat with him at a table while he ate. We talked. Andre’s back story…8 years ago he was in college. He was a plumbers apprentice. He was going on a date. As he walked up to the door to knock for the girl. Someone came up and shot him twice in the temple. Andre’ woke up in the hospital blind. To this day he has no idea who or why he was shot. The only possessions Andre’ had was the way-too-warm clothes on his back, his blind cane. His sign, and his cup. I took Andre’ to TJ Maxx. It’s 90 degrees at at 9:30am. I got him a t-shirt, shorts, clean socks and underwear and a back pack. After I paid, I took him back to the dressing room so he could have some privacy while he changed. I told the lady at the dressing room that he was going in to change. She told me that wasn’t allowed. I kindly informed her that I wasn’t asking… She looked at me and quickly realized it wasn’t a request. More of a statement. I must have had a look on my face.

I get those sometimes.

She nodded her understanding. In the dressing room Andre’ cried. He was ashamed for crying. I didn’t say much. Just put my hand on his back for a second to let him knew I understood. After he changed I took him back to where I originally met him and found out his routine. Where he goes when and such. I left Andre’ in his spot and went to go find James and Anthony. You remember them from the beginning of this story. They were in the same spot as a few weeks ago. They remembered me. I told them it was time to return the favor. I explained to them that I wanted them to look out for Andre’ to make sure he was safe. Andre’ has been repeatedly mugged. Who the fuck mugs a hungry homeless blind guy? Well. They must have seen the look in my face saying this wasn’t a request.

I apparently get that look sometimes.

They came with me from Copley all the way to downtown crossings. We went looking for Andre’. We looked all over but couldn’t find him. We went all over south station and back up all over downtown crossings. (For those not familiar, Google a map of Boston) we couldn’t find Andre’. Anthony said he’s seen him around and knew who I was talking about. They promised me they would look for him everyday. I know they will too. They look out for theirs. Remember all the food I bought them and how they made sure everyone ate? James doesn’t like bullies. He sure as shit won’t tolerate someone stealing from a blind and scared homeless guy. Anthony spends his mornings in south station. He promised me that he will find him and try to bring him to where they stay. It’s safer in numbers and when you have a crew watching your back. You have to know who to trust. That’s what they told me. I gave James and Anthony some money for their time and bought them each a cold drink.

“It’s fucking hot out.

“These guys are all on hard times. Some of them fucked up. Some were just unlucky. Andre’…now that’s some shit luck. That’s just not fucking fair. I’ve never met someone like Andre’. How in the hell would I survive if I couldn’t see? I have an amazing family and a great group of friends. Andre’ has no one. Did I change his life? Nope. Did I make his day better? I honestly hope so. I talked to him like a man. I didn’t let him know how horrible I felt for him. No matter how far you fall in life. If you have the strength to get up each day and try to feed your self, you still have pride, you still have hope. I didn’t want to take away any of his pride. He doesn’t have much to begin with. But he must have a little. I will continue to look for Andre’ every day. I met him near my office. I can look during my lunch. I have to find him and keep an eye on him.

“No matter how bad things get. No matter how unfair you feel you have been treated. Pretty much no matter what your lot in life is. Think of Andre’ when you feel down. If he has the strength to go on… So do you.

“I didn’t write this to say ‘look what great things I did.’ I wish I could write this with out being part of the story. There is no way I could express how much this meeting of Andre’ has effected me with out letting you know this is what I did today. ..

“I just got home from this experience. I’ll update this when I find Andre’ and let you know how he’s doing. If anyone in Boston reads this and sees a black guy about my height. Thinner than me…Obviously blind.

“Please hashtag ‪#‎lookingforAndre‬ and tell me where you saw him. Like I said. South station or downtown crossings are the areas that I know of. Thank you for reading this. Help me find Andre’.”

And then he sent this

“I found Andre’. He is meeting me for breakfast tomorrow.”



Billy Set up a fundraising account for Andre.