Fedora security Planet

Musical Midi Accompaniment: First Tune

Posted by Adam Young on November 23, 2020 02:16 AM

Here is a tune I wrote called “Standard Deviation” done as an accompaniment track using MMA. This is a very simplistic interpretation that makes no use of dynamics, variations in the BossaNova Groove, or even decent repeat logic. But it compiles.

Here’s the MMA file.

Slightly Greater than one Standard Deviation from the Mean:

Episode 225 – Who is responsible if IoT burns down your house?

Posted by Josh Bressers on November 23, 2020 12:01 AM

Josh and Kurt talk about the safety and liability of new devices. What happens when your doorbell can burn down your house? What if it’s your fault the doorbell burned down your house? There isn’t really any prior art for where our devices are taking us, who knows what the future will look like.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2077-1" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_225_Who_is_responsible_if_IoT_burns_down_your_house.mp3?_=1" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_225_Who_is_responsible_if_IoT_burns_down_your_house.mp3</audio>

Show Notes

Musical Midi Accompaniment: Understanding the Format

Posted by Adam Young on November 22, 2020 07:52 PM

Saxophone is a solo instrument. Unless you are into the sounds of Saxophone multiphonics, harmony requires playing with some other instrument. For Jazz, this tends to be a rhythms section of Piano, Bass, and Drums. As a kid, my practicing (without a live Rhythm section) required playing along with pre-recordings of tunes. I had my share of Jamie Aebersold records.

Nowadays, the tool of choice for most Jazz muscians, myself included is iReal Pro. A lovely little app for the phone. All of the Real Book tunes have their chord progressions been posted and generated. The format is simple enough.

But it is a proprietary app. While I continue to support and use it, I am also looking for alternatives that let me get more involved. One such tool is Musical MIDI Accompaniment. I’m just getting started with it, and I want to keep my notes here.

First is just getting it to play. Whether you get the tarball or checkout from Git, there is a trick that you need to do in order to even play examples: regenerate the libraries.

./mma.py -G

That allows me to generate a midi file from a file in the MMA Domain Specific Language (DSL) which is also called MMA. I downloaded the backing track for I’ve Got You Under My Skin https://www.mellowood.ca/mma/examples/examples.html and, once I regenerated the libraries with the above command, was able to run :

./mma.py ~/Downloads/ive-got-you-under-my-skin.mma
Creating new midi file (120 bars, 4.57 min / 4:34 m:s): '/home/ayoung/Downloads/ive-got-you-under-my-skin.mid'

Which I can then play with timidity.

The file format is not quite as simplistic as iReal Pro, but does not look so complex that I won’t be able to learn it.

There are examples of things that look like real programming. Begin and End Blocks.

Line Numbers. This is going to give my flashbacks to coding in Basic on my C64…not such an unpleasant set of memories. And musical ones at that.

Ok, lets take this apart. Here is the first few lines:

// I've Got You Under My Skin

Tempo 105
Groove Metronome2-4

	z * 2

Comments are doubles slashes. Title is just for documentation.

Tempo is in BPM.

Groove Metronome2-4 Says to use a Groove, the MMA “Grooves, in some ways, are MMA ‘s answer to macros … but they are cooler, easier to use, and have a more musical name. ” Says the manual. So, somewhere we have inherited a Groove called Metronome…something. Is the 2-4 part of the name? It looks it. Found this in the library

lib/stdlib/metronome.mma:97:DefGroove Metronome2-4 A very useful introduction. On bar one we have hits on beats 1 and 3; on bar two hits on beats 1, 2, 3 and 4.

Which is based on a leader counting off the time in the song. If you play the midi file, you can hear the cowbell-effect used to count off

z * 2 is the way of saying that this extends for 2 measures.

The special sequences, “-” or “z”, are also the equivalent of a rest or “tacet” sequence. For example, in defining a 4 bar sequence with a bass pattern on the first 3 bars and a walking bass on bar 4 you might do something like:

If you already have a sequence defined5.2 you can repeat or copy the existing pattern by using a single “*” as the pattern name. This is useful when you are modifying an existing sequence.

The next block is the definition of a section he calls Solo. This is a Track.

Begin Solo
	Voice Piano2
	Octave 4
	Harmony 3above
	Articulate 90
	Accent 1 20
 	Volume f
End

I think that the expectation is that you get the majority of the defaults from the Groove, and customize the Solo track.


As a general rule, MELODY tracks have been designed as a “voice” to accompany a predefined form defined in a GROOVE—it is a good idea to define MELODY parameters as part of a GROOVE. SOLO tracks are thought to be specific to a certain song file, with their parameters defined in the song file.

So if it were a Melody track definition is would be ignored, and the track from the Rhumba base would be used instead.

The next section defines what is done overall.

Keysig 3b


Groove Rhumba
Alltracks SeqRnd Off
Bass-Sus Sequence -		// disable the strings

Cresc pp mf 4

Keysig directive can be found here. This will generate a MIDI KeySignature event. 3b means 3 flats in the midi spec. Major is assumed if not specified. Thus this is the key of E Flat.

The Groove Rhumba directive is going to drive most of the song. The definitions for this Groove can be found under the standard library I might tear apart a file like this one in a future post.

The next two lines specify how the Groove is to be played. SeqRnd inserts randomness into the sequencing, to make it more like a live performance. This directive shuts down the randomness.

Bass-Sus Sequence – seems to be defining a new, blank sequence. The comment implies that it is shutting off the strings. I have to admit, I don’t quite understand this. I’ve generated the file with this directive commented out and detect no differences. Since Bass-Sus is defined in the Bossa Nova Groove under the standard library, I’m tempted to think this is an copy-pasta error. Note that it defines “Voice Strings” and I think that is what he was trying to disable. I suspect a git history will show the Bass-Sus getting pulled out of the Rhumba file.

Cresc pp mf 4 Grow in volume from piano (soft) to mezzo-forte (Medium Loud) over 4 bars. Since no track is specified, it is for the master volume.

// 4 bar intro

1 	Eb		{4.g;8f;2e;}
2 	Ab      {4.a;8g;2f;}
3 	Gm7     {1g;}
4 	Bb7     {2b;}

Delete Solo

Now we start seeing the measures. The numbers are optional, and just for human readers to keep track.
Measure 1 is an E flat chord. The braces delineate a Riff line. The 4 means a Quarter note. The period after it Makes it Dotted, half again as long, or the equivalent of 3 tied eighth notes. The Note played is a g. This is adjusted for the octave appropriate to the voice. This is followed by an eighth note f, an a half note e. This adds up to a full measure; 3/8 + 1/8 + 4/8.

After the four bar intro, the solo part is deleted, and the normal Rhumba patterns take effect.

The next line is a Repeat directive, which is paired with the repeatending directive on line 129 and repeatend directives on line 135. This says that measures 5-60 should be repeated once, first and second ending style.

The Groove changes many times during the song, and I think this leads to the one bug I noticed: the time keeps changing, speeding up and slowing down. I think these match up with the Groove changes, but I am not yet certain.

It should be fairly easy to translate one of my songs into this format.

We can’t move forward by looking back

Posted by Josh Bressers on November 19, 2020 03:24 PM

For the last few weeks Kurt and I have been having a lively conversation about security ratings scales. Is CVSS good enough? What about the Microsoft scale? Are there other scales we should be looking at? What’s good, what’s missing, what should we be talking about.

There’s been a lot of back and forth and different ideas, over the course of our discussions I’ve come to realize an important aspect of security which is we don’t look forward very often. What I mean by this is there is a very strong force in the world of security to use prior art to drive our future decisions. Except all of that prior art is comically out of date in the world of today.

An easy example are existing security standards. All of the working groups that build the standards, and ideas the working groups bring to the table, are using ideas from the past to solve problems for the future. You can argue that standards are at best a snapshot of the past, made in the present, to slow down the future. I will elaborate on that “slow down the future” line in a future blog post, for now I just want to focus on the larger problem.

It might be easiest to use an example, I shall pick on CVSS. The vast majority of ideas and content in a standard such as CVSS is heavily influenced by what once was. If you look at how CVSS scores things, it’s clear a computer in a datacenter was in mind for many of the metrics. That was fine a decade ago, but it’s not fine anymore. Right now anyone overly familiar with CVSS is screaming “BUT CVSS DOESN’T MEASURE RISK IT MEASURES SEVERITY”, which I will say: you are technically correct, nobody cares, and nobody uses it like this. Sit down. CVSS is a perfect example of the theory being out of touch with reality.

Am I suggesting CVSS has no value? I am not not. In its current form CVSS has some value (it should have a lot more). It’s all we have today, so everyone is using it, and it’s mostly good enough in the same way you can drive a nail with a rock. I have a suspicion it won’t be turned into something truly valuable because it is a standard based on the past. I would like to say we should take this into account when we use CVSS, but nobody will. The people doing the work don’t have time to care about improving something that’s mostly OK, and the people building the standards don’t do the work, so it’s sort of like a Mexican standoff, but one where nobody showed up.

There are basically two options for CVSS: don’t use it because it doesn’t work properly, or use it and just deal with the places it falls apart. Both of those are terrible options. There’s little chance it’s going to get better in the near future. There is a CVSSv4 design document here. If you look at it, does it look like something describing a modern cloud based architecture? They’ve been working on this for almost five years; do you remember what your architecture looked like even a year ago? For most of us in the world of IT a year is a lifetime now. Looking backwards isn’t going to make anything better.

OK, I’ve picked on CVSS enough. The real reason to explain all of this is to change the way we think about problems. Trying to solve problems we already had in the past won’t help with problems we have today, or will have in the future. I think this is more about having a different mindset than security had in the past. If you look at the history of infosec and security, there has been a steady march of progress, but much of that progress has been slower than the forward movement of IT in general. What’s holding us back?

Let’s break this down into People, Places, and Things

People

I use the line above “The people doing the work don’t have time to care, and the people building the standards don’t do the work”. What I mean by this is there are plenty of people doing amazing security work. We don’t hear about them very often though because they’re busy working. Go talk to someone building detection rules for their SIEM, those are the people making a difference. They don’t have time to work on the next version of CVSS. They probably don’t even have the time to file a bug report against an open source project they use. There are many people in this situation in the security world. They are doing amazing work and getting zero credit. These are the heroes we need.

But we have the heroes we deserve. If you look at many of the people working on standards, and giving keynotes, and writing blogs (oh hi), a lot of them live in a world that no longer exists. I willingly admit I used to live in a world that didn’t exist. I had an obsession with problems nobody cared about because I didn’t know what anyone was really doing. I didn’t understand cloud, or detection, or compliance, or really anything new. Working at Elastic and seeing what our customers are accomplishing in the world of security has been a life changing experience. It made me realize some those people I thought were leaders weren’t actually making the world a better place. They were desperately trying to keep the world in a place that they were relevant and could understand.

Places

One of my favorite examples these days is the fact that cloud won, but a lot of people are still talking about data centers or “hybrid cloud” or some other term that means owning a computer. A data center is a place. Places don’t exist anymore, at least not for the people making a difference. Now there are reasons to have a data center, just like there are reasons to own an airplane. Those reasons are pretty niche and solve a unique problem. We’re not worried about those niche problems today.

How many of our security standards focus on having a computer in a room, in a place? Too many. Why doesn’t your compliance document ask about the seatbelts on your airplane? Because you don’t own an airplane, just like you don’t (or shouldn’t) own a server. The world changed, security is still catching up. There are no places anymore. Trying to secure a server in a room isn’t actually helping anyone.

Things

Things is one of the most interesting topics today. How many of us have corporate policies that say you can only access company systems from your laptop, while connected to a VPN, and wearing a hat. Or some other draconian rule. Then how many of us have email on our personal phones? But that’s not a VPN, or a hat, or a laptop! Trying to secure a device is silly because there are a near infinite number devices and possible problems.

We used to think about securing computers. Servers, desktops, laptops, maybe a router or two. Those are tangible things that exist. We can look at them, we can poke them with a stick, we can unplug them. We don’t have real things to protect anymore and that’s a challenge. It’s hard to think about protecting something that we can’t hold in our hand. The world has changed in a such a way that the “things” we care about aren’t even things anymore.

The reality is we used to think of things as objects we use, but things of today are data. Data is everything now. Every service, system, and application we use is just a way to understand and move around data. How many of our policies and ideas focus on computers that don’t really exist instead of the data we access and manipulate?

Everything new is old again

I hope the one lesson you take away from all of this is to be wary of leaning on the past. The past contains lessons, not directions. Security exists in a world unlike any we’ve ever seen, the old rules are … old. But it’s also important to understand that even what we think of as a good idea today might not be a good idea tomorrow.

Progress is ahead of you, not behind.

Keystone and Cassandra: Parity with SQL

Posted by Adam Young on November 18, 2020 09:41 PM

Look back at our Pushing Keystone over the Edge presentation from the OpenStack Summit. Many of the points we make are problems faced by any application trying to scale across multiple datacenters. Cassandra is a database designed to deal with this level of scale. So Cassandra may well be a better choice than MySQL or other RDBMS as a datastore to Keystone. What would it take to enable Cassandra support for Keystone?

Lets start with the easy part: defining the tables. Lets look at how we define the Federation back end for SQL. We use SQL Alchemy to handle the migrations: we will need something comparable for Cassandra Query Language (CQL) but we also need to translate the table definitions themselves.

Before we create the tables, we need to create keyspace. I am going to make separate keyspaces for each of the subsystems in Keystone: Identity, Assignment, Federation, and so on. Here’s the Federated one:

CREATE KEYSPACE keystone_federation WITH replication = {'class': 'NetworkTopologyStrategy', 'datacenter1': '3'}  AND durable_writes = true;

The Identity provider table is defined like this:

    idp_table = sql.Table(
        'identity_provider',
        meta,
        sql.Column('id', sql.String(64), primary_key=True),
        sql.Column('enabled', sql.Boolean, nullable=False),
        sql.Column('description', sql.Text(), nullable=True),
        mysql_engine='InnoDB',
        mysql_charset='utf8')
    idp_table.create(migrate_engine, checkfirst=True)

The comparable CQL to create a table would look like this:

CREATE TABLE identity_provider (id text PRIMARY KEY , enables boolean , description text);

However, when I describe the schema to view the table defintion, we see that there are many tuning and configuration parameters that are defaulted:

CREATE TABLE federation.identity_provider (
    id text PRIMARY KEY,
    description text,
    enables boolean
) WITH additional_write_policy = '99p'
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND cdc = false
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '16', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND default_time_to_live = 0
    AND extensions = {}
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair = 'BLOCKING'
    AND speculative_retry = '99p';

I don’t know Cassandra well enough to say if these are sane defaults to have in production. I do know that someone, somewhere, is going to want to tweak them, and we are going to have to provide a means to do so without battling the upgrade scripts. I suspect we are going to want to only use the short form (what I typed into the CQL prompt) in the migrations, not the form with all of the options. In addition, we might want an if not exists  clause on the table creation to allow people to make these changes themselves. Then again, that might make things get out of sync. Hmmm.

There are three more entities in this back end:

CREATE TABLE federation_protocol (id text, idp_id text, mapping_id text,  PRIMARY KEY(id, idp_id) );
cqlsh:federation> CREATE TABLE mapping (id text primary key, rules text,    );
CREATE TABLE service_provider ( auth_url text, id text primary key, enabled boolean, description text, sp_url text, RELAY_STATE_PREFIX  text);

One thing that is interesting is that we will not be limiting the ID fields to 32, 64, or 128 characters. There is no performance benefit to doing so in Cassandra, nor is there any way to enforce the length limits. From a Keystone perspective, there is not much value either; we still need to validate the UUIDs in Python code. We could autogenerate the UUIDs in Cassandra, and there might be some benefit to that, but it would diverge from the logic in the Keystone code, and explode the test matrix.

There is only one foreign key in the SQL section; the federation protocol has an idp_id that points to the identity provider table. We’ll have to accept this limitation and ensure the integrity is maintained in code. We can do this by looking up the Identity provider before inserting the protocol entry. Since creating a Federated entity is a rare and administrative task, the risk here is vanishingly small. It will be more significant elsewhere.

For access to the database, we should probably use Flask-CQLAlchemy. Fortunately, Keystone is already a Flask based project, so this makes the two projects align.

For migration support, It looks like the best option out there is cassandra-migrate.

An effort like this would best be started out of tree, with an expectation that it would be merged in once it had shown a degree of maturity. Thus, I would put it into a namespace that would not conflict with the existing keystone project. The python imports would look like:

from keystone.cassandra import migrations
from keystone.cassandra import identity
from keystone.cassandra import federation

This could go in its own git repo and be separately pip installed for development. The entrypoints would be registered such that the configuration file would have entries like:

[application_credential] driver = cassandra

Any tuning of the database could be put under a [cassandra] section of the conf file, or tuning for individual sections could be in keys prefixed with cassanda_ in the appropriate sections, such as application_credentials as shown above.

It might be interesting to implement a Cassandra token backend and use the default_time_to_live value on the table to control the lifespan and automate the cleanup of the tables. This might provide some performance benefit over the fernet approach, as the token data would be cached. However, the drawbacks due to token invalidation upon change of data would far outweigh the benefits unless the TTL was very short, perhaps 5 minutes.

Just making it work is one thing. In a follow on article, I’d like to go through what it would take to stretch a cluster from one datacenter to another, and to make sure that the other considerations that we discussed in that presentation are covered.

Feedback?

Episode 224 – Are old Android devices dangerous?

Posted by Josh Bressers on November 16, 2020 12:01 AM

Josh and Kurt talk about what happens when important root certificates expire on old Android devices? Who should be responsible? How can we fix this? Is this even something we can or should fix? How devices should age is a really hard problem that needs a lot of discussion.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2038-2" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_224_Are_old_Android_devices_dangerous.mp3?_=2" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_224_Are_old_Android_devices_dangerous.mp3</audio>

Show Notes

Episode 223 – Full disclosure won, deal with it

Posted by Josh Bressers on November 09, 2020 12:01 AM

Josh and Kurt talk about the idea behind the full disclosure of security vulnerability details. There have been discussions about this topic for decades with many people on all sides of the issue. The reality is however, if you look at the current state of things, this discussion is settled, full disclosure won.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2027-3" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_223_Full_disclosure_won_deal_with_it.mp3?_=3" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_223_Full_disclosure_won_deal_with_it.mp3</audio>

Show Notes

Dependency Injection in Java

Posted by Adam Young on November 07, 2020 12:04 AM

You might be thinking that this is a long solved problem. I think I have something a little bit different.

This is very similar to the C++ based one that I wrote long ago.

There are several design decisions in this approach that I would like to make explicit.

  • It favors constructor dependency resolution, although it does not require it.
  • It favors type safety.
  • It does not require modification of existing code to add decorators, and thus allows you to mix code from different sources.
  • The dependency declarations are all done in simple Java.
  • The components built this way can implement the composite design pattern, and the framework can be thought of as the builder for them.
  • The declaration of the factories is separate from the implementation of the objects themselves. This allows different factories to be used in different settings.

The resolvers can be chained in parent/child relationships. A common example of this is in a web application, a request will be the child of a session, which in turn will be the child of the web application. The rule is that a child object can resolve a dependency via the parent, but a parent cannot resolve a dependency via a child. Parent scoped objects are expected to live beyond the lifespan of child scoped objects.

Java Generics can be used to select the appropriate factory function to create an instance of a class. Say I want to create an instance of Message pump, a class that pulls a message from a source and sends it to a sink.

MessagePump messagePump = childResolver.fetch(MessagePump.class);

If we have a factory interface like this:

package com.younglogic.resolver;

public interface  Factory<t> {
	T create (Resolver registry); 
};

You can collect them up in a registry like this:

package com.younglogic.resolver;

import java.util.HashMap;
import java.util.Map;

public class Registry {
	
	@SuppressWarnings("rawtypes")
	Map<class> factories = new HashMap<class>();

	public Resolver createResolver(Resolver parent) {
		return new Resolver(this, parent);
	}

	<t> void register(Class<t> c, Factory<t> f) {
		factories.put(c, f);
	};

}

The actual factory implementation can be anonymous classes.

registry.register(MessageSink.class, new Factory<messagesink>() {
    @Override
    public ConcreteMessageSink create(Resolver registry) {
	return new ConcreteMessageSink();
    }
});

The resolution is done via a lazy load proxy defined like this:

package com.younglogic.resolver;

import java.util.HashMap;
import java.util.Map;
import java.lang.InstantiationError;

public class Resolver {

	@SuppressWarnings("rawtypes")
	Map<class> instances = new HashMap<class>();

	private Registry registry;
	private Resolver parent;

	Resolver(Registry registry, Resolver parent) {
		super();
		this.registry = registry;
		this.parent = parent;
	}

	@SuppressWarnings("unchecked")
	public <t extends="extends" object="Object"> T fetch(@SuppressWarnings("rawtypes") Class c) throws InstantiationError {
		T o = (T) instances.get(c);
		if (o == null) {
			// Don't synchronize for the fast path, only the
			// slow path where we need to create a new object.

			Factory<t> factory = registry.factories.get(c);
			if (factory == null) {
				if (parent == null){
					throw new InstantiationError();
				}
				return parent.fetch(c);
			}
			synchronized (instances) {
				if (instances.get(c) == null) {
					try {
						o = (T) factory.create(this);
						instances.put(c, o);
					} catch (ClassCastException e) {
						throw new InstantiationError();
					}
				}
			}
		}
		return o;
	}
}

One feature I have not implemented is away to distinguish between two different components that implement the same interface.

The drawback to Java in this case is that there is no clean up; you are still depending on the garbage collector, and finalize might never be called for your objects.

Full code is here: https://github.com/admiyo/javaresolver

Hidden Tuples

Posted by Adam Young on November 06, 2020 11:17 PM

If you are going to write a Sudoku solver, write a brute force, depth first search. You can get it running fast enough.

But what if you couldn’t? What if the puzzles were so big that solving them by brute force was not computationally feasible? A Sudoku puzzle is build on a basis of 3: The Blocks are 3X3, there are 3X 3 of them in the puzzle, and the rows and columns are are 9 cells (3 * 3) long. This approach scales up. If you were to do a basis of 4, you could use the Hexadecimal digits, and have 16 X 16 puzzles.

A Basis of K leads to a puzzle size of (K^4). The basis can be any integer. A Basis of 10 would lead to a puzzle size of 1000.

The Sudoku puzzle shows exponential growth. https://en.wikipedia.org/wiki/Combinatorial_explosion#Sudoku

What could you do for a complex puzzle? Use heuristics to reduce the problem set to the point where a the brute force algorithm can complete.

One of the most powerful heuristics to use is the Hidden Tuples algorithm, and I recently went through the process of implementing it. I’ll post the code, but first, a bit on notation.

I am going to use the term house to refer to a row, block, or column. It is a generic term to mean any one of them, or, when I say houses, all three of them.

The simplest format for storing a Sudoku puzzle is as a string. You can mark the solved cells with a single digit, and an unsolved cell as a blank. However, to calculate operations on a column you need to do the math to find the column’s location. It turns out that most programming languages do exactly this for two dimensional arrays. So the logical way to store it in memory while working on the puzzle is as a 2 D array.

Thus a puzzle defined like this:

sample_puzzle = ("800000000" +
                 "003600000" +
                 "070090200" +
                 "050007000" +
                 "000045700" +
                 "000100030" +
                 "001000068" +
                 "008500010" +
                 "090000400")

Would be layed out logically like this:

 8                
     3 6          
   7     9   2    
   5       7      
         4 5 7    
       1       3  
     1         6 8
     8 5       1  
   9         4    

Another way of interpreting the blank cell is that it can potentially be any value, 1-9 for a Basis of 3, 1 through Basis^2 for larger puzzles. However, those blank cells can be rewritten as arrays of digits. You can then remove each digit once it is no longer viable. Thus, a solved cell is one that contains one digit in it, and an unsolved cell contains more than one digit. This is one of the techniques used when solving a puzzle by hand. But when solving it computationally, it is actually easier to go ahead an expand out a blank board to be filled with:

[‘1′,’2′,’3′,’4′,’5′,’6′,’7′,’8′,’9’]

Now, when you write your puzzle on the board, as you write each solved digit into its location you also go ahead and remove that digit from all of the other cells in the same column, row, and block. Thus, a partial solution looks like this:

┏━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┳━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┳━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┓
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃    8    │   1246  │  24569  ┃   2347  │  12357  │   1234  ┃  13569  │   4579  │ 1345679 ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  12459  │   124   │    3    ┃    6    │  12578  │   1248  ┃   1589  │  45789  │  14579  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃   1456  │    7    │   456   ┃   348   │    9    │   1348  ┃    2    │   458   │  13456  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┣━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━╋━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━╋━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┫
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  123469 │    5    │   2469  ┃   2389  │   2368  │    7    ┃   1689  │   2489  │  12469  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  12369  │  12368  │   269   ┃   2389  │    4    │    5    ┃    7    │   289   │   1269  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  24679  │   2468  │  24679  ┃    1    │   268   │   2689  ┃   5689  │    3    │  24569  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┣━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━╋━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━╋━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┫
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  23457  │   234   │    1    ┃  23479  │   237   │   2349  ┃   359   │    6    │    8    ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  23467  │   2346  │    8    ┃    5    │   2367  │  23469  ┃    39   │    1    │   2379  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┠─────────┼─────────┼─────────╂─────────┼─────────┼─────────╂─────────┼─────────┼─────────┨
┃         │         │         ┃         │         │         ┃         │         │         ┃
┃  23567  │    9    │   2567  ┃   2378  │  123678 │  12368  ┃    4    │   257   │   2357  ┃
┃         │         │         ┃         │         │         ┃         │         │         ┃
┗━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┻━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┻━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┛

We can further reduce the above puzzle by running additional algorithms run against it. At the simplest, we can just search for and singletons: numbers that appear in only one cell in a house.

The most complex algorithm I have heard of people actually using is hidden tuple removal. I use the word tuple to cover the three lengths that are most valuable; pairs, triplets, and quads. A Hidden pair is a pair of digits that only exist in 2 cells in a house. When these are identified, you can remove any other numbers from the cell where the pair resides. More powerfully, you can extrapolate into the other houses; if both of those cells from a row or column reside in the same block, you can remove those two numbers from all of the other cells in that block.

When we scale this pattern up to the quad tuple, it gets even more powerful. See, no one cell needs to hold all four numbers. We just need 4 cells. These cells can contain other numbers, but the numbers in this set must be constrained to those four cells. Thus, if one cell has [1,2,8,9] another has [1,3,8,9] another has [2,3,4,8,9] a fourth has [4,8,9], and both 8 and 9 appear in other cells, we can define the tuple as [1,2,3,4] and remove the 8 and the 9 from all of these cells.

The algorithm is expensive. Basically, create an exhaustive list of sets, figure out which cells have a subset of that set, and then look for sets of the right length e.g. 4 for hidden quads.

def find_and_reduce_hidden_tuples(board, n):
    found = find_hidden_tuples(board, n)
    reduce_hidden_tuples(board, found, n)

First lets talk about finding the hidden tuples. Generate a map and look through it for sets of length 4.

def find_hidden_tuples(board, k):
    tuple_map = generate_hidden_tuple_map(board, k)
    found = dict()
    for (key, values) in tuple_map.items():
        if len(values) == k:
            found[key] = values
    return found

To Generate the map, we need to generate an exhausitive list of all the subsets, then iterate through each cell on the board. If a cell contains a non-null subset of the initial set, we add it to the potential matches.

def generate_hidden_tuple_map(board, k): tuples = gen_subsets(common.FULL_SET, k) tuple_map = dict() for cell in BoardCellIterator(board): if len(cell.get()) == 1: continue for tuple in tuples: for digit in tuple: if digit in cell.get(): append_to_tuple_map( tuple_map, cell.r, cell.c, tuple, board) break return tuple_map

To create the exhaustive list of sets:

# select k from n.  Number of elements will be
# n!/(k! - (n-k)!)
def gen_subset_indexes(n, k):
    subsets = []
    max = 1 << n
    for i in range(max):
        indexes = []
        for x in range(9):
            if (i >> x) & 1 == 1:
                indexes.append(x)
        if len(indexes) == k:
            subsets.append(indexes)
    return subsets


# generates all subsets of length k
def gen_subsets(allset, k):
    subsets = []
    indexes = gen_subset_indexes(len(allset), k)
    for i in indexes:
        subset = []
        for j in i:
            subset.append(allset[j])
        subsets.append(subset)
    return subsets

I am kind of proud of this code. It uses the fact that a subset can be implemented as a bit map, a technique I remember learning from Programming Pearls. Basically, look for (9 bit long) bytes where the there number of bits set is 4. Those bits map to the indexes of the subset characters…which happen to be their digit values as well.

Once we have all of the subsets, we look for the matches. These will go in a map. I created a custom type for the key for this map:

class TupleKey:
    def __init__(self, house, location, tuple):
        self.house = house
        self.location = location
        self.tuple = tuple

    def __eq__(self, other):
        if not isinstance(other, TupleKey):
            return False
        return ((self.house == other.house) and
                (self.location == other.location) and
                (self.tuple == other.tuple))

    def __hash__(self):
        return hash(self.__str__())

    def __repr__(self):
        return self.__str__()

    def __str__(self):
        return "%s %s %s" % (self.house, self.location, self.tuple)

This key is tuned to how we need to look up the value at the end; it contains all of the information that makes it unique, and it provides a way to find the value in the overall board. Note that the location field is tuned to the house; if the house is a row, we store the row, if it is a block, we store a composite value that uniquely identifies the block.

Appending it to the map involves ignoring singletons; these are cells that are already solved, and thus remove the tuple in question from being a potential match.

def append_to_tuple_map(tuple_map, r, c, tuple, board):
    def no_singletons(board, itr):
        ok = True
        for cell in itr:
            if not ok:
                break
            for digit in tuple:
                if cell.get() == digit:
                    ok = False
                    break
        return ok

    tuple_str = ""
    for i in tuple:
        tuple_str += i
    keys = []

    if no_singletons(board, RowCellIterator(board, r)):
        keys.append(TupleKey('r', r, tuple_str))

    if no_singletons(board, ColCellIterator(board, c)):
        keys.append(TupleKey('c', c, tuple_str))

    if no_singletons(board, BlockCellIterator(board, r, c)):
        keys.append(TupleKey('b', block_for_rc(r, c), tuple_str))

    for key in keys:
        if tuple_map.get(key) is None:
            tuple_map[key] = []
        tuple_map[key].append((r, c))


def block_for_rc(row, col):
    sg_row = math.floor(row / common.BASIS)
    sg_col = math.floor(col / common.BASIS)
    return [sg_row, sg_col]

Now that we have a map that contains all the hidden tuples, we reduce the cells affected by each hidden tuple:

def reduce_hidden_tuples(board, found, n):
    for (key, cells) in found.items():
        digits_in = key.tuple
        digits_out = common.FULL_SET
        for c in digits_in:
            digits_out = digits_out.replace(c, '')

        itr = None
        if key.house == 'b':
            itr = BlockCellIterator(board, cells[0][0], cells[0][1])
        elif key.house == 'r':
            itr = RowCellIterator(board, cells[0][0])
        elif key.house == 'c':
            itr = ColCellIterator(board, cells[0][1])
        else:
            assert(itr is not None)

        for cell in itr:
            ct = (cell.r, cell.c)
            if ct in cells:
                cell.set(cell.remove_from_set(digits_out))
            else:
                cell.set(cell.remove_from_set(digits_in))

The digits_out removal is, I think, ineffectual. I convinced myself that I needed it when I wrote it, but I am fairly certain that the cells out side of the found list will never contain those numbers.

I am not certain if it makes sense to run this algorithm unless it is coupled with other techniques that make use of the small reductions found by it to do larger reductions on the whole board.

If we go back to our Sudoku puzzles with basis > 3, the tuples found by this technique can be larger than 4. I suspect that there is a relationship of the the nature Basis+1. Thus, for our basis of 10, we probably could expect to find matches with a tuple size of 11, slightly greater than one in ten cells in a house.

But this algorithm is expensive. The tuple map grows with the number of elements on the board. I’ve not calculated the growth, but I suspect it is of the nature of n^2. Still that should be better than N! or exponential.

The number of distinct subsets of size k from a set of size n is

n!/((k!)(n-k)!) which means, among other things, that the number of subsets is roughly the same regardless of the size of k.

However, as the size of n grows, the number of subsets of different lengths we need to generate to also grows. A 9 X 9 Sudoku only needs subsets of lengths 2,3, and 4. If we assume that we should not cross the halfway boundary, A 25 * 25 will need lengths from 2 through 12.

Well, not “need” but rather “benefit from.” You can limit it to hidden “quads” and lower. But the larger puzzle is also going to have larger percentages of hit rates at all tuple sizes. I don’t think I’m prepared to advise what would be the right way to scale it.

Episode 222 – HashiCorp Boundary with Jeff Mitchell

Posted by Josh Bressers on November 02, 2020 12:01 AM

Josh and Kurt talk to Jeff Mitchell about the new HashiCorp project Boundary. We discuss what Boundary is, why it’s cooler than a VPN, and how you can get involved.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2023-4" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_222_HashiCorp_Boundary_with_Jeff_Mitchell.mp3?_=4" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_222_HashiCorp_Boundary_with_Jeff_Mitchell.mp3</audio>

Show Notes

Long Refactoring: Completing the Iterator

Posted by Adam Young on October 29, 2020 01:01 AM

Now that the algorithm does not need a new test_board every time, we no longer need to treat the Node object as a Flyweight. We can move the board into the Node object and remove it from the parameter list of all the functions that operate on it.

Changing the Node object parameter list requires changing all the places it is called. The unit test helps here by identifying the places I break when I modify the function declaration.

______________________________________________________ ERROR collecting treesudoku/test_tree_soduku.py _______________________________________________________
treesudoku/test_tree_soduku.py:1: in <module>
    from treesudoku import tree_sudoku
treesudoku/tree_sudoku.py:245: in <module>
    solver = SudokuSolver(import_csv())
treesudoku/tree_sudoku.py:34: in __init__
    return_string = self.tree_to_solution_string(value)
treesudoku/tree_sudoku.py:38: in tree_to_solution_string
    head_node = Tree_Node(None, 0)
E   TypeError: __init__() missing 1 required positional argument: 'table'

Here are the places I had to change. Note that the new parameter is not yet used. We just know, logically, that the parameter passed in to the functions is always going to match the member.

diff --git a/treesudoku/test_tree_soduku.py b/treesudoku/test_tree_soduku.py
index be4b95b..1266e11 100644
--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -50,7 +50,7 @@ def test_sudoku_solver():
 
 def test_advance():
     test_board = tree_sudoku.build_board(puzzle0)
-    node = tree_sudoku.Tree_Node(None, 0)
+    node = tree_sudoku.Tree_Node(None, 0, test_board)
     node.write(test_board)
     assert(test_board[0][0] == '9')
     node = node.advance(test_board)
diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 89096d7..7d288ce 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -35,9 +35,9 @@ class SudokuSolver:
             self.solved_board_strings[key] = return_string
 
     def tree_to_solution_string(self, original_board):
-        head_node = Tree_Node(None, 0)
-        curr_node = head_node
         test_board = copy.deepcopy(original_board)
+        head_node = Tree_Node(None, 0, test_board)
+        curr_node = head_node
         curr_node.check_solved(test_board)
         while True:
             if not curr_node.solved:
@@ -187,12 +187,13 @@ board_index = BoardIndexTable()
 
 
 class Tree_Node:
-    def __init__(self, last_node, index):
+    def __init__(self, last_node, index, board):
         self.board_spot = board_index.table[index]
         self.last_node = last_node
         self.next_node = None
         self.solved = False
         self.index = index
+        self.board = board
         self.reset()
 
     def reset(self):
@@ -223,7 +224,7 @@ class Tree_Node:
     def advance(self, test_board):
         node = self
         if node.next_node is None:
-            new_node = Tree_Node(node, self.index + 1)
+            new_node = Tree_Node(node, self.index + 1, self.board)
             new_node.check_solved(test_board)
             node.next_node = new_node
         return node.next_node

Next step is to wire up the functions to use the member variable instead of the parameter. There are two ways to do this: change one function body, parameter list, and calling locations all at the same time, or change the function bodies first, and then each of the places that call it. I prefer the latter, as it lets me keep the code running successfully with fewer breaking changes between stable points.

This is a long change.

--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -50,19 +50,19 @@ def test_sudoku_solver():
 
 def test_advance():
     test_board = tree_sudoku.build_board(puzzle0)
-    node = tree_sudoku.Tree_Node(None, 0)
-    node.write(test_board)
+    node = tree_sudoku.Tree_Node(None, 0, test_board)
+    node.write()
     assert(test_board[0][0] == '9')
-    node = node.advance(test_board)
-    node = node.advance(test_board)
-    node.write(test_board)
+    node = node.advance()
+    node = node.advance()
+    node.write()
     assert(test_board[0][3] == '0')
-    node = node.advance(test_board)
-    node.write(test_board)
+    node = node.advance()
+    node.write()
     assert(test_board[0][3] == '9')
-    back_node = node.retreat(test_board)
+    back_node = node.retreat()
     assert(test_board[0][3] == '0')
     assert(node.value == "9")
-    back_node.write(test_board)
+    back_node.write()
     assert(test_board[0][2] == '3')
     assert(back_node.board_spot == '02')
diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 89096d7..00ee60c 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -35,24 +35,24 @@ class SudokuSolver:
             self.solved_board_strings[key] = return_string
 
     def tree_to_solution_string(self, original_board):
-        head_node = Tree_Node(None, 0)
-        curr_node = head_node
         test_board = copy.deepcopy(original_board)
-        curr_node.check_solved(test_board)
+        head_node = Tree_Node(None, 0, test_board)
+        curr_node = head_node
+        curr_node.check_solved()
         while True:
             if not curr_node.solved:
-                curr_node.write(test_board)
+                curr_node.write()
             if self.box_index.is_value_valid(test_board, curr_node):
                 if curr_node.index + 1 >= MAX:
                     break
-                curr_node = curr_node.advance(test_board)
-                curr_node.check_solved(test_board)
+                curr_node = curr_node.advance()
+                curr_node.check_solved()
             else:
                 # backtrack
                 while len(curr_node.possible_values) == 0:
-                    curr_node = curr_node.retreat(test_board)
+                    curr_node = curr_node.retreat()
                 curr_node.next()
-                curr_node.write(test_board)
+                curr_node.write()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):
@@ -187,12 +187,13 @@ board_index = BoardIndexTable()
 
 
 class Tree_Node:
-    def __init__(self, last_node, index):
+    def __init__(self, last_node, index, board):
         self.board_spot = board_index.table[index]
         self.last_node = last_node
         self.next_node = None
         self.solved = False
         self.index = index
+        self.board = board
         self.reset()
 
     def reset(self):
@@ -205,36 +206,36 @@ class Tree_Node:
     def __str__(self):
         return self.value
 
-    def write(self, board):
+    def write(self):
         row = int(self.board_spot[0])
         col = int(self.board_spot[1])
-        board[row][col] = self.value
+        self.board[row][col] = self.value
 
-    def check_solved(self, board):
+    def check_solved(self):
         if self.solved:
             return
         row = int(self.board_spot[0])
         col = int(self.board_spot[1])
-        if board[row][col] != '0':
-            self.value = board[row][col]
+        if self.board[row][col] != '0':
+            self.value = self.board[row][col]
             self.possible_values = []
             self.solved = True
 
-    def advance(self, test_board):
+    def advance(self):
         node = self
         if node.next_node is None:
-            new_node = Tree_Node(node, self.index + 1)
-            new_node.check_solved(test_board)
+            new_node = Tree_Node(node, self.index + 1, self.board)
+            new_node.check_solved()
             node.next_node = new_node
         return node.next_node
 
-    def retreat(self, test_board):
+    def retreat(self):
         node = self
         node.reset()
         if not self.solved:
             row = int(self.board_spot[0])
             col = int(self.board_spot[1])
-            test_board[row][col] = '0'
+            self.board[row][col] = '0'
         node = self.last_node
         node.next_node = None
         return node

Here is our current algorithmic function:

    def tree_to_solution_string(self, original_board):
        test_board = copy.deepcopy(original_board)
        head_node = Tree_Node(None, 0, test_board)
        curr_node = head_node
        curr_node.check_solved()
        while True:
            if not curr_node.solved:
                curr_node.write()
            if self.box_index.is_value_valid(test_board, curr_node):
                if curr_node.index + 1 >= MAX:
                    break
                curr_node = curr_node.advance()
                curr_node.check_solved()
            else:
                # backtrack
                while len(curr_node.possible_values) == 0:
                    curr_node = curr_node.retreat()
                curr_node.next()
                curr_node.write()
        return self.build_solution_string(head_node)

One thing that occurred to me when I reviewed this is that we create a new node, we immediately check if it is solved. Node creation is done at the top of the function and in the advance function. If we move the is_solved check into the constructor, we treat it as an invariant.

-- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -38,7 +38,6 @@ class SudokuSolver:
         test_board = copy.deepcopy(original_board)
         head_node = Tree_Node(None, 0, test_board)
         curr_node = head_node
-        curr_node.check_solved()
         while True:
             if not curr_node.solved:
                 curr_node.write()
@@ -46,7 +45,6 @@ class SudokuSolver:
                 if curr_node.index + 1 >= MAX:
                     break
                 curr_node = curr_node.advance()
-                curr_node.check_solved()
             else:
                 # backtrack
                 while len(curr_node.possible_values) == 0:
@@ -195,6 +193,7 @@ class Tree_Node:
         self.index = index
         self.board = board
         self.reset()
+        self.check_solved()
 
     def reset(self):
         self.value = '9'
@@ -225,7 +224,6 @@ class Tree_Node:
         node = self
         if node.next_node is None:
             new_node = Tree_Node(node, self.index + 1, self.board)
-            new_node.check_solved()
             node.next_node = new_node
         return node.next_node

Something else that is now apparent: we are writing the value both at the bottom and the top of the loop, and that is redundant. We should be able to remove the write at the bottom.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index eadd374..0c7b16c 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -50,7 +50,6 @@ class SudokuSolver:
                 while len(curr_node.possible_values) == 0:
                     curr_node = curr_node.retreat()
                 curr_node.next()
-                curr_node.write()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):

This is the common pattern of a long refactoring. You tease apart, and then you simplify. Extract method in one place, Inline that method somewhere else if applicable. The algorithm is now fairly straightforward and understandable, and I would feel that the effort put in thus far would be justified in a production project. But more can certainly be done.

Long Refactoring: Streamline the Algorithm

Posted by Adam Young on October 29, 2020 12:56 AM

The code in tree_to_solution_string mixes the logic of solving the puzzle with the management of a linked list. Splitting your attention between these two levels can make it hard to track down errors. To continue teasing these two aspects apart, we need to make heavier use of the iterator. We’re in the middle of it now, and the code might actually feel like it is more of a mess than when we started. That is common, natural, and nothing to be afraid of.

Well, unless we get directed on to a different task tomorrow. Lets finish this up tonight.

Remember to run the tests after every change.

The allocation of new nodes is done in this code:

                new_node = Tree_Node(
                    self.board_index.table[index], curr_node)
                new_node.check_solved(test_board)
                curr_node.next_node = new_node
                curr_node = new_node

Lets pull that out into its own function.

And…as we do it, we trip over that board_index table again. I am going to pull that out into a global variable for the moment. I have a feeling that I am going to end up encapsulating it inside our Node class eventually, but we are not there yet.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -29,15 +29,20 @@ class SudokuSolver:
         self.board_strings = board_strings
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index = BoxIndex()
-        self.board_index = BoardIndexTable()
         self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
             return_string = self.tree_to_solution_string(value)
             self.solved_board_strings[key] = return_string
 
     def tree_to_solution_string(self, original_board):
+        def advance(node, test_board):
+            new_node = Tree_Node(
+                board_index.table[index], node)
+            new_node.check_solved(test_board)
+            node.next_node = new_node
+            return new_node
         index = 0
-        head_node = Tree_Node(self.board_index.table[index])
+        head_node = Tree_Node(board_index.table[index])
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -51,11 +56,8 @@ class SudokuSolver:
                 index += 1
                 if index >= MAX:
                     continue
-                new_node = Tree_Node(
-                    self.board_index.table[index], curr_node)
-                new_node.check_solved(test_board)
-                curr_node.next_node = new_node
-                curr_node = new_node
+                curr_node = advance(curr_node, test_board)
+                curr_node.check_solved(test_board)
             else:
                 if len(curr_node.possible_values) == 0:
                     # backtrack
@@ -194,6 +196,9 @@ class BoardIndexTable:
         return return_list
 
 
+board_index = BoardIndexTable()
+
+
 class Tree_Node:
     def __init__(self, board_spot, last_node=None, next_node=None):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']

The advance function belongs in the Node class. I don’t think there is any benefit to keeping it in this interim form, even in git, so I am going to move it straight away. I do have to add an index to it. Still messy.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -29,7 +29,6 @@ class SudokuSolver:
         self.board_strings = board_strings
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index = BoxIndex()
-        self.board_index = BoardIndexTable()
         self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
             return_string = self.tree_to_solution_string(value)
@@ -37,7 +36,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(self.board_index.table[index])
+        head_node = Tree_Node(board_index.table[index])
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -51,11 +50,8 @@ class SudokuSolver:
                 index += 1
                 if index >= MAX:
                     continue
-                new_node = Tree_Node(
-                    self.board_index.table[index], curr_node)
-                new_node.check_solved(test_board)
-                curr_node.next_node = new_node
-                curr_node = new_node
+                curr_node = curr_node.advance(test_board, index)
+                curr_node.check_solved(test_board)
             else:
                 if len(curr_node.possible_values) == 0:
                     # backtrack
@@ -194,6 +190,9 @@ class BoardIndexTable:
         return return_list
 
 
+board_index = BoardIndexTable()
+
+
 class Tree_Node:
     def __init__(self, board_spot, last_node=None, next_node=None):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
@@ -222,6 +221,14 @@ class Tree_Node:
             self.possible_values = []
             self.solved = True
 
+    def advance(self, test_board, index):
+        node = self
+        new_node = Tree_Node(
+            board_index.table[index], node)
+        new_node.check_solved(test_board)
+        node.next_node = new_node
+        return new_node
+
 
 start = time.time()
 solver = SudokuSolver(import_csv())

Lets do the same with the code that deletes the node.

                if len(curr_node.possible_values) == 0:
                    # backtrack
                    while len(curr_node.possible_values) == 0:
                        curr_node = curr_node.last_node
                        curr_node.next_node = None
                        index -= 1

Move that to a helper function….

index 165948e..8d58d6f 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -35,6 +35,11 @@ class SudokuSolver:
             self.solved_board_strings[key] = return_string
 
     def tree_to_solution_string(self, original_board):
+        def retreat(node):
+            node = curr_node.last_node
+            node.next_node = None
+            return node
+
         index = 0
         head_node = Tree_Node(board_index.table[index])
         curr_node = head_node
@@ -56,8 +61,7 @@ class SudokuSolver:
                 if len(curr_node.possible_values) == 0:
                     # backtrack
                     while len(curr_node.possible_values) == 0:
-                        curr_node = curr_node.last_node
-                        curr_node.next_node = None
+                        curr_node = retreat(curr_node)
                         index -= 1
                 curr_node.next()
         return self.build_solution_string(head_node)

And then move it to the Node class

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -56,8 +56,7 @@ class SudokuSolver:
                 if len(curr_node.possible_values) == 0:
                     # backtrack
                     while len(curr_node.possible_values) == 0:
-                        curr_node = curr_node.last_node
-                        curr_node.next_node = None
+                        curr_node = curr_node.retreat()
                         index -= 1
                 curr_node.next()
         return self.build_solution_string(head_node)
@@ -229,6 +228,11 @@ class Tree_Node:
         node.next_node = new_node
         return new_node
 
+    def retreat(self):
+        node = self.last_node
+        node.next_node = None
+        return node
+
 
 start = time.time()
 solver = SudokuSolver(import_csv())

Lets do something about that index. What is clear is that there should only ever be 81 nodes, max, and each node knows its index. Having both the index and the board location is overkill, but you can never have too much overkill. First, lets clear up the parameter list for the Node initialization.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -36,7 +36,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(board_index.table[index])
+        head_node = Tree_Node(board_index.table[index], None)
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -193,10 +193,9 @@ board_index = BoardIndexTable()
 
 
 class Tree_Node:
-    def __init__(self, board_spot, last_node=None, next_node=None):
+    def __init__(self, board_spot, last_node):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
         self.board_spot = board_spot
-        self.next_node = next_node
         self.last_node = last_node
         self.value = '9'
         self.solved = False

Now we can add an index parameter to the end,. Yes, I know I removed it before, I was wrongish.

With the index parameter back, we can encapsulate the BlockIndexTable inside the node object constructor.

    self.board_spot = board_index.table[index] 

And then remove it from the parameter list.

 --- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -36,7 +36,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(board_index.table[index])
+        head_node = Tree_Node(None, index)
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -193,13 +193,14 @@ board_index = BoardIndexTable()
 
 
 class Tree_Node:
-    def __init__(self, board_spot, last_node=None, next_node=None):
+    def __init__(self, last_node, index):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
-        self.board_spot = board_spot
-        self.next_node = next_node
+        self.board_spot = board_index.table[index]
         self.last_node = last_node
+        self.next_node = None
         self.value = '9'
         self.solved = False
+        self.index = index
 
     def next(self):
         self.value = self.possible_values.pop()
@@ -222,8 +223,7 @@ class Tree_Node:
 
     def advance(self, test_board, index):
         node = self
-        new_node = Tree_Node(
-            board_index.table[index], node)
+        new_node = Tree_Node(node, self.index + 1)
         new_node.check_solved(test_board)
         node.next_node = new_node
         return new_node

Including in the advance function.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -50,7 +50,7 @@ class SudokuSolver:
                 index += 1
                 if index >= MAX:
                     continue
-                curr_node = curr_node.advance(test_board, index)
+                curr_node = curr_node.advance(test_board)
                 curr_node.check_solved(test_board)
             else:
                 if len(curr_node.possible_values) == 0:
@@ -221,7 +221,7 @@ class Tree_Node:
             self.possible_values = []
             self.solved = True
 
-    def advance(self, test_board, index):
+    def advance(self, test_board):
         node = self
         new_node = Tree_Node(node, self.index + 1)
         new_node.check_solved(test_board)

Get rid of the index variable in the code, and use the value from the current node pointer instead. Yeah, I called it a pointer. The one place we have to watch is where it is used for early exit from the loop, but there we can use curr_node.index + 1.

What is trickier is that check at the top.

 while index < MAX:

This controls the whole loop. But curr_node is not always equal to the maximum node when we reach this point. We can't just check for curr_node.index < MAX because the continue statement kicks us up here when we pass beyond the last node, we'll never break. If we use a break statement instead of the the continue, it works.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -35,10 +35,9 @@ class SudokuSolver:
             self.solved_board_strings[key] = return_string
 
     def tree_to_solution_string(self, original_board):
-        index = 0
-        head_node = Tree_Node(None, index)
+        head_node = Tree_Node(None, 0)
         curr_node = head_node
-        while index < MAX:
+        while curr_node.index < MAX:
             curr_board_filling_node = head_node
             test_board = copy.deepcopy(original_board)
             curr_board_filling_node.write(test_board)
@@ -47,9 +46,8 @@ class SudokuSolver:
                 curr_board_filling_node.write(test_board)
 
             if self.box_index.is_value_valid(test_board, curr_node):
-                index += 1
-                if index >= MAX:
-                    continue
+                if curr_node.index + 1 >= MAX:
+                    break
                 curr_node = curr_node.advance(test_board)
                 curr_node.check_solved(test_board)
             else:
@@ -57,7 +55,6 @@ class SudokuSolver:
                     # backtrack
                     while len(curr_node.possible_values) == 0:
                         curr_node = curr_node.retreat()
-                        index -= 1
                 curr_node.next()
         return self.build_solution_string(head_node)

At this point, the statement at the top of the loop will never evaluate to False. We could replace it with while True.

It should be clear by now that the code writes the partial solution on a clean board, and then rolls forward with the next thing it is trying. Now that we have a backtrack function, we could reset the nodes to their default state and try again.

And that would mean we don't even need the early steps of the algorithm that roll forward. This is a pretty big jump. But the break in the middle of the code is the trigger to show that this is where we can change things

Seems I had a bug in advance. I did not find it until I wrote a unit test.

Here is the unit test change:

--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -31,7 +31,38 @@ puzzles = {
 }
 
 
+puzzle0 = ("003020600" +
+           "900305001" +
+           "001806400" +
+           "008102900" +
+           "700000008" +
+           "006708200" +
+           "002609500" +
+           "800203009" +
+           "005010300")
+
+
 def test_sudoku_solver():
     solver = tree_sudoku.SudokuSolver(tree_sudoku.import_csv())
     for key, solution in solver.solved_board_strings.items():
         assert solver.solved_board_strings[key] == puzzles[key]
+
+
+def test_advance():
+    test_board = tree_sudoku.build_board(puzzle0)
+    node = tree_sudoku.Tree_Node(None, 0)
+    node.write(test_board)
+    assert(test_board[0][0] == '9')
+    node = node.advance(test_board)
+    node = node.advance(test_board)
+    node.write(test_board)
+    assert(test_board[0][3] == '0')
+    node = node.advance(test_board)
+    node.write(test_board)
+    assert(test_board[0][3] == '9')
+    back_node = node.retreat(test_board)
+    assert(test_board[0][3] == '0')
+    assert(node.value == "9")
+    back_node.write(test_board)
+    assert(test_board[0][2] == '3')
+    assert(back_node.board_spot == '02')

And here is the change to the algorithm, including the fix for the advance method.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -38,24 +38,21 @@ class SudokuSolver:
         head_node = Tree_Node(None, 0)
         curr_node = head_node
         test_board = copy.deepcopy(original_board)
+        curr_node.check_solved(test_board)
         while True:
-            curr_board_filling_node = head_node
-            curr_board_filling_node.write(test_board)
-            while curr_board_filling_node.next_node:
-                curr_board_filling_node = curr_board_filling_node.next_node
-                curr_board_filling_node.write(test_board)
-
+            if not curr_node.solved:
+                curr_node.write(test_board)
             if self.box_index.is_value_valid(test_board, curr_node):
                 if curr_node.index + 1 >= MAX:
                     break
                 curr_node = curr_node.advance(test_board)
                 curr_node.check_solved(test_board)
             else:
-                if len(curr_node.possible_values) == 0:
-                    # backtrack
-                    while len(curr_node.possible_values) == 0:
-                        curr_node = curr_node.retreat(test_board)
+                # backtrack
+                while len(curr_node.possible_values) == 0:
+                    curr_node = curr_node.retreat(test_board)
                 curr_node.next()
+                curr_node.write(test_board)
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):
@@ -227,9 +224,9 @@ class Tree_Node:
         node = self
         if node.next_node is None:
             new_node = Tree_Node(node, self.index + 1)
-        new_node.check_solved(test_board)
-        node.next_node = new_node
-        return new_node
+            new_node.check_solved(test_board)
+            node.next_node = new_node
+        return node.next_node
 
     def retreat(self, test_board):
         node = self

It took a lot of trial and error to get this to run. Here is roughly what I learned:

The original code was rewriting the potential values on the board. This should not have been necessary if the backtracking was correct. IN general, the code was selecting a value on one trip through the loop, and writing, checking it and backtracking on the next time.

With backtracking erasing the old values, there was no need to re-write the potential answer every time. You also don't need to free and reallocate the nodes on back track. Right now, the linked list is built on demand in a forward only manner.

We are not done here, not by a long shot, but this is the biggest change in behavior. There should not be a need to change the functioning of the algorithm itself.

Long Refactoring: Introduce Iterator

Posted by Adam Young on October 29, 2020 12:52 AM

In a previous article, I had to shorten a bunch of lines that had a row and column value used as indexes to the board array. This repeated pattern is a call-to-action.

We want to encapsulate the logic for referring to a particular place on the board, and for advancing through the board. This is the responsibility of the Iterator pattern.

Spoiler Alert: we don’t get all the way there in this article.

Lets take a look at one specific piece of code, the code that sets a value:

curr_board_filling_node = curr_board_filling_node.next_node
                row = int(curr_board_filling_node.board_spot[0])
                col = int(curr_board_filling_node.board_spot[1])
                test_board[row][col] = curr_board_filling_node.value

curr_board_filling_node is acting like an iterator already, except that it does not encapsulate the set function of the assignment in the last line. If the curr_board_filling_node had a pointer back to the board, the code to set the value would be

    def set(self, value):
        row = int(self.board_spot[0])
        col = int(self.board_spot[1])
        self.board[row][col] = value

However, the nodes do not currently hold on to a reference to the table. Could we make them? Right now, they act like flyweights, taking the table as external state. We can keep that pattern up for now. Also, we note that whenever we are setting a value, it is from the node in question. That means we can write the method like this:

    def write(self, board):
        row = int(self.board_spot[0])
        col = int(self.board_spot[1])
        board[row][col] = self.value

Note that I call this “write” as opposed to “set” for clarity. Set implies that you are setting a value on the object. Here, we are implying that we are writing a value into the parameter that has been passed in.

Here is the entirety of the code change. Again, it is minimal, to be legible. Already the main function ist starting to be more understandable.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index dc5a96b..cd8a002 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -42,14 +42,10 @@ class SudokuSolver:
         while index < MAX:
             curr_board_filling_node = head_node
             test_board = copy.deepcopy(original_board)
-            row = int(curr_board_filling_node.board_spot[0])
-            col = int(curr_board_filling_node.board_spot[1])
-            test_board[row][col] = curr_board_filling_node.value
+            curr_board_filling_node.write(test_board)
             while curr_board_filling_node.next_node:
                 curr_board_filling_node = curr_board_filling_node.next_node
-                row = int(curr_board_filling_node.board_spot[0])
-                col = int(curr_board_filling_node.board_spot[1])
-                test_board[row][col] = curr_board_filling_node.value
+                curr_board_filling_node.write(test_board)
 
             curr_row = int(curr_node.board_spot[0])
             curr_col = int(curr_node.board_spot[1])
@@ -216,6 +212,11 @@ class Tree_Node:
     def __str__(self):
         return self.value
 
+    def write(self, board):
+        row = int(self.board_spot[0])
+        col = int(self.board_spot[1])
+        board[row][col] = self.value
+
 start = time.time()
 solver = SudokuSolver(import_csv())
 for key, solution in solver.solved_board_strings.items():

How about this spot?

          curr_row = int(curr_node.board_spot[0])
            curr_col = int(curr_node.board_spot[1])
            if self.box_index.value_valid(test_board, curr_row, curr_col):

Again, we can move the logic to the Node object. Probably just as clean to move the logic to the BoxIndex. This one alone doesn't clean up much, but every bit helps. We might need to do some re-balancing of these classes later.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index cd8a002..9290c2f 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -47,9 +47,7 @@ class SudokuSolver:
                 curr_board_filling_node = curr_board_filling_node.next_node
                 curr_board_filling_node.write(test_board)
 
-            curr_row = int(curr_node.board_spot[0])
-            curr_col = int(curr_node.board_spot[1])
-            if self.box_index.value_valid(test_board, curr_row, curr_col):
+            if self.box_index.is_value_valid(test_board, curr_node):
                 index += 1
                 if index >= MAX:
                     continue
@@ -118,6 +116,11 @@ class BoxIndex:
     def __init__(self):
         self.table = self.fill_box_index_table()
 
+    def is_value_valid(self, board, node):
+        row = int(node.board_spot[0])
+        col = int(node.board_spot[1])
+        return self.value_valid(board, row, col)
+
     def value_valid(self, board, row_index, column_index):
         row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
         column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']

OK, what about this spot?

                row = int(new_node.board_spot[0])
                col = int(new_node.board_spot[1])
                if test_board[row][col] != '0':
                    new_node.value = test_board[row][col]
                    new_node.possible_values = []

I have to be honest, I am not sure what to call this. If the test board has had a value placed in it, then the node that represents the table has no possible values assigned. It is reducing the possibilities because the current board space is marked as solved. And that last line is the name of the new function. check_solved(). Note that I forgot to check in to git after the last one, and so this diff is bigger...and note how that makes things harder to review. But not impossible: there are two chunks of code removed from the algorithm, and two functions added later.

index cd8a002..17926c3 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -47,20 +47,13 @@ class SudokuSolver:
                 curr_board_filling_node = curr_board_filling_node.next_node
                 curr_board_filling_node.write(test_board)
 
-            curr_row = int(curr_node.board_spot[0])
-            curr_col = int(curr_node.board_spot[1])
-            if self.box_index.value_valid(test_board, curr_row, curr_col):
+            if self.box_index.is_value_valid(test_board, curr_node):
                 index += 1
                 if index >= MAX:
                     continue
                 new_node = Tree_Node(
                     index, self.board_index.table[index], curr_node)
-
-                row = int(new_node.board_spot[0])
-                col = int(new_node.board_spot[1])
-                if test_board[row][col] != '0':
-                    new_node.value = test_board[row][col]
-                    new_node.possible_values = []
+                new_node.check_solved(test_board)
                 curr_node.next_node = new_node
                 curr_node = new_node
             else:
@@ -118,6 +111,11 @@ class BoxIndex:
     def __init__(self):
         self.table = self.fill_box_index_table()
 
+    def is_value_valid(self, board, node):
+        row = int(node.board_spot[0])
+        col = int(node.board_spot[1])
+        return self.value_valid(board, row, col)
+
     def value_valid(self, board, row_index, column_index):
         row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
         column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
@@ -217,6 +215,14 @@ class Tree_Node:
         col = int(self.board_spot[1])
         board[row][col] = self.value
 
+    def check_solved(self, board):
+        row = int(self.board_spot[0])
+        col = int(self.board_spot[1])
+        if board[row][col] != '0':
+            self.value = board[row][col]
+            self.possible_values = []
+
+
 start = time.time()
 solver = SudokuSolver(import_csv())
 for key, solution in solver.solved_board_strings.items():

The code is written with linked list modifications interspersed with the logic of solving the problem. The steps we have taken so far mitigate some of the problem. However, a proper iterator solution would extract that logic from this function altogether. The node object really is not the iterator. The node object should be the iterated object.

Here is another minor refactoring that removes duplicate code. It also clarifies the intention.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 17926c3..459619b 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -58,13 +58,12 @@ class SudokuSolver:
                 curr_node = new_node
             else:
                 if len(curr_node.possible_values) == 0:
+                    # backtrack
                     while len(curr_node.possible_values) == 0:
                         curr_node = curr_node.last_node
                         curr_node.next_node = None
                         index -= 1
-                    curr_node.next()
-                else:
-                    curr_node.next()
+                curr_node.next()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):

We could statically allocate all of the nodes we would ever need at the beginning (MAX) and iterate through a list. This would remove the need for the index variable as well. It also seems that the index member variable on the node is not used. Lets remove it.

index 17926c3..3778782 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -37,7 +37,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(index, self.board_index.table[index])
+        head_node = Tree_Node(self.board_index.table[index])
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -52,19 +52,18 @@ class SudokuSolver:
                 if index >= MAX:
                     continue
                 new_node = Tree_Node(
-                    index, self.board_index.table[index], curr_node)
+                    self.board_index.table[index], curr_node)
                 new_node.check_solved(test_board)
                 curr_node.next_node = new_node
                 curr_node = new_node
             else:
                 if len(curr_node.possible_values) == 0:
+                    # backtrack
                     while len(curr_node.possible_values) == 0:
                         curr_node = curr_node.last_node
                         curr_node.next_node = None
                         index -= 1
-                    curr_node.next()
-                else:
-                    curr_node.next()
+                curr_node.next()
         return self.build_solution_string(head_node)
 
     def build_solution_string(self, head_node):
@@ -196,9 +195,8 @@ class BoardIndexTable:
 
 
 class Tree_Node:
-    def __init__(self, index, board_spot, last_node=None, next_node=None):
+    def __init__(self, board_spot, last_node=None, next_node=None):
         self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
-        self.index = index
         self.board_spot = board_spot
         self.next_node = next_node
         self.last_node = last_node

Lets take a closer look at our board_index. This is only used in the creation of new nodes. If we could pre-create all our nodes, we would not need this table. Take a mental note that we can inline that later on.

When we remove a node from the list, what are we doing? We are indicating "this is the furthers we've progressed thus far" and we ensure that when we progress past that point, the nodes have been re-initialized. Something feels wasteful about removing these nodes. When we backtrack, we could set the node in the fresh state and reuse them. If only we didn't use the None as the way to indicate how far to roll forward. Going to be getting rid of that in the future.


Long Refactoring: Extract Method

Posted by Adam Young on October 29, 2020 12:43 AM

This refactoring is my bread and butter. Functions tend to grow. Eventually, you need to split them. Find a section of the method that has its own self-containerd reason- for existence, and make that its own function.

I have in the back of my head that I want to extract a class that abstracts the boards from this code. I’ve been resisting the urge thus far, as keeping the board as a 2D array of cells is really conducive to sharing with other implementations of this code. However, the following refactoring should work to support either direction: pull out the code that constructs a board from the string. This has the added benefit of making it possible to write a unit test that calls tree_to_solution_string without having to parse all of the strings.

I do this refactoring in two steps. The first step I create an inner function and move the code into there. Call the function from the original location.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 7c4004e..6a8cc15 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -86,15 +86,19 @@ class SudokuSolver:
         return return_string
 
     def strings_to_board_dict(self, board_strings):
-        return_dict = {}
-        for index, board in enumerate(board_strings):
+
+        def build_board(board):
             rows = re.findall(r"\d{9}", board)
             board_list = []
             for row in rows:
                 row_list = []
                 row_list[:0] = row
                 board_list.append(row_list)
-            return_dict[str(index)] = board_list
+            return board_list
+
+        return_dict = {}
+        for index, board in enumerate(board_strings):
+            return_dict[str(index)] = build_board(board)
         return return_dict
 
     def print_board(self, board):

Run the tests. Commit to git. Again, this is making code as reviewable as possible. Now move promote the function top level. We remove the self parameter, as it is not used.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 6a8cc15..316df52 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -87,15 +87,6 @@ class SudokuSolver:
 
     def strings_to_board_dict(self, board_strings):
 
-        def build_board(board):
-            rows = re.findall(r"\d{9}", board)
-            board_list = []
-            for row in rows:
-                row_list = []
-                row_list[:0] = row
-                board_list.append(row_list)
-            return board_list
-
         return_dict = {}
         for index, board in enumerate(board_strings):
             return_dict[str(index)] = build_board(board)
@@ -114,6 +105,16 @@ class SudokuSolver:
                 print('-' * 21)
 
 
+def build_board(board):
+    rows = re.findall(r"\d{9}", board)
+    board_list = []
+    for row in rows:
+        row_list = []
+        row_list[:0] = row
+        board_list.append(row_list)
+    return board_list
+
+

With this code promoted, we can now opt to promote the print_board method to top level function as well. Note that this is an untested function, and you should run the code from the command line to visually inspect it for this change.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 316df52..9dc5fe6 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -92,17 +92,18 @@ class SudokuSolver:
             return_dict[str(index)] = build_board(board)
         return return_dict
 
-    def print_board(self, board):
-        for index1, row in enumerate(board):
-            if index1 == 0 or index1 == 3 or index1 == 6:
-                print('-' * 21)
-            for index, char in enumerate(row):
-                print(char, '', end='')
-                if index == 2 or index == 5:
-                    print('| ', end='')
-            print('')
-            if index1 == 8:
-                print('-' * 21)
+
+def print_board(board):
+    for index1, row in enumerate(board):
+        if index1 == 0 or index1 == 3 or index1 == 6:
+            print('-' * 21)
+        for index, char in enumerate(row):
+            print(char, '', end='')
+            if index == 2 or index == 5:
+                print('| ', end='')
+        print('')
+        if index1 == 8:
+            print('-' * 21)
 
 
 def build_board(board):
@@ -217,7 +218,7 @@ start = time.time()
 solver = SudokuSolver(import_csv())
 for key, solution in solver.solved_board_strings.items():
     print(f"Board: {key}")
-    solver.print_board(solver.strings_to_board_dict([solution])['0'])
+    print_board(solver.strings_to_board_dict([solution])['0'])
 
 end = time.time()
 print(f"Finished in {end - start} seconds")

That is all the prep work. Next we have to tackle the algorithm at the heart of this code. While the bulk of that work is going to happen later, it so happens that the start of it is pulling out the tail end of the code into its own method, and that seems like an extension of what we have been doing here. Here’s the change:

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 9dc5fe6..dc5a96b 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -76,7 +76,9 @@ class SudokuSolver:
                     curr_node.next()
                 else:
                     curr_node.next()
+        return self.build_solution_string(head_node)
 
+    def build_solution_string(self, head_node):
         return_string = ''
         curr_node = head_node
         return_string += str(curr_node.value)

It may happen that we want to move the logic for building this string into another object. If so, this is a pre-req. However, at a minimum, it documents the meaning of this section of code.

Long Refactoring: Extract Helpers

Posted by Adam Young on October 28, 2020 11:20 PM

As the refactoring process continues, we will continue to decompose the large central class and functions. Right now, the SudokuSolver class is performing two functions. It is holding and managing the list of the puzzles, and it is solving them.

The heart of this code is the function tree_to_solution_string. Right now, I can’t call that by itself, as the SudokuSolver creates a bunch of helper objects before running through the whole set of tests. how can we tease this apart?

We have two small helper objects in this code: the box_index_table and the board index table. Lets’ split them off the Solver, so we can move them around and reuse them as we may need. Start with the BoardIndexTable

class BoardIndexTable:
    def __init__(self):
        pass

Add this code below SudokuSolver. Why? because the functions that will make the methods of these classes are down there, and we want to make the code as easy to review as possible. We want to avoid moving code around too far. The change has been trivial, but run the tests just to be sure.

OK, time to make a real change. We are going to replace the variable in the SudokuSolver with an instance of our new class. However, in order to do that, we need to initialize the instance, which means calling the function fill_board_index_table. If we make that a non-member function, it gets de-indented, and the code review gets tougher. We are going to keep is as a member function, but of our new class. We can oly get away with this because it is the last function of the code. OTOH, that is why we chose it.

We need a member variable to hold the table. I call that table. I also changed the points that reference the table to look like this:

-                    index, self.board_index_table[index], curr_node)
+                    index, self.board_index.table[index], curr_node)

Pretty subtle. Some diff tools do a better job with minor changes on the line, but notice that the underscore before table became a dot. It keeps the lines the same length, and keeps the naming consistent. Here is the whole change.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index f382971..0305598 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -29,7 +29,7 @@ class SudokuSolver:
         self.board_strings = board_strings
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index_table = self.fill_box_index_table()
-        self.board_index_table = self.fill_board_index_table()
+        self.board_index = BoardIndexTable()
         self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
             return_string = self.tree_to_solution_string(value)
@@ -37,7 +37,7 @@ class SudokuSolver:
 
     def tree_to_solution_string(self, original_board):
         index = 0
-        head_node = Tree_Node(index, self.board_index_table[index])
+        head_node = Tree_Node(index, self.board_index.table[index])
         curr_node = head_node
         while index < MAX:
             curr_board_filling_node = head_node
@@ -58,7 +58,7 @@ class SudokuSolver:
                 if index >= MAX:
                     continue
                 new_node = Tree_Node(
-                    index, self.board_index_table[index], curr_node)
+                    index, self.board_index.table[index], curr_node)
 
                 row = int(new_node.board_spot[0])
                 col = int(new_node.board_spot[1])
@@ -174,6 +174,11 @@ class SudokuSolver:
             box_center[1] -= DIM
         return boxes
 
+
+class BoardIndexTable:
+    def __init__(self):
+        self.table = self.fill_board_index_table()
+
     def fill_board_index_table(self):
         return_list = []
         for row in range(DIM):

Commit to git. That keeps that change as a stand alone, reviewable chunk of code. A comparable change can be made with the BoardIndex. Note that we get luck: the value_valid also refers to the BoardIndex. Any other value it requires is passed in as a parameter. Let’s use that as part of our new class, too.

--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -28,7 +28,7 @@ class SudokuSolver:
     def __init__(self, board_strings):
         self.board_strings = board_strings
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
-        self.box_index_table = self.fill_box_index_table()
+        self.box_index = BoxIndex()
         self.board_index = BoardIndexTable()
         self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
@@ -53,7 +53,7 @@ class SudokuSolver:
 
             curr_row = int(curr_node.board_spot[0])
             curr_col = int(curr_node.board_spot[1])
-            if self.value_valid(test_board, curr_row, curr_col):
+            if self.box_index.value_valid(test_board, curr_row, curr_col):
                 index += 1
                 if index >= MAX:
                     continue
@@ -109,6 +109,11 @@ class SudokuSolver:
             if index1 == 8:
                 print('-' * 21)
 
+
+class BoxIndex:
+    def __init__(self):
+        self.table = self.fill_box_index_table()
+
     def value_valid(self, board, row_index, column_index):
         row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
         column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
@@ -130,8 +135,9 @@ class SudokuSolver:
             else:
                 return False
 
-        box_indexes = self.box_index_table[
-            self.find_box_of_index(str(row_index) + str(column_index))]
+        box_indexes = self.table[
+            self.find_box_of_index(
+                str(row_index) + str(column_index))]
         for index in box_indexes:
             row = int(index[0])
             column = int(index[1])
@@ -147,8 +153,8 @@ class SudokuSolver:
 
     def find_box_of_index(self, index):
         box = 'box not found'
-        for each_box in self.box_index_table:
-            if index in self.box_index_table[each_box]:
+        for each_box in self.table:
+            if index in self.table[each_box]:
                 box = each_box
                 break
         return box

Commit to git.

Each of these changes is a minimal, stand alone change. Each could be reviewed and merged into a larger project by someone not very familiar with the code base. This is how we make progress.

Long Refactoring: pep8

Posted by Adam Young on October 28, 2020 10:45 PM

While writing the last article, I noticed that I had made a bunch of little changes to the files without explicitly meaning to. I didn’t include them in the diffs. However, on my screen, I have a bunch of changes that appear to be…not changes:

                     current_node.next()
                 else:
                     current_node.next()
-            
+
         return_string = ''

This is due to me running the whitespace cleanup on the code base: I noticed that I had introduced some white space while editing the file, and it shows up in the diffs. It has become an ingrained habit to clean those up.

Since I fixed that, I am also going to fix the rest of the pep8 errors reported by the flake8 tool.

After running the emacs command white-space cleanup, a bunch of the tox errors go away. Here is the tail end of what I am left with:

./treesudoku/tree_sudoku.py:154:80: E501 line too long (84 > 79 characters)
                    box_list.append(str(box_center[0] + i) + str(box_center[1] + 1))
                                                                               ^
./treesudoku/tree_sudoku.py:169:1: E302 expected 2 blank lines, found 1
class Tree_Node:
^

When fixing these errors, I start at the bottom of the file and work my way to the top. Why? so the line numbers stay consistent.

pep8 is pedantic about spacing. Non class functions should be exactly two lines apart. Methods of a class should be one line apart. The tool tells you where this in not met. Start by adding an additional newline on line 169. Make sure you don’t introduce whitespace

Long lines are a pain. Making long lines shorter is a different kind of pain. There are several techniques. For python, the most important is to know that you can wrap at open parenthesis, brackets, and braces. Change line 154 to two lines like this.

-                    box_list.append(str(box_center[0] + i) + str(box_center[1] + 1))
+                    box_list.append(
+                        str(box_center[0] + i) + str(box_center[1] + 1))

Go ahead and do that for the two lines above it as well….they are all guilty. Similar change as line 122

-        box_indexes = self.box_index_table[self.find_box_of_index(str(row_index) + str(column_index))]
+        box_indexes = self.box_index_table[
+            self.find_box_of_index(str(row_index) + str(column_index))]

pep8 is also strict about spacing around various keywords and tokens. During definition and on stand alone lines it wants spaces around and equals sign, but during function calls and headers it does not. Change line 96 like this:

-                    print('| ', end = '')
+                    print('| ', end='')

Now we come to this pair of lines that are too long:

            if test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] != '0':
                    new_node.value = test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])]

The code formatter does not do us much good here. The easiest way to shorten these lines is to introduce a shorter variable for the values inside the brackets. Since this is a code change, we’ll want to run the unit tests before and after.

I am going to name these variables row and col, short for column.

-                if test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] != '0':
-                    new_node.value = test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])]
+
+                row = int(new_node.board_spot[0])
+                col = int(new_node.board_spot[1])
+                if test_board[row][col] != '0':
+                    new_node.value = test_board[row][col]

The next few changes are long lines, and we can apply one of the previous techniques for each. But be careful at the change on line 51. The above section is indented (it is in a while loop) and we do not want to indent our variables inside the loop. If the while loop is never executed, our variables will be undefined.

I also note that we can start getting confused between variables. I am going to name this pair curr_row and curr_col, since they come from the int(current_node.board_spot[0])

-            if self.value_valid(test_board, int(current_node.board_spot[0]), int(current_node.board_spot[1])):
+
+            curr_row = int(current_node.board_spot[0])
+            curr_col = int(current_node.board_spot[1])
+            if self.value_valid(test_board, curr_row, curr_col):

For the long line 80 I use the short names of row and col to avoid spilling over even after extracting. Long variables names and deep indentation can make this task bothersome. Line 47 is too long, and none of our previous tricks will work. I am going to change the string current_ to curr_ in this whole function, and it will shorten this line by 6 characters, just enough.

Line 45 is another extract variable shortening.

The final change for this file is the comment block at the top. Insert a space after the hash mark on each line:

@@ -1,8 +1,8 @@
-#turn each csv row into a board
-#find what values can go into what spot
-#create a tree trying to put in each value
-#if value can not be possible, end stem, go back up the tree
-#retrun the branch when tree is MAX=81 layers deep, the board is filled
+# turn each csv row into a board
+# find what values can go into what spot
+# create a tree trying to put in each value
+# if value can not be possible, end stem, go back up the tree
+# return the branch when tree is MAX=81 layers deep, the board is filled

That leaves the code in the test_tree_soduku file. We’ll ignore the typo in the name for now, and just shorten the strings. Use the fact that strings can be added together and defined within parenthesis, and we get a style that works like this:

 puzzles = {
-    "0": "483921657967345821251876493548132976729564138136798245372689514814253769695417382",
-    "1": "245981376169273584837564219976125438513498627482736951391657842728349165654812793",
-    "2": "462831957795426183381795426173984265659312748248567319926178534834259671517643892"
+    "0": ("483921657" +
+          "967345821" +
+          "251876493" +
+          "548132976" +
+          "729564138" +
+          "136798245" +
+          "372689514" +
+          "814253769" +
+          "695417382"),
+    "1": ("245981376" +
+          "169273584" +
+          "837564219" +
+          "976125438" +
+          "513498627" +
+          "482736951" +
+          "391657842" +
+          "728349165" +
+          "654812793"),
+    "2": ("462831957" +
+          "795426183" +
+          "381795426" +
+          "173984265" +
+          "659312748" +
+          "248567319" +
+          "926178534" +
+          "834259671" +
+          "517643892")
 }

Note that this makes the puzzles readable to the human eye, as they are in the standard 9 X 9 layout.

We have one error left:

./treesudoku/test_tree_soduku.py:1:1: F401 'subprocess' imported but unused
import subprocess

This is the kind of thing we really want the flake8 check for. We now know we can remove this import.

@@ -1,11 +1,33 @@
-import subprocess
-

and now tox reports

===================================================================== 1 passed in 13.43s =====================================================================
__________________________________________________________________________ summary ___________________________________________________________________________
  pep8: commands succeeded
  py37: commands succeeded
  congratulations 🙂

Commit to git and take a break.

Long Refactoring: Extract Read

Posted by Adam Young on October 28, 2020 10:40 PM

The typical approach to Data handling is

  1. read
  2. munge
  3. write

We want out code to reflect that structure. In doing so, we make it much easier to adapt the code to read from different sources and write to different destinations.

In our last article, we split the display code off the data munging. Lets do the same for the code that reads in the puzzles from a file.

The initialization function starts like this:

class SudokuSolver:
    def __init__(self):
        self.board_strings = self.import_csv()

The import_csv file is self contained, and does not need to be a method of SudokuSolver. Lets’ move that outside the class.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 400450c..555ff27 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -14,9 +14,20 @@ BASIS = 3
 DIM = BASIS * BASIS
 MAX = DIM * DIM
 
+
+def import_csv():
+    list_of_boards = []
+    with open('sample_sudoku_board_inputs.csv', 'r') as file:
+        reader = csv.reader(file)
+        for row in reader:
+            list_of_boards.append(str(*row))
+    return list_of_boards
+    
+
+
 class SudokuSolver:
     def __init__(self):
-        self.board_strings = self.import_csv()
+        self.board_strings = import_csv()
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index_table = self.fill_box_index_table()
         self.board_index_table = self.fill_board_index_table()
@@ -64,14 +75,6 @@ class SudokuSolver:
             return_string += str(current_node.value)
         return return_string
 
-    def import_csv(self):
-        list_of_boards = []
-        with open('sample_sudoku_board_inputs.csv', 'r') as file:
-            reader = csv.reader(file)
-            for row in reader:
-                list_of_boards.append(str(*row))
-        return list_of_boards
-    
     def strings_to_board_dict(self, board_strings):
         return_dict = {}
         for index, board in enumerate(board_strings):

We can now pass in the list of puzzles to the constructor. Note that to make this work we need to modify the test code as well. The constructor now looks like this:

 
class SudokuSolver:
-    def __init__(self):
-        self.board_strings = self.import_csv()
+    def __init__(self, board_strings):
+        self.board_strings = board_strings

And the calling code like this:

-solver = SudokuSolver()
+solver = SudokuSolver(import_csv())

The test code looks like this now:

--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -10,6 +10,6 @@ puzzles = {
 
 
 def test_sudoku_solver():
-    solver = tree_sudoku.SudokuSolver()
+    solver = tree_sudoku.SudokuSolver(tree_sudoku.import_csv())
     for key, solution in solver.solved_board_strings.items():
         assert solver.solved_board_strings[key] == puzzles[key]

Long Refactoring: First New Unit Test

Posted by Adam Young on October 28, 2020 10:37 PM

The heart of the code is the call to solve a single puzzle; the function tree_to_solution_string. We want to write a test that runs just this function. Getting there is not so easy.

This method is buried deep within the class SudokuSolver. But creating one of these essentially runs the entire code. In fact, we can redo our original test to run the code as a python call as opposed to a subprocess call. Lets’ start with that.

In order to create a SudokuSolver, we need to import the code. Add the following line to the top of the test file:

from treesudoku import tree_sudoku

But when we run tox, we get an error:

______________________________________________________ ERROR collecting treesudoku/test_tree_soduku.py _______________________________________________________
ImportError while importing test module '/home/ayoung/Documents/CodePlatoon/tree_sudoku/treesudoku/test_tree_soduku.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
.tox/py37/lib64/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
treesudoku/test_tree_soduku.py:3: in <module>
    from treesudoku import tree_sudoku
E   ModuleNotFoundError: No module named 'treesudoku'

Create a blank file in the treesudoku directory name __init__.py.

Here’s my first implementation of the test.

def test_sudoku_solver():
    solver = tree_sudoku.SudokuSolver()
    assert(False)

Note that when I assert(False) aside from making the test fail, it also produces all the output from running the program. I want to start by getting rid of all that output, as I am going to pull it out of the objects. I can do that by pulling the display code out of the object. The output happens at the very end of tree_to_solution_string. here is the first transform

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 8db59ce..60b9c24 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -23,7 +23,10 @@ class SudokuSolver:
         self.solved_board_strings = []
         for key, value in self.boards_dict.items():
             print(f"Board: {key}")
-            self.solved_board_strings.append([self.tree_to_solution_string(value)])
+            return_string = self.tree_to_solution_string(value)
+            self.solved_board_strings.append([return_string])
+            self.print_board(self.strings_to_board_dict([return_string])['0'])
+
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -62,7 +65,6 @@ class SudokuSolver:
         while(current_node.next_node):
             current_node = current_node.next_node
             return_string += str(current_node.value)
-        self.print_board(self.strings_to_board_dict([return_string])['0'])
         return return_string
 
     def import_csv(self):

Aside from breaking up the long line, this code moves the print statement from the called function to the loop. However, it still uses the return value from the function instead of the key from the dictionary. What is the difference? Use the debugger to look:

self.print_board(self.strings_to_board_dict([return_string])['0'])
(Pdb) print (return_string)
483921657967345821251876493548132976729564138136798245372689514814253769695417382
(Pdb) print (value)
[['0', '0', '3', '0', '2', '0', '6', '0', '0'], ['9', '0', '0', '3', '0', '5', '0', '0', '1'], ['0', '0', '1', '8', '0', '6', '4', '0', '0'], ['0', '0', '8', '1', '0', '2', '9', '0', '0'], ['7', '0', '0', '0', '0', '0', '0', '0', '8'], ['0', '0', '6', '7', '0', '8', '2', '0', '0'], ['0', '0', '2', '6', '0', '9', '5', '0', '0'], ['8', '0', '0', '2', '0', '3', '0', '0', '9'], ['0', '0', '5', '0', '1', '0', '3', '0', '0']]

One is the solved string. The other is the original puzzle in Array form. We don’t seem to have a solution set. This seems to call for a dictionary much like the original one. Let’s create that here, and use it for the display. That member variable self.solved_board_strings = [] is not being used outside of this function, so we can convert that to a dictionary.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index 8db59ce..a0e9c29 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -20,10 +20,13 @@ class SudokuSolver:
         self.boards_dict = self.strings_to_board_dict(self.board_strings)
         self.box_index_table = self.fill_box_index_table()
         self.board_index_table = self.fill_board_index_table()
-        self.solved_board_strings = []
+        self.solved_board_strings = dict()
         for key, value in self.boards_dict.items():
+            return_string = self.tree_to_solution_string(value)
+            self.solved_board_strings[key] = return_string
+        for key, solution in self.solved_board_strings.items():
             print(f"Board: {key}")
-            self.solved_board_strings.append([self.tree_to_solution_string(value)])
+            self.print_board(self.strings_to_board_dict([solution])['0'])
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -62,7 +65,6 @@ class SudokuSolver:
         while(current_node.next_node):
             current_node = current_node.next_node
             return_string += str(current_node.value)
-        self.print_board(self.strings_to_board_dict([return_string])['0'])
         return return_string
 
     def import_csv(self):

Note that we have introduced a minor inefficiency by going through the loop twice. Compared to the time it takes to solve the puzzle this is irrelvant. We will do this a lot; add a small inefficiency to tease things apart. But, once we have refactored, we find we can more than make up for that by optimizing the expensive parts of the separated code.

Commit to git before continuing.

Lets move the display logic out of this class.

diff --git a/treesudoku/tree_sudoku.py b/treesudoku/tree_sudoku.py
index a0e9c29..400450c 100644
--- a/treesudoku/tree_sudoku.py
+++ b/treesudoku/tree_sudoku.py
@@ -24,9 +24,6 @@ class SudokuSolver:
         for key, value in self.boards_dict.items():
             return_string = self.tree_to_solution_string(value)
             self.solved_board_strings[key] = return_string
-        for key, solution in self.solved_board_strings.items():
-            print(f"Board: {key}")
-            self.print_board(self.strings_to_board_dict([solution])['0'])
 
     def tree_to_solution_string(self, original_board):
         index = 0
@@ -183,6 +180,10 @@ class Tree_Node:
         return self.value
 
 start = time.time()
-x = SudokuSolver()
+solver = SudokuSolver()
+for key, solution in solver.solved_board_strings.items():
+    print(f"Board: {key}")
+    solver.print_board(solver.strings_to_board_dict([solution])['0'])
+
 end = time.time()
 print(f"Finished in {end - start} seconds")

Now we can redo our unit test like this:

diff --git a/treesudoku/test_tree_soduku.py b/treesudoku/test_tree_soduku.py
index 67b2287..9969c94 100644
--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -2,30 +2,14 @@ import subprocess
 
 from treesudoku import tree_sudoku
 
-check_data ="""
-Board:0483921657967345821251876493548132976729564138136798245372689514814253769695417382
-Board:1245981376169273584837564219976125438513498627482736951391657842728349165654812793
-Board:2462831957795426183381795426173984265659312748248567319926178534834259671517643892"""
-
-def test_end_to_end():
-    print ("Running Test")
-    p = subprocess.run(["python3", "treesudoku/tree_sudoku.py"], capture_output=True)
-    output = p.stdout.decode("utf-8").split("\n")
-    output = "".join(output[:-2])
-    output = output.replace("-","").replace("|","")
-    output = output.replace(" ","").replace("\n","")
-    output = output.replace("Board","\nBoard")
-    print("comparing output ")
-    assert(len(check_data) == len(output))
-    assert(check_data == output)
-    print("OK")
-
+puzzles = {
+    "0": "483921657967345821251876493548132976729564138136798245372689514814253769695417382",
+    "1": "245981376169273584837564219976125438513498627482736951391657842728349165654812793",
+    "2": "462831957795426183381795426173984265659312748248567319926178534834259671517643892"
+}
 
 
 def test_sudoku_solver():
     solver = tree_sudoku.SudokuSolver()
-    assert(False)
-
-test_end_to_end()
-
-
+    for key, solution in solver.solved_board_strings.items():
+        assert solver.solved_board_strings[key] == puzzles[key]

Of course, I did this in two stages: first I got the test to run, then I removed the old test. I can no longer run the tests using the CLI, now only via tox

Long Refactoring: Call a Unit Test

Posted by Adam Young on October 28, 2020 10:31 PM

Our test takes 5 seconds to run. That is not horrible, but it does not bode well for when we write more tests. We don’t want to explode that number, and we want to be able to write more focused test. I’m going to do some project setup that will enable us to do more development, including fine grained unit tests.

I’m going to start by using tox.

I’m going to grab a simplistic tox file I have from previous projects.

# tox (https://tox.readthedocs.io/) is a tool for running tests
# in multiple virtualenvs. This configuration file will run the
# test suite on all supported python versions. To use it, "pip install tox"
# and then run "tox" from this directory.

[tox]
envlist = pep8,py37

skipsdist = True


[testenv:pep8]
commands =
  flake8 --ignore=D100,D101,D102,D103,D104,E305,E402,W503,W504,W605

[flake8]
filename= *.py
show-source = true
enable-extensions = H203,H904
ignore = D100,D101,D102,D103,D104,D203,E402,W503,W504

exclude=.venv,.git,.tox,build,dist,*lib/python*,*egg,tools,vendor,.update-venv,*.ini,*.po,*.pot
max-complexity=24

[testenv]
deps = -rrequirements.txt

commands =
    pytest
~            

I created a simple requirements.txt file that just has flake8 and the tester in it for now. Ideally, this would be in a testing only requirements file, but simplicity….

flake8
pytest>=3.0.7

I create a subdirectory called treesudoku with no underscore or hyphens for simplicity. Move all of the code there. Lets’ run tox. There is a slew of output, and I am not going to look at it all now. For starters, the pep8 checks run by Flake 8 produce far more output than I want to deal with now.

treesudoku/test_tree_soduku.py:14: in <module>
    assert(len(check_data) == len(output))
E   AssertionError: assert 267 == 0
E    +  where 267 = len('\nBoard:0483921657967345821251876493548132976729564138136798245372689514814253769695417382\nBoard:1245981376169273584...1391657842728349165654812793\nBoard:2462831957795426183381795426173984265659312748248567319926178534834259671517643892')
E    +  and   0 = len('')

I am more concerned with that failing py37 test. We can see that we are not actually running our program. Maybe because it is in a different subdir? Maybe. But instead of guessing, lets step through the code.

Edit the file to introduce the debugger. Step through it until we can produce the error code:

(Pdb) print(p.stderr)
b'Traceback (most recent call last):\n  File "treesudoku/tree_sudoku.py", line 184, in <module>\n    x = SudokuSolver()\n  File "treesudoku/tree_sudoku.py", line 19, in __init__\n    self.board_strings = self.import_csv()\n  File "treesudoku/tree_sudoku.py", line 70, in import_csv\n    with open(\'sample_sudoku_board_inputs.csv\', \'r\') as file:\nFileNotFoundError: [Errno 2] No such file or directory: \'sample_sudoku_board_inputs.csv\'\n'

I moved the test data file…I move it back. The test runs…but still produces an error line in the output. But..I can run the test by hand still.

python3 ./treesudoku/test_tree_soduku.py

Lets convert that test file into something that the test framework can run. Commit to git before continuing.

Take the body of the test and move it into a function with a name that starts with test_. The testing framework will now pick that up.

diff --git a/treesudoku/test_tree_soduku.py b/treesudoku/test_tree_soduku.py
index f4d8532..c36355c 100644
--- a/treesudoku/test_tree_soduku.py
+++ b/treesudoku/test_tree_soduku.py
@@ -3,16 +3,20 @@ check_data ="""
 Board:0483921657967345821251876493548132976729564138136798245372689514814253769695417382
 Board:1245981376169273584837564219976125438513498627482736951391657842728349165654812793
 Board:2462831957795426183381795426173984265659312748248567319926178534834259671517643892"""
-print ("Running Test")
-p = subprocess.run(["python3", "treesudoku/tree_sudoku.py"], capture_output=True)
-output = p.stdout.decode("utf-8").split("\n")
-output = "".join(output[:-2])
-output = output.replace("-","").replace("|","")
-output = output.replace(" ","").replace("\n","")
-output = output.replace("Board","\nBoard")
-print("comparing output ")
-assert(len(check_data) == len(output))
-assert(check_data == output)
-print("OK")
+
+def test_end_to_end():
+    print ("Running Test")
+    p = subprocess.run(["python3", "treesudoku/tree_sudoku.py"], capture_output=True)
+    output = p.stdout.decode("utf-8").split("\n")
+    output = "".join(output[:-2])
+    output = output.replace("-","").replace("|","")
+    output = output.replace(" ","").replace("\n","")
+    output = output.replace("Board","\nBoard")
+    print("comparing output ")
+    assert(len(check_data) == len(output))
+    assert(check_data == output)
+    print("OK")
+
+test_end_to_end()

Now I can run it the old way or via tox.

==================================================================== test session starts =====================================================================
platform linux -- Python 3.7.9, pytest-6.1.1, py-1.9.0, pluggy-0.13.1
cachedir: .tox/py37/.pytest_cache
rootdir: /home/ayoung/Documents/CodePlatoon/tree_sudoku
collected 1 item                                                                                                                                             

treesudoku/test_tree_soduku.py .                                                                                                                       [100%]

===================================================================== 1 passed in 13.53s =====================================================================
__________________________________________________________________________ summary ___________________________________________________________________________
ERROR:   pep8: commands failed
  py37: commands succeeded

Long Refactoring: Symbolic Constants

Posted by Adam Young on October 28, 2020 10:26 PM

180 lines of code is not horrible, but it is a lot to wrap your mind around all at once. In order to get oriented to the code, we’re going to take on one of the simpler refactorings. We are going to replace a bunch of the magic numbers with symbolic constants.

Sudoku puzzles are 9 squares by 9 squares. Thus, we should expect to see a lot of 9s in the code.

grep 9 tree_sudoku.py
            rows = re.findall(r"\d{9}", board)
        row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        square = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        for a_row in range(9):
            box_center[1] -= 9
        for row in range(9):
            for column in range(9):
        self.value = '9'

We’re going to skip the places where it is in the character arrays for now, but those ranges and centers look like good targets. To Start, lets run our test to make sure everything still runs. OK. Now lets start by adding our symbolic constant at the top of the file. I am going to use DIM, short for dimension. It is long enough that people should be able to figure it out, but short enough that it won’t make the code less readable.

diff --git a/tree_sudoku.py b/tree_sudoku.py
index 21a6a34..f5abef7 100644
--- a/tree_sudoku.py
+++ b/tree_sudoku.py
@@ -9,6 +9,10 @@ import re
 import copy
 import time
 
+
+DIM = 9
+
+
 class SudokuSolver:
     def __init__(self):
         self.board_strings = self.import_csv()

It might be tempting to search and replace automatically. Don’t do that. Part of the reason we are doing this one first is to get familiar with the code. You can use the search function, but make the changes by hand.

diff --git a/tree_sudoku.py b/tree_sudoku.py
index 21a6a34..f456817 100644
--- a/tree_sudoku.py
+++ b/tree_sudoku.py
@@ -9,6 +9,10 @@ import re
 import copy
 import time
 
+
+DIM = 9
+
+
 class SudokuSolver:
     def __init__(self):
         self.board_strings = self.import_csv()
@@ -104,7 +108,7 @@ class SudokuSolver:
             else:
                 return False
 
-        for a_row in range(9):
+        for a_row in range(DIM):
             number = board[a_row][column_index]
             if number == '0':
                 continue
@@ -150,13 +154,13 @@ class SudokuSolver:
                 box_number += 1
                 box_center[1] += 3
             box_center[0] += 3
-            box_center[1] -= 9
+            box_center[1] -= DIM
         return boxes
     
     def fill_board_index_table(self):
         return_list = []
-        for row in range(9):
-            for column in range(9):
+        for row in range(DIM):
+            for column in range(DIM):
                 return_list.append(str(row) + str(column))
         return return_list

Make that change and run the tests. Everything should still run.

Notice those 3s on the lines above box_center[1] -= DIM? Those are also magic numbers, and we know that there is a relationship between them and the one we just created. Sudoku puzzles have blocks of 3X3 cells. If you scale the puzzle down to 4 numbers in stead of 9, they have two blocks that are 2 X2 ins size. If you scale the puzzle up to 16 X 16, you get 4 blocks that are 4X4.

Lets call the smaller number BASIS. While we could caluculate the BASIS as the square-root of DIM, that would be a floating point operation, and we’d have float to int conversions. It is much easier, and clearer, to calculate DIM from BASIS.

Before we make this change, however, we check in our code to git. We have working code, and we don’t want to have to redo that work if we break something.

$ git status
On branch master
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   tree_sudoku.py

no changes added to commit (use "git add" and/or "git commit -a")
[ayoung@ayoungP40 tree_sudoku]$ git add tree_sudoku.py 
[ayoung@ayoungP40 tree_sudoku]$ git commit -m "introduced DIM symbolic constant"
[master 653dac8] introduced DIM symbolic constant
 1 file changed, 9 insertions(+), 5 deletions(-)

The “print_board” function has a bunch of 3s in it, as well as numbers based on 3 for drawing the board. Since we don’t check that in our tests, we are not going to change this function. We only want to change things that we can test. The output will be a separate concern that we will get to once we have a better approach toward testing.

This is also the case with the “fill_box_index_table” code. This is too bad, as it is very tempting to change, but we can leave it for now. That leaves us with the trivial code change:

diff --git a/tree_sudoku.py b/tree_sudoku.py
index f456817..130e935 100644
--- a/tree_sudoku.py
+++ b/tree_sudoku.py
@@ -10,7 +10,8 @@ import copy
 import time
 
 
-DIM = 9
+BASIS = 3
+DIM = BASIS * BASIS
 
 
 class SudokuSolver:
@@ -143,8 +144,8 @@ class SudokuSolver:
         boxes = {}
         box_center = [1, 1]
         box_number = 0
-        for _row_of_boxes in range(3):
-            for _each_box in range(3):
+        for _row_of_boxes in range(BASIS):
+            for _each_box in range(BASIS):
                 box_list = []
                 for i in range(-1, 2):
                     box_list.append(str(box_center[0] + i) + str(box_center[1] - 1))
@@ -152,8 +153,8 @@ class SudokuSolver:
                     box_list.append(str(box_center[0] + i) + str(box_center[1] + 1))
                 boxes[box_number] = box_list
                 box_number += 1
-                box_center[1] += 3
-            box_center[0] += 3
+                box_center[1] += BASIS
+            box_center[0] += BASIS
             box_center[1] -= DIM
         return boxes

 

Note that the logic in this function makes use of the range from -1 to 2. This seems to be related ot the BASIS: We could change the 2 to BASIS – 1, but there seems to be more logic in this stretch that is hard coded to assume a square of size 3. We’ll leave those for now.

Run the tests and check in.

Last magic number we want to deal with is 81. This is the max number of squares in the puzzle, and we’ll modify the code to use this as MAX.

diff --git a/tree_sudoku.py b/tree_sudoku.py
index 130e935..8db59ce 100644
--- a/tree_sudoku.py
+++ b/tree_sudoku.py
@@ -2,7 +2,7 @@
 #find what values can go into what spot
 #create a tree trying to put in each value
 #if value can not be possible, end stem, go back up the tree
-#retrun the branch when tree is 81 layers deep, the board is filled
+#retrun the branch when tree is MAX=81 layers deep, the board is filled
 
 import csv
 import re
@@ -12,7 +12,7 @@ import time
 
 BASIS = 3
 DIM = BASIS * BASIS
-
+MAX = DIM * DIM
 
 class SudokuSolver:
     def __init__(self):
@@ -29,7 +29,7 @@ class SudokuSolver:
         index = 0
         head_node = Tree_Node(index, self.board_index_table[index])
         current_node = head_node
-        while index < 81:
+        while index < MAX:
             current_board_filling_node = head_node
             test_board = copy.deepcopy(original_board)
             test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value
@@ -38,7 +38,7 @@ class SudokuSolver:
                 test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value
             if self.value_valid(test_board, int(current_node.board_spot[0]), int(current_node.board_spot[1])):
                 index += 1
-                if index >= 81:
+                if index >= MAX:
                     continue
                 new_node = Tree_Node(index, self.board_index_table[index], current_node)
                 if test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] != '0':

Now that we have some simple work done, we should feel confident in our approach. Next we’ll do something a little more aggressive, and look into the overall code structure.

Long Refactoring: Testing

Posted by Adam Young on October 27, 2020 11:17 PM

The code runs. How do me make sure that continues to be the case? Write a test.

The output from the previous post shows that the code runs. The ideal functional test would be to snapshot this and compare with the snapshot. If it is identical, we haven’t borked up nuffin.

To start with, let’s create a new file called test_tree_soduku.py. Populate it with this old chestnut:

print ("Hello World")
assert(False)

Make sure we can run it.

$ python3 test_tree_soduku.py
Hello World
Traceback (most recent call last):
  File "test_tree_soduku.py", line 2, in <module>
    assert(False)
AssertionError

I always want to see out put. I want to make sure I am not fooling myself. The assert(False) is what I put into every test before I run it, so I don’t accidentally skip running a test.

Let’s make this a little more functional. For now, use the python module for running another program. https://docs.python.org/3/library/subprocess.html

import subprocess
print ("Running Test")
p = subprocess.run(["python3", "./tree_sudoku.py"], capture_output=True)
print(p.stdout)
assert(False)

Here is the output.

$ python3 test_tree_soduku.py
Hello World
b'Board: 0\n---------------------\n4 8 3 | 9 2 1 | 6 5 7 \n9 6 7 | 3 4 5 | 8 2 1 \n2 5 1 | 8 7 6 | 4 9 3 \n---------------------\n5 4 8 | 1 3 2 | 9 7 6 \n7 2 9 | 5 6 4 | 1 3 8 \n1 3 6 | 7 9 8 | 2 4 5 \n---------------------\n3 7 2 | 6 8 9 | 5 1 4 \n8 1 4 | 2 5 3 | 7 6 9 \n6 9 5 | 4 1 7 | 3 8 2 \n---------------------\nBoard: 1\n---------------------\n2 4 5 | 9 8 1 | 3 7 6 \n1 6 9 | 2 7 3 | 5 8 4 \n8 3 7 | 5 6 4 | 2 1 9 \n---------------------\n9 7 6 | 1 2 5 | 4 3 8 \n5 1 3 | 4 9 8 | 6 2 7 \n4 8 2 | 7 3 6 | 9 5 1 \n---------------------\n3 9 1 | 6 5 7 | 8 4 2 \n7 2 8 | 3 4 9 | 1 6 5 \n6 5 4 | 8 1 2 | 7 9 3 \n---------------------\nBoard: 2\n---------------------\n4 6 2 | 8 3 1 | 9 5 7 \n7 9 5 | 4 2 6 | 1 8 3 \n3 8 1 | 7 9 5 | 4 2 6 \n---------------------\n1 7 3 | 9 8 4 | 2 6 5 \n6 5 9 | 3 1 2 | 7 4 8 \n2 4 8 | 5 6 7 | 3 1 9 \n---------------------\n9 2 6 | 1 7 8 | 5 3 4 \n8 3 4 | 2 5 9 | 6 7 1 \n5 1 7 | 6 4 3 | 8 9 2 \n---------------------\nFinished in 4.880897283554077 seconds\n'
Traceback (most recent call last):
  File "test_tree_soduku.py", line 9, in <module>
    assert(False)
AssertionError

Now let’s convert this into an automated test. We’ll capture the output in a string and compare it. Note that the letter “b” at the start of the output tells us this is bytes, not a string. Converting the output to a string will give us more ability to morph it. Let’s start there.

import subprocess
print ("Running Test")
p = subprocess.run(["python3", "./tree_sudoku.py"], capture_output=True)

print(type(p.stdout))
output = p.stdout.decode("utf-8")
print(type(output))
#assert(False
$ python3 test_tree_soduku.py
Running Test
<class>
<class>

Since we want to avoid changing the original code, we’ll instead strip out anything from the output we don’t want. Lets get rid of all of the formatting characters first. A little trial and error gives me this code:

import subprocess
print ("Running Test")
p = subprocess.run(["python3", "./tree_sudoku.py"], capture_output=True)

output = p.stdout.decode("utf-8").split("\n")
output = "".join(output[:-2])
output = output.replace("-","").replace("|","")
output = output.replace(" ","").replace("\n","")
output = output.replace("Board","\nBoard")
print(output)

#assert(False)
$ python3 test_tree_soduku.py
Running Test

Board:0483921657967345821251876493548132976729564138136798245372689514814253769695417382
Board:1245981376169273584837564219976125438513498627482736951391657842728349165654812793
Board:2462831957795426183381795426173984265659312748248567319926178534834259671517643892

I removed the last line of output (last two actually) as the time changes each time I run the program.

This output is a little horrible, in that it appends a number to the front of the solution: 1, 2,3. However, this is not our end state, and it is good enough for our purposes. I could do regex matches to get rid of other stuff, but I don’t want to spend more time than necessary here. I just want the output to be short enough to fit in code.

Let’s codify that output into our test.

import subprocess
check_data ="""
Board:0483921657967345821251876493548132976729564138136798245372689514814253769695417382
Board:1245981376169273584837564219976125438513498627482736951391657842728349165654812793
Board:2462831957795426183381795426173984265659312748248567319926178534834259671517643892"""
print ("Running Test")
p = subprocess.run(["python3", "./tree_sudoku.py"], capture_output=True)
output = p.stdout.decode("utf-8").split("\n")
output = "".join(output[:-2])
output = output.replace("-","").replace("|","")
output = output.replace(" ","").replace("\n","")
output = output.replace("Board","\nBoard")
print("comparing output ")
assert(len(check_data) == len(output))
assert(check_data == output)
print("OK")

I had originally had a problem when I put the final “”” on its own line, as it added an additional newline. I checked the length with:

 print("comparing output %d %d " % (len(check_data), len(output)))

Which is why that code is still in there.

OK, we have a test. Now we can start hacking on the original code.

A Long Refactoring: introduction

Posted by Adam Young on October 27, 2020 07:43 PM

Congratulations, you got your code to run! You are done! Ship it!. Just don’t expet to be able to read it next month. You need to maintain it. You need to add new features. It is a mess.

Give yourself a well deserved break. Then come back to it.

I have a piece of code written for that very common job interview question: write a Sudoku solver. Jarrett, A Code Platoon student I am mentoring, wrote it for his project, and he got it to work. I think it is an ideal candidate for a refactoring story. Over the next few posts, I am going to work through how I would tackle this.

It works. At 180 lines, it is complex enough that the mind cannot hold on to the entire structure at once. Here is the code in its current form.

In order to run the code, you need some sample data. I’ll post that afterwards

Here is tree_sudoku.py

#turn each csv row into a board
#find what values can go into what spot
#create a tree trying to put in each value
#if value can not be possible, end stem, go back up the tree
#retrun the branch when tree is 81 layers deep, the board is filled

import csv
import re
import copy
import time

class SudokuSolver:
    def __init__(self):
        self.board_strings = self.import_csv()
        self.boards_dict = self.strings_to_board_dict(self.board_strings)
        self.box_index_table = self.fill_box_index_table()
        self.board_index_table = self.fill_board_index_table()
        self.solved_board_strings = []
        for key, value in self.boards_dict.items():
            print(f"Board: {key}")
            self.solved_board_strings.append([self.tree_to_solution_string(value)])

    def tree_to_solution_string(self, original_board):
        index = 0
        head_node = Tree_Node(index, self.board_index_table[index])
        current_node = head_node
        while index < 81:
            current_board_filling_node = head_node
            test_board = copy.deepcopy(original_board)
            test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value
            while current_board_filling_node.next_node:
                current_board_filling_node = current_board_filling_node.next_node
                test_board[int(current_board_filling_node.board_spot[0])][int(current_board_filling_node.board_spot[1])] = current_board_filling_node.value
            if self.value_valid(test_board, int(current_node.board_spot[0]), int(current_node.board_spot[1])):
                index += 1
                if index >= 81:
                    continue
                new_node = Tree_Node(index, self.board_index_table[index], current_node)
                if test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])] != '0':
                    new_node.value = test_board[int(new_node.board_spot[0])][int(new_node.board_spot[1])]
                    new_node.possible_values = []
                current_node.next_node = new_node
                current_node = new_node
            else:
                if len(current_node.possible_values) == 0:
                    while len(current_node.possible_values) == 0:
                        current_node = current_node.last_node
                        current_node.next_node = None
                        index -= 1
                    current_node.next()
                else:
                    current_node.next()
            
        return_string = ''
        current_node = head_node
        return_string += str(current_node.value)
        while(current_node.next_node):
            current_node = current_node.next_node
            return_string += str(current_node.value)
        self.print_board(self.strings_to_board_dict([return_string])['0'])
        return return_string

    def import_csv(self):
        list_of_boards = []
        with open('sample_sudoku_board_inputs.csv', 'r') as file:
            reader = csv.reader(file)
            for row in reader:
                list_of_boards.append(str(*row))
        return list_of_boards
    
    def strings_to_board_dict(self, board_strings):
        return_dict = {}
        for index, board in enumerate(board_strings):
            rows = re.findall(r"\d{9}", board)
            board_list = []
            for row in rows:
                row_list = []
                row_list[:0] = row
                board_list.append(row_list)
            return_dict[str(index)] = board_list
        return return_dict
    
    def print_board(self, board):
        for index1, row in enumerate(board):
            if index1 == 0 or index1 == 3 or index1 == 6:
                print('-' * 21)
            for index, char in enumerate(row):
                print(char, '', end='')
                if index == 2 or index == 5:
                    print('| ', end = '')
            print('')
            if index1 == 8:
                print('-' * 21)
    
    def value_valid(self, board, row_index, column_index):
        row = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        column = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        square = ['1', '2', '3', '4', '5', '6', '7', '8', '9']
        for number in board[row_index]:
            if number == '0':
                continue
            if number in row:
                row.remove(number)
            else:
                return False

        for a_row in range(9):
            number = board[a_row][column_index]
            if number == '0':
                continue
            if number in column:
                column.remove(number)
            else:
                return False

        box_indexes = self.box_index_table[self.find_box_of_index(str(row_index) + str(column_index))]
        for index in box_indexes:
            row = int(index[0])
            column = int(index[1])
            number = board[row][column]
            if number == '0':
                continue
            if number in square:
                square.remove(number)
            else:
                return False
        
        return True
  
    def find_box_of_index(self, index):
        box = 'box not found'
        for each_box in self.box_index_table:
            if index in self.box_index_table[each_box]:
                box = each_box
                break
        return box

    def fill_box_index_table(self):
        boxes = {}
        box_center = [1, 1]
        box_number = 0
        for _row_of_boxes in range(3):
            for _each_box in range(3):
                box_list = []
                for i in range(-1, 2):
                    box_list.append(str(box_center[0] + i) + str(box_center[1] - 1))
                    box_list.append(str(box_center[0] + i) + str(box_center[1]))
                    box_list.append(str(box_center[0] + i) + str(box_center[1] + 1))
                boxes[box_number] = box_list
                box_number += 1
                box_center[1] += 3
            box_center[0] += 3
            box_center[1] -= 9
        return boxes
    
    def fill_board_index_table(self):
        return_list = []
        for row in range(9):
            for column in range(9):
                return_list.append(str(row) + str(column))
        return return_list

class Tree_Node:
    def __init__(self, index, board_spot, last_node=None, next_node=None):
        self.possible_values = ['1', '2', '3', '4', '5', '6', '7', '8']
        self.index = index
        self.board_spot = board_spot
        self.next_node = next_node
        self.last_node = last_node
        self.value = '9'

    def next(self):
        self.value = self.possible_values.pop()
    
    def __str__(self):
        return self.value

start = time.time()
x = SudokuSolver()
end = time.time()


Here is the sample data you can use to run it. I cut it down to just 3 puzzles. To convert to a sudoku puzzle, break it up into 9 lines of 9 characters each. A 0 indicates a blank spot on the puzzle.

The full file test file has 50 puzzles in it. It takes the code a while to run against. I am going to cut it down to 3, but I want to hold on to the original data as well. I’ll do that using a symlink.

Here is sample_sudoku_board_inputs.csv.short. I can then symlink sample_sudoku_board_inputs.csv to this file.

003020600900305001001806400008102900700000008006708200002609500800203009005010300
200080300060070084030500209000105408000000000402706000301007040720040060004010003
000000907000420180000705026100904000050000040000507009920108000034059000507000000
ln -s sample_sudoku_board_inputs.csv.short sample_sudoku_board_inputs.csv

Lets make sure we can run it;

$ python3 tree_sudoku.py
Board: 0
---------------------
4 8 3 | 9 2 1 | 6 5 7 
9 6 7 | 3 4 5 | 8 2 1 
2 5 1 | 8 7 6 | 4 9 3 
---------------------
5 4 8 | 1 3 2 | 9 7 6 
7 2 9 | 5 6 4 | 1 3 8 
1 3 6 | 7 9 8 | 2 4 5 
---------------------
3 7 2 | 6 8 9 | 5 1 4 
8 1 4 | 2 5 3 | 7 6 9 
6 9 5 | 4 1 7 | 3 8 2 
---------------------
Board: 1
---------------------
2 4 5 | 9 8 1 | 3 7 6 
1 6 9 | 2 7 3 | 5 8 4 
8 3 7 | 5 6 4 | 2 1 9 
---------------------
9 7 6 | 1 2 5 | 4 3 8 
5 1 3 | 4 9 8 | 6 2 7 
4 8 2 | 7 3 6 | 9 5 1 
---------------------
3 9 1 | 6 5 7 | 8 4 2 
7 2 8 | 3 4 9 | 1 6 5 
6 5 4 | 8 1 2 | 7 9 3 
---------------------
Board: 2
---------------------
4 6 2 | 8 3 1 | 9 5 7 
7 9 5 | 4 2 6 | 1 8 3 
3 8 1 | 7 9 5 | 4 2 6 
---------------------
1 7 3 | 9 8 4 | 2 6 5 
6 5 9 | 3 1 2 | 7 4 8 
2 4 8 | 5 6 7 | 3 1 9 
---------------------
9 2 6 | 1 7 8 | 5 3 4 
8 3 4 | 2 5 9 | 6 7 1 
5 1 7 | 6 4 3 | 8 9 2 
---------------------
Finished in 4.777865409851074 seconds

Last thing before we really get started is to check this all in to git.

$ git status
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
[ayoung@ayoungP40 tree_sudoku]$ git init
Initialized empty Git repository in /home/ayoung/Documents/CodePlatoon/tree_sudoku/.git/
[ayoung@ayoungP40 tree_sudoku]$ git add .
[ayoung@ayoungP40 tree_sudoku]$ git commit -m "Initial checking of functioning code and sample data"
[master (root-commit) 5eb15e4] Initial checking of functioning code and sample data
 5 files changed, 478 insertions(+)
 create mode 120000 sample_sudoku_board_inputs.csv
 create mode 100644 sample_sudoku_board_inputs.csv.full
 create mode 100644 sample_sudoku_board_inputs.csv.short
 create mode 100644 tree_sudoku.py
 create mode 100644 tree_sudoku.py.comments

Next up; we’ll take a look through the code and write a test.

Episode 221 – Security, magic, and FaceID

Posted by Josh Bressers on October 26, 2020 12:01 AM

Josh and Kurt talk about how to get started in security. It’s like the hero’s journey, but with security instead of magic. We then talk about what Webkit bringing Face ID and Touch ID to the browsers will mean.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2019-5" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_221_Security_magic_and_FaceID.mp3?_=5" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_221_Security_magic_and_FaceID.mp3</audio>

Show Notes

Adding Nodes to Ironic

Posted by Adam Young on October 22, 2020 03:14 AM

TheJulia was kind enough to update the docs for Ironic to show me how to include IPMI information when creating nodes.

To all delete the old nodes

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; do openstack baremetal node delete $UUID; done

nodes definition

I removed the ipmi common data from each definition as there is a password there, and I will set that afterwards on all nodes.

{
  "nodes": [
    {
      "ports": [
        {
          "address": "00:21:9b:93:d0:90"
        }
      ],
      "name": "zygarde",
      "driver": "ipmi",
      "driver_info": {
      		"ipmi_address":  "192.168.123.10"
      }
    },
    {
      "ports": [
        {
          "address": "00:21:9b:9b:c4:21"
        }
      ],
      "name": "umbreon",
      "driver": "ipmi",
      "driver_info": {
	      "ipmi_address": "192.168.123.11"
	}
      },	
    {
      "ports": [
        {
          "address": "00:21:9b:98:a3:1f"
        }
      ],
      "name": "zubat",
      "driver": "ipmi",
       "driver_info": {
	      "ipmi_address": "192.168.123.12"
       }
    }
  ]
}

Create the nodes

openstack baremetal create  ./nodes.ipmi.json 

Check that the nodes are present

$ openstack baremetal node list
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name    | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+
| 3fa4feae-0d5c-4e38-a012-29258d40651b | zygarde | None          | None        | enroll             | False       |
| 00965ad4-c972-46fa-948a-3ce87aecf5ac | umbreon | None          | None        | enroll             | False       |
| 8702ea0c-aa10-4542-9292-3b464fe72036 | zubat   | None          | None        | enroll             | False       |
+--------------------------------------+---------+---------------+-------------+--------------------+-------------+

Update IPMI common data

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; 
do  openstack baremetal node set $UUID --driver-info ipmi_password=`cat ~/ipmi.password`  --driver-info   ipmi_username=admin   ; 
done

EDIT: I had ipmi_user before and it does not work. Needs to be ipmi_username.

Final Check

And if I look in the returned data for the definition, we see the password is not readable:

$ openstack baremetal node show zubat  -f yaml | grep ipmi_password
  ipmi_password: '******'

Power On

for UUID in `openstack baremetal node list -f json | jq -r '.[] | .UUID' ` ; do  openstack baremetal node power on $UUID  ; done

Change “on” to “off” to power off.

Running Cassandra on Fedora 32

Posted by Adam Young on October 20, 2020 01:10 PM

This is not a tutorial. These are my running notes from getting Cassandra to run on Fedora 32. The debugging steps are interesting in their own right. I’ll provide a summary at the end for any sane enough not to read through the rest.

<figure class="wp-block-image"></figure>

Old Instructions

So…Starting with https://www.liquidweb.com/kb/how-to-install-cassandra-2-on-fedora-20/ The dsc-20 is, I think, version specific, so I want to se if there is something more appropriate for F32 (has it really been so many years?)

Looking in here https://rpm.datastax.com/community/noarch/ I see that there is still a dsc-20 series of packages, but also dsc-30…which might be a bit more recent of a release.

Dependencies resolved.
==================================================================================================================
 Package                      Architecture            Version                     Repository                 Size
==================================================================================================================
Installing:
 dsc30                        noarch                  3.0.9-1                     datastax                  1.9 k
Installing dependencies:
 cassandra30                  noarch                  3.0.9-1                     datastax                   24 M

Transaction Summary
==================================================================================================================
Install  2 Packages

I’d be interested to see what is in the dsc30 package versus Cassandra.

$ rpmquery --list dsc30
(contains no files)

OK. But…there is no Systemd file:

sudo systemctl start cassandra
Failed to start cassandra.service: Unit cassandra.service not found.

Garbage Collection Configuration

We’ll, let’s just try to run it.

sudo /usr/sbin/cassandra

Unrecognized VM option 'UseParNewGC'

Seems like it is built to use an older version of the Java CLI params, which is now gone. Where does this come from?

$ rpmquery --list cassandra30 | xargs grep UseParNewGC  2>&1 | grep -v "Is a direc" 
/etc/cassandra/default.conf/jvm.options:-XX:+UseParNewGC

We can remove it there. According to this post, the appropriate replacement is -XX:+UseG1GC

OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Unrecognized VM option 'PrintGCDateStamps'

OK, lets take care of both of those. According to this post, the GC line we put in above should cover UseConcMarkSweepGC.

The second option is in the logging section. It is not included in the jvm.options. However, if I run it with just the first option removed, I now get:

$ sudo /usr/sbin/cassandra
[0.000s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/var/log/cassandra/gc.log instead.
Unrecognized VM option 'PrintGCDateStamps'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

More trial and error shows I need to comment out all of the GC logging values at the bottom of the file:

#-XX:+PrintGCDetails
#-XX:+PrintGCDateStamps
#-XX:+PrintHeapAtGC
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime
#-XX:+PrintPromotionFailure
#-XX:PrintFLSStatistics=1
#-Xloggc:/var/log/cassandra/gc.log
#-XX:+UseGCLogFileRotation
#-XX:NumberOfGCLogFiles=10
#-XX:GCLogFileSize=10M

-Xloggc is deprecated. Will use -Xlog:gc:/var/log/cassandra/gc.log instead. This is not from the jvm.options file (it was already commented out above).

$ rpmquery --list cassandra30 | xargs grep loggc  2>&1 | grep -v "Is a direc" 
/etc/cassandra/default.conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc.log"
/etc/cassandra/default.conf/cassandra-env.sh.orig:JVM_OPTS="$JVM_OPTS -Xloggc:${CASSANDRA_HOME}/logs/gc.log"
/etc/cassandra/default.conf/jvm.options:#-Xloggc:/var/log/cassandra/gc.log

I’m going to replace this with -Xlog:gc:/var/log/cassandra/gc.log as the message suggests in /etc/cassandra/default.conf/cassandra-env.sh

Thread Priority Policy

$ sudo /usr/sbin/cassandra
intx ThreadPriorityPolicy=42 is outside the allowed range [ 0 ... 1 ]
Improperly specified VM option 'ThreadPriorityPolicy=42'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
$ rpmquery --list cassandra30 | xargs grep ThreadPriorityPolicy  2>&1 | grep -v "Is a direc" 
/etc/cassandra/default.conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"
/etc/cassandra/default.conf/cassandra-env.sh.orig:JVM_OPTS="$JVM_OPTS -XX:ThreadPriorityPolicy=42"

Looks like that was never a legal value. Since I am running a pretty tip-of-tree Linux distribution and OpenJDK version, I am going to set this to 1.

And with that, Cassandra will run. Too much output here. Let’s try to connect:

cqlsh doesn’t run

cqlsh
Connection error: ('Unable to connect to any servers', {'127.0.0.1': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

OK…let’s dig: First, is it listening:

$ ps -ef | grep java
root      618809    5573  7 14:30 pts/3 ....
java    618809 root   61u     IPv4           32477117       0t0        TCP localhost:7199 (LISTEN)
java    618809 root   62u     IPv4           32477118       0t0        TCP localhost:46381 (LISTEN)
java    618809 root   70u     IPv4           32477124       0t0        TCP localhost:afs3-fileserver (LISTEN)
$ grep afs3-file /etc/services 
afs3-fileserver 7000/tcp                        # file server itself
afs3-fileserver 7000/udp                        # file server itself

I’m not sure off the top of my head which of those is the Query language port, but I can telnet to 7000, 7199, and 46381

Running cqlsh –help I see:

Connects to 127.0.0.1:9042 by default. These defaults can be changed by
setting $CQLSH_HOST and/or $CQLSH_PORT. When a host (and optional port number)
are given on the command line, they take precedence over any defaults.

Lets give that a try:

[ayoung@ayoungP40 ~]$ cqlsh 
Connection error: ('Unable to connect to any servers', {'127.0.0.1': ConnectionShutdown('Connection <asyncoreconnection> is already closed',)})
[ayoung@ayoungP40 ~]$ export CQLSH_PORT=7100
[ayoung@ayoungP40 ~]$ cqlsh 
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 7100)]. Last error: Connection refused")})
[ayoung@ayoungP40 ~]$ export CQLSH_PORT=46381
[ayoung@ayoungP40 ~]$ cqlsh 
nOPConnection error: ('Unable to connect to any servers', {'127.0.0.1': ConnectionShutdown('Connection <asyncoreconnection> is already closed',)})

Nope. Ok, maybe there is a log file. Perhaps the Casandra process is stuck.

[ayoung@ayoungP40 ~]$ ls -lah /var/log/cassandra/ total 52M drwxr-xr-x. 2 cassandra cassandra 4.0K Oct 19 15:41 . drwxr-xr-x. 23 root root 4.0K Oct 19 11:49 .. -rw-r–r–. 1 root root 19M Oct 19 15:41 debug.log

That is a long log file. I’m going to stop the process, wipe this directory and start again. Note that just hitting Ctrl C on the terminal was not enough to stop the process, I had to send a kill by pid.

This time the shell script exited on its own, but the cassandra process is running in the background of that terminal. lsof provides similar output. The high number port is now 44823 which means that I can at least rule that out; I think it is an ephemeral port anyway.

[ayoung@ayoungP40 ~]$ export CQLSH_PORT=7199
[ayoung@ayoungP40 ~]$ cqlsh 
Connection error: ('Unable to connect to any servers', {'127.0.0.1': ConnectionShutdown('Connection <asyncoreconnection> is already closed',)})

According to This post, the port for The query language is not open. That would be 9042. The two ports are for Data sync and for Java Management Extensions (JMX).

Why don’t I get Query port? Lets look in the log:

INFO  [main] 2020-10-19 15:46:11,640 Server.java:160 - Starting listening for CQL clients on localhost/127.0.0.1:9042 (unencrypted)...
INFO  [main] 2020-10-19 15:46:11,665 CassandraDaemon.java:488 - Not starting RPC server as requested. Use JMX (StorageService->startRPCServer()) or nodetool (enablethrift) to start it

But starting it seems to trigger a cascading failure: I now have a lot of log files. Let me see if I can find the first error. Nah, they are all zipped up. Going to wipe and restart, using tail -f on the log file before asking to restart thrift.

$ grep start_rpc /etc/cassandra/conf/*
/etc/cassandra/conf/cassandra.yaml:start_rpc: false
/etc/cassandra/conf/cassandra.yaml.orig:start_rpc: false
grep: /etc/cassandra/conf/triggers: Is a directory

Since trying to start it with nodetool enablethrift failed. Let me try changing that value in the config file and restarting. My log file now ends as:

INFO  [main] 2020-10-19 16:12:47,695 ThriftServer.java:119 - Binding thrift service to localhost/127.0.0.1:9160
INFO  [Thread-1] 2020-10-19 16:12:47,699 ThriftServer.java:136 - Listening for thrift clients...
$ cqlsh 
Connection error: ('Unable to connect to any servers', {'127.0.0.1': OperationTimedOut('errors=Timed out creating connection (5 seconds), last_host=None',)})

Something is not happy. Let me see where the errors start. tail the log and tee it into a file in /tmp so I can look at it in the end.

ERROR [SharedPool-Worker-6] 2020-10-19 16:24:34,069 Message.java:617 - Unexpected exception during request; channel = [id: 0x0e698e4a, /127.0.0.1:35340 => /127.0.0.1:9042]
java.lang.RuntimeException: Unable to access address of buffer
        at io.netty.channel.epoll.Native.read(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:678) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe$3.run(EpollSocketChannel.java:755) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:268) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
        at java.base/java.lang.Thread.run(Thread.java:834) ~[na:na]
ERROR [SharedPool-Worker-2] 2020-10-19 16:24:34,069 Message.java:617 - Unexpected exception during request; channel = [id: 0x0e698e4a, /127.0.0.1:35340 => /127.0.0.1:9042]

Note: At this point, I suspected SELinux, so I put my machine in permissive mode. No change.

Native Transport

So I turned to Minecraft. Turns out they have the same problem there, and the solution is to disable native transport: Lets see if that applies to Cassandra.

$ grep start_native_transport /etc/cassandra/conf/*
/etc/cassandra/conf/cassandra.yaml:start_native_transport: true
/etc/cassandra/conf/cassandra.yaml.orig:start_native_transport: true

Ok, and looking in that file I see:

#Whether to start the native transport server.

#Please note that the address on which the native transport is bound is the

#same as the rpc_address. The port however is different and specified below.

Let me try disabling that and see what happens. No love but…in the log file I now see:

INFO  [main] 2020-10-19 17:23:28,272 ThriftServer.java:119 - Binding thrift service to localhost/127.0.0.1:9160
INFO  [main] 2020-10-19 17:23:28,272 ThriftServer.java:119 - Binding thrift service to localhost/127.0.0.1:9160

So let me try on that port.

[ayoung@ayoungP40 ~]$ export CQLSH_PORT=9160
[ayoung@ayoungP40 ~]$ cqlsh

Maybe it needs the native transport, and it should not be on the same port? Sure enough, further down the conf I find:

rpc_port: 9160

Change the value for start_native_transport back to true and restart the server.

Now it fails with no message why.

This native_transport intrigues me. Lets see what else we can find…hmm seems as if that is an old protocol, and the native_transport has been in effect for 5 or so years…which would explain why it shows in the F22 page, but is not really supported. I should probably turn it off.

Interlude: nodetool status

OK…what else can I do to test my cluster? Nodetool?

$ nodetool status
WARN  21:37:34 Only 51050 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  186.68 KB  256          100.0%            6cf084ed-a30a-4b80-9efb-acbcd22362c2  rack1

Better install repository

OK….wipe the system, try with a different repo. Before this, I wiped out all of my old config files, and will need to remake any changes that I noted above.

sudo yum install http://apache.mirror.digitalpacific.com.au/cassandra/redhat/311x/cassandra-3.11.8-1.noarch.rpm http://apache.mirror.digitalpacific.com.au/cassandra/redhat/311x/cassandra-tools-3.11.8-1.noarch.rpm

Still no systemd scripts. Maybe in the 4 Beta. I’ll check that later. Make the same config changes for the jvm.options. Note that the thread priority has moved here, too. Also, it does not want to run as root any more…progress. Do this to let things log:

sudo chown -R ayoung:ayoung /var/log/cassandra/

Insufficient permissions on directory /var/lib/cassandra/data

Get that one too.

sudo chown -R ayoung:ayoung /var/lib/cassandra/

telnet localhost 9160

Connects….I thought that would be turned off…interesting

cqlsh Connection error: (‘Unable to connect to any servers’, {‘127.0.0.1’: ConnectionShutdown(‘Connection to 127.0.0.1 was closed’,)}) $ strace cqlsh 2>&1 | grep -i connect\( connect(5, {sa_family=AF_INET, sin_port=htons(9160), sin_addr=inet_addr(“127.0.0.1”)}, 16) = -1 EINPROGRESS (Operation now in progress)

This despite the fact that the docs say

Connects to 127.0.0.1:9042 by default.

Possibly picking up from config files.

Anticlimax: Works in a different terminal

I just opened another window, typed cqlsh and it worked…..go figure….Maybe some phantom env var from a previous incantation.

Summary

  • For stable (not beta) Use the 3.11 version
  • Run as non-root user is now enforced
  • Change ownership of the var directories so your user can read and write
  • remove the GC options from jvm.options
  • Set the threading priority policy to 1 (or 0)
  • make sure that you have a clean env when running cqlsh

It would be nice to have a better understanding of what went wrong running cqlsh. Problems that go away by themselves tend to return by them selves.

Episode 220 – Securing network time and IoT

Posted by Josh Bressers on October 19, 2020 12:01 AM

Josh and Kurt talk about Network Time Security (NTS) how it works and what it means for the world (probably not very much). We also talk about Singapore’s Cybersecurity Labelling Scheme (CLS). It probably won’t do a lot in the short term, but we hope it’s a beacon of hope for the future.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2014-6" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_220_Securing_network_time_and_IoT.mp3?_=6" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_220_Securing_network_time_and_IoT.mp3</audio>

Show Notes

Running Ironic Standalone on RHEL

Posted by Adam Young on October 15, 2020 09:26 PM

This is only going to work if you have access to the OpenStack code. If you are not an OpenStack customer, you are going to need an evaluation entitlement. That is beyond the scope of this article.

For me, I had to enable the repo openstack-16-for-rhel-8-x86_64-rpms. With that enabled I was able to install via yum:

yum install openstack-ironic-api mariadb-server python3-openstackclient python3-ironicclient openstack-ironic-conductor

Set up Maria DB

systemctl enable mariadb 
Created symlink /etc/systemd/system/mysql.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/mysqld.service → /usr/lib/systemd/system/mariadb.service.
Created symlink /etc/systemd/system/multi-user.target.wants/mariadb.service → /usr/lib/systemd/system/mariadb.service.
[root@nuzleaf ~]# systemctl start mariadb 
mysql -u root 
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 8
Server version: 10.3.17-MariaDB MariaDB Server

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> create database ironic;
Query OK, 1 row affected (0.001 sec)

MariaDB [(none)]> CREATE USER 'ironic'@'localhost' IDENTIFIED BY 'ironic';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'localhost';
Query OK, 0 rows affected (0.001 sec)

MariaDB [(none)]> \q
Bye

Make sure I can log in as the ironic user

mysql ironic -u ironic -p

Back up the config file:

 
cp /etc/ironic/ironic.conf /etc/ironic/ironic.conf.orig

I’ll use these values:

auth_strategy = noauth
enabled_hardware_types = ipmi
enabled_power_interfaces = ipmitool
connection=mysql+pymysql://ironic:ironic@localhost/ironic
rpc_transport = json-rpc

Now let me see if I can sync the database:

ironic-dbsync 
INFO  [alembic.runtime.migration] Context impl MySQLImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 2581ebaf0cb2, initial migration
INFO  [alembic.runtime.migration] Running upgrade 2581ebaf0cb2 -> 21b331f883ef, Add provision_updated_at
...

INFO  [alembic.runtime.migration] Running upgrade 28c44432c9c3 -> 2aac7e0872f6, Create deploy_templates and deploy_template_steps tables.
INFO  [alembic.runtime.migration] Running upgrade 2aac7e0872f6 -> 1e15e7122cc9, add extra column to deploy_templates

Seemed to work. Let’s start conductor:

systemctl status openstack-ironic-conductor
# systemctl status openstack-ironic-conductor
● openstack-ironic-conductor.service - OpenStack Ironic Conductor service
   Loaded: loaded (/usr/lib/systemd/system/openstack-ironic-conductor.service; disabled; vendor preset: disabled)
   Active: active (running) since Thu 2020-10-15 17:11:36 EDT; 1s ago
 Main PID: 491085 (ironic-conducto)
    Tasks: 1 (limit: 99656)
   Memory: 40.0M
   CGroup: /system.slice/openstack-ironic-conductor.service
           └─491085 /usr/bin/python3 /usr/bin/ironic-conductor

Oct 15 17:11:36 nuzleaf.home.younglogic.net systemd[1]: Started OpenStack Ironic Conductor service.

So far so good.

The API server is going to be an interesting piece. For a first run, I’ll try the command line version I ran on the upstream. However, my longer term approach will be to run it as a wsgi App behind the Apache server that is already running on this host.

Run the API server in one terminal

ironic-api

and open up a second one to try to hit it with curl.

$ curl http://127.0.0.1:6385
{"name": "OpenStack Ironic API", "description": "Ironic is an OpenStack project which aims to provision baremetal machines.", "versions": [{"id": "v1", "links": [{"href": "http://127.0.0.1:6385/v1/", "rel": "self"}], "status": "CURRENT", "version": "1.58", "min_version": "1.1"}], "default_version": {"id": "v1", "links": [{"href": "http://127.0.0.1:6385/v1/", "rel": "self"}], "status": "CURRENT", "version": "1.58", "min_version": "1.1"}}

Excellent. let’s try the command line. The python3-ironicclient RPM does not ship the baremetal executable, but using the openstack common client, we can see all of the baremetal sub-commands. To list the drivers and the nodes, we can use the following.

openstack baremetal driver list
+---------------------+-----------------------------+
| Supported driver(s) | Active host(s)              |
+---------------------+-----------------------------+
| ipmi                | nuzleaf.home.younglogic.net |
+---------------------+-----------------------------+
$ openstack baremetal node list


Edit: Next up was going to be configuring Apache to run the Ironic API as a WSGI App, but I decided to use the systemd files to enable and run Ironic for the time being. Here’s all the commands together.

systemctl start mariadb 
systemctl start openstack-ironic-conductor
systemctl start openstack-ironic-api
systemctl enable mariadb 
systemctl enable openstack-ironic-conductor
systemctl enable openstack-ironic-api

Edit 2: Seems I am having trouble connecting. I’m back to running the API from the command line. More to come.

Introduction to Ironic

Posted by Adam Young on October 15, 2020 07:27 PM

“I can do any thing. I can’t do everything.”

The sheer number of projects and problem domains covered by OpenStack was overwhelming. I never learned several of the other projects under the big tent. One project that is getting relevant to my day job is Ironic, the bare metal provisioning service. Here are my notes from spelunking the code.

The Setting

I want just Ironic. I don’t want Keystone (personal grudge) or Glance or Neutron or Nova.

Ironic will write files to e.g. /var/lib/tftp and /var/www/html/pxe and will not handle DHCP, but can make sue of static DHCP configurations.

Ironic is just an API server at this point ( python based web service) that manages the above files, and that can also talk to the IPMI ports on my servers to wake them up and perform configurations on them.

I need to provide ISO images to Ironic so it can put the in the right place to boot them

Developer steps

I checked the code out of git. I am working off the master branch.

I ran tox to ensure the unit tests are all at 100%

I have mysql already installed and running, but with a Keystone Database. I need to make a new one for ironic. The database name, user, and password are all going to be ironic, to keep things simple.

CREATE USER 'ironic'@'localhost' IDENTIFIED BY 'ironic';
create database ironic;
GRANT ALL PRIVILEGES ON ironic.* TO 'ironic'@'localhost';
FLUSH PRIVILEGES;

Note that I did this as the Keystone user. That dude has way to much privilege….good thing this is JUST for DEVELOPMENT. This will be used to follow the steps in the developers quickstart docs. I also set the mysql URL in the config file to this

connection = mysql+pymysql://ironic:ironic@localhost/ironic

Then I can run ironic db sync. Lets’ see what I got:

mysql ironic --user ironic --password
#....
MariaDB [ironic]> show tables;
+-------------------------------+
| Tables_in_ironic              |
+-------------------------------+
| alembic_version               |
| allocations                   |
| bios_settings                 |
| chassis                       |
| conductor_hardware_interfaces |
| conductors                    |
| deploy_template_steps         |
| deploy_templates              |
| node_tags                     |
| node_traits                   |
| nodes                         |
| portgroups                    |
| ports                         |
| volume_connectors             |
| volume_targets                |
+-------------------------------+
15 rows in set (0.000 sec)

OK, so the first table shows that Ironic uses Alembic to manage migrations. Unlike the SQLAlchemy migrations table, you can’t just query this table to see how many migrations have been performed:

MariaDB [ironic]> select * from alembic_version;
+--------------+
| version_num  |
+--------------+
| cf1a80fdb352 |
+--------------+
1 row in set (0.000 sec)

Running The Services

The script to start the API server is:
ironic-api -d --config-file etc/ironic/ironic.conf.local

Looking in the file requirements.txt, I see that they Web framework for Ironic is Pecan:

$ grep pecan requirements.txt 
pecan!=1.0.2,!=1.0.3,!=1.0.4,!=1.2,>=1.0.0 # BSD

This is new to me. On Keystone, we converted from no framework to Flask. I’m guessing that if I look in the chain that starts with ironic-api file, I will see a Pecan launcher for a web application. We can find that file with

$which ironic-api
/opt/stack/ironic/.tox/py3/bin/ironic-api

Looking in that file, it references ironic.cmd.api, which is the file ironic/cmd/api.py which in turn refers to ironic/common/wsgi_service.py. This in turn refers to ironic/api/app.py from which we can finally see that it imports pecan.

Now I am ready to run the two services. Like most of OpenStack, there is an API server and a “worker” server. In Ironic, this is called the Conductor. This maps fairly well to the Operator pattern in Kubernetes. In this pattern, the user makes changes to the API server via a web VERB on a URL, possibly with a body. These changes represent a desired state. The state change is then performed asynchronously. In OpenStack, the asynchronous communication is performed via a message queue, usually Rabbit MQ. The Ironic team has a simpler mechanism used for development; JSON RPC. This happens to be the same mechanism used in FreeIPA.

Command Line

OK, once I got the service running, I had to do a little fiddling around to get the command lines to work. The was an old reference to

OS_AUTH_TYPE=token_endpoint

which needed to be replaces with

OS_AUTH_TYPE=none

Both are in the documentation, but only the second one will work.

I can run the following commands:

$ baremetal driver list
+---------------------+----------------+
| Supported driver(s) | Active host(s) |
+---------------------+----------------+
| fake-hardware       | ayoungP40      |
+---------------------+----------------+
$ baremetal node list


curl

Lets see if I can figure out from CURL what APIs those are…There is only one version, and one link, so:

curl http://127.0.0.1:6385 | jq '.versions  | .[] | .links | .[] |  .href'

"http://127.0.0.1:6385/v1/"


Doing curl against that second link gives a list of the top level resources:

  • media_types
  • chassis
  • nodes
  • drivers

And I assume that, if I use curl to GET the drivers, I should see the fake driver entry from above:

$ curl "http://127.0.0.1:6385/v1/drivers" | jq '.drivers |.[] |.name'

"fake-hardware"

OK, that is enough to get started. I am going to try and do the same with the RPMs that we ship with OSP and see what I get there.

But that is a tale for another day.

Thank You

I had a conversation I had with Julia Kreger, a long time core member of the Ironic project. This helped get me oriented.

Recovering Audio off an Old Tape Using Audacity

Posted by Adam Young on October 14, 2020 05:30 PM

One of my fiorends wrote a bunch of music back in high school. The only remainig recordings are on a casette tape that he produced. Time has not been kind to the recordings, but they are audible…barely. He has a device that produces MP3s from the tape. My job has been to try and get them so that we can understand them well enough to recover the original songs.

I have the combined recording on a single MP3. I’ve gone through and noted the times where each song starts and stops. I am going to go through the steps I’ve been using to go from that single long MP3 to an individual recording.

<figure class="wp-block-image"></figure>

Cutting from the big file

First, I pull up the original recording in Audacity. Today I am working on the recording that starts at 51 minutes and a few seconds into the track. Visually, I can see that the last song ends at 51:23 and this one starts at 51:26. I highlight the majority of the silence that precedes the song, through to the end. Select copy, and start a new project. Select Tracks->add-new->Stereo Track and paste the selection from the longer file.

I’d suggest listening to the recording prior to and after each step, so you can hear the differences.

Removing the Tape His

The reason I keep the silence from the beginning of the song is to use it as a noise profile for removing the hiss from the rest of the recording.

<figure class="wp-block-image"></figure>
  • Highlight a small section of the empty segment
  • select Effect->Noise Reduction.
  • Click Noise Profile. The dialog will close
  • Select the entire clip (Ctrl A)
  • select Effect->Noise Reduction again
  • Select OK to take the defaults

Vocals

The Vocals are pretty hard to hear in the original. I use the graphics equalizer to try and pull the frequencies that the voice cover up from the piano background.

  • Select the whole clip
  • Select Effect->Compressor. Take the defaults
  • Select Effect->Normalize. Take the defaults
  • Select Effect->Graphic Equalizer. I use a shape like this:
<figure class="wp-block-image"></figure>

I tend to undo and redo this a few times until I can best hear the vocals.

Pitch Correction

The tape is slightly sped up. I need to take it down just shy of one whole step. Again, select the whole clip. Then select Effect->Change Pitch. This is what worked for one of the tunes:

<figure class="wp-block-image"></figure>

I suspect that I should drop that “7” in the last fields to a more reasonable octave, probably the 3 or 4 in the middle of the range. I’ll play with that in the future.

Again, I do and undo this multiple times, playing the chords on a eyboard along with the recording , until I feel the pitch matches.

Close enough, anyway.

Once this is done, I save the project and export to MP3, and share with my other players.

Episode 219 – Chat with Larry Cashdollar

Posted by Josh Bressers on October 12, 2020 12:01 AM

Josh and Kurt have a chat with Larry Cashdollar. The three of us go way back. Larry has done some amazing things and he tells us all about it!

<audio class="wp-audio-shortcode" controls="controls" id="audio-2010-7" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_219_Chat_with_Larry_Cashdollar.mp3?_=7" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_219_Chat_with_Larry_Cashdollar.mp3</audio>

Show Notes

Home Gym

Posted by Adam Young on October 07, 2020 06:39 PM

Concrete weighs 5 pounds per quart.

<figure class="wp-block-image size-large"><figcaption>Concrete Weights</figcaption></figure>

For benching I needed something bigger. The big ones are 60 pounds.

<figure class="wp-block-image size-large"><figcaption>60 pounds Weights for bench press</figcaption></figure>

Stretching and calesthenics…and kicking. Great stress relief.

<figure class="wp-block-image size-large"><figcaption>Half a ring</figcaption></figure>

Use the rubber bands for lat work and hips.

<figure class="wp-block-image size-large"></figure>

Episode 218 – The past was a terrible place

Posted by Josh Bressers on October 05, 2020 12:01 AM

Josh and Kurt talk about change. Specifically we discuss how the past was a terrible place. Never believe anyone who tells you it was better. Part of a career now is learning how to learn. The things you learn today won’t be useful skills in a few years. The future is is always better than the past. Even in 2020.

<audio class="wp-audio-shortcode" controls="controls" id="audio-2005-8" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_218_The_past_was_a_terrible_place.mp3?_=8" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_218_The_past_was_a_terrible_place.mp3</audio>

Show Notes

When to play notes out of the Key

Posted by Adam Young on October 02, 2020 08:12 PM

Another question from the Reddit Theory board. Here is my answer.

Ok…here are the simplest extensions.

  1. Use the F as a replacement for the F# to convert to the relative minor. D minor is based on F major, so you can also use the C natural and the B flat. https://en.wikipedia.org/wiki/Relative_key
  2. Convert the third chord from minor to major. In this case, that is the F#. Add in the A#. Yes, this is the same as the Bflat from the relative minor in step 1, but it is really different sounding. Listen to how Blues Traveller uses it in “Hook” which is based on Canon in D changes. Chords are D A Bmin F# G D G A7 https://en.wikipedia.org/wiki/Borrowed_chord
  3. Use the bVII. In the key of G, this is the F7 (natural) . Read up on Back door ii V progressions. It will land back on G. https://en.wikipedia.org/wiki/Backdoor_progression https://antonjazz.com/2012/01/backdoor-ii-v-progression/
  4. Add in the #5. This converts the major scale to the bebop scale. I love this note. https://adam.younglogic.com/2019/01/g/
  5. Use a tritone substitution: a ii V7 I progression in G is Amin, D7, G. Replace the D7 with an Ab7. https://en.wikipedia.org/wiki/Tritone_substitution
  6. Extend the Major 7 chord with the #11. This is the tritone, but it sounds good . The natural 11 is typically an avoid note over the major 7th chord. This is the C#
  7. Extend the Dominant 7th chord with the Flat 9. This is the D7 chord, and you add the C#. Yep, same as the one above. This is not coincidence.
  8. Create a iiø V7b9 i progression on the relative minor. Again, this is the D minor. It does not really add any notes outside the scale, but the different chords sound good when played with chords beyond the diatonic. The Diminished scale in particular sounds good over the V7b9. So for G major, this would b F#ø B7b9 Eminor. B7 Diminished uses the Half /Whole scale https://en.wikipedia.org/wiki/Octatonic_scale#Jazz
  9. Add the major 7th on the Mixolydian mode to get the bebop scale. So, on D7, this is the C#. Seeing a trend?
  10. Use the blues scale. This means the Minor third over the major third, the tritone, the dom 7th…all things referred to in early items on this list. You actually have a few blues scales to pick from. The Tonic, the dorian, the relative minor, and the Dom7th can all be the basis for a blues scale depending on position. You’ll find that a lot of the above techniques are actually basic parts of the blues scale, and are much easier to think of as blues tones.
  11. One downward running riffs, add in chromatic passing tones to land on chord tones. This is the Barry Harris stuff. IN addition to the #5, you can use the b9 and #9, tritone, dom7, etc…all of the above techniques. When the down beat has a chord tone, the upbeat is very accommodating of non chord and non scale tones.

That enough to get you started?

Altered chords

Posted by Adam Young on October 02, 2020 06:22 PM

When you start looking at Jazz lead sheets, you might notice a Chord with the Alt modification. This is an altered chord.

You can play anything and call it an altered chord. You can sit on the Piano if you really want to. OK, maybe not quite…

The Dominant 7th chord allows a lot of leeway. Pretty much any note works over it to some degree. The clash of the tritone between the third and the seventh just opens up the possibilities.Thus, pianists started introducing all sorts of voicings to the chord. The obvious first note to add was the b9.

<script type="text/javascript">ABCJS.renderAbc(['abc-paper-5f999ee50f2f3',], "�X:1�K:C�L:1/4

[CEG_B_d]4�".replace(/\x01/g,"\n"), {}, {}, {});</script>

<script type="text/javascript">ABCJS.renderMidi(['abc-midi-5f999ee50f30f',], "�X:1�K:C�L:1/4�[CEG_B_d]4�".replace(/\x01/g,"\n"), {}, {});</script>

Why is this obvious? Well, an Natural 9 actually sounds pretty tame. The b9, OTOH, is the tritone of the root. This is the blues note of the relative dorian blues, and the 7th that turns the natural minor to the harmonic minor.

Ok, that is a lot of words. An example helps. G7. Dom7th of the C major scale. A flat over a G7. The enharmonic G# is the 7th of the A Harmonic minor scale, and the the blues note of the D dorian blues. It is the note that turns the C major scale into the C Bebop scale. The G7 includes the notes of the B dim triad. The Ab makes it a B half diminished chord. Thus, the two get used somewhat interchangably.

I’ve talked about this note before.

Once that got established in the Jazz tradition, the race was on to see that else could be altered. Some threw a #9 in there, some a #5, some a #4 tritone off the root . The root, fifth and even the thirds notes of the chord were often dropped in the voicings

These chords are often played deep in the heart of a sequence. The most obvious is a ii V7 I. There the V7 is often played as a tritone substitution, so ii bII I .

Again, in the key of C:

Dmin Db7 C.

<script type="text/javascript">ABCJS.renderAbc(['abc-paper-5f999ee50f324',], "�X:1�K:C�L:1/4�|[DFAc]4 [_DF_Ac]4 | [CEGB]8|�".replace(/\x01/g,"\n"), {}, {}, {});</script>
<script type="text/javascript">ABCJS.renderMidi(['abc-midi-5f999ee50f337',], "�X:1�K:C�L:1/4�|[DFAc]4 [_DF_Ac]4 | [CEGB]8|�".replace(/\x01/g,"\n"), {}, {});</script>

The Db shares the two notes of the tritone with the G7: the B and the F. So, is the chord played a Db or is it a G7 + Db and G#? Only the pianist know for sure, and he already forgot.

<script type="text/javascript">ABCJS.renderAbc(['abc-paper-5f999ee50f348',], "�X:1�K:C�L:1/4�|[DFAc]4 [_DfG_aBd]4 | [CEGB]8|�".replace(/\x01/g,"\n"), {}, {}, {});</script>
<script type="text/javascript">ABCJS.renderMidi(['abc-midi-5f999ee50f35a',], "�X:1�K:C�L:1/4�|[DFAc]4 [_DfG_aBd]4 | [CEGB]8|�".replace(/\x01/g,"\n"), {}, {});</script>

Its just easier to write it as altered.

A bug by any other name

Posted by Josh Bressers on October 01, 2020 02:37 PM

This tweet from Jim Manico really has me thinking about why we like to consider security bugs special. There are a lot of tools on the market today to scan your github repos, containers, operating systems, web pages … pick something, for security vulnerabilities. I’ve written a very very long series about these scanners and why they’re generally terrible today but will get better, but only if we demand it. I’m now wondering why we want to consider security special. Why do we have an entire industry focused just on security bugs?

Let’s change the conversation a little bit. Rather than focus on security bugs, let’s ask the question: Which bugs in a given project should we care about?

There are of course bugs an attacker could use to compromise your system. There are also bugs that could result in data loss. Or bugs that could bring everything down. What about a bug that uses 10% more CPU? Every piece of software has bugs. All bugs are equal, but some bugs are more equal than others.

We are at a time in software history where we have decided security bugs are more equal than other bugs. This has created entire industries around scanning just for security problems. Unfortunately the end goal isn’t always to fix problems, the goal is often to find problems, so problems are found (a LOT of problems). I think this is a pretty typical case of perverse incentives. You will always find what you measure. The pendulum will swing back in time, maybe we can help it swing a little faster.

Finding bugs

How do you find bugs in your software? There are probably two big sources. You have users and those users report problems (and of course demand fixes). You can also have a bug show up during testing. Ideally it’s automated testing. Automated testing is one of the most important innovations in software development in the last few decades. The most important functionality of your project can tested to make sure nothing obvious breaks when you make a change, like updating a dependency.

But users and testing don’t really find security bugs. A lot of these are for really bizarre corner cases that no normal user would ever do. If you’ve ever run a fuzzer you know what this means. You might have an API that expects positive integers as the input. When you hand it the string “banana stand” it suddenly decides to crash because a string isn’t an integer. Why would anyone do this is the obvious question. Most automated testing don’t do extremely silly things such as this.

Yet this is basically how security bugs work. Now imagine when we send our API “banana stand” instead of crashing it deletes all your files. That would be a very bad thing. Now imagine someone in Australia figures out they can delete all your files and decides it would be a fun thing to do that on a Friday night. Your weekend is ruined either because you have to spend it restoring data, or you just lost your job.

And this is probably why we like to make a huge deal about security bugs.

Open Source

I also think the other half of this story is open source. A lot of these scanners are focused on open source libraries you are using in your application. Initially the threat was open source itself. It’s written by miscreants who want to give away things for free! The first iteration of these scanners was all about just finding hidden open source in your products so you can remove it.

But the siren song of free lured away too many developers. If we can’t scare you away from open source, maybe we can scare you into thinking the open source you have is inherently broken, you just don’t know it, and you need a tool to tell you what’s wrong with it, and we happen to have such a tool!

Security and fear were forged on the same anvil. The forward thinking security folks know that fear is not a motivator, it is a de-motivator. Fear paralyzes people, it does not empower. The vast majority of these scanners are focused on what can go wrong. Why you should be careful with what libraries you are using. There is some truth in this, but we should be focusing on what can go right and how to make things better. There is more good than bad.

This is where I start to wonder why we focus only on security bugs. Is there a tool that tells us when a new version of a library has a 20% speed improvement? Or what about when a library fixes a bug you had to work around last year. What if a project fixed a bug that could cause data corruption? What about giving advice as to which library has a healthier community and better features? These are arguably more important than 98% of all security bugs affecting your project.

It’s easy to say “But those problems are hard” and I would agree. Solving hard problems is how the future is built. Telling me my project has 600 security vulnerabilities isn’t very useful. How many of those 600 actually matter? If my project is this insecure how is it not literally on fire all of the time? Maybe I should just put a paper bag over my head and ignore the scan. Yes, that sounds easier.

Now, imagine the scanner telling me I should upgrade 3 libraries to get a 20% speed improvement, fix a data corruption bug, and get rid of a cross site scripting vulnerability. Now I’m listening! This is the tool we should be demanding.

The reality is many of these tools cost the price of a full time employee and deliver the value of an intern. That’s probably too harsh. I apologize to the interns.

Episode 217 – How to tell your story with Travis Murdock

Posted by Josh Bressers on September 28, 2020 12:01 AM

Josh and Kurt talk to Travis Murdock about how to tell your story. Travis explains how to talk to the press and how to tell our story in a way that helps get our message across and lets the reporter do their job better.

<audio class="wp-audio-shortcode" controls="controls" id="audio-1968-8" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_217_How_to_tell_your_story_with_Travis_Murdock.mp3?_=8" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_217_How_to_tell_your_story_with_Travis_Murdock.mp3</audio>

Show Notes

DHCP Lease Design

Posted by Adam Young on September 22, 2020 11:34 PM

The key piece of persisted data in an DHCP server is the lease. A lease is a the mapping between a MAC address and an IP address, limited in time. A Lease typically has a start time and an end time, but can be renewed. Because I am still living in an IPV4 world, I have to deal with arbitrarily small pools of IP addresses. Thus, the design needs to strike the balance between static and dynamic: a machine should generally get back the same IP address each time. However, if addresses get tight, address reuse should be aggressive.

A DHCP server typically works with a Class C network subnet. That means that roughly 250 addresses are managed per server. This is on the low size. For example, the clusters on the Top 500 site often have thousands of nodes. However, they also don’t tend to come up and disappear very randomly

What does happen, however, is that certain pieces of hardware fail and get replaced. When that happens, the new servers need to reclaim leases from the retired ones.

Assuming that the client servers regularly send DHCP packets out to renew their leases, the server should be able to reuse the IP addresses based on the order in which the leases expire.

Rust is very strict about memory ownership. That was giving me a bit of and issue when I was trying to design a data structure to solve the allocation and reallocation of IP addresses. My first thought was for the renewal, I could lookup the lease based on the MAC. This calls for a hashmap where the key is the MAC address and the value is the lease.

But I also want to order the leases by expiry time, which seems to call for a linked-list style collection, where additions are made to one end (newest) and removed from the other (oldest) but also allowing for a reshuffling of the values when a renewal happens. These need to be removed from the middle.

My current thinking is that I can keep the MAC addresses as the primary key for the MAP, but I will also maintain a separate list of the MAC addresses ordered by expiration time.

<figure class="wp-block-image"></figure>

This last bit is going to be tricky. If it were just renewal time, pushing the value onto the head of the list would be sufficient. If all of the leases are the same duration, then ordering by renewal time is the same as ordering by expiration. If the lease duration can vary, the insertion process is going to require looking up each lease in the map to retrieve the expiration time.

There are various usage patterns for DHCP. Probably the biggest differentiator is whether the addresses are going to be recycled frequently, say in a Coffee shop, or if we expect long term traffic. In an office setting, people expect to come in an get to work, so making sure there is an IP address available for them is paramount. This setting requires longer leases; a few posts I’ve found around suggest a week +1 Day. This makes sense; If you get your address on Monday, and miss renewal for some reason, when you come in the following Monday, you still have your lease.

<figure class="wp-block-image"></figure>

Some devices get very long term leases. A printer or a server with a public Hostname should probably have something very much approximating a fixed IP address. Since these often need to be on local networks, it means that certain MAC addresses are assigned leases for IP addresses from different pools. This can cause some frustration when initially setting up a machine, as it might get its address from an 8 day pool but the admin wants to give it a long term assignment from a static pool.

A Virtual machine based deployment is going to have an order of magnitude more host than a pure physical data center. The need for new IP address leases is going to also be far more common.

A container based system could potentially have even higher numbers of lease requests than a virtual machine based deployment.

Maybe we should just go with IPv6.

Moving back to hardware provisioning, one pattern that comes up a lot is new-hardware classification. This is the usage where a new MAC address gets a shorter term lease, installs a short lived, inventory-generating Operating system, performs initial registration, and then reboots into its longer term configuration. Again, we have two pools of IP addresses with two different lease durations.

What I think this means is that there is going to be a couple different abstractions required. I do not like the word profile or the word flavor, but something that indicates the class of hardware available in the machine. IP addresses are allocated out of subnets, but finer grained than that will be Pools, which are subsets of the subnet IDs that have common usage. Static , short-lived, and week-long are three examples of pools based on what I wrote above.

Since a machine can move from one pool to another, there should be a link from the lease to the pool. That way the pool used can be determined from the original MAC address.

Episode 216 – Security didn’t find life on Venus

Posted by Josh Bressers on September 21, 2020 12:01 AM

Josh and Kurt talk about how we talk about what we do in the context of life on Venus. We didn’t really discover life on Venus, we discovered a gas that could be created by life on Venus. The world didn’t hear that though. We have a similar communication problem in security. How often are your words misunderstood?

<audio class="wp-audio-shortcode" controls="controls" id="audio-1957-9" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_216_Security_didnt_find_life_on_Venus.mp3?_=9" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_216_Security_didnt_find_life_on_Venus.mp3</audio>

Show Notes

Episode 215 – Real security is boring

Posted by Josh Bressers on September 14, 2020 12:00 AM

Josh and Kurt talk about attacking open source. How serious is the threat of developers being targeted or a git repo being watched for secret security fixes? The reality of it all is there are many layers in a security journey, the most important things you can do are also the least exciting.

<audio class="wp-audio-shortcode" controls="controls" id="audio-1923-9" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_215_Real_security_is_boring.mp3?_=9" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_215_Real_security_is_boring.mp3</audio>

Show Notes

Interpreting DHCP packets

Posted by Adam Young on September 13, 2020 02:55 PM

To capture DHCP packets I ran:

 tcpdump port 67 -i vnet0 -vvvv  -w /tmp/packets.bin

That gave me a binary file 940 bytes long. This is actually 2 packets: the request and the response. This has the IP header, the UDP header, and the DHCP packet payload in it.

First, lets let tcpdump interpret the whole thing for us, and then we can starp picking it apart:

#  tcpdump -r /tmp/packets.bin -vvvv
reading from file /tmp/packets.bin, link-type EN10MB (Ethernet)
19:42:46.742937 IP (tos 0x0, ttl 64, id 278, offset 0, flags [none], proto UDP (17), length 428)
    0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:94:9e:f2 (oui Unknown), length 400, xid 0xff77df41, secs 4, Flags [none] (0x0000)
	  Client-Ethernet-Address 52:54:00:94:9e:f2 (oui Unknown)
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Discover
	    MSZ Option 57, length 2: 1472
	    ARCH Option 93, length 2: 0
	    NDI Option 94, length 3: 1.2.1
	    Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
	    User-Class Option 77, length 4: 
	      instance#1: ERROR: invalid option
	    Parameter-Request Option 55, length 23: 
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG
	      Hostname, Domain-Name, RP, MTU
	      Vendor-Option, Vendor-Class, TFTP, BF
	      Option 119, Option 128, Option 129, Option 130
	      Option 131, Option 132, Option 133, Option 134
	      Option 135, Option 175, Option 203
	    T175 Option 175, length 48: 2969895296,2249199339,50397184,385941794,16847617,17891585,654377241,16846849,35717377,352387352,16852481,17957121
	    Client-ID Option 61, length 7: ether 52:54:00:94:9e:f2
	    GUID Option 97, length 17: 0.178.35.76.56.225.195.173.69.183.151.210.221.34.14.27.157
	    END Option 255, length 0
19:42:50.798338 IP (tos 0x0, ttl 64, id 666, offset 0, flags [none], proto UDP (17), length 428) Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
    0.0.0.0.bootpc > 255.255.255.255.bootps: [udp sum ok] BOOTP/DHCP, Request from 52:54:00:94:9e:f2 (oui Unknown), length 400, xid 0xff77df41, secs 10, Flags [none] (0x0000)
	  Client-Ethernet-Address 52:54:00:94:9e:f2 (oui Unknown)
	  Vendor-rfc1048 Extensions
	    Magic Cookie 0x63825363
	    DHCP-Message Option 53, length 1: Discover
	    MSZ Option 57, length 2: 1472
	    ARCH Option 93, length 2: 0
	    NDI Option 94, length 3: 1.2.1
	    Vendor-Class Option 60, length 32: "PXEClient:Arch:00000:UNDI:002001"
	    User-Class Option 77, length 4: 
	      instance#1: ERROR: invalid option
	    Parameter-Request Option 55, length 23: 
	      Subnet-Mask, Default-Gateway, Domain-Name-Server, LOG
	      Hostname, Domain-Name, RP, MTU
	      Vendor-Option, Vendor-Class, TFTP, BF
	      Option 119, Option 128, Option 129, Option 130
	      Option 131, Option 132, Option 133, Option 134
	      Option 135, Option 175, Option 203
	    T175 Option 175, length 48: 2969895296,2249199339,50397184,385941794,16847617,17891585,654377241,16846849,35717377,352387352,16852481,17957121
	    Client-ID Option 61, length 7: ether 52:54:00:94:9e:f2
	    GUID Option 97, length 17: 0.178.35.76.56.225.195.173.69.183.151.210.221.34.14.27.157
	    END Option 255, length 0

My last post was a one liner I use to look at the packets with hexdump. This is the first couple lines of output from the packets I captured.

00000000  212 195 178 161 002 000 004 000  000 000 000 000 000 000 000 000  |................|
00000016  000 000 004 000 001 000 000 000  246 092 093 095 025 086 011 000  |.........\]_.V..|

Another way to look at the packets is using emacs and hexl-mode. Both have their uses. Most valuable is the ability to run the file through tcpdump with various flags to see what it gives.

For example, I can run:

#  tcpdump -r /tmp/packets.bin -X
reading from file /tmp/packets.bin, link-type EN10MB (Ethernet)
19:42:46.742937 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:94:9e:f2 (oui Unknown), length 400
	0x0000:  4500 01ac 0116 0000 4011 782c 0000 0000  E.......@.x,....

This shows the first two bytes of the UDP packet data as the 4500 (hex). The same is true of the second packet:

19:42:50.798338 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:94:9e:f2 (oui Unknown), length 400
	0x0000:  4500 01ac 029a 0000 4011 76a8 0000 0000  E.......@.v.....

Hex 45 is Decimal 69. Looking with Hexdump:

[root@nuzleaf ~]# hexdump  /tmp/packets.bin  -e'"%07.8_ad  " 8/1 "%03d " "  " 8/1 "%03d " "  |"' -e'16/1  "%_p"  "|\n"' -v | grep 69
00000048  000 148 158 242 008 000 069 000  001 172 001 022 000 000 064 017  |......E.......@.|

We see 069 000. That matches the 4500.

Next in the decimal sequence is 001 172. In the hex we have 01ac which matches. This offset is 54. We know the whole file is 940 bytes long. That means each packet is probably 470 bytes. 470 minus 54 is still 416 is still longer than the 300 that we know is the length of the DHCP packet.

tcpdump shows us that the length is 400, which is, I think the length of the IP packet.

Possibly the best pattern to match is the MAC address, which we know to be

82, 84, 0, 230, 8, 49 (converted from hex 0x52,0x54,0x00,0xE6,0x08,0x031)

The MAC address is supposed to be put in the packet at an offset of 32 bytes in. The length is 6. We see this pattern on the line that starts with offset 00000096 and continues onto the next line:

00000096  000 000 000 000 000 000 000 000  000 000 000 000 000 000 082 084  |..............RT|
00000112  000 148 158 242 000 000 000 000  000 000 000 000 000 000 000 000  |................|

Working backwards, now, we should be able to pick out the patter for the start of the packet: 01 for the Opcode, 01 for the Hardware type, and 06 for the HW length. We see that here:

00000080  077 216 001 001 006 000 255 119  223 065 000 004 000 000 000 000  |M......w.A......|

So our DHCP payload starts at offset 82. It should continue for 300 bytes.

00000368  050 048 048 049 077 004 105 080  088 069 055 023 001 003 006 007  |2001M.iPXE7.....|
00000384  012 015 017 026 043 060 066 067  119 128 129 130 131 132 133 134  |....+<bcw.......>

That actually lines up with the error reported in the tcpdump:

instance#1: ERROR: invalid option Parameter-Request Option 55, length 23:

The 55 is followed by the length 23, the two values 001 and 003...and that is the end of the packet. 006 007 012 and so on might be part of the packet, but the BOOTP protocol hardcodes the length at 300 and so they get ignored. If we could find the length of the data section of the UPD header, we might know better.

Note that the bad checksum is reported as well. That value should be in the 2 bytes right before the start of the actual UDP data. And the two bytes prior to that is the UDP data length. Looking at the dump above we see that the line off set 00000080 starts with 077 216 before going into the 001 001 006 of the Boot packet. This is the checksum. The line before that ends with the two values 001 152. In Hex that is 0198 (I looked in hexl-mode) which in decimal is 408.

Our DHCP packet, which is supposed to be 300 Bytes long, is 408 bytes long. Is this legal?

Apparently, yes: RFC1531 extended the BOOTP definition. https://tools.ietf.org/html/rfc1531#section-2 DHCP extended the BOOTP packet size by renaming the vendor-specific area to options and giving it a size of 312.

 The options field is now variable length, with the minimum extended
   to 312 octets.

This seems to mean that tcpdump is working with the older format, which is no longer valid. I also need to extend the size of my packet in rust.

hexdump one byte decimal display

Posted by Adam Young on September 12, 2020 03:34 AM
hexdump  boot-packet.bin  -e'"%07.8_ad  " 8/1 "%03d " "  " 8/1 "%03d " "  |"' -e'16/1  "%_p"  "|\n"' -v

Found here:

Extract Function Refactoring using inline functions.

Posted by Adam Young on September 09, 2020 07:06 PM

The Extract Function refactoring is the starting point for much of my code clean up. Once a “Main” function gets sufficiently complicated, I pull pieces of it out into their own functions, often with an eye to making them methods of the involved classes.

While working with some rust code, I encountered an opportunity to execute this refactoring on some logging code. Here’s how I executed it.

Here is what the code looks like at the start:

fn main() -> std::io::Result<()> {
    {
        let local_ip4 = IpAddr::from_str("0.0.0.0").unwrap();
        let listen4_port: u16  = 67;
        let socket = UdpSocket::bind(&SocketAddr::new(local_ip4, listen4_port))?;
        socket.set_broadcast(true).expect("set_broadcast call failed");
        // Receives a single datagram message on the socket. If `buf` is too small to hold
        // the message, it will be cut off.
        let mut buf = [0; 300];
        let (_amt, _src) = socket.recv_from(&mut buf)?;
        let boot_packet = build_boot_packet(&buf);
        
        println!("packet received");
        println!("opcode      = {0}", boot_packet.opcode);
        println!("hwtype      = {0}", boot_packet.hwtype);
        println!("hw addr len = {0}", boot_packet.hw_addr_len);
        println!("hop count   = {0}", boot_packet.hop_count);            
        println!("txn_id      = {:x}", boot_packet.txn_id);            
        println!("num_secs    = {:}", boot_packet.num_secs);
        println!("ips {0} {1} {2} {3}",
                 boot_packet.client_ip, boot_packet.your_ip,
                 boot_packet.server_ip,  boot_packet.gateway_ip);
        println!("Mac Addr:   = {:}", boot_packet.client_mac);            
   Ok(())
}

My next task is to modify the data in the packet in order to send it back to the client. I know that I am going to want to log the data prior to writing it to the socket. That means I am going to end up needing those same log lines, possibly with additional entries.

I also know that I am going to want to see the difference when logging the packets. So…I decide first to implement the copy trait on my structure, and make a duplicate of the packet so I can see the before and after.

        let boot_request = build_boot_packet(&buf);
        let mut boot_packet = boot_request;

As always: run the code after making this change. I don’t have automated unit tests for this yet, as all I am doing is readying data off the wire…just saying that points at the direction for where my test should go, and I will do that shortly.

For now, I spin up a virtual machine that makes a DHCP request, and I make sure that I can still see the log data.

Now, to extract the logging code, I first wrap that section in a local function. Make sure to call the function after it is defined:

        fn log_packet(boot_packet: &BootPacket){
        println!("packet received");
        println!("opcode      = {0}", boot_packet.opcode);
        println!("hwtype      = {0}", boot_packet.hwtype);
        println!("hw addr len = {0}", boot_packet.hw_addr_len);
        println!("hop count   = {0}", boot_packet.hop_count);
        println!("txn_id      = {:x}", boot_packet.txn_id);
        println!("num_secs    = {:}", boot_packet.num_secs);
        println!("ips {0} {1} {2} {3}",
                 boot_packet.client_ip, boot_packet.your_ip,
                 boot_packet.server_ip,  boot_packet.gateway_ip);
        println!("Mac Addr:   = {:}", boot_packet.client_mac);
        }
        log_packet(&boot_packet);

Again, run the test. If it runs successfully, continue on to moving the log_packet function up to the file level namespace. Run the tests.

The above steps are the main technique, and can be used to extract a function. I am going to take it one step further and convert it to a method implemented on struct BootPacket. The final code looks like this:

impl BootPacket {
    fn log_packet(&self){
        println!("packet received");
        println!("opcode      = {0}", self.opcode);
        println!("hwtype      = {0}", self.hwtype);
        println!("hw addr len = {0}", self.hw_addr_len);
        println!("hop count   = {0}", self.hop_count);
        println!("txn_id      = {:x}", self.txn_id);
        println!("num_secs    = {:}", self.num_secs);
        println!("ips {0} {1} {2} {3}",
                 self.client_ip, self.your_ip,
                 self.server_ip,  self.gateway_ip);
        println!("Mac Addr:   = {:}", self.client_mac);
    }
}

Episode 213 – Security Signals: What are you telling the world

Posted by Josh Bressers on September 07, 2020 01:57 AM

Josh and Kurt talk about how your actions can tell the world if you actually take security seriously. We frame the discussion in the context of Slack paying a very low bug bounty and discover some ways we can look at Slack and decide if they do indeed take our security very seriously.

<audio class="wp-audio-shortcode" controls="controls" id="audio-1918-8" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_213_Security_Signals_What_are_you_telling_the_world.mp3?_=8" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_213_Security_Signals_What_are_you_telling_the_world.mp3</audio>

Show Notes

Swinging eighth notes via articulation

Posted by Adam Young on September 03, 2020 10:44 PM

Someone asked (on Reddit) how to swing eighth notes. Here’s my response, somewhat cleaned up and expanded.

Don’t. If you try to swing them, they will be over done.
Play straight eighth notes. The “swing is done via articulation.

You want to think “forward motion” http://www.halgalper.com/articles/understandingforwardmotion/

But the short version is that, in 4/4 time, the on beats are slurred and the off beats (the ‘and’) gets the tongue. Example:

<script type="text/javascript">ABCJS.renderAbc(['abc-paper-5f87172410415',], "�X:1�T:Swing via articulation�M:4/4�C:Adam�K:C�L: 1/8�%%MIDI program 67�|LC (LDE) (L^F =F) (LGA) (L=B|G) (L^F=F) (L=ED) (LCD) L^D|�".replace(/\x01/g,"\n"), {}, {staffwidth: 200}, {});</script>
<script type="text/javascript">ABCJS.renderMidi(['abc-midi-5f87172410432',], "�X:1�T:Swing via articulation�M:4/4�C:Adam�K:C�L: 1/8�%%MIDI program 67�|LC (LDE) (L^F =F) (LGA) (L=B|G) (L^F=F) (L=ED) (LCD) L^D|�".replace(/\x01/g,"\n"), {}, {});</script>

Say you have a measure with an eighth rest and then 7 eight notes. Play the eight notes with the articulation pattern : Tee-AH. So

rest tee-ah tee-ah tee-ah tah.

If the measure startswith an eight note, say tah. so a full 8 eighth notes is articulated;

tah tee-ah tee-ah tee-ah tah.

If the lines crosses the measure boundary, don’t stop the music. For 8 beats, you get:

tah tee-ah tee-ah tee-ah tee-ah tee-ah tee-ah tee-ah tah.

If there is an eight note pick up, tongue it and slur the one..

tee | ah tee ah ….

But play the eighth notes as straight as possible. When you get up to speed, it is impossible to swing them via duration.


Podman login to a secured registry

Posted by Adam Young on September 01, 2020 10:15 PM

I took the container registry I ran via podman and put it behind an Apache HTTPD instance secured with mod_ssl. Now when I try to log in to it, I get:

error authenticating creds for “nuzleaf.home.younglogic.net”: error pinging docker registry nuzleaf.home.younglogic.net: invalid status code from registry 403 (Forbidden)

Here’s my debugging notes.

To get more information, I rerun the loogin command with the debug flag:

podman login  -u ayoung  -p `cat password`   nuzleaf.home.younglogic.net   --log-level debug

And I see this in the output:

DEBU[0000] Credentials not found 
DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/nuzleaf.home.younglogic.net 

If I try with the flag --tls-verify=false, things work:

$ podman login  -u ayoung  -p `cat password`   nuzleaf.home.younglogic.net      --tls-verify=false 
WARNING! Using --password via the cli is insecure. Please consider using --password-stdin
Login Succeeded!

Yes, that warning bothers me. One issue at a time.

And...there is this: https://github.com/golang/go/issues/40521  Which shos that the TLS libraray in Go does not support the client authentication mechanism in TLS 1.3.

I'm going to leave the stuff below this in the page, as it was good debugging, but the short of it is that it won't work until Go updates, and then we rebuild podman.

Looking at the Man page for containers-certs.d(5) I see the following:

A certs directory can contain one or more files with the following extensions:
· *.crt files with this extensions will be interpreted as CA certificates

Hmmm...my machines are both ipa clients, and there should be no further certificate non-sense. I am assuming here the podman code is following the pattern from Docker. Let me try putting the IPA server CA cert into this directory.

curl -O https://nuzleaf.home.younglogic.net/ipa/config/ca.crt
sudo mkdir -p /etc/docker/certs.d/nuzleaf.home.younglogic.net
sudo mv ca.crt /etc/docker/certs.d/nuzleaf.home.younglogic.net/

While this does not solve my problem, it does have some effect. Rerunning with tls checking on, and debugging, I now see:

DEBU[0000] Returning credentials from /run/user/1000/containers/auth.json 
DEBU[0000] Looking for TLS certificates and private keys in /etc/docker/certs.d/nuzleaf.home.younglogic.net 
DEBU[0000]  crt: /etc/docker/certs.d/nuzleaf.home.younglogic.net/ca.crt 

Since I manage the server as well as the client, I can go over there to see what is happening. I use tail -f on /var/log/httpd/error_log to see the error specific to the login. It produces this output:

[Tue Sep 01 12:13:32.801607 2020] [ssl:error] [pid 131048:tid 140649565263616] [client 192.168.123.116:41734] AH: verify client post handshake
[Tue Sep 01 12:13:32.801681 2020] [ssl:error] [pid 131048:tid 140649565263616] [client 192.168.123.116:41734] AH10158: cannot perform post-handshake authentication
[Tue Sep 01 12:13:32.801710 2020] [ssl:error] [pid 131048:tid 140649565263616] SSL Library Error: error:14268117:SSL routines:SSL_verify_client_post_handshake:extension not received

Looking at the proxy config file in the /etc/httpd/conf.d/registry-proxy.conf:

ProxyRequests Off

<locationmatch>
    SSLOptions +StdEnvVars +ExportCertData +StrictRequire +OptRenegotiate
    SSLVerifyClient optional
    ProxyPassMatch http://localhost:4000
    ProxyPassReverse http://localhost:4000
</locationmatch>

So it looks like it is trying to validate the client certificate. Let's issue one and see what happens. I'm going to create a service principal for the client machine, request a cert for it, and use that cert to authenticate when I call to the registry.

The man page for containers-certs.d has the following layout specified:

             /etc/containers/certs.d/    <- Certificate directory
              └── my-registry.com:5000    <- Hostname:port
                 ├── client.cert          <- Client certificate
                 ├── client.key           <- Client key
                 └── ca.crt               <- Certificate authority that signed the registry certificate

Here is how I requested the certificate. Note the my user is an admin on the IPA server. I also need to set the SE Linux context on the target directory so certmonger can write the key and certificate files.

$ kinit 
Password for ayoung@HOME.YOUNGLOGIC.NET: 
$ ipa service-add containerregistry/yamask.home.younglogic.net@HOME.YOUNGLOGIC.NET
--------------------------------------------------------------------------------
Added service "containerregistry/yamask.home.younglogic.net@HOME.YOUNGLOGIC.NET"
--------------------------------------------------------------------------------
  Principal name: containerregistry/yamask.home.younglogic.net@HOME.YOUNGLOGIC.NET
  Principal alias: containerregistry/yamask.home.younglogic.net@HOME.YOUNGLOGIC.NET
  Managed by: yamask.home.younglogic.net
$ sudo semanage fcontext -a   -t cert_t /etc/docker/certs.d/nuzleaf.home.younglogic.net/
$ sudo restorecon /etc/docker/certs.d/nuzleaf.home.younglogic.net/
$ sudo ipa-getcert request -f /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.cert -k /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.key --principal=containerregistry/yamask.home.younglogic.net@HOME.YOUNGLOGIC.NET

Lets take a look at the files to see if there is anytthing there

$ ls -lZ /etc/docker/certs.d/nuzleaf.home.younglogic.net/
total 12
-rw-rw-r--. 1 ayoung ayoung unconfined_u:object_r:user_home_t:s0 1667 Sep  1 12:05 ca.crt
-rw-------. 1 root   root   system_u:object_r:cert_t:s0          1996 Sep  1 12:40 client.cert
-rw-------. 1 root   root   system_u:object_r:cert_t:s0          1704 Sep  1 12:40 client.key

And lets try again:

$ podman login  -u ayoung  -p `cat password`   nuzleaf.home.younglogic.net   --log-level debug
...
DEBU[0000]  cert: /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.cert 
ERRO[0000] error authenticating creds for "nuzleaf.home.younglogic.net": error creating new docker client: open /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.cert: permission denied 

Closer. So it is trying to read something....these next two steps are not secure, but included for debugging. Specifically, the key file should NEVER be made world writable.

$ sudo chmod a+r /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.key
$ sudo chmod a+r /etc/docker/certs.d/nuzleaf.home.younglogic.net/client.cert

No love: still get a 403 from the server. However, I do notice this:

DEBU[0000] Returning credentials from /run/user/1000/containers/auth.json 

And in there I do, infact, see an entry for my server

$ cat /run/user/1000/containers/auth.json
{
	"auths": {
		"nuzleaf.home.younglogic.net": {
			"auth": "YXlvdW5nOldvbGZob3VuZHM0LTI3"
		},
		"nuzleaf.home.younglogic.net:4000": {
			"auth": "YXlvdW5nOldvbGZob3VuZHM0LTI3"
		}
	}
}

We take security seriously, VERY SRSLY!

Posted by Josh Bressers on August 31, 2020 12:54 PM

Every company tells you they take security seriously. Some even take it very seriously. But do they? I started to think about this because of a recent Slack bug. I think there are a lot of interesting things we can look at to decide if a company is taking security seriously or if the company thinks security is just a PR problem. I’m going to call the behavior we want to look at “security signals”.

On August 28, 2020 a remote code execution (RCE) bug was made public from the Slack bug bounty program. This bug really got me thinking about how a company or project signals their security maturity to the world. Unless you’ve been living under a rock, you know Slack has become one of the most popular communication platforms of the modern era, one would assume Slack has some of the best security around. It turns out the security signals from Slack are pretty bad.

Let’s first focus on the Slack public bug bounty. Having a bug bounty is often a good sign that you consider security important. In the case of Slack however it’s a signal against. Their bounties are comically low.

<figure class="wp-block-image size-large"></figure>

To put this $1500 critical into context, Mozilla has a critical bounty of $10,000. Twitter has $20,000. I would expect something that is critical infrastructure to be much higher. The security signal here is security isn’t taken very seriously. It’s likely someone managed to convince leadership they need a bug bounty, but didn’t manage to convince anyone to properly staff and fund it. Not having a bug bounty is better than one that is being starved of resources.

The next thing I look for is a security landing page. In the case of Slack it’s https://slack.com/security. That’s a great address! The page doesn’t explicitly tell us they take security seriously so we’ll have to figure it out on our own. They do list their vast array of compliance certifications, but we all know compliance and security are not the same thing. The page has no instructions on how to report security vulnerabilities. No link to the bug bounty program. There are no security advisories. There are links to whitepapers. It looks like this page is owned by marketing. Once again we have a security signal that tells us the security team is not in charge.

Next I wanted to look at is how many CVE IDs have been assigned to Slack products? I could only find one: CVE-2020-11498. If I search for this CVE to be listed on slack.com I can’t find it anywhere. Slack has a number of libraries and a desktop client. If they took security seriously, they would have CVE IDs for their client so their customers can track their risk. I tried to find security advisories on the Slack web site and found a page owned by their legal team that does link to the bug bounty program. On the plus side that page does say “We take the security of your data very seriously at Slack.” They’re not just serious, they’re VERY serious!

I could continue looking for signals from Slack, but I think it’s more useful to focus on how this can happen and what can be done about it. There have been plenty of people discussing not letting marketing, or PR, or legal, or some other department be in charge of your security. This is of course easier said than done. A great deal of these problems are the result of the security team not understanding how to properly communicate with the business people. Security is not a tautology, security has to exist as part of the larger strategy.

Signaling Security

The single most important thing you need to learn is never NEVER try to scare people. Nobody likes to be scared. If you try to pull the “If you don’t do this monsters will eat the children” you’ve already lost. Fear does not work. If you have to, write “FEAR DOES NOT WORK” on your wall in huge red letters.

Now that we have that out of the way, let’s talk about what a bug bounty amount tells us. The idea behind a bug bounty isn’t to entice researchers to report issues to you instead of the black market. Anyone willing to sell exploits will always make more selling them illegally than you will pay. If you pay nothing or a terribly small amount you will still get reports for things people accidentally find.

The purpose of the bounty is to entice a security researcher into spending some time on your project. Good bugs take time, and the more time you spend the more good bugs you can find. It’s a bit circular. Think of these people as contract workers for a project you don’t even know about yet. The other thing you have to understand is if your bounty amounts are too low, that either means you couldn’t get a proper budget for your program, or you have so many critical bugs a critical has no value. Neither is a good signal.

How do you get enough budget for a proper bug bounty program? You have to frame a bug bounty as something that can enable business. A bug bounty seen as only a cost will always have to battle for funding. A proper bug bounty program signals to the world you take security seriously (you might not even have to tell everyone how serious you take it). It gives customers a sense of calm that shows you’re more than just some compliance certificates. It helps you gain unexpected contractors to help you find and fix bugs you may never find on your own. If you run the numbers it will be drastically more expensive to let one of your full time people hunt for bugs, and you still have to pay them if they don’t find anything!

What about having a security contact page? It’s a great tool for marketing to convince security minded customers how seriously you take security! This again comes back to the business value of having a proper security landing page. This answer will be very cultural. If your business values technical users, you use this as a way to show how smart your security people are. If you value transparency, you show off how open and honest your security is. This is one of those places you have to understand what your company considers business value. If you don’t know this, that’s going to be your single biggest problem to solve before worrying about any other security problems. Make your security landing page reflect the business strategy and culture of your company.

What about CVE IDs? Should your company issue CVE IDs for your products? I would reframe this one a little bit. CVE IDs are great, but they’re really the second step in publishing security advisories. Advisories first, CVEs second. Security advisories are a way to communicate important security details to customers. Advisories show customers you have a competent security team on the inside. Good security advisories are hard. If you have zero why is that? No product has zero security vulnerabilities.

There is an external value to having a team responsible for advisories. If you have a product your customers are running 3rd party scanners against it. It’s the cool new thing to do. Do you know what those scans look like? Are you staying ahead of the scanners? Are you publishing details for customers about what you’re fixing? Security advisories aren’t the goal, the goal is to have a product that has someone keeping ahead of industry standard security. The industry standard says security advisories and 3rd party scanners are now table stakes.

This post is already pretty long and I don’t want to make it any longer, every one of these paragraphs could be a series of blog posts on its own. The real takeaway here is that security needs to work with the business. Show how you make the product better and help with the bottom line. If security is seen only as a cost, it will always be a battle to get the funding needed. It’s not up to the business leaders to figure out security can help the bottom line, it’s up to security leadership to show it. If you just expect everyone else to make security a big deal, you’re going to end up with marketing owning your security landing page.

And I want to end on an important concept that is often overlooked. If there is something you do, and you can’t prove adds to the bottom line, you need to stop doing it. Focus on the work that matters, eschew the work that doesn’t. That’s how you take security seriously, not by putting it on a web site.

Episode 212 – Grab Bag: The Security We Deserve Edition

Posted by Josh Bressers on August 31, 2020 12:01 AM

Josh and Kurt talk about Chromium sending traffic to root DNS servers. Telemetry watching what we do. Cryptocurrency scams and a few other random topics. Also pandas.

<audio class="wp-audio-shortcode" controls="controls" id="audio-1911-8" preload="none" style="width: 100%;"><source src="https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_212_Grab_Bag_The_Security_We_Deserve_Edition.mp3?_=8" type="audio/mpeg">https://traffic.libsyn.com/secure/opensourcesecuritypodcast/Episode_212_Grab_Bag_The_Security_We_Deserve_Edition.mp3</audio>

Show Notes

Dealing with reused Serial Numbers for CAs

Posted by Adam Young on August 27, 2020 06:03 PM

“An error occurred during a connection to nuzleaf.home.younglogic.net. You are attempting to import a cert with the same issuer/serial as an existing cert, but that is not the same cert.”

Many years ago I battled this problem and had different solutions. Today, I got one that worked for Firefox on Fedora 32.

My IPA server is for HOME.YOUNGLOGIC.NET

My backing store for my default profile in firefox is /home/ayoung/.mozilla/firefox/x4kktanr.default

To see the certificates I have in that store:

$ pwd
/home/ayoung/.mozilla/firefox/x4kktanr.default
$ certutil -L -d . | grep -i younglogic
adam.younglogic.com                                          ,,   
younglogic.com                                               ,,   
openshift.younglogic.info                                    ,,   
nuzleaf.home.younglogic.net                                  ,,   
Certificate Authority - HOME.YOUNGLOGIC.NET                  CT,C,

To delete that last one:

certutil  -d . -D -n "Certificate Authority - HOME.YOUNGLOGIC.NET

Then restart Firefox. It holds this in some other cache, perhaps memory, until a restart. Note that the file that backs the certificate database is: cert9.db. It is a SQLite database with a very unfriendly schema.

Running git and gitweb in a container with Fedora

Posted by Adam Young on August 24, 2020 03:51 PM

There are many reasons to run a web service in a container. One of the remote services I rely on most heavily is git. While git local operations are fine in a global namespace, running a shared git repository on a remote server is a web-service based use case. There are three protocols used most commonly to remotely access git: git, ssh, and https. I am going to focus on the last one here.

In a previous post I ran git and gitweb as Web services directly on my laptop. Then I got push access to work. Now I want to do the same thing, but in a container. The directory that stores the repository files are going to be mounted into the container so that they survive a reboot.

We are going to use the same git.conf and gitweb.conf files as we used for the base OS, but we will put them into the container image via a Dockerfile.

FROM fedora:latest
#registry.fedoraproject.org/f29/httpd
USER root
LABEL maintainer="Adam Young"
# Update image
RUN yum -y install httpd
RUN yum -y install gitweb
COPY git.conf /etc/httpd/conf.d/
RUN mkdir -p /var/www/git

EXPOSE 80
# Start the service
CMD ["-D", "FOREGROUND"]
ENTRYPOINT ["/usr/sbin/httpd"]

To build the container, I used podman:

buildah bud -t ayoung/gitserver

And to run it:

podman run  -p 9999:80/tcp  --mount type=bind,source=/var/lib/git,destination=/var/lib/git --rm -it localhost/ayoung/gitserver  

Note that I run from a very non-standard port: 9999. To clone from this server:

git clone http://gitserver:9999/repo/gitserver.repo
 

Note that I have put an entry into /etc/hosts for gitserver.

Make a change and trying to push back to the server gives me an error:

git clone http://gitserver:99$ git push
Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 347 bytes | 347.00 KiB/s, done.
Total 3 (delta 1), reused 0 (delta 0), pack-reused 0
error: remote unpack failed: unable to create temporary object directory
To http://gitserver:9999/repo/gitserver.repo
 ! [remote rejected] master -> master (unpacker error)
error: failed to push some refs to 'http://gitserver:9999/repo/gitserver.repo'
 

Even when running with SELinux in permissive mode. This is because the user that is writing the file has a different ID than

the ayoung user that owns the repo. If I make the repo world writable:

$ cd /var/lib/git/ $ sudo chmod -R a+w . $ cd – $ git push origin HEAD:test Enumerating objects: 5, done. Counting objects: 100% (5/5), done. Delta compression using up to 8 threads Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 347 bytes | 347.00 KiB/s, done. Total 3 (delta 1), reused 0 (delta 0), pack-reused 0 To http://gitserver:9999/repo/gitserver.repo * [new branch] HEAD -> test

I can successfully push to the remote repository.

Both this change and the config change in the last post are not how I would want to run in product, but these are interim steps toward what I really want: running the git server in OpenShift. We’ll get there.

Enabling Write access to git-http-backend

Posted by Adam Young on August 24, 2020 03:31 PM

While I was able to read/clone a git repository self-hosted with HTTPD and git-http-backend, I found that I could not push to it. Here’s how I fixed it.

The directions say this:

By default, only the upload-pack service is enabled, which serves git fetch-pack and git ls-remote clients, which are invoked from git fetch, git pull, and git clone. If the client is authenticated, the receive-pack service is enabled, which serves git send-pack clients, which is invoked from git push.

https://git-scm.com/docs/git-http-backend#_description

It goes on to say this:

These services can be enabled/disabled using the per-repository configuration file:

https://git-scm.com/docs/git-http-backend#_services

It turns out is was talking about the standard git configuration file, and not something web specific. Since I created a bare repository named gitserver.repo, this file is /var/lib/git/gitserver.repo/config. I added the following content, and was then able to push:

[http]
        receivepack = true