The Unhappy Lambda

Posted on Jun 4.

tl;dr: Progress on The Happy Lambda is currently blocked, because I didn't think it through before starting, and because functional programming in Ruby sucks.

It's over half a year now since I started this project. I've gotten a lot of support since then, and I'm very grateful for that. However lately progress has been slacking. I still intend to finish the book, but I need to take a step back and rethink some assumptions.

I'm much more a Rubyist than a functional programmer, however I've dabbled with Haskell and a few LISPs, and I find a lot to like there. When I started work on Hexp I decided to make all the core types immutable. This forced me to do certain things in a functional style, and I came up with some interesting patterns and techniques along the way. Parallel to that I spent time toying around trying to bring more ideas from functional languages to Ruby, getting creative with lambdas, that sort of thing. I combined all this in a lightning talk and delivered it at a few conferences, and people were enthusiastic. So on the long train ride from ArrrrCamp back to Berlin I started writing The Happy Lambda.

My original scope for the book was roughly this

  • explain and demystify FP terms and concepts
  • a “practical” part with patterns and tips on how to use those FP concepts to write better code
  • an “experimental” part where I use Ruby's malleability to make it resemble as much a functional programming language as possible

I've written little bits of all three parts, but now I find I'm unsure how to continue. What I really didn't think through enough is who this book is for, and what the book's goal is.

One of the things I had in mind originally was that I would teach functional programming using Ruby, so you can learn all that stuff “in the comfort of your own home”, your known language, so to speak. The thing is that Ruby makes doing stuff in a functional way very complicated. I'll get to that in a bit but basically there is a lot of syntax that just works against you. So while it's possible to explain and showcase a lot of the concepts in Ruby, it just seems really pointless. It's hard to demonstrate the benefits of a technique when taken at face value all it does is make your code more cryptic.

I also tried to make the book very beginner friendly, assuming nothing but a basic knowledge of Ruby. But even for beginners, maybe especially for beginners, it makes more sense to explain laziness, partial application, functional composition, etc, in a language that makes these things elegant. So if one of the book's goals is teaching these concepts to people that never came across them, maybe I should be using Haskell or Clojure or ML for my code samples.

So what is it that sucks so much? Well let's see, if FP is about functions, and the elegance of treating functions as first class citizens, then Ruby surely offers us even more than we could wish for (note my sarcasm). Instead of the unifying concept of a function, and the simple mechanism of passing functions around, we have procs, lambdas, methods and method objects. Three of these are kind of like first class functions. Each is very much unlike the other. We have a million different syntactic constructs to arrive at them, and two separate mechanisms for passing a function-like-thing to another function-like-thing.

On the other hand we have very little general purpose higher order functions, unless you count the stuff in Enumerator. There's not even a general purpose “compose” (f comes after g). Let alone a “juxtapose”, “bind”, whatever. Procs and lambdas have “curry”, except that it throws together currying and partial application, has pretty funny semantics when used on varargs, and is described by core developers as an “easter egg”. But it's better than nothing. A patch to add “curry” to Method, so at least these three function-like-things are a little more like one another, hasn't received any feedback more than a month later. Which just goes to show that while the situation is bad, no one in a position to do something about it cares a single fuck.

Another small but annoying thing you've probably never realized until you started making your classes immmutable, the foo.bar=baz syntax is completely unusable. No matter what a method ending on = returns, the actual return value is always what was passed in. So the only side-effect free method you can make that ends on an = is the identity function. A persistent hashmap structure can't use hashmap[:foo]=:bar, it has to use something like hashmap.put(:foo, :bar). Sure it's a tiny thing, but it's the straw that breaks my pseudo-functional Ruby's back.

So now you're sitting back and grinning thinking, “but Arne, please, surely you could have realized all of that earlier”. And fair enough, I chose Ruby+functional for the topic of this book even though it's not the most natural fit, because I felt there was something there that was worth exploring. And I still do.

I'm still undecided how much of the stuff you do with functions in an FP language can be easily brought to Ruby, but apart from functions FP brings something else to the table, something maybe even more important: values (the “immutable” is implied).

FP is a collection of techniques that together have some interesting emergent properties, but that doesn't mean it's all or nothing. And while some will say that “mostly functional” programming does not work, I think getting used to building systems based on value semantics is something all programmers should be doing, and it's probably the biggest lesson Ruby can take home from the functional world. If you're not following there please watch Rich Hickey's excellent The Value of Values.

I'm clearly not the only one reasoning in this direction. To have value semantics of composed data types without sacrificing too much in terms of performance you need good persistent data types, and several projects are underway to bring these to Ruby. I've started a humble effort to coordinate and align these efforts by having shared specs and benchmarks.

So maybe that should form the core of the book? Less about functions and lambdas, more about values? It's certainly more practical advice then trying to write lisp-with-ruby-syntax. Except there's no implementation of persistent data structures I would recommend to use in a production setting today, so how pragmatic are we talking, really? But yeah, maybe starting from the “value object” section I already have, and turn that into half a book, show how it composes into bigger systems. Demonstrate helpful gems like Anima, Adamantium, Equalizer. Show step by step how to implement a cons based list, a hash array mapped trie, a zipper, that kind of thing.

And for the “functional” stuff, I think for that there's a bright (or at least dimly lit) future as well. I've been trying stuff out for over a year, for Hexp, for Yaks, for other projects. I bundled a bunch of utility functions here, but I can't say I've found a sweet spot of expressive syntax just yet.

Conclusion: I need to clarify who this book is for, and what it tries to achieve. I need to write an outline, and basically (almost) start from scratch. I've also been doing a lot of traveling the past year, which has really cut into my productivity. I will be back in Berlin in a few days and plan to travel a lot less the coming months. I also went from contracting five days a week to four days a week. All of that means that I should have some time on my hands. I intend to start working a bit harder on my open source projects, especially Hexp and Yaks, and also on RubyDataSpec, which should indirectly keep me involved in Hamster, Persistent (working title) and Clojr. Hopefully I can then return to writing on the Happy Lambda with more experience, a better battle plan, and higher confidence.

Thanks for listening.

Rails is No Longer Alone

Posted on May 16.

This is an article I've been meaning to write for a while. I was pushed over the edge by Adam Hawkin's post Fragmentation in the Ruby Community. Adam writes that fragmentation in the community is accelerating, and that Rails is a major fault line. I'd like to add some context and my personal view on the Rails vs “pure Ruby” debate.

I don't remember exactly the first time I came across Ruby. It must have been 2005 or 2006, and it most certainly was because of Rails.

Rails was revolutionary at the time. It combines a powerful and pleasant language with a pragmatic “rapid application development” approach. Getting so much from doing so little was truly a breakthrough. And Rails is still one of the most complete solutions out there. But it has its flaws.

When talking about Rails we have to remember the time it stems from. Rails chose a language that not many people were using at the time, and Ruby's ecosystem was tiny compared to what is out there now. There was no Rack, no Nokogiri, no Rspec. Chances of already being a Ruby programmer and then moving on to Rails were small. Instead people came to Rails from Java, C++, Perl, Python. They learned “Rails Ruby”, the distinction didn't matter.

There also wasn't a whole lot available in the Ruby world to build upon when it came to web programming. People were using the standard lib's CGI module, that's pretty much as far as it went.

These circumstances make some choices that Rails made very reasonable. The Ruby language explicitly allows extending core classes, so why not use that capability to make it even easier, even better? There weren't that many third party libs available, so interop was only of limited concern. In fact most third party stuff that came later was built explicitly for Rails, so those libs would take care of not conflicting with Rails, with ActiveSupport.

The ability to reopen classes was in turn embraced by much that came after Rails, extending and changing Rails classes to make them “even better”. It wasn't too hard to see this style of development would lead to a mess. But hey, it worked, startups shipped and made money. Life was good.

Looking at it from a 2005 perspective also explains why Rails has this “everything but the kitchen sink” approach. Rails was ambitious from the start, trying to solve as many challenges of web development it possibly could. And since there was very little to build upon, it had to do all of that itself. This explains the strong “Not Invented Here” tendencies of the Rails developers. And the fact that “adding tech available in gems to core” is an official policy.

In short, Rails assumes it is alone in the world, whereas in fact it no longer is. We now have a rich Ruby ecosystem. Building on top of that, using a modular approach, promoting libraries over frameworks, having lean, focused components each with their own maintainers. Libraries that stick to their own namespace (sorry, major pet peeve there). That's how I envision a healthy Ruby ecosystem. I think it's what many people hunker for. It's what I see in initiatives like micro-rb.

I think what turns people away from Rails is basically this, the accidental complexity that grew out of being an island. The ball of mud that is ActiveSupport. I don't see this getting better. But I do see a vibrant community of people and projects with a different vision, one that grows by the day. One that values high quality Ruby code, and high quality Ruby implementations. The future is bright. Yay Ruby!

HTML and URI's in Javascript

Posted on May 14.

I have written and spoken a few times about the perils of string arithmetic on formal data. Those talks were focused on theory and fundamentals, in this post you'll get very practical tips.

When programming for the web there are two types of formal data you'll come across All The Time, HTML and URIs. These formats have well specified structure and semantics, so that machines can unambiguously generate and consume them. Don't try to do what the machine does better, or you'll shoot yourself in the foot.

URIs

This one is easy: use URI.js. It is unfortunate that browsers don't have built-in APIs to deal with URIs in a sane way, but URI.js really gives you all you need.

Some simple examples

// bad
window.location.origin + '/foo/bar'
//good
URI('/foo/bar').absoluteTo(window.location.origin).toString()

// bad
uri = 'http://example.com/posts/' + escapeURI(postId) + '/comments/' + escapeURI(commentId)
// good
uri = URI.expand('http://example.com/posts/{pid}/comments/{cid}', {pid: postId, cid: commentId))

// complete example from the README
URI("http://example.org/foo.html?hello=world")
  .username("rodneyrehm")
    // -> http://rodneyrehm@example.org/foo.html?hello=world
  .username("")
    // -> http://example.org/foo.html?hello=world
  .directory("bar")
    // -> http://example.org/bar/foo.html?hello=world
  .suffix("xml")
    // -> http://example.org/bar/foo.xml?hello=world
  .query("")
    // -> http://example.org/bar/foo.xml
  .tld("com")
    // -> http://example.com/bar/foo.xml
  .query({ foo: "bar", hello: ["world", "mars"] });

There are tons of edge cases that this covers that your naive let's-mash-some-strings-together code does not, including proper escaping.

Update

I should have mentioned this earlier, URI templates are actually a RFC standardized mechanism for building and recognizing URIs. This is what URI.expand above is based on. It's a very useful and underused mechanism.

HTML

In contrast to URIs, browsers do come with a sane API for building HTML, it's called the DOM (Document Object Model) API.

var divNode  = document.createElement("div");
var textNode = document.createTextNode("We all live in happy HTML! &<>");
divNode.appendChild(textNode);
document.body.appendChild(divNode);

So that's great, except that no one wants to actually write code like that, so people end up committing atrocities like setting innerHTML with the tagsoup of the day. Notice though how this version has already eliminated the need of manually calling escape functions.

The highly informative MDN article DOM Building and HTML Insertion. has some great tips, for instance a handy jsonToDOM function.

The implementation there is already quite clever, allowing one to set event handlers in one go. Since this article is meant for people building browser extensions, it also has some XUL stuff that's not relevant when programming for the web.

document.body.appendChild(jsonToDOM(
  ["div", {},
    ["a", { href: href, onclick: function() { } }, text])));

Great idea, and with some tweaking very useful in a browser context. But chances are you're already using jQuery, in which case I have good news for you: jQuery has everything covered!

var divNode = $('<div>', {class: 'my-div'}).append($('<a>', {href: '..'}));

The $('<tag>', {attributes}) syntax provides an easy way to build DOM objects. The result is a jQuery object. You'll have to unwrap it to get to actual DOM element.

var domNode = divNode[0];

You might want to convert this to an HTML string now. In that case it's highly likely you're doing it wrong, but there are some cases where this is actually legit, e.g. Ember.js Handlebars helpers don't allow returning DOM nodes. I assume this will change with HTMLBars.

In this case keep in mind that calling html() on the jQuery object will only return the inner HTML. You can get the full thing from the DOM node though.

var nodeHTML = divNode[0].outerHTML;

For example in Ember.js:

Ember.Handlebars.registerBoundHelper('linkToPost', function(postId) {
  var uri  = URI.expand('/posts/{id}', {id:  postId});
  var html = $('<a>', {href: uri, text: "goto post"})[0].outerHTML;
  return new Handlebars.SafeString(html);
});

Putting the two together

Take this simple function

function linkToPost(postId) {
  var uri = '/posts/' + encodeURI(postId);
  return '<a href="' + uri + '">goto post</a>';
}

The problem here is that there are two levels of interpretation going on. While the URI is correctly escaped in itself, when placing it in the context of HTML, in particular as an attribute value, there's extra escaping that needs to happen, so the value can't break out of the attribute (by including ' or “) or out of the HTML tag (by including > or <).

Escaping always depends on context, and if there are multiple levels of context the manual approach will always fail, without fault. In short if you find yourself:

  • writing HTML fragments inside strings ('<a href=…')
  • calling escape functions (e.g. for URI or HTML) manually

think if you can let some other component that knows the details of the language you're generating better than you do, to do the work for you. Here's a corrected version of the above.

function linkToPost(postId) {
  var uri = URI.expand('/posts/{id}', {id:  postId});
  return $('<a>', {href: uri, text: "goto post"});
}

Finally

Browsers don't come with a function for manually escaping HTML. That is because you don't need it. Having it there might encourage bad practices and hence do more bad than good.

But as with everything there are exceptions. If you really need to escape HTML, and you're sure your use case is legit, there are a few options.

Let the browser do it for you:

var divNode  = document.createElement("div");
var textNode = document.createTextNode("We all live in happy HTML! &<>");
divNode.innerHTML // "We all live in happy HTML!We all live in happy HTML! &amp;&lt;&gt;"

Use Underscore.js

_.escape("We all live in happy HTML! &<>");
// "We all live in happy HTML!We all live in happy HTML! &amp;&lt;&gt;"

or copy any of the functions you find on the web. Make sure it escapes < > ' ” &.

Syck vs Psych: Differences and Conversion

Posted on Feb 9.

YAML (rhymes with camel) is a data serialization format designed to be both human and machine readable. It's distinguishing features are use of semantic whitespace, and support for a rich set of built-in and user defined types.

While not the inventor of the format, the first widespread implementation of YAML was written by Ruby-famous and now virtually deceased “Why the lucky stiff”. His C implementation, titled Syck, became part of the Ruby distribution, and got bindings to several other languages as well.

Later on the PyYAML project wrote their own “libyaml”, which better kept up with the evolving YAML specification, and has since become the implementation recommended as a reference by the YAML folks.

Aaron Patterson wrote bindings to libyaml, called Psych, which made it into Ruby 1.9.2. With the release of 1.9.3 Psych became the default YAMLer, although Syck was only removed in 2.0, so 1.9 users could still opt-out of using the new version.

All of this is old news, why bring it up again? Two reasons: there are probably still quite a few systems running on Syck because of legacy YAML data, and I couldn't find any resource on the web describing the differences in behavior between Syck and Psych.

At Ticketsolve we have a number of database columns that contain serialized YAML, several dozen of millions of records. This is why up to now we have kept using the old Syck. It's also the main reason we haven't migrated to Ruby 2.0 yet. So I set out to investigate how involved the change would be. I also thought that it would be a great opportunity to document the differences in behavior, based on a large enough real world data set.

The first question is : if we simply flip over and read our existing data with Psych, what would happen? It turned out that 0.016% of records would be interpreted differently. Not a lot in relative terms, but still thousands of records. These are the main differences we have found:

Representation of non-ASCII Strings

Syck dates from a time when Ruby was still blissfully unaware of string encodings. Strings were byte arrays, rather than character arrays. Interpretation of those bytes was left to the program. To prevent emitting invalid output when fed random binary gobbledygook, Syck will emit hexadecimal escape sequences for any higher order bytes (decimal value >= 128).

puts Syck.dump('utf8' => 'é', 'latin1' => 'é'.encode('ISO-8859-1'))
# ---
# utf8: "\xC3\xA9"
# latin1: "\xE9"

Psych will convert any strings it encounters to UTF-8, and then output actual UTF-8.

puts Psych.dump('utf8' => 'é', 'latin1' => 'é'.encode('ISO-8859-1'))
# ---
# utf8: é
# latin1: é

Interestingly, Syck will happily and correctly parse the UTF-8 version emitted by Psych.

This difference caused the majority of incompatible interpretations. But we found more!

Single vs Double Quotes

Syck and Psych seem to have different heuristics for when to pick single quotes, when to go for double quotes, and when to use “block text” syntax. It seems Syck pretty much always uses double quotes, whereas Psych will only switch to doubles when the string contains something like a newline which requires an escape sequence.

puts Syck.dump('\ ')
puts Psych.dump('\ ')
# --- "\\ "
# --- ! '\ '

puts Syck.dump("\n")
puts Psych.dump("\n")
# --- "\n"
# --- ! "\n"

Digging more into this I found one case where Psych output confuses Syck. When Psych uses double quotes, and needs to output multiple consecutive spaces right after a line break, it will start the new line with a backslash. When reading this Syck will read an actual backslash.

p data = {foo: {bar: {baz: 'foo '*18 + "   \n"}}}
p Syck.load(Psych.dump(data))
# {:foo=>{:bar=>{:baz=>"foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo    \n"}}}
# {:foo=>{:bar=>{:baz=>"foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo foo \\   \n"}}}

Long Hash Keys

When the key of a Hash is longer than 128 characters long, Psych will put the key and value on separate lines, using special YAML prefixes (? and :) to indicate which is which.

puts Psych.dump('x'*129 => 'y')
# ---
# ? xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# : y

Documents Containing a Single Scalar

Psych will put an “end of document” marker consisting of three dots after a document consisting of a single string, symbol, number, boolean or null. This is a difference in how they generate YAML, although both parsers can correctly read the other's output in this case.

puts Psych.dump('foo')
# --- foo
# ...
p Syck.load(Psych.dump('foo'))
# "foo"

Another tiny change is that Syck will always put a space after the opening marker, while Psych does not.

p Psych.dump('abc' => 123)
p Syck.dump('abc' => 123)
# "---\nabc: 123\n"
# "--- \nabc: 123\n"

Red Herrings

Last year after a number of vulnerabilities concerning Rails and YAML were exposed we started using the 'safe_yaml' gem. The main feature of SafeYaml is that it allows you to whitelist the types of objects that can be created when deserializing YAML. A HashWithIndifferentAccess is probably fine, a Rails RouteSet might be more problematic.

SafeYaml will only install itself for the YAMLer that is active at the time it is loaded, so in our case it was active for Syck, not for Psych. I found differences in how timestamps with timezone information were handled, that went away when disabling SafeYaml. Similarly, data like the following, coming from YAML generated by as3yaml was being incorrectly interpreted because we hadn't whitelisted the !str tag.

---
:lat: !str 53.363665
:long: !str -8.02002
:zoom: !str 7

So the numbers were being interpreted as numbers, rather than strings.

Converting

The only way to be certain your data will be identically interpreted after conversion as before is to read it with Syck, then dump it again with Psych. You'll have to do this on Ruby 1.9.3 so you have both implementations available. First make sure Syck is properly loaded by changing the YAML engine:

require 'yaml'
YAML::ENGINE.yamler = 'syck'

Now you can use Syck.load and Psych.dump. An earlier blog post I found on the topic tells you to switch engine, load the data, then switch again and dump it. If you do this in a tight loop you will find this to be horribly slow, so just use the constants directly.

Note that you basically need to stop the world while the conversion happens, then immediately afterwards start using Psych for everything. To limit the time this migration takes, I'm first generating a list of all primary keys that are affected. Remember this was only 0.016%, so that will be a big speedup, and you can prepare this beforehand.

records.each do |id, yaml|
  ids << id if Syck.load(yaml) != Psych.load(yaml)
end

After converting these you only need to run the conversion for records that have been created or changed since scanning for differences.

The Future : JSON?

Having faced security and other issues with YAML, people are opting more and more for JSON. In fact we are also looking to eventually move this data to JSON. One thing to keep in mind is that JSON is significantly less powerful than YAML. It can only represent strings, numbers, booleans and null. No timestamps, symbols, or other richer types. Some extensions do allow storing timestamps, but these are non-standard and not guaranteed to work in an inter-operable way.

We've come to realize that YAML probably isn't be best format for serializing data, I think that's a good thing. YAML's most redeeming quality in my humble opinion, is that it's great for data that is mostly written and managed by humans, like configuration files, where the strictness of JSON (trailing commas anyone?) can be an annoyance.

Conclusion

For such a “simple” format YAML has a surprising number of gotchas and subtleties. I've learned a lot about the format in conducting this exercise. Incompatible implementations of formal languages can be real nuisance, but also a security liability. I highly recommend looking into some of the Langsec material if that stuff interests you.

Audio Compression for the Rest of Us

Posted on Jan 10.

It's the onset of summer, and tomorrow you'll be trotting off with your bff's to the biggest bestest music festival in decades. Full of anticipation you fall asleep… and suddenly find yourself transported to the festival grounds, the main act is just about to take the stage, and you're the one behind the mixing desk, making sure they sound amazing (p.s. this is a dream).

The thing is, you have a problem. The singer has an amazing voice, and she uses the full range of it. Not only from low to high notes (the frequency range), but also from really really quiet, to really really loud (the dynamic range). In the meanwhile the rest of the band is producing this steady wall of sound, so when her singing gets close to a whisper there's just no way anyone will hear her. And when the chorus starts and she starts screaming and hopping around it's deafening, the sound of the band vanishes into the background.

So with a sense of duty you grab hold of the fader that controls the volume of her voice. You set the baseline volume loud enough so her whispers can be heard, and every time she gets really loud you pull the fader down a little bit, and then back up afterwards.

You're basically trying to reduce her “very very loud” into “very loud”, and her “very loud” into “just loud”. But when she's singing quiet, or normal, anything less than very loud, you do nothing.

So you're standing there, bobbing that fader up and down, trying to track her musical escapades, but you're always a little too late and the whole thing just sounds horrible. If only a machine could do what you were trying to do.

And then you notice this device in the rack on your right. It says “dynamic range compressor” in shiny silver letters at the top right, it has a couple of knobs you can turn, and two rows of colored leds. The first row is labeled “input volume”, and you can see that it lights up in sync with her singing. It seems like the device is tracking the volume of her voice.

The second row of leds is labeled “gain reduction”, but they don't light up. It seems there's something the machine could be doing but isn't.

“Maybe”, you think to yourself, “gain reduction” just means “reducing the volume by this much”, and I can make this machine do what I'm trying to do manually.

The first knob is labeled “threshold”, and it has markings going from 0dB all the way to ∞dB (you know those are decibels, i.e. a measure of how loud something is). It's currently all the way to the right, at “infinite loudness”. Seems like it's time to lower the bar.

So you start turning the knob to the left, and around 120dB you notice that the “gain reduction” leds start lighting up just a little bit when her singing gets really loud. You keep your ears cocked and keep turning. And amazed you find out that IT IS WORKING. Each time she goes “very very loud” the machine turns it into “very loud” just like you were trying to do before.

In other words, it's bringing loud and quiet closer together. If her singing was like a picture of mountains, with the high peeks representing the really loud singing, what you've done is you've pressed top part of the picture together a bit. So the tops of the highest mountains are a little less high, but the rest of the hills haven't changed.

The second knob on the machine is labeled ratio, and it's at 2:1. What this means is that every two decibels over the threshold will be halved. You can turn it up to 3:1 or 5:1 to get more compression. You're still only changing the highest peaks, but you're pressing them together a little harder.

Other parameters

I'll quickly summarize these

Attack and release

Sound is a very organic thing, it typically goes up and down gradually. If we all of a sudden decrease the volume when a certain threshold is reached it might sound a bit weird. The “attack” (a time in ms) is the time it takes the compressor to go from zero compression to the configured compression ratio (like 2:1).

Release is the opposite, it's the time it takes the compressor to stop compressing after the sound has dropped below the threshold again.

Make up gain

Since a compressor makes the loudest parts a bit more quiet, it actually makes the whole signal on average a bit more quiet. To compensate many compressors have a “make up gain” setting, it's basically just an extra amplifier to boost the volume a bit after compression.

Special types of compressors

Expander

If the ratio is e.g. 1:2 instead of 2:1, then 1 decibel over the threshold will be turned into 2, so it will make the loudest parts louder. This is called an expander. Think of it as a reverse compressor.

Limiter

Rather than reducing the volume in par with the actual signal, a limiter will simply stop the signal from exceeding the threshold. It's like setting an infinite ratio. As soon as the threshold is reached, any excess volume is squashed to nothing.

Normalization

With digital music there is a fixed volume range to operate in, you can't go louder than 100% (all bits to 1). Digital volume is measured in the dBFS scale, where 0 means maximum volume, anything less is a negative number.

If the loudest part of a recording is at -30dBFS, then that's 30dB of dynamic range you're not using. Normalization means automatically applying expansion so you are using the full scale.

Ever noticed how with home recordings, you sometimes have to put the volume waaay up, and still you can barely understand what is being said. While at the same time you've also increased the background noise. Professional recordings don't have this because they've been normalized.

Closing thoughts

Ever noticed that when TV commercials start the volume suddenly jumps up? Technically they are not louder than the rest of the broadcast, it's all been normalized to use the full range of whatever medium is being used to transmit the audio.

However the commercials will have had very hard compression applied to them, so instead of quiet and loud, there's only loud and loud. It's a trick to catch your attention. In some places there is regulation to limit this “loudness war”.

I hope this article manages to make compression understandable. If so or if not let me know in the comments!

White Ribbons

Posted on Oct 19.

The Ruby is world is known to be a happy place. Rubyists are friendly, relaxed folks who like to party and have a good time. And it's an inclusive community as well, just look at Rails Girls, Railsbridge, or the couple of speakers who do the conference circuit talking about diversity.

So far for the theory. While this is how the Ruby community likes to see itself, we've had enough incidents, stories and stats by now to know that reality isn't always so rosy.

A recent report of sexual harassment at a conference has brought this topic to the forefront again. But the Twitter flamewar is already cooling down again, as these things go, and gradually we slip back to business as usual.

Before that happens, however, let's all talk and think about how something good might still come of this.

I coach at a weekly study group consisting of former Rails Girls attendees, and at our last meetup we spontaneously ended up having a long group discussion about the recent events and the state of the community. Shortly afterwards some of us decided to set up Ruby White Ribbon, as a way to take this discussion to the conferences.

I was at dotRB in Paris yesterday, a really great conference. I wore and handed out ribbons, and again talked with many people about diversity, discrimination, sexism, and how we, as a community can become better.

This was new to me. While I really, really want to see a more diverse and welcoming community, I haven't been too vocal about it. I think mostly for fear of being misconstrued, of saying something clumsy, of getting the rage of the Twitters over me. I do help out with Rails Girls workshops and study groups, helped a bit with Rails Girls Summer of Code, so I figured I was doing my share. But I've realized that it's too easy to be silent. It's a very privileged thing. The status quo doesn't hurt me directly, so it's easy to just go my merry way without touching on hard subjects. But as I told some people yesterday, I don't want to be silent any more. We can't just leave it to a few people like Ashe Dryden or Julie Pagano to do this work.

The fact is that we're mostly running on auto-pilot. Perhaps the Ruby world isn't worse than other tech communities (perhaps), perhaps not even worse than society in general (unlikely). But even if that would be the case, that's still pretty bad. Do your homework. The daily experience as a woman, LGBT, person of color, etc, contains countless instances of discrimination, micro-agressions, and much much worse. That's not cool. That's horrible. We as a community shape our community. We can work on being better than that.

Opening your eyes to this stark reality is a process. It takes effort and learning to see through your cultural programming, to empathize with those going through these experiences.

One thing I've realized when deciding to be more vocal, is that that also means knowing what you're talking about. And I admit that I still have a lot to learn. Maybe there will be times that I unknowingly make someone uncomfortable by something I say or do. If so please call me out on it. I'm not perfect, but I want to live what I stand for, I want to own up to my mistakes.

I hope I can inspire others to speak up more as well. It's easy to say “of course rape is wrong, why do you need me to say that”. That's not what it's about. But by opening your mouth you start a process of educating yourself and people around you, and to eventually grow to become a better person, and a better community.

It would be great if we could encourage “big names” in our community, the people that draw crowds to conferences, to do a conference talk on these topics for a change. A few people have been doing amazing work, but they can't reach everyone. If we want to grow as a community, we need a strong message from the icons that people respect and look up to.

So that's at least something actionable. Other loose ideas that have come up in discussions over the past week :

Can we please make it established practice that every conference starts with saying a few words about their code of conduct? Something along the lines of

  • These are our core values
  • We have a code of conduct
  • You can find the full document here
  • These are the main points
  • If something makes you uncomfortable during the conference, or you believe the code of conduct has been violated, these are the people to talk to

I know some conferences already do this (notably, Eurucamp), but it's not established practice. If you're going to a conference soon, please ask the organizers if they could do this.

Another point that has been raised and discussed are the pre- and post-conference drink ups. While it's not necessary to demonize people that want to go out and have a ball with their Ruby peeps, it's still unfortunate that this is often the only social event included in the conference program. Not everyone is comfortable with that. And as we have seen, tons of sponsored booze, darkened rooms and loud music can sometimes bring out the worst in people.

There are many alternatives that could be offered, perhaps as a “parallel track”. I personally would love to just have a space in the evening where people can get together and hack on stuff. Especially with open source you often meet people on conferences that you otherwise only collaborate with on-line. It would be great to sit down and do a bit of pair programming face to face. Maybe with some snacks and drinks available. Can't speak for all, but I for one would love that.

Hacking a Presentation with Mdpress

Posted on Jun 25.

Last weekend I had the chance to speak at the RuLu conference in Lyon, France. I've spoken at user groups before, but this was my first time speaking at a proper conference.

Before I typically would use OpenOffice to prepare my presentations, like this one about Rails security, but this time I decided to do something different. I'm a minimalist when it comes to computing, and a big fan of plain text formats. At least half of my screen time is spent either at the command line, or in the Emacs text editor.

My friend, the inimitable K.M. Lawson wrote an article recently about using Markdown to create presentations, and I was eager to give it a try. The approach he outlines uses a tool called Mdpress, which converts from Markdown to a HTML presentation powered by Impress.js.

Impress.js provides many of the features of Prezi, so you can create these ultra-dynamic infinte canvas type presentations. It's pretty neat, and it works well even if your needs are more modest than that.

So with these in my toolbox I could start playing around. The great part about using small, dedicated, open source tools like these is that they are super hackable. I ended up making lots of modifications to Mdpress just to suit this presentation. Some of them were ugly hacks on top of ugly hacks, but that's ok, because for once “works for me” was actually good enough.

Setting things up

For starters I set up a local Git repository to be able to track changes as I worked on my presentation. This is the basic layout I started with

.
└── 2013-rulu
    ├── presentation.md
    ├── presentation.org
    ├── Gemfile
    ├── Gemfile.lock
    └── Rakefile

The presentation.md is the source file for my presentation, it's a standard markdown file, but it starts with a YAML preamble to configure some aspects of Mdpress, and individual slides are separated with ---. The presentation.org are my notes in Emacs org-mode format.

Custom Mdpress branch

As soon as I realized I would be tweaking and molding Mdpress to my liking, I forked it on Github and created my own rulu2013 branch. This way I have versioning there as well, and if I want to rebuild my presentation at a later date (or on someone else's laptop after mine has crashed five minutes before going on stage), I can quickly retrieve the exact version I used for this talk.

The Gemfile refers to my custom branch. This is what it looks like

source 'https://rubygems.org'

gem 'mdpress', path: '/home/arne/github/mdpress'
#gem 'mdpress', github: 'plexus/mdpress', branch: 'rulu2013'
gem 'rake'
gem 'rb-inotify'

Basically as long as I'm working on things I want to just use the version of Mdpress that's on my laptop, so I can change things right away. That's what the first gem line does. After I'm done and everything is pushed to github under rulu2013 I will replace the first line with the second.

Rake tasks

To automate things I'm using a Rakefile, it ended up having four tasks.

rake mdpress:build  # build the impress.js presentation
rake mdpress:watch  # watch the presentation and theme for changes and rebuild
rake preview        # push a preview version to a unique URI
rake publish        # Publish the final result after the talk

You can find the final Rakefile in this gist

So now I can open a terminal, type rake mdpress:watch, and it will automatically update the presentation whenever I save the markdown file. When I want to get some feedback on my current version I do rake preview, and I get a unique URL that I can mail around, but that's hard to guess.

Mdpress tweaks

So here comes the fun part, I'm just going to mention the most interesting things I've done, you can find all the changes I made in my rulu2013 branch

Inline graphs

Markdown has a feature called 'fenced code blocks', a way of marking blocks of code so they get syntax highlighting. To do this you specify the language the code is written in.

One of the coolest changes I did was change the way code in the “dot” language is handled. This format used by the Graphviz suite of tools lets you describe complex graphs in a textual way. My presentation contained lots of syntax trees, and being able to edit them like that in place was a real lifesaver.

This went through a few iterations. Basically I just call the command line tool 'dot' in the background to turn the description into SVG (an XML format for vector graphics). Since HTML5 it's allowed to stick that SVG inline in your HTML, so that's exactly what I do. It also receives some post-processing with Nokogiri to change the size, and remove the white background.

See commits e98fcd8 44d48f6 324243d and 2acb0ab

Speaker notes

In the end I didn't really use these, but it's still a great feature that I might use in the future. Most presentation software has a speaker view where you can display some notes to help you remember what to say. The trick to getting something like this in a HTML presentation is by using the Javascript console. You can decouple this from your browser, display the window on another screen (i.e. your laptop, not the projector), and you can increase the font size with Control-+.

Here again I misused the fenced code blocks but now used the language name “notes”. These got turned into a hidden div with class="notes", and a little bit of Javascript tied it all together

document.addEventListener("impress:init", function (event) {
    document.addEventListener("impress:stepenter", function (event) {
        var step = $(event.target);
        if (console.clear) { console.clear(); }
        console.log(step.attr('id'));
        console.log(step.find('.notes').text());
    }, false);
});

Return to start

While I was practicing, I found that I wanted to more easily jump back to first slide. I had a look at how impress.js handled its keyboard input, and came up with this

document.addEventListener("keydown", function ( event ) {
  if ( event.keyCode === 48) { //0
    event.preventDefault();
    impress().goto(0);
  }
}, false);

Now I could press the 0 key and would jump straight back.

Relative offsets

Impress.js uses data attributes to position each 'slide' on an infinite canvas, so you need to specify all of these when using Mdpress as well. By default I wanted my slides to just be layed out from left to right, unless otherwise specified. What I came up with was this :

  • The first slide gets position 0, 0
  • When an attribute like data-x or data-y starts with a + or - it is taken to be relative from the position of the previous slide
  • I added the option to set defaults for these in the preamble, in my case data-x defaults to +1000, and data-y to +0

So now slides are layed out left to right with 1000px between them. I can position them differently, but if I keep using offsets then I can reshuffle slides without having to change the values of all that comes later.

In the end I ended up going for a more traditional slide deck look, so I didn't really use this after all.

Configurable transition time

Mdpress/Impress default to 1000ms transition between slides. When practicing I found that I wanted to be able to set that to something snappier, so I made it configurable in the preamble. In the end I just set it to 0ms to forego animation completely. Maybe in the future I will use this fancier stuff, but since I didn't have the time to focus on that kind of polish, I decided to keep it simple.

Extra classes

To allow easier styling of individual slides, I added the ability to add CSS classes to the div that wraps the slide, using the same notation as is used for the data attributes. See commit 7ce982e8.

Conclusion

Each of these “hacks” took minutes in itself, and in the end I had tools and a workflow that suited me just right! Of course this is due to time invested learning programming, Ruby, web stuff, Javascript, so I know this won't be as easy for everybody.

But don't be afraid to look under the hood. These kind of changes can sometimes be easy to make without having to understand the whole program, and they are great appetizers for playing around more with the code you find. Nowadays after using a tool or library for a little while, I almost always end up with a copy of the sources on my hard drive, just to poke around with and learn. Happy hacking :)

The Devil in Plain Text

Posted on Apr 15.

When developing for the web, one inevitably deals with lots of strings. When a browser talks to your killer web app they converse in plain text. String manipulation seems to be a web developer's core business.

A language like Ruby is a natural fit for this kind of job, since it inherits the exquisite text manipulation features from Perl. Here's a partial list of the languages you might be dealing with in a modern day web project :

  • HTML, CSS, Javascript
  • HAML, SASS/SCSS, Coffeescript
  • JSON, XML, YAML
  • SQL, Ruby, Regex
  • URL, HTTP request/response, Mbox/MIME

Precious Plain Text

These are all formal languages, with specific rules of what constitutes a well formed string, and with specific semantics. Yet a lot of the time we deal with them in our programs as mere strings of characters, generating them on the fly with string interpolation and templating systems, parsing them ad-hoc with regexp matching.

This is in itself an amazing accomplishment, a consequence of the Unix history of writing simple but composable tools with plain text interfaces. Plain text is The Universal Data Type, the One to Rule Them All. There is something profoundly pragmatic about reducing all problems to text manipulation. But like the One Ring, one should consider carefully when to wield its power.

What we (still haven't) learned from SQL injection attacks

No self respecting web dev would dare to commit this code :

User.where("age > #{params[:min_age]}")

Several decades of SQL injection attacks, and little Bobby Tables, have taught us that escaping values in queries is not optional, so instead we write :

User.where("age > ?", params[:min_age])

Now the database driver will 'escape' the value before inserting it into the SQL statement, making sure that in its target context it is still just a single value.

Sadly SQL seems to be the only case where this mechanism has become standardized, automated, and commonly used. We still manually CGI.escape, Regexp.escape, json_escape, Shellwords.escape, and just as often, we forget.

Semantics, semantics, semantics!

The obvious problem is that we are dealing with “dumb” strings containing “smart” data. You, the programmer, know what is in them, but your program has no clue.

The case above is common : we move a primitive value like an integer or a string into a new context. Its meaning is supposed to stay the same (e.g integer:4, string:'foo'), but because it ends up in a context with different laws, it needs to be encoded in a particular way. Here is the literal string "Foo&Bar, just 4U!" in a few different contexts:

<p>Foo&amp;Bar, just 4U!</p>
http://example.com/Foo%26Bar%2C+just+4U%21
echo Foo\\&Bar,\\ just\\ 4U\\!

If only Ruby Strings were a little bit smarter! But wait, they have already smartened up. Ruby 1.9 strings contain characters rather than bytes. They are aware of their own encoding, adding a level of interpretation on top of the underlying array of bytes. It would be an interesting exercise to make strings content-type aware. Here's how it could work.

p1, p2 = String.html('<p>'), String.html('</p>')
foo = 'foo&<bar>'
p1.type
# => 'html'
foo.type
# => 'raw'
html = p1 + foo + p2
# => '<p>foo&amp;&gt;bar&lt;</p>'

This is a step in the right direction. It is a trivial example however, and I don't want to dwell on it too long in this post. My main point is that we could use a unified API for constructing and composing 'strings with meaning'. But it would be no more than a compromise, an iterative step up from where we are.

Update: Coping by James Coglan is an implementation of this idea.

The Universal Data Type, Revisited

Properly encoding strings matters, it is something we should always keep in mind, but there is an iceberg of other potential issues lurking underneath the water when we treat structured data as merely textual strings. We are playing doctor Frankenstein, tinkering with characters to create monstrosities of ill-formed strings with dubious semantics.

The reason we do this seems to be that our tools are so well suited for textual manipulation. We are wielding Maslow's Regexp and treating every problem as a textual nail. Surely we can do better.

Ruby has more than one parent, and while it has the powerful string processing of Perl, it is also inspired by the elegant list processing of LISP.

Long before the plain text hegemony of Unix, there was already the world of LISP in which everything is a list. Strings are lists, as are nil, true, functions, lambdas, and (surprise) lists. LISP pioneered the idea of having a unifying data type, and providing powerful tools to manipulate it. And half a century later we are still dealing with data in a representation that's several levels of abstraction removed from that.

We could be dealing with lists of tokens, or abstract syntax trees, and yet we aren't. We are concatenating strings because we need to “get shit done”.

Here's an exercise : go back to the list of languages at the top and for each of them ask yourself :

  • do you know a parser library for that language?
  • do you know how to use it?
  • can you manipulate the parsed data structure, adding, removing and changing nodes?
  • can you turn the result back into its textual representation?

There is a gap in our tooling waiting to be filled. We need an elegant API contract that all parser/generator libraries can implement. Learn once, use everywhere. My hope is that you will look at all this string wizardry with different eyes. You might find it hard to unsee a pattern, it may even start to itch. And when it does, scratch.

Emergence for Developers

Posted on Mar 17.

In this post I want to dig into “Emergence”, what it is, how it applies to software development, and why it matters.

What it is

Simply put, emergence explains how some things are “more than the sum of their parts.” When something is made up out of simple components performing simple interactions, and yet the end result exhibits complex, seemingly intelligent properties, then we call these emergent properties.

The reason that sentence is stated in such an abstract way is that this concept applies to very different things, such as games, birds and free markets. A few examples will help.

Go!

Take the game of Go, an East Asian game that's been around for centuries. The rules of Go are very simple. One player plays with white stones, the other with black, and taking turns you place one of your stones on the intersecting points of a raster. When you have completely surrounded a group of stones from your opponent, she loses those stones.

That's pretty much all there is to it, you could be up and playing in ten minutes. And yet the game of Go is known for being incredibly deep. It takes years of practice to become good at it, and a lot of that time is spent studying the behavior of specific patterns. One of the first things you'll learn is that a group of stones can no longer be slain once it has two “eyes”, two holes in an otherwise connected group. Yet this is not part of the rules, it is a higher order property that emerges by applying the rules.

Flocks of Birds

There are many good examples of emergence in nature. There are emergent structures like sand dunes or water crystals. Organisms like animals and plants could be called emergent, since their patterns of behavior are not immediately apparent by looking at the organs, cells and molecules they are composed of.

Flocks of birds can appear to have a mind of their own, changing directions, dodging and diving with wondrous coordination. Yet the basic “rules” that govern are simply that each bird 1) flies in the same direction as its neighbors, 2) remains close to its neighbors and 3) avoids collision.

Bitcoin

A favorite topic among programmers! While there is some very ingenious cryptography involved in Bitcoin, the general principles are relatively easy to understand. Yet the fact that Bitcoin “works” is because some of the amazing properties that emerge through its network of cooperating clients. Essentially Bitcoin consists of a long history of financial transactions. This history is continuously shared between all clients.

The clients have some simple rules to determine what they accept as the correct version of history. With enough processing power, and enough luck, you can “find” the next chunk of history, one that has all the necessary properties to be accepted by the other clients. It can happen however that two valid “chunks” are found at the same time, creating two versions of history. Yet again by some simple rules, before long all will “agree” which version is correct, and the other one will be discarded.

(There has been a recent case of history actually “forking”, requiring human intervention. This was due to a difference in client implementations, it shows though that small changes in the system can be enough to make emergent properties disappear again.)

Emergentism vs Reductionism

Can everything be explained by sufficiently understanding its parts? This has been a topic of discussion in the natural sciences, and the reductionists, those on the “divide and conquer” side of the argument, have come out victorious.

But that doesn't mean interest in emergence has disappeared. The properties of complex systems could be explained using the properties of their parts, but often the interactions are so intricate and complex that it makes more sense to look at things at a higher level, and study these emergent properties for their own sake.

Emergent Software

Software is one of the most complex things that humans build, and a lot has been said and written about managing that complexity. But as we have seen in previous examples, complex behavior doesn't imply complex systems. What lessons can we learn from other emergent systems on how to achieve our desired end results while keeping it simple?

In some ways we already have. Programmers and mathematicians like to aspire for elegance in their work. It implies that things fall into place in a natural, harmonic way. That a lot is accomplished without trying too hard. This harks back to ancient Taoist philosophy.

When nothing is done, nothing is left undone. - Lao Tze

The Taoists have other interesting things to say that relate to this. For instance they often refer to the properties of water, a prime example of emergence in nature! I can highly recommend some Lao Tze or Zhuangzi when in need of some metaphysical inspiration.

Interacting Objects

I find that once you start framing things in the picture of emergence, many rules of thumb in modern day programming become self-apparent. Take the Single Responsibility Principle (let each object do one thing well) and the rule of Encapsulation (have clear defined boundaries between objects). They simply set the stage for your emergent feature to unfold.

In computer science literature it seems emergence is really only talked about in cases that take direct inspiration from nature, such as genetic algorithms, neural networks or Conway's Game of Life. But we do have a term for many of the emergent properties our programs exhibit. These are called “non-functional requirements”, such as security, performance, maintainability, and being adaptable to change.

Agile!

Does everyone remember the Agile Manifesto? It talks about “individuals and interactions”, and “responding to change”, among other things. By bringing humans back into the picture, the Agile movement has paved the way for a whole new type of emergence. I'm no longer talking about objects in memory interacting with each other, I'm talking about you interacting with others, and with your code.

Take pair programming. At first it takes some time to get into it, but after a while a dynamic interaction emerges between the two people coding, and the code. I've heard people say that it can almost feel like you've become “one brain”. There are no hard rules on how to do pairing, because every pair is different. In that sense it's a self-organizing system, with three agents (the pair, and the code) interacting and organically coming up with a way of working that feels right.

In fact, I would argue that there is a fourth agent that comes into play : your test suite. Have you ever felt that your tests were “pushing back”? That, by trying to come up with good tests, you realized you had to improve your overall design? Those are your tests talking to you, informing you, and your code. And by improving testability you will decrease coupling, factor out new methods and classes, and pave the way for more emergence!