Reserved Words in YAML and Translating Booleans in Rails

Back in the days, while I was researching for my Rails boolean formatting helper, I came across some strange behaviour of Rails’ internationalization system when using certain words in the YAML files containing the translated strings. I didn’t really investigate the issue any further back then, but I recently stumbled upon it again, so I figured there should be a place where this is documented.

The Problem

Let’s first look at the problem in the context where I encountered it at first: in a minimal Rails application. After generating a new app, copy the following YAML contents into the default config/locales/en.yml:

en:
  hello: "Hello world"
  yes: "Yes, Sir!"
  no: "Nope"

This would normally let us translate the strings 'hello, 'yes' and "no" with a simple call to I18n.t. We can test this in the Rails console and see the following:

I18n.t 'hello'
# => "Hello world"
I18n.t 'yes'
# => "translation missing: en.yes"
I18n.t 'no'
# => "translation missing: en.no"

What’s that? Didn’t we provide a translation for the string “yes” right below the translation for “hello”, which seems to be perfectly fine? No; actually, we didn’t. The problem is that the keys in YAMLs key-value collections (called mappings) aren’t just strings. They are themselves YAML nodes. Here’s the relevant excerpt from the YAML 1.2 specification, section 3.2.1.1. Nodes:

The content of a mapping node is an unordered set of key: value node pairs, with the restriction that each of the keys is unique. YAML places no further restrictions on the nodes. In particular, keys may be arbitrary nodes [emphasis added], the same node may be used as the value of several key: value pairs, and a mapping could even contain itself as a key or a value (directly or indirectly).

What this means is that when we parse YAML, we have to also parse the keys of mapping nodes; again as YAML content. How does this apply in the above situation? Let’s look at an even more minimal example. Save the following code as en.yml somewhere:

yes: Yes, Sir!
no: Nope!

If we parse this YAML tree, we can now roughly see what is happening:

require 'yaml'

YAML::load(
  File.open('en.yml').read
)
# => {true=>"Yes, Sir!", false=>"Nope!"}

The keys of the resulting hash are actually true and false rather than the strings "yes" and "no"! This is because these words are actually aliases for YAMLs scalar nodes for boolean values, namely true and false. This is also the case for all of the following words: true, false, yes, no, on, off. In addition, at least the Ruby YAML parser is case insensitive in this case, meaning that a YAML mapping key of yEs will still be parsed to true. The YAML spec however only mentions these keywords in all lowercase and with the initial letter capitalized.

Now we see why our initial attempt to translate the words “yes” and “no” with I18n failed: These strings are reserved words in YAML and will be parsed to their respective boolean value. Accessing the resulting hash with e.g. "yes" then won’t find anything because it does not contain that string as a key. It only contains a key true.

Note: Using true and false as key in your YAML file and/or using true and false (i.e. the boolean values, not the strings) to look up your translations with I18n.t won’t fix this problem, though, since I18n.t apparently calls to_s on its argument or uses some other way to get a string value out of it before looking up a translation for it. To actually make it work, read on!

The Solution

The obvious solution is, of course, to avoid the reserved words as keys in your YAML files. However, if you still must or want to use them, here’s the fix: What we want to do is to tell the YAML parser to treat true, yes, etc. as strings, despite their special meaning. Don’t forget: Keys are just YAML themselves, so we can use YAMLs two standard ways to force something to be parsed as string. The following example illustrates both:

"yes": Yes, Sir!
!!str no: Nope!

I personally prefer the first one for readability but it really means the same.

Before we come to an end here, you might be asking: What about the “Yes” in the actual translation “Yes, Sir!”? Good question! This “Yes” already appears in the context of a string and is thus just parsed as part of it. You can also use this in your keys:

Yes, Sir!: Yes, Sir!

This won’t convert anything to boolean, but the key in the resulting hash will obviously not be "yes", either.

Also, be aware that this post is actually not really about boolean values. What was said equally applies to anything that has a special meaning in YAML. The following code will parse to a hash containing the array [1, 2, 3] as a key, not the string "[1, 2, 3]":

[1, 2, 3]: A list!