Nathan Peck
Nathan Peck
Senior Developer Advocate for Generative AI at Amazon Web Services
Nov 18, 2020 7 min read

Is YAML a real programming language?

A few days ago I wrote an article about common misconceptions on AWS Cloud Development Kit. In that article I made the following statement: “YAML is definitely a real programming language.” A couple readers reached out to me via email and Twitter direct message to disagree or ask for clarification on this:

I read your latest article about the CDK and I agree in all points except the first. You state YAML is a real programming language. This is factually wrong and misleading!

and

I liked the article but you make it sound like YAML is a real programming language when its not.

In addition to direct responses I decided to share this longer post. Spoiler alert: in defiance of Betteridge’s law of headlines this article will explain why I do in fact believe that YAML is a real programming language.

What is programming?

The definition of programming is “to provide (a computer or other machine) with coded instructions for the automatic performance of a task.” There are many ways and formats for providing those coded instructions.

One of the most direct ways to program would be to send raw bits by flipping switches on and off. Back in 1975 that was how you had to provide programming to the Altair 8800.

altair 8800

Attribution: Wikimedia Commons

Over time us developers have created higher and higher level abstractions for providing coded instructions to machines: punch cards and punch card readers so we wouldn’t have to flip switches by hand. Eventually we created higher level programming languages, where a few statements in a more human readable format could be translate into numerous underlying bits. From assembly language, then C, and C++, eventually higher level interpreted languages: the abstraction levels keep getting higher.

Programming operates on layers of abstraction

At this point no matter how you choose to write code it goes through many, many layers of software interpretation. Even if you tried to directly write machine code for a specific processor that doesn’t stop the modern processor microcode from reading your machine code instructions, interpreting them, and optimizing and reorganizing them as it sees fit, right inside the processor.

So no matter what format or language you use to write your code the final result of what the machine does in response to your code has been interpreted and remixed by layers and layers of programs in between your input and the final result. All of this happens no matter what format you use: Assembly Language, C, C++, Python, JavaScript, HTML, or YAML.

From this perspective programming a system to change its behavior using YAML and a YAML interpreter isn’t that different from using HTML and a browser, or JavaScript and a Node.js interpreter, or C++ and a compiler, or machine code and processor microcode. All are forms of programming: providing “instructions for the automatic performance of a task”. None of these languages are any more or less “real” than the other languages.

Argument: But YAML isn’t a general purpose language like <insert language here>

You can definitely make the argument that the less abstracted and the lower the level of the language you are working in the more flexibility you have in what you are accomplishing. Some languages are more general purpose while others are more specialized. If you are writing YAML you can only accomplish what your YAML based tooling allows (including any YAML addons or preprocessors).

But this hold true for other programming languages as well. If you are writing HTML and CSS you can arguably accomplish a lot more than you can with YAML but you are still stuck inside the browser sandbox. If you are writing code for an executable binary program that runs directly under an operating system you can accomplish even more but you are still limited to only what the underlying system calls allow you to do.

Every level of programming abstraction is subject to limits. The limits of YAML do not make it any less of a “real” programming language.

Gatekeeping programming languages is toxic

In the end one of the most important points that it all comes back to for me is that most of the time when devs are saying that a language is not “real” it is because it allows them to feel a false sense of superiority that they know how to work at a lower level of the stack. This leads to a toxic culture in engineering. Imagine this ludicrous scenario:

  • Infrastructure engineers: Writing YAML to control and automate deployments across thousands of servers in the cloud, distributed around the globe in multiple cloud regions
  • HTML and CSS developers: Well at least I work with “real” languages unlike these infrastructure engineers who just write YAML all day
  • Node.js API developers: Hah, those frontend devs only write HTML and CSS… but I know how to code in a “real” language
  • Systems developers writing the underlying C++ for V8: lol most of these Node.js developers only know how to write in an interpreted language, not a “real” compiled language like C++!
  • Compiler developers: imagine all these noobs above me not knowing how to make a compiler to generate “real” machine code
  • Processor designers: lol I’m the one who made the “real” microcode inside your processor, that runs all your machine code!

In this broader perspective the entire premise of “real” programming becomes a ridiculous exercise of superiority complexes.

The reality is all these developers at different levels of the stack are providing a tremendous amount of value at their level of the stack, and if they have done their job well enough for long enough they most likely have deep domain specific expertise that the other developers aren’t even aware of.

YAML is whatever you make of it

While YAML is simplistic on the surface there is a tremendous amount of depth to the YAML ecosystem. There are domain experts who know how to use YAML to solve challenging problems. There is tooling for extending and preprocessing YAML, in a similar way to how TypeScript is compiled to JavaScript, but instead producing YAML. And there are so many tools that consume YAML and implement rich logic.

You can even write your own tooling. For example, a while back I wrote an interpreter to execute YAML. It takes input like this:

items:
  $for:
    $index: "iterationNumber"
    $start: 1
    $end: 31
    $delta: 1
    $each:
      index: "{{ @iterationNumber }}"
      mod3:
        $math:
          $expression: "{{ @index }} % 3 == 0"
      mod5:
        $math:
          $expression: "{{ @index }} % 5 == 0"
      $return:
        $branch:
          $basedOn: "{{ @mod3 }}:{{ @mod5 }}"
          $if:
            true:true: "Fizzbuzz"
            true:false: "Fizz"
            false:true: "Buzz"
            false:false: "{{ @iterationNumber }}"
$return:
  $join:
    $target: "items"
    $delimiter: ", "

My interpreter executes the instructions in this YAML to solve the classic “fizz buzz” coding interview test. I mostly created this as a troll project to poke some fun at the “fizz buzz” interview challenge by solving it using YAML instead of a lower level programming language. You can find the source code for this project on Github.

Just like my interpreter for YAML there are numerous pieces of software out there that consume YAML and execute different flows and commands based on that YAML input. They are essentially “interpreters” that change the behavior of the machine in response to YAML input.

So YAML is what you make of it. Yes you can use it as a static thing that just describes a data structure, but you can also use it to program systems. This is something that infrastructure engineers do every day when they write CloudFormation YAML, Kubernetes manifest YAML, Ansible playbook YAML, etc. These engineers are writing code in a programming language, one that is very abstracted and specialized, but a real programming language nonetheless.

Conclusion

I think that it is very uncool to disparage the work of devs working at a different level of abstraction by saying the language they work in is not “real”. I believe that it is better for everyone to give each other the respect they deserve for the expertise they have at using the tools they work with.

I believe that all languages are ultimately interpreted through many layers of abstraction, and YAML is no different in this respect than the numerous other less abstracted languages you can write code in.

Programming is the act of providing instructions to a machine to change its behavior, so in this respect YAML is just as real of a programming language as any other language.