I’m something of a programming languages obsessive. I love learning new programming languages, almost irrespective of anything I could actually build using them. I love seeing how the same basic problems get solved, over and over again, in different ways. I’m fascinated by the formal grammar of programming languages and how it can be so similar and yet so different from natural language. And I like how new language paradigms force us to retool our thought processes.
At the same time, sometimes I actually want to get things done. One of my other continuing obsessions is the Game of Hex, and its related game Y. Oh, I don’t actually play these games that often. But I like figuring out ways to teach computers to play them. They’re relatively simple to implement, which makes them ideal testing grounds for different machine learning methods. And I’ve lately been playing around with implementing Monte Carlo Tree Search for the game of Y, and, by extension, Hex (the latter being equivalent to a special case of the former). My go-to language is Python – not necessarily my favorite language, but the one I’m most comfortable with and can do the most in without looking anything up. But MCTS requires running thousands of simulations each turn, and pure Python is notoriously slow at that sort of thing. So it was time for me to turn to something faster. And these days, there are more choices than ever.
A Systems Programming Bestiary
The type of languages I’m looking for are usually called “systems” programming languages, in that they’re considered suitable for low-level programming like device drivers and even operating systems. They tend to be compiled, strongly typed, and often involve manual memory management.
If you’re a fan of functional languages, OCaml could be considered a systems programming language (there’s a guide and everything.) But it’s probably a bit high-level to really qualify, and if you really want optimized code you’re probably going to want to use more of its not-purely-functional capabilities, in which case, what’s the point? Fans of Ruby will appreciate Crystal. People who really want a weird Pascal / Python hybrid will love Nim. Fans of Google who like simple syntax and don’t need fancy language capabilities will really like Go.
All of those languages are garbage collected1, meaning that you don’t have to worry about allocating and deallocating memory. Instead, every once in a while, a garbage collection routine will run that will automatically deallocate any objects that aren’t being used. This is usually a good thing, as it prevents a whole lot of errors that at best eat up your memory and at worst cause horrific security vulnerabilities. But if you’re doing anything time-crucial, such as real time audio or video processing, the time during which execution pauses so that the garbage collector can run can be very noticeable.
Which brings us to the languages with manual memory management. The big behemoth of low-level, manually managed programming languages is C. If you use an operating system or a programming language – which you do – chances are it was either written in C, or written in something that was written in C. This is a language with no almost no guard rails. You can declare an array of length 10 and ask for the 6,254th element. You can free a piece of memory and then gleefully continue to use a pointer to it. You can use variables that haven’t been initialized, giving you keen insight into whatever random garbage or closely-guarded secrets lurk in your memory. I’ve never really done much pure C programming, so for all I know in some of these cases you’ll at least get a compiler warning. But if you want to badly enough, you can do all kinds of nonsense, and in many cases you can do it accidentally, sometimes with dire consequences, sometimes going unnoticed for years. Most of these peculiarities date back to the days when compilers used tons of the computer’s scarce memory and time cycles, and it just wasn’t feasible to guard against them. Compilers just had to trust that the programmer knew what they were doing.
The most successful of C’s object-oriented successors is C++. In addition to bringing to the C world the object-oriented paradigm that was all the rage thirty-five years ago (and still dominant even though people constantly complain about it), it gave us a number of safeguards. Object constructors and destructors made it easier to ensure that resources were initialized and discarded cleanly. You could still do the ridiculous C things, but most C++ developers are never tempted to do so since they have much easier and safer tools at their disposal. Over the years, many additions to the language’s core and standard libraries have made C++ even safer and easier, but you can still go hog-wild ridiculous if you really want to.
C’s contemporary, Pascal, also had manual memory management, but didn’t allow quite as many transgressions as C. Eventually the language would add its own object-oriented capabilities and be seen as a competitor to C++. Perhaps unfairly, it never quite shook the stigma that it was a “teaching” language, to be discarded in favor of a “real” language (C) once you graduated CS 101.
The desire for a language that wasn’t garbage collected but also didn’t allow the egregious memory errors of C let to something of an explosion of new languages. There’s D, and Zig, and of course everyone’s favorite darling these days, Rust. Somehow almost forgotten in the rush is Ada, Pascal’s super-safe cousin, which found a home in the Department of Defense and embedded systems, and which implemented some of Rust’s nicer features years before.
Back to the Mother Church
After considering many of these languages, I finally decided to implement my dinky little game AI in C++. This may not be an obvious choice: I’m not doing any signal processing or other super-sensitive work that would be bothered by a garbage collector, after all, so why not go with something easier, like Crystal or Go? Or, if I’m bound and determined to do manual memory management, why not do what all the cool kids are doing and learn Rust?
In some sense, it didn’t really matter, because it’s not as though I had to pick the language I had to use for the rest of my life. If I chose C++ and became disillusioned with it, I could always learn Rust later. I’m constantly learning new programming languages. It always irks me somewhat when I see a forum post asking “which language should I learn in 2022?” which usually means something more like “what language are all the high-paying jobs in right now?” The real advice is, learn how to program; if you know how to think about programming problems, the language is just details. I started my first job as a Java developer having never written one line of Java, and I had written very little C# when I began using it professionally. And none of it mattered, because I knew how to program.
The first serious programming language (i.e., not BASIC) that I learned was Pascal, which was the language of most introductory programming courses in the 1980s and 1990s. Having learned enough Pascal to understand how ALGOL-style languages worked, I decided to teach myself C. At the time I was mainly using UNIX operating systems, which are thoroughly entwined with the C programming language. And while I’m sure purists will balk at this comparison, C and Pascal were basically the same programming language with different syntax. (There was even a nifty UNIX utility,
p2c, that would translate Pascal to C; some of my classmates didn’t even use a real Pascal compiler and chose instead to do their homework in
p2c.) However, these were the days of Turbo Pascal, which had recently added object-oriented features to the language, and I wanted that same power in the C world. And the language that did the most Turbo Pascal-like job of added object orientation to C was C++.2
I learned a decent amount of C++ and used it for a few dinky little programming projects before I decided that it was too much work for the payoff. There was too much boilerplate code you had to write to get things to work, and to make a program do anything useful you had to either re-invent the wheel or install a third-party library. And managing any sort of dependencies in those days was a headache. You had to do a lot of fiddling around with Makefiles, and that still wouldn’t guarantee your program would compile on another computer. (Automake wasn’t really a thing yet.) So I pretty much decided I didn’t have the patience for programming and all but gave up. It wasn’t until several years later, when I discovered an early version of the Python programming language, that I seriously began programming again. Python was billed as coming “with batteries included,” having an enormous standard library that greatly increased the amount you could do without installing dependencies. And it wasn’t compiled, so there was no fiddly Makefile to worry about. You just slapped your code in a file and ran it. That was all the impetus I needed to get back to programming: getting the tedious bits out of the way.
From Python I went on to Java and then C#, with shorter excursions into stranger languages along the way. But all the while I felt like I was missing out on something. I was living on the fringes. All of these other languages had compilers or interpreters that were originally written in C.3 If you wanted to do anything that required real optimization, you’d use extensions written in C. C was, in fact, the lingua franca of computing: any two languages that talked to each other probably did so via a C API. UNIX was basically “what if C were an operating system?” It was as if the history of all other programming languages since C were a series of responses to the question “how can we bring the benefits of C to people who don’t like to program in C?” There were lots of sects and schisms, but C was the Mother Church.
Actually, C was more like the Orthodox Church, and C++ was the Roman Catholic Church: the former sought to preserve tradition by making as little change as possible, while the latter changed what was needed to keep the tradition relevant in a changing world. All those other languages jettisoned tradition like so much weighty baggage with the hopes of starting over, thus making the same mistakes over again in new and exciting ways. Only C and C++ allowed one to experience what Chesterton referred to as “the thrilling romance of orthodoxy”4: upholding an ancient tradition when the fashionable thing is to discard it and start anew, only to be vindicated when those fashionable new trends are themselves discarded. Sure, I could pour all my energy into learning Rust, but what would I do in a decade when people are clamoring to invent new languages to fix Rust’s mistakes? A modern C++ developer – so long as they do not content themselves with learing only modern C++, and don’t shy away from the parts of the language inherited from C – could today write code that would run on machines from the late 1970s, and that will almost certainly still work with compilers that are produced thirty years from now.
Open the windows and let in the fresh air
Coming back to C++ after so long in the wilderness was not nearly as terrifying as I thought it would be. Many changes had happened that made the experience quite nice. The Standard Library, while not nearly as batteries-included as Python’s, nevertheless now includes a lot of helpful functions and data structures that were missing back in the ’90s. Smart pointers, while I’m sure not as nice and safe as Rust’s borrow checker, helped prevent a lot of the blow-off-your-entire-leg memory errors that C++ had been criticized for. A lot of the features that had made Java and C# so much more developer friendly, like range-based
for loops and type deduction, were now part of C++; the C++20 standard even includes a lot of features that are somewhat cutting edge in other languages. You do still have to be careful without automatic garbage collection, and you have to have a good understanding of what a pointer is, but modern C++ isn’t the order of magnitude more complex than Java that it’s popularly portrayed to be.
You do still have to compile a lot of individual files to object files, and use a separate program to link them, which means some sort of build system is necessary for all but the tiniest projects. There are more choices now, however. Autotools, which came out about the time I gave up on C++, is now looking pretty long in the tooth; the cool kids now mostly seem to be using CMake, although I ultimately elected to use the less popular but (at first glance) better-documented Meson. (And no, I don’t consider my choice of a newer and less popular build system to contradict my whole embrace of computing tradition; for one, the build system isn’t part of the actual language, and for another, build systems in general are pretty new in the grand scheme of things.) It seems like you are still responsible for finding packages and installing them to your system yourself (though at least one dependency manager has gained some traction), but that’s not as big of a problem in the C world, since the package managers for Linux, BSD, and MacOS all allow you to install most of the important C/C++ libraries to your system.
Tooling was another area where I did not have to struggle as much as I thought. For most languages these days, you either need to install an IDE for that language, or use some sort of language server that works with your editor of choice. I was pleasantly surprised to find that when I started writing C++ code in emacs, I got warnings about most errors right off the bat, before I even compiled. Granted, I use doom-emacs rather than off-the-shelf emacs, so there is probably some extension I don’t know about doing the heavy lifting here, but even so, it would be just an ordinary emacs extension, not a separate language server. That’s a lot less than I had to do to get helpful hints for F# or Rust in my editor.
One of the benefits of using a language that’s had such a large userbase for so long is that there’s plenty of good-quality documentation out there, and when that fails, a simple Web search usually turns up plenty of good results that answer your specific question. I know the time will probably come, but so far I have yet to be truly stumped by a C++ error message or unexpected behavior that a quick search couldn’t fix.
There are some downsides, too. While C++ does sometimes deprecate features and syntax, it still takes backwards-compatibility pretty seriously, which means there’s probably five ways to do everything, four of which are wrong. You do have to be really careful to ensure that the solution to your problem that you just found on Stack Overflow isn’t dated to four standards ago, because there’s a good chance something has been added to the language since then that helps prevent problems that the other method was found to have. Even so, it’s nice knowing you don’t have to refactor all your code just because a new version of the language came out. And it’s even nicer to know that there are so many people willing to approach problems with a mindset of “what can we add to the language to keep people from making this mistake?” rather than “we obviously designed the language wrong in the first place, let’s either make some breaking changes or give up and design a new language”.
I’m still going to use Python, both because it’s the language I’ve used continuously for the longest, and because it’s what my current job requires. And I’ll keep learning new languages. But unless I’ve got some pressing need to interoperate with the .NET or JVM ecosystems, I doubt I’ll reach for Java or C# very often. Now that I’ve discovered the well-kept secret that C++ isn’t actually very difficult, there’s not as much of a reason to mess around with managed memory virtual machines. It really does feel like I’ve come home.
Nim, technically, lets you choose which kind of garbage collector you want, including “none at all”. But I’d say most code written in Nim is probably garbage-collected. ↩︎
I was dimly aware, at the time, of Objective C, which was the “C-plus-Smalltalk” to C++’s “C-plus-Simula-67”. This would have been a much bigger conceptual leap for someone used to the Simula-inspired Turbo Pascal. Plus the only people who seemed to be using it in those days of Steve Jobs’ vacation from Apple were NeXT, and almost nobody owned an actual NeXT workstation. ↩︎
I haven’t fact-checked this claim, but I’d be shocked to discover I was wrong. ↩︎
I promise I’m not as much of a G. K. Chesterton fan as the fact that I’ve quoted him in two separate blog posts might suggest. I think he was wrong about a lot of things, but the man could definitely come up with a quotable turn of phrase. ↩︎
Last modified on 2022-11-08