Python Is Not Object-Oriented

Recently I had a technical interview in which I was asked a rather strange question in order to probe the extent of my Python knowledge. It consisted of a code sample similar to the following:

# Comments after each print() statement show the output
# (And no, they were not in the original problem)

class Foo:
	bar = 4
	
x = Foo()
y = Foo()

print(x.bar) # 4
print(y.bar) # 4
print(Foo.bar) # 4

x.bar = 5

print(x.bar) # 5
print(y.bar) # 4
print(Foo.bar) # 4

Foo.bar = 12

print(x.bar) # 5
print(y.bar) # 12
print(Foo.bar) #12

z = Foo()
print(z.bar) # 12

I was asked to predict what values would be printed at each point. To be honest, I was kind of surprised by this question: One, because it didn’t make much sense in context: this was a remote interview, and I was not asked to share my screen, so I could easily just copy and paste it into a REPL to get the answer. (Maybe they wanted to weed out people who didn’t know how to use the REPL?) But the other reason was that it was roughly equivalent to asking a prospective auto mechanic, “What would happen if you replaced your car’s transmission fluid with Mrs. Butterworth’s Pancake Syrup?” Because such a situation would never even come up on the job unless you were doing something very, very wrong. In fact, most other object-oriented languages won’t even let you do something like what this example does. The fact that Python does allow it made me realize why I have always been vaguely bothered by the language’s object-oriented features: because Python is not, actually, object-oriented.

We begin by defining a Foo class, with a class variable bar that is initialized to 4. Then we instantiate two members of Foo and query “their” .bar members, which both equal 4, as expected. (I put “their” in scare-quotes because it is not strictly true that each of these objects has a member called bar; the real answer is, “it depends,” which, as we shall see, is a problem.) Querying Foo.bar directly also yields 4.

Next we assign 5 to x.bar. Now when we query the values, we see that x.bar is 5, but y.bar and Foo.bar are both still 4. Makes some kind of sense: we changed a certain instance’s value, but we don’t expect the other instances’ values to change. As to whether or not we expected Foo.bar to change, well, I’ll get to that later.

Okay, what if we wanted to change the value of bar for all Foo instances, everywhere?

Kevin Uxbridge setting Husnock.alive to False

Setting Foo.bar to 12 ensures that, predictably, Foo.bar now returns 12. And y.bar also returns 12. But x.bar still returns 5! But if we instantiate a new Foo object called z, and then query z.bar, it gives us 12. What the holy hell is going on here?

What is going on is that Python has both class variables and instance variables, and its syntax is designed to be maximally confusing between the two. When we defined Foo, we created a bar class variable and assigned it a value. Whenever we instantiate a new Foo instance and query its .bar member, Python says “This object doesn’t have a .bar member. What about the class? Oh, it has one. Let’s return it!”

When we say x.bar = 5, we are not re-assigning a value to x’s .bar member. We are creating a new member of x that, purely coincidentally, has the same name as a variable belonging to its class. This new value shadows Foo.bar, so that when anyone queries x.bar, they get that variable instead of the class variable.

This is absolutely mind-warpingly insane, and I have no idea what was going on in Guido’s brilliantly eccentric mind to make him think this was desirable behavior. It means that we have a reliable way to return a class variable – Foo.bar – but no simple way of reliably returning an instance variable. x.bar will return either an instance variable or a class variable, depending on what tortured history that instance has been through. The only way I can immediately think of to return only the instance variable is something like x.__dict__['bar'], which either returns the instance variable, or fails if one is not defined.

This is not how sane object-oriented languages behave. The following C# code, for example, will fail to compile:

var x = new Foo();

Foo.bar = 12;
Console.WriteLine(x.bar);

class Foo {
    public static int bar = 4;
}

Here we’re trying to do something similar to what we did in Python: define a class with a static (class) variable, instantiate the class, and query the variable. But since bar was declared as static, the compiler won’t let us refer to x.bar. We can’t confuse our bar instance variable with our bar static variable, because C# won’t let us define two variables with the same name within the same class, even if we make one static and the other not.

That’s why I said the example was like putting pancake syrup in your transmission: it violates Python best practices. The best way to write object-oriented Python code is to pretend it’s C# code: If you define a class variable, only reference it via <class name>.<variable name>. And for God’s sake, if you’re using a class variable, don’t ever define an instance variable with the same name.

If I were to re-design Python, I would make <instance name>.<variable name> refer to instance variables only, rather than “check for an instance variable, then return a class variable if you don’t find one”. Because you know what the latter behavior reminds me of? Javascript. Original Javascript was explicitly not object-oriented, but rather used prototype-based inheritance.¹ There were no classes; you defined an object, used that object as a prototype to define more objects, and modified them as you saw fit. If you reference a member of an object, Javascript first looks in that object, and then, if the member isn’t there, looks at its prototype. It was only relatively late in the game that Javascript actually added the class keyword and accompanying syntactic sugar to make life easier for developers more accustomed to object-oriented languages. But that’s all it was: syntactic sugar. Under the hood you still had the same old prototype-based system; Javascript “classes” are just prototypes that you can use special OO-like syntax with.

Python’s object-oriented capabilities aren’t exactly prototype-based, but they resemble it more than they do “strong” OO languages like C# or Java. And Python wears the “syntactic sugar” nature of its classes on its sleeve. It reminds me in some ways of the early versions of C++ (or its predecessor, “C with classes”), which were implemented as preprocessors rather than compilers: they translated C++ syntax into C code.

In standard C, if you want to define a new type, you would usually do so with a struct, which is a product type: it’s like several variables, each with their own name, all grouped together, with a convenient syntax for accessing them:

struct CartesianCoordinate {
    int x;
    int y;
};

int main(){
    struct CartesianCoordinate foo;
    
    foo.x = -7;
    foo.y = 4;
    
    printf("(%s, %s)", foo.x, foo.y);
}

This already looks a lot like object-oriented programming: CartesianCoordinate is a struct that has two int members, x and y. You can create an instance foo of this type, and access its members with foo.x and foo.y. Again, this is C, not C++, but we’ve already got something that resembles classes and objects.

Of course, once you try to do anything nontrivial with CartesianCoordinate, the fact that it’s not a real class becomes quite apparent. You can’t define methods on it, for one thing. The best you could do would be to define a function that takes a CartesianCoordinate as one of its arguments. So instead of defining a distance_from_origin() method that you’d call via foo.distance_from_origin(), you’d define a distance_from_origin(CartesianCoordinate c) function that you’d call via distance_from_origin(foo).

If you wanted to make C feel a little more object-oriented, then, you could write a preprocessor that would let you define a class and methods on that class, but would output a struct definition and a set of functions that take an instance of that struct as their first argument. It would also translate method calls from foo.bar(...) to bar(foo, ...). This would get you something that feels very object-oriented indeed! Of course, it wouldn’t actually be OO. You wouldn’t have any of the nice abstraction and encapsulation that a real object-oriented language would get you. All members would be public, and nothing would stop you from defining a “method” that modifies any value in any instance of any “class”.

Sound familiar? Classes in Python really are just a fancy way of passing a struct to a bunch of functions that accept the struct as an argument – so much so that when you define a method, you actually have to give it a self argument. If you define a method with no arguments, Python won’t complain; it’ll even let you instantiate the accompanying class. But try to call it, and you’ll get something like this:

TypeError: Jorf.snorg() takes 0 positional arguments but 1 was given

That’s clear evidence that Python “methods” are just syntactic sugar over functions that take record types as their first argument.

We also don’t get real data abstraction. Python classes have no private members; the best you can do is give them a name starting with an underscore. This works like the prison in The State’s “Prison Break” sketch, which had a wide-open gate which the warden asked the inmates to consider “off limits, as a favor to me”.² Underscores signal to other developers to pretty please not use those variables or methods outside of that class, but don’t actually prevent you from doing so.

Python does have more OO (or OO-like) features than the trivial preprocessor example I’ve outlined. You get something like inheritance, and polymorphism, and generics. But those are really just more syntactic sugar, coupled with the fact that Python is a dynamic language that tries its best to let you pass any value to any function. Thus Python doesn’t care whether you’re calling a method on an instance of a base class or a subclass, because it’s a trivial matter to just pass a different object as the first argument to the function-in-method’s-clothing.

I still like Python, and the pseudo-OO that we get in Python is still better than the not-at-all OO we get in C. But it also allows you to do all sorts of crazy things that you really shouldn’t be able to do, and some of its syntactic sugar is more like syntactic salt. So I’m not going to go mixing class variables and instance variables of the same name, even if it lets me do that. I’m going to approach object-oriented Python as though it were C#, with the added caution that I don’t have a compiler that will prevent me from accidentally doing something stupid. And if I ever somehow become the next Benevolent Dictator for Life, I’m going to make some breaking changes.

Some pedants will no doubt argue that prototype-based programming is even more deserving of the title “object-oriented,” and that C# and Java and such are really “class-oriented,” or “class-and-object-oriented”. Some pedants are very annoying. ↩︎
Astoundingly, neither a YouTube clip nor a .gif of this sketch seems to exist on the Web, or I would 100% be embedding it here. ↩︎

Last modified on 2023-10-11