Object Oriented Programming: introduction#

Object-oriented programming (OOP) is a standard feature of many modern programming languages. Learning OOP “properly” would require one entire semester at least, and we will not try to cover all the details here. The specific objectives of these three OOP units are:

  • to learn the basic OOP concepts and syntax so that you are able to understand OOP code and can use libraries making use of OOP concepts (i.e., almost all python libraries)

  • to become familiar with certain semantics associated with OOP: classes, objects, attributes, methods, inheritance

  • to introduce simple examples where OOP is a useful paradigm, and try to raise your interest in its usage so that you can learn it by yourself when needed

Today’s OOP unit introduces the concept of “objects” in the Python language and shows how to create new kinds of objects on your own. In the next unit, we will learn the concept of class inheritance. Finally, we will learn in which circumstances objects can be useful, and when not.

Copyright notice: this chapter is vaguely inspired from RealPython’s beginner tutorial on OOP.

Introduction#

As stated in the OOP wikipedia page, OOP is a “paradigm” which may or may not be supported by a specific programming language. Although Python is an OOP language at its core, it does not enforce its usage. Matlab or R follow the same philosophy: they support OOP but do not force their users to use it. In fact, many of you will finish your master thesis without having to write any OOP specific code. You have, however, already made heavy use of Python objects (everything is an object in Python, remember?) and I argue that it is very important that you are able to understand the basics of OOP in order to make better use of Python.

In short, OOP is simply another way to structure your programs. Until now, you have written modules consisting of functions, sometimes with a short __main__ script which was itself calling one or more functions. OOP will add a new tool to your repertoire by allowing you to bundle data and behaviors into individual objects, possibly helping you to organize your code in a way that feels more natural and clear.

Let’s get started with some examples and new semantics! We will talk about the advantages and disadvantages of OOP in a following unit, once you are more familiar with its syntax.

Classes and objects#

Classes are used to create new user-defined structures that contain information about something. These “things” come with “services”, as you already know. Let’s define a new class called Cat:

class Cat:
    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

What are the new things in this code snippet?

  • First, the class name definition (class Cat). As per pep8, class names in Python should use “CapWords” per convention.

  • The class contains a “function” called __init__, which indeed looks very much like a normal function. Here the __init__ function has three positional arguments: self (which has a special meaning as we are going to see), name and weight. These arguments are used to initialize the attributes of the same name. We will get back to this in the next section.

A class provides a new structure definition. It is a “blueprint” for how something should be defined, but it does not actually provide any real data content itself. To actually use the functionalities defined by the class you need to create a new instance of that class. Instantiating is a fancy term for creating a new, unique realization of a class (an object). Let’s go for it:

a = Cat('Grumpy', 4)
a
<__main__.Cat at 0x7b64953588f0>

We just created a new instance of the class Cat and assigned it to the variable a. An instance of a class is commonly called an object (this can be used as a synonym for “instance”). The variable a stores an instance of the class Cat:

# Ask if a is an instance of Cat or not
isinstance(a, Cat)
True

In fact, we just created a new datatype, called Cat:

type(a)
__main__.Cat

Every new instance of a class is unique, regardless of the values used to initialize it. Let’s create a new Cat with the same name and weight:

b = Cat('Grumpy', 4)
isinstance(b, Cat)
True

It is still a unique instance and is not a copy of a in any way:

a == b
False
b is a
False

Class/instance attributes#

The cat’s name and weight are called instance attributes and can be accessed with the dot syntax:

a.name
'Grumpy'
a.weight
4

A common synonym for the term “attribute” is “property”. The two terms are very close and you might find one or the other term depending on who writes about them. Properties in python are a special kind of attributes, but the difference is subtle and not relevant here.

Instance attributes are specific to the created object. They are often defined at instantiation:

b = Cat('Tiger', 5)
b.name
'Tiger'

Classes can also define class attributes, which are tied to a class but not to a specific instance:

class Cat:
    # Class attributes are defined at the class level
    speak = 'Meow'

    # Initializer
    def __init__(self, name, weight):
        # Instance attributes only make sense at instanciation
        self.name = name
        self.weight = weight
Cat.speak
'Meow'
a = Cat('Grumpy', 4)
a.speak
'Meow'

Careful! Both class and instance attributes are mutable. They can be changed from outside the class:

a.name = 'Roncheux'
a.speak = 'Miaou'
a.name
'Roncheux'
a.speak
'Miaou'

These changes are specific to the instance, and the class attributes remains unchanged:

Cat.speak
'Meow'

In comparison to other OO languages, python is very “liberal” regarding attributes: some languages like Java would not allow to change attributes this way. In practice, attributes should not be changed by the users of a class. Unless they are documented as being “changeable”, and in this case they become “properties”. More on this later.

Instance Methods#

If a class only had attributes, it would merely be a structure for data storage. Classes become useful when they add “services” to the data they store. These services are called methods, and their syntax has similarities with a function definition:

class Cat:
    # Class attribute
    speak = 'Meow'

    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    # Method
    def say_name(self):
        print('{}, {}!'.format(self.speak, self.name))

The biggest difference between methods and functions is that a method is tied to a class instance: this is made clear by the self argument, present in the method definition but not used when calling the method:

a = Cat('Kitty', 4)
a.say_name()
Meow, Kitty!
b = Cat('Grumpy', 3)
b.say_name()
Meow, Grumpy!

The self variable is implicit in the call above, and refers to the instance of the class which is calling the method. It might sound a little complicated at first, but you will get used to it: self is used to read and write instance attributes, and is the first argument to virtually any method defined in the class (there is one exception to this rule, and we will ignore it for today).

At this point, you may have noticed similarities between objects you have used before and the objects we just defined here. Let’s find an analogy:

import xarray as xr
a = xr.DataArray([1, 2, 3], name='my data variable')  # instantiating a class
assert isinstance(a, xr.DataArray)  # a is an instance of the DataArray class
print(type(a))  # xarray.core.dataarray.DataArray is the datatype
print(a.name)  # name is an instance attribute
print(a.mean())  # mean is an instance method
<class 'xarray.core.dataarray.DataArray'>
my data variable
<xarray.DataArray 'my data variable' ()> Size: 8B
np.float64(2.0)

Are you confident about the meaning of all these terms? If not, I might have explained it in a way which is not the right one for you: you can use your google-skills to look for other tutorials. There are plenty of good tutorials on the web!

Extending attributes: the @property decorator#

We learned that “attributes” are data that describe some aspects of a class instance. Very often, instance methods will initialize and/or update these attributes at run time. Consider the following example:

class Cat:

    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    # Method
    def eat_food(self, food_kg):
        self.weight += food_kg
a = Cat('Grumpy', 4)
print('Weight before eating: {} kg'.format(a.weight))
a.eat_food(0.2)
print('Weight after eating: {} kg'.format(a.weight))
Weight before eating: 4 kg
Weight after eating: 4.2 kg

This is a simplified but very typical use for instance attributes: they will change in an object’s lifetime according to specific events. Now let’s suppose that you are working with scientists from the USA, and they would like to know the cat’s weight in pounds. One way to do so would be to compute it at instantiation:

class Cat:

    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight
        self.weight_lbs = weight / 0.45359237

    # Method
    def eat_food(self, food_kg):
        self.weight += food_kg
a = Cat('Grumpy', 4)
a.weight_lbs
8.818490487395103

There is an obvious drawback to this method however: what if the cat eats food? Its weight in lbs will not be updated!

a.eat_food(0.2)
a.weight_lbs  # this is a problem
8.818490487395103

A possible way to deal with the issue would be to compute the pound weight on demand, i.e., to write a method to compute it:

class Cat:

    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    # Method
    def eat_food(self, food_kg):
        self.weight += food_kg

    def get_weight_lbs(self):
        return self.weight / 0.45359237
a = Cat('Grumpy', 4)
a.eat_food(0.2)
a.get_weight_lbs()
9.259415011764858

This is already much better (and accurate), but it is somehow hiding the fact that the weight of a cat really is an attribute, no matter the unit. It should not be accessed with a get_ method. This is where a new syntax comes in handy:

class Cat:

    # Initializer
    def __init__(self, name, weight):
        self.name = name
        self.weight = weight

    # Method
    def eat_food(self, food_kg):
        self.weight += food_kg

    @property
    def weight_lbs(self):
        return self.weight / 0.45359237

weight_lbs looks like a method (it computes something), but only in the class definition. For the user of the objects, the computation is hidden in an attribute call:

a = Cat('Grumpy', 4)
a.eat_food(0.2)
a.weight_lbs  # weight_lbs is an attribute!
9.259415011764858

This is a very useful pattern, and it is frequently used in python. The @ syntax defines a “decorator”, and you might learn about decorators in a more advanced python class.

Digging deeper into @property: setters#

Now, your colleagues from the USA can finally read the weight of the cat in pounds, and you programmed this without compromising the consistency of the cat’s weight, regardless of the unit used. However, your colleagues are still unhappy: they would like to be able to set the weight of the cat in pounds as well (these Americans really hate SI units). This is causing you trouble, because when they try to set it they get an error:

a.weight_lbs = 9  # this generates an attribute error
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[32], line 1
----> 1 a.weight_lbs = 9  # this generates an attribute error

AttributeError: property 'weight_lbs' of 'Cat' object has no setter

This is expected, because nothing in the class definition above can set an attribute named weight_lbs: it is a passive, “read only” property. Here again, your creative mind came up with an elegant solution:

class Cat:

    # Initializer
    def __init__(self, name, weight, unit='kg'):
        self.name = name
        if unit == 'kg':
            self.weight = weight
        elif unit == 'lbs':
            self.set_weight_lbs(weight)
        else:
            raise ValueError('Unit not understood: {}'.format(unit))

    # Method
    def eat_food(self, food_kg):
        self.weight += food_kg

    @property
    def weight_lbs(self):
        return self.weight / 0.45359237

    # Set a new value for the weight!
    def set_weight_lbs(self, new_weight):
        self.weight = 0.45359237 * new_weight

With this new method, we are now able to consistently switch between units:

a = Cat('Grumpy', 4)
a.eat_food(0.2)
print(a.weight, a.weight_lbs)
4.2 9.259415011764858
a.set_weight_lbs(11)
print(a.weight, a.weight_lbs)
4.9895160700000005 11.0

This solution works, but is suboptimal. Here again, your USA colleagues have to use a method (obj.set_xxx()) where you can use a more intuitive attribute syntax (obj.xxx = new_value). For this reason, Python introduced a setter decorator, with a syntax inspired from @property:

class Cat:

    def __init__(self, name, weight, unit='kg'):
        self.name = name
        if unit == 'kg':
            self.weight = weight
        elif unit == 'lbs':
            self.weight_lbs = weight
        else:
            raise ValueError('Unit not understood: {}'.format(unit))

    def eat_food(self, food_kg):
        self.weight += food_kg

    @property
    def weight_lbs(self):
        return self.weight / 0.45359237

    # This is new!
    @weight_lbs.setter
    def weight_lbs(self, new_weight):
        self.weight = 0.45359237 * new_weight

With this new syntax, the USA colleagues will not see a difference between the “true attribute” (in SI units) and theirs:

a = Cat('Grumpy', 4)
a.eat_food(0.2)
print(a.weight, a.weight_lbs)  # read OK
4.2 9.259415011764858
a.weight_lbs = 11  # set OK
print(a.weight, a.weight_lbs)  # updated: nice!
4.9895160700000005 11.0

Private attributes and methods#

Sometimes it can be useful to define and use attributes or methods within a class, without providing them to users of the class: in some OO languages like Java or C++, they are called private (e.g. a private attribute or a private method). In Python, there is no specific syntax or language specification for this pattern: in other words, all class attributes and methods are public, visible and usable by anyone.

To allow the definition of private elements, however, the community agreed on a convention: all attributes and methods starting with an underscore (_) are meant to be private and do not belong to the public facing (documented interface) of a class.

There are plenty of uses for private attributes or methods, but I will illustrate one of them below:

class Cat:

    # Initializer
    def __init__(self, name, language='en'):

        # Name of the cat: an instance attribute
        self.name = name

        # The cats language
        self.speak = None  # we don't know yet what the cat speaks
        self._language = None  # This is a private attribute
        self.language = language

    # Methods
    def say_name(self):
        print('{}, {}!'.format(self.speak, self.name))

    def _decide_speech(self, value):
        # This is a private method
        # We use it to structure our code within the class
        if value == 'en':
            self.speak = 'Meow'
        elif value == 'fr':
            self.speak = 'Miaou'
        else:
            raise ValueError('Language not understood: {}'.format(value))

    # Property getter
    @property
    def language(self):
        return self._language

    # Property setter
    @language.setter
    def language(self, value):
        self._language = value
        self._decide_speech(value)

The design of this class comes from the addition of a new functionality: users can now set the language of their cat, and the cat’s speech will change accordingly. See the following:

a = Cat('Grumpy')
a.say_name()  # the default language is english
a.language = 'fr'  # we set it to french
a.say_name()  # now our cat speaks french!
Meow, Grumpy!
Miaou, Grumpy!

Line 3 in the code snippet above motivates the use of a @setter: when language is updated, we want the speak attribute to be updated too: we need a dedicated method (def language(self, value)) that we hide behind an attribute. This comes with a drawback: where to store the language value? We have to use a new container for it: the private attribute _language (by convention).

Our final class#

I attempt to summarize all the concepts we learned above into a single class. Please read through it carefully:

class Cat:
    # Class attributes
    purr = 'RRRrrrRRRrrr'  # same for all cats, regardless of their language

    # Initializer
    def __init__(self, name, weight, unit='kg', language='en'):

        # Name of the cat: an instance attribute
        self.name = name

        # The cat's weight in two possible units
        if unit == 'kg':
            self.weight = weight
        elif unit == 'lbs':
            self.weight_lbs = weight
        else:
            raise ValueError('Unit not understood: {}'.format(unit))

        # The cat's language
        self.speak = None  # we don't know yet what the cat speaks
        self._language = None  # private attribute
        self.language = language

    # Methods
    def say_name(self):
        print('{}, {}!'.format(self.speak, self.name))

    def eat_food(self, food, unit='kg'):
        if unit == 'lbs':
            # convert to kg
            food = 0.45359237 * food
        self.weight += food

    # Private method
    def _decide_speech(self, value):
        if value == 'en':
            self.speak = 'Meow'
        elif value == 'fr':
            self.speak = 'Miaou'
        else:
            raise ValueError('Language not understood: {}'.format(value))

    # Property getter
    @property
    def weight_lbs(self):
        return self.weight / 0.45359237

    # Property setter
    @weight_lbs.setter
    def weight_lbs(self, new_weight):
        self.weight = 0.45359237 * new_weight

    @property
    def language(self):
        return self._language

    @language.setter
    def language(self, value):
        self._language = value
        self._decide_speech(value)
# Instantiating an object from the class "Cat"
a = Cat('Grumpy', 4)
# Calling an instance method
a.say_name()
Meow, Grumpy!
# Changing an attribute also changes the behavior of the object
a.language = 'fr'
a.say_name()
Miaou, Grumpy!
# Class attributes are available for anyone to read
a.purr
'RRRrrrRRRrrr'
# An action on an object might change its instance attributes
a.eat_food(0.3, unit='lbs')
a.weight
4.136077711
# Some attributes are passive and computed only when needed.
# The user doesn't see the difference:
a.weight_lbs
9.118490487395103

Take home points#

  • Python is an object oriented programming language but does not enforce the definition of classes in your own programs. However, a basic understanding of the core concepts of OOP is a strong asset and allows to make better use of Python’s capabilities.

  • We defined a lot of new concepts today: classes, objects, instances, instance methods, instance attributes, class attributes, the @property decorator, and the @property.setter decorator. They are all important! You will have to revise these concepts step by step, possibly by making use of external resources. The web has plenty of good beginner-level OOP tutorials, I recommend to have a look at least at one of them.