Python Object-Oriented Programming (OOP): Tutorial
Object-Oriented programming is a widely used concept to write powerful applications. As a data scientist, you will be required to write applications to process your data, among a range of other things. In this tutorial, you will discover the basics of object-oriented programming in Python. You will learn the following:
Object-oriented programming has some advantages over other design patterns. Development is faster and cheaper, with better software maintainability. This, in turn, leads to higher-quality software, which is also extensible with new methods and attributes. The learning curve is, however, steeper. The concept may be too complex for beginners. Computationally, OOP software is slower, and uses more memory since more lines of code have to be written.
Object-oriented programming is based on the imperative programming paradigm, which uses statements to change a program's state. It focuses on describing how a program should operate. Examples of imperative programming languages are C, C++, Java, Go, Ruby and Python. This stands in contrast to declarative programming, which focuses on what the computer program should accomplish, without specifying how. Examples are database query languages like SQL and XQuery, where one only tells the computer what data to query from where, but now how to do it.
OOP uses the concept of objects and classes. A class can be thought of as a 'blueprint' for objects. These can have their own attributes (characteristics they possess), and methods (actions they perform).
An example of a class is the class
Dog. Don't think of it as a specific dog, or your own dog. We're describing what a dog is and can do, in general. Dogs usually have a
age; these are instance attributes. Dogs can also
bark; this is a method.
When you talk about a specific dog, you would have an object in programming: an object is an instantiation of a class. This is the basic principle on which object-oriented programming is based. So my dog Ozzy, for example, belongs to the class
Dog. His attributes are
name = 'Ozzy' and
age = '2'. A different dog will have different attributes.
OOP in Python
Python is a great programming language that supports OOP. You will use it to define a class with attributes and methods, which you will then call. Python offers a number of benefits compared to other programming languages like Java, C++ or R. It's a dynamic language, with high-level data types. This means that development happens much faster than with Java or C++. It does not require the programmer to declare types of variables and arguments. This also makes Python easier to understand and learn for beginners, its code being more readable and intuitive.
If you're new to Python, be sure to take a look at DataCamp's Intro to Python for Data Science course.
How to create a class
To define a class in Python, you can use the
class keyword, followed by the class name and a colon. Inside the class, an
__init__ method has to be defined with
def. This is the initializer that you can later use to instantiate objects. It's similar to a constructor in Java.
__init__ must always be present! It takes one argument:
self, which refers to the object itself. Inside the method, the
pass keyword is used as of now, because Python expects you to type something there. Remember to use correct indentation!
class Dog: def __init__(self): pass
self in Python is equivalent to
this in C++ or Java.
To instantiate an object, type the class name, followed by two brackets. You can assign this to a variable to keep track of the object.
ozzy = Dog()
And print it:
<__main__.Dog object at 0x111f47278>
Adding attributes to a class
ozzy, it is clear that this object is a dog. But you haven't added any attributes yet. Let's give the
Dog class a name and age, by rewriting it:
class Dog: def __init__(self, name, age): self.name = name self.age = age
You can see that the function now takes two arguments after
age. These then get assigned to
self.age respectively. You can now now create a new
ozzy object, with a name and age:
ozzy = Dog("Ozzy", 2)
To access an object's attributes in Python, you can use the dot notation. This is done by typing the name of the object, followed by a dot and the attribute's name.
This can also be combined in a more elaborate sentence:
print(ozzy.name + " is " + str(ozzy.age) + " year(s) old.")
Ozzy is 2 year(s) old.
Define methods in a class
Now that you have a
Dog class, it does have a name and age which you can keep track of, but it doesn't actually do anything. This is where instance methods come in. You can rewrite the class to now include a
bark() method. Notice how the
def keyword is used again, as well as the
class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print("bark bark!")
bark method can now be called using the dot notation, after instantiating a new
ozzy object. The method should print "bark bark!" to the screen. Notice the parentheses (curly brackets) in
.bark(). These are always used when calling a method. They're empty in this case, since the
bark() method does not take any arguments.
ozzy = Dog("Ozzy", 2) ozzy.bark()
Recall how you printed
ozzy earlier? The code below now implements this functionality in the
Dog class, with the
doginfo() method. You then instantiate some objects with different properties, and call the method on them.
class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print("bark bark!") def doginfo(self): print(self.name + " is " + str(self.age) + " year(s) old.")
ozzy = Dog("Ozzy", 2) skippy = Dog("Skippy", 12) filou = Dog("Filou", 8)
ozzy.doginfo() skippy.doginfo() filou.doginfo()
Ozzy is 2 year(s) old. Skippy is 12 year(s) old. Filou is 8 year(s) old.
As you can see, you can call the
doginfo() method on objects with the dot notation. The response now depends on which
Dog object you are calling the method on.
Since dogs get older, it would be nice if you could adjust their age accordingly. Ozzy just turned 3, so let's change his age.
ozzy.age = 3
It's as easy as assigning a new value to the attribute. You could also implement this as a
birthday() method in the
class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print("bark bark!") def doginfo(self): print(self.name + " is " + str(self.age) + " year(s) old.") def birthday(self): self.age +=1
ozzy = Dog("Ozzy", 2)
Passing arguments to methods
You would like for our dogs to have a buddy. This should be optional, since not all dogs are as sociable. Take a look at the
setBuddy() method below. It takes
self, as per usual, and
buddy as arguments. In this case,
buddy will be another
Dog object. Set the
self.buddy attribute to
buddy, and the
buddy.buddy attribute to
self. This means that the relationship is reciprocal; you are your buddy's buddy. In this case, Filou will be Ozzy's buddy, which means that Ozzy automatically becomes Filou's buddy. You could also set these attributes manually, instead of defining a method, but that would require more work (writing 2 lines of code instead of 1) every time you want to set a buddy. Notice that in Python, you don't need to specify of what type the argument is. If this were Java, it would be required.
class Dog: def __init__(self, name, age): self.name = name self.age = age def bark(self): print("bark bark!") def doginfo(self): print(self.name + " is " + str(self.age) + " year(s) old.") def birthday(self): self.age +=1 def setBuddy(self, buddy): self.buddy = buddy buddy.buddy = self
You can now call the method with the dot notation, and pass it another
Dog object. In this case, Ozzy's buddy will be Filou:
ozzy = Dog("Ozzy", 2) filou = Dog("Filou", 8) ozzy.setBuddy(filou)
If you now want to get some information about Ozzy's buddy, you can use the dot notation twice:. First, to refer to Ozzy's buddy, and a second time to refer to its attribute.
Notice how this can also be done for Filou.
The buddy's methods can also be called. The
self argument that gets passed to
doginfo() is now
ozzy.buddy, which is
Filou is 8 year(s) old.
Example: OOP in Python for finance
An example for where Object-Oriented programming in Python might come in handy, is our Python For Finance: Algorithmic Trading tutorial. In it, Karlijn explains how to set up a trading strategy for a stock portfolio. The trading strategy is based on the moving average of a stock price. If
signals['short_mavg'][short_window:] > signals['long_mavg'][short_window:] is fulfilled, a signal is created. This signal is a prediction for the stock's future price change. In the code below, you'll see that there is first a initialisation, followed by the moving average calculation and signal generation. Since this is not object-oriented code, it's just one big chunk that gets executed at once. Notice that we're using
aapl in the example, which is Apple's stock ticker. If you wanted to do this for a different stock, you would have to rewrite the code.
# Initialize short_window = 40 long_window = 100 signals = pd.DataFrame(index=aapl.index) signals['signal'] = 0.0 # Create short simple moving average over the short window signals['short_mavg'] = aapl['Close'].rolling(window=short_window, min_periods=1, center=False).mean() # Create long simple moving average over the long window signals['long_mavg'] = aapl['Close'].rolling(window=long_window, min_periods=1, center=False).mean() # Create signals signals['signal'][short_window:] = np.where(signals['short_mavg'][short_window:] > signals['long_mavg'][short_window:], 1.0, 0.0) # Generate trading orders signals['positions'] = signals['signal'].diff() # Print `signals` print(signals)
In an object-oriented approach, you only need to write the initialisation and signal generation code once. You can then create a new object for each stock you want to calculate a strategy on, and call the
generate_signals() method on it. Notice that the OOP code is very similar to the code above, with the addition of
class MovingAverage(): def __init__(self, symbol, bars, short_window, long_window): self.symbol = symbol self.bars = bars self.short_window = short_window self.long_window = long_window def generate_signals(self): signals = pd.DataFrame(index=self.bars.index) signals['signal'] = 0.0 signals['short_mavg'] = bars['Close'].rolling(window=self.short_window, min_periods=1, center=False).mean() signals['long_mavg'] = bars['Close'].rolling(window=self.long_window, min_periods=1, center=False).mean() signals['signal'][self.short_window:] = np.where(signals['short_mavg'][self.short_window:] > signals['long_mavg'][self.short_window:], 1.0, 0.0) signals['positions'] = signals['signal'].diff() return signals
You can now simply instantiate an object, with the parameters you want, and generate signals for it.
apple = MovingAverage('aapl', aapl, 40, 100) print(apple.generate_signals())
Doing this for another stock becomes very easy. It's just a matter of instantiating a new object with a different stock symbol.
microsoft = MovingAverage('msft', msft, 40, 100) print(microsoft.generate_signals())
You now know how to declare classes and methods, instantiate objects, set their attributes and call instance methods. These skills will come in handy during your future career as a data scientist. If you want to expand the key concepts that you need to further work with Python, be sure to check out our Intermediate Python for Data Science course.
With OOP, your code will grow in complexity as your program gets larger. You will have different classes, subclasses, objects, inheritance, instance methods, and more. You'll want to keep your code properly structured and readable. To do so, it is advised to follow design patterns. These are design principles that represent a set of guidelines to avoid bad design. They each represent a specific problem that often reoccurs in OOP, and describe the solution to that problem, which can then be used repeatedly. These OOP design patterns can be classified in several categories: creational patterns, structural patterns and behavioral patterns. An example of a creational pattern is the singleton, which should be used when you want to make sure that only one instance of a class can be created. An iterator, which is used to loop over all objects in a collection, is an example of a behavioral pattern. A great resource for design patterns is oodesign.com. If you're more into books, I would recommend you to read Design Patterns: Elements of Reusable Object-Oriented Software.