Tutorials
machine learning
+2

Automated Machine Learning with Auto-Keras

Learn about automated machine learning and how it can be done with auto-keras.

Machine learning is not a very uncommon term these days because of organizations like DataCamp, Coursera, Udacity and many more are constantly working on how efficiently and flexible they can bring the very education of machine learning to the commoners. With the virtue of their platforms, it is really easy nowadays to get started in this field with almost no prerequisites. However, the term Automated Machine Learning is making a lot of headlines these days on the popular Data Science education forums. Many organizations like Google, H2O.ai, etc. are working commendably in this area. This is not a very common topic as compared to Machine Learning. Because machine learning deals with the automation part itself; so naturally the question that hits the mind first - "Can machine learning be also automated?"

You will find the answers to many questions like these in this tutorial. This tutorial comprises the following:

  • Understanding a standard Machine Learning pipeline
  • How can the Machine Learning pipeline be automated?
  • Introduction to automated machine learning
  • Python libraries for automated machine learning
  • Introduction to auto-keras
  • A case study of AutoML using auto-keras
  • Further reading on the topic

Let's get started.


Source: IBM Data Hub

Understanding a standard Machine Learning pipeline:

When you are solving a problem as a Data Scientist, your standard workflow looks something like the following:

  1. Data Collection
  2. Data Preprocessing
  3. Initialize a machine learning model that may be suitable for the problem
  4. Train the model
  5. Test the model
  6. Tweak the parameters of the model
  7. Again test the model
  8. Communicate the results

The second step, Data Preprocessing is very broad in its context because it is essentially one of the most time-consuming tasks in the above pipeline and it includes many subtasks such as Data Cleaning, Data Transformation, Feature Selection, etc. The steps from 3 to 7 is just for one machine learning model. A good practitioner will certainly not stop after just one model. He will try his experiments on different models to compare their results and finally decide on the best model for the problem. So, here comes another very time-consuming set of tasks - Deciding on which model to choose?

The following quote regarding the debugging process in a machine learning task could not be more apt, and it is imperative to keep in mind:

"Debugging for machine learning happens in two cases: 1) your algorithm doesn't work or 2) your algorithm doesn't work well enough.[...] Very rarely does an algorithm work the first time and so this ends up being where the majority of time is spent on building algorithms." - S. Zayd Enam

So, by now, you must have got a brief idea of a standard machine learning engineering task? Ultimately you will have to decide which model is the best for the problem at your hand and you also need to explain why it is the best. At times, you will be under situations where the number of possibilities to try is just too many, but the deadline is not that far. - Getting the essence of the problem? Let's find out more.

How can the Machine Learning pipeline be automated?

You will continue this section with the notion of deciding the best model for a problem and a situation where the number of possibilities and the deadline are inversely proportional to each other.

At the very beginning of this tutorial you encountered one question "Can machine learning be also automated?" This question is not at all silly. Even the great Sebastian Raschka in one of his interviews described Automated Machine Learning as "the automation of automating automation".

Revisit the step number 5 from the standard workflow of a data science task you just studied - Tweak the hyperparameters of the model. Say, you have finished preparing data for the further steps, and you have just initiated a classifier $X$. Now, say, $X$ takes 5 different hyperparameters. So, you will have to try the same classifier $X$ but with different sets of hyperparameter values which is definitely not an easy task to execute. Now comes the more troubling part. After trying out various combinations, you discovered that the results are not well enough. So, you decided to test four more classifiers (each with 6 different hyperparameters). Can you imagine how time-consuming can this be? Even after that, what if you do not get good results? Investigation of that will be nothing but another heavily time-consuming process.

So the very idea of Automated Machine Learning comes from this problem.

"If numerous machine learning models must be built, using a variety of algorithms and a number of differing hyperparameter configurations, then this model building can be automated, as can the comparison of model performance and accuracy." - KDNuggets

You now have a reason why the term Automated Machine Learning is making a lot of headlines these days on the popular Data Science education forums. You will now study more about automated machine learning in the next section.

Introduction to automated machine learning:

The task of tuning hyperparameters for different machine learning models is also highly likely to be time-consuming in nature. In a more Computer-Science specific term, tuning of hyperparameters is a search process which in this case can be extremely exhaustive. So what if this process could itself be automated? Well, that is what automated machine learning basically does. "Automated machine learning is a direct solution to the shortage of data scientists as it can drastically increase the performance and productivity of data scientists by speeding up work cycles, improving model accuracy, and ultimately, even possibly replace the need for data scientists." -Automated Machine Learning for Internet of Things

You now have enough knowledge about automated machine learning, and you are ready to see it in action. But first, let's see what some of the widely used Python libraries for doing automated machine learning are.

Python libraries for automated machine learning:

There are many Python libraries available for performing automated machine learning. All of them attempt to achieve more or less the same goal, that of automating the machine learning process. Following are some of the most widely used Python libraries for automated machine learning:

Each of these libraries has their own approach to tackle the process of "automation of automating automation". But for this tutorial, you will use Auto-Keras. Why wait then? Let's do it.

Introduction to auto-keras:

"Auto-Keras is an open source software library for automated machine learning."(Source) It is being developed by DATA Lab at Texas A&M University and community contributors. According to the official site of auto-keras- "The ultimate goal of this automated machine learning is to provide easily accessible deep learning tools to domain experts with limited data science or machine learning background. Auto-Keras provides functions to automatically search for architecture and hyperparameters of deep learning models."

Source: Auto-Keras

To install auto-keras just run the following command.

Note: Currently, Auto-Keras is only compatible with: Python 3.6.

!pip install autokeras

Now that you have successfully installed Auto-Keras, it's time for some quick implementation.

A case study of AutoML using auto-keras:

For this case study, you will use the very popular MNIST dataset. keras has this dataset built-in. So, you don't need to download it separately. You will start off by loading the ImageClassifier module of auto-keras. You will also load the MNIST dataset from keras module.

from keras.datasets import mnist
from autokeras import ImageClassifier

You loaded the MNIST dataset from keras.datasets module and you also imported ImageClassifier from auto-keras. You will now separate out the dataset into train and test splits.

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(x_train.shape + (1,)) # (1,) denotes the channles which is 1 in this case
x_test = x_test.reshape(x_test.shape + (1,)) # (1,) denotes the channles which is 1 in this case
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step

You have separated out the train and test splits, and now you will fit ImageClassifier with x_train and y_train. You will test its performance on x_test and y_test.

# Instantiate the ImageClassifier class
clf = ImageClassifier(verbose=True, augment=False)
# Fit the train set to the image classifier
clf.fit(x_train, y_train, time_limit=12 * 60 * 60)
clf.final_fit(x_train, y_train, x_test, y_test, retrain=True)
# Summarize the results
y = clf.evaluate(x_test, y_test)
print(y * 100)

It's as simple as above. Just 4 - 5 lines of code and you are done with a quick experimentation. Well, it is not really that quick. The above code takes a considerable amount of time to get executed. A decent configuration for running Deep Learning experiments will undoubtedly help you. Google Colab is also an excellent starting point for this.

Let's now learn more about the parameters that you used in the above piece of code. You are going to refer the documentation of auto-keras for this and below are the relevant excerpts from the documentation:

  • In the ImageClassifier():
    • verbose: A boolean of whether the search process will be printed to the output.
    • augment: A boolean value indicating whether the data needs augmentation. If not define, then it will use the value of Constant.DATA_AUGMENTATION which is True by default.
  • In the fit() method:
    • time_limit: The time limit for the search in seconds.
  • final_fit(): Final training after found the best architecture.
    • retrain: A boolean of whether reinitialize the weights of the model.

Auto-keras is an evolving library and still in its pre-release version. According to the official site, it supports the following major modules:

  • supervised: The base class for all supervised task.
  • bayesian: A GaussianProcessRegressor for bayesian optimization.
  • search: Base class of all searcher classes. Eevery searcher class can override its search function to implements its strategy.
  • graph: A class representing the neural architecture graph of a Keras model. Graph extracts the neural architecture graph from a Keras model. Each node in the graph is an intermediate tensor between layers. Each layer is an edge in the graph. Notably, multiple edges may refer to the same layer. (e.g., Add layer is adding two tensors into one tensor. So it is related to two edges.
  • preprocessor: A class that can format data. This class provides ways to transform data's classification label into a vector.
  • model_trainer: A class that is used to train the model. This class can train a Pytorch model with the given data loaders. The metric, loss_function, and model must be compatible with each other. Please see the details in the Attributes.

Ending notes:

You have made it till the end. In this tutorial, you studied the process of machine learning in general and learned how it can be automated. You took a quick look the libraries that are available for performing AutoML. You used auto-keras and saw what kind of high-level abstraction it offers and how easy it is to use auto-keras.

This tutorial might give you a negative notion that AutoML can replace many data scientists when it becomes full-proof. Really? Give it another thought. AutoML actually frees data scientists from the burden of the iterative process of selecting the best model for a problem. During that course, a data scientist may focus more on the data itself which is more essential. This interview of Randy Olson covers some beautiful insights about this topic. Make sure you give it a read, and you will feel demotivated again. If you ever feel that you are unsure of the work that a machine learning practitioner does, just check this post out.

The field of automated machine learning is really making progress. For example, NAS which stands for Neural Architecture Search which is an algorithm that searches for the best neural network architecture. Following are some links to the resources which demonstrate state-of-the-art use cases of AutoML:

Some references used:

If you are interested in learning more about Deep Learning, take DataCamp's "Deep Learning in Python" course.

Want to leave a comment?