Skip to main content
HomeTutorialsData Analysis

Alteryx Tutorial: A Comprehensive Hands-On Guide for Data Analytics

Dive into our detailed Alteryx tutorial and learn how this powerful data analytics tool can transform your data handling experience. This guide covers everything from installation to advanced workflow automation in Alteryx, making it the perfect resource for beginners and seasoned data professionals alike.
Updated Jan 2024  · 11 min read

Imagine a tool that allows you to access, clean, test, combine, analyze, and output data much easier than SQL, Microsoft Excel, or similar tools…

Now, stop imagining. Alteryx is that tool.

Namely, Alteryx is a powerful data analytics and ETL tool that enables teams to build data processes efficiently in a repeatable, less error-prone, and less risky way.

In this tutorial, we will look at what Alteryx is and then dive into a hands-on approach to how to use it.

Let’s start by properly defining what Alteryx is.

What is Alteryx?

We’ve got a full guide covering what Alteryx is. However, in brief, Alteryx is a data analytics and visualization tool designed to simplify advanced analytics automation and make it accessible to all data professionals.

To be more specific, Alteryx is a tool that enables users to prepare, blend, and analyze data from various sources without extensive coding knowledge as a prerequisite. Leveraging the drag-and-drop interface, users can create complex workflows by integrating their data from various sources, cleaning and transforming it before performing advanced analytics and visualization.

The main benefit of adopting Alteryx to create your workflows is that it enables you to reduce manual effort by automating your data analytics processes. These workflows can be saved and reused at a later date, which makes it easier to replicate tasks such as data processing and analytics tasks. It also helps reduce the risk of human error in manual data manipulation.

We’ll be getting hands-on for the remainder of the tutorial; follow along to help make the lesson stick.

Installing Alteryx

The installation process is extremely simple; follow these steps to install Alteryx onto your Desktop.

Step 1

Navigate to the Alteryx website. Select “Products” from the menu, and navigate to “Alteryx Analytics Cloud Platform” under the “Platform Overview” section. Click on it to be taken to the next screen.

Step 1

Step 2

Select the “Start Free Trial” option. This will open up a page where you can opt to start a “Desktop Trial” or a “Cloud Trial.” For our tutorial, we will use the “Desktop Trial,” so select that option. Note the free trial is valid for 30 days.

The free trial page for the Alteryx Analytics Cloud Platform

The free trial page for the Alteryx Analytics Cloud Platform

Step 3

Fill in the details about yourself on the next page, then select “Submit.” This will start the download.

Step 4

Open the .exe to start the setup when the download is complete. There will be two options on the screen: select the typical download and click next. This will complete the setup and begin the installation. You’ll be prompted to accept a user license – read through before you do so – and choose where you want to save the program on your system. Pick what’s best for you.

The setup page for Alteryx

The setup page for Alteryx

Step 5

Run Alteryx to open the platform. Upon opening, a prompt will come up to request your Alteryx Designer Activation. Select “Start Free Trial” and insert your email.

The AlteryX Designer platform and activation pop-up.

The AlteryX Designer platform and activation pop-up.

Step 6

You’ll be asked for your details to get your trial activation code. Fill it out, then select “activate,” and voila!

Trial activation form

Trial activation form

You’re now ready to start solving.

Workflow Canvas

The Alteryx Workflow Canvas is marked in red

The Alteryx Workflow Canvas is marked in red

Workflows are built in the Workflow Canvas area. For clarity, a workflow defines a series of tools used to perform various functions to process data. Relative paths to various data sources can be saved within the workflow, thus enabling the workflow to be shared with other teammates via or by saving it in a shared drive.

Note: each workflow is saved as a YXMD file type.

Building Your First Alteryx Workflow

When you open the Alteryx Designer interface, a Workflow will be initiated by default, but let's assume this doesn't happen for you. To build a new workflow, navigate to “File” at the top left-hand corner and select “New Workflow.” This will create a tab in the Workflow Canvas for your new workflow.

Creating a new workflow

As stated above, a workflow is a series of connected tools performing different data-processing functions.

To begin building your workflow, drag the action you would like to perform from the palette and place it onto the canvas.

Selecting the Input Data tool

Selecting the Input Data tool

To connect a tool to your existing workflow, drag it from the palette onto the canvas and place it near the output anchor of another tool. It’s also possible to drag the output anchor from your existing tool to your recently added tool, so pick whichever feels most natural.

Connecting a tool; note this raises an error with the Input Data tool because no input data was defined.

Connecting a tool; note this raises an error with the Input Data tool because no input data was defined.

Connections move in a downstream direction, which could either be from left to right or from top to bottom. It’s all based on the workflow layout you select in the Workflow Configuration window.

It’s possible for some tools to accept multiple inputs, and if a tool has a gray input anchor, it has an optional input. Lastly, all tools with an output anchor can be output to multiple streams.

Data Preparation in Alteryx

Data preparation, or pre-processing as it’s sometimes called, is the act of manipulating raw data into a form that can be readily and accurately analyzed or used as input into a machine learning model.

People rarely celebrate this aspect of being a data professional. Still, it takes up a significant amount of time and is one of the key components of successful data analytics and machine learning.

Alteryx makes it extremely simple to perform data preparation. With a few clicks, you can acquire your data, clean it, perform joins, and implement transformations.

Let’s go through the process of each step.

Data acquisition

Before we can start preparing our data, we must first acquire it — this can be from various sources such as a cloud data warehouse or data lake.

For our example, we will use one of the sample datasets on AlteryX. To do this, we must start by dragging the “Input data” tool from the palette. This will open a configuration bar on the left-hand side.

Select Set Up a Connection > Files > Alteryx Database (.yxdb) > TutorialData.yxdb

Acquiring data from the Alteryx database

Acquiring data from the Alteryx database

Once the data has been pulled into Alteryx, analysts and data scientists would typically begin their examinations and data profiling to better understand the data at their disposal.

We can do this by dragging the “Browse” tool into the canvas, connecting it to the anchor of the input data, and then running the workflow.

Now, you can select various columns from the preview window to view the data quality in that column.

Browsing the input data

Browsing the input data

Notice that there is a value with trailing whitespace in our data's “Last” name column.

The length statistics of the “Last” column in our data.

The length statistics of the “Last” column in our data.

We can handle this with some data cleansing.

Data cleansing

Data cleansing is the act of cleaning poorly structured data to improve it’s quality. It involves procedures such as:

  • Correcting entry errors
  • Handling missing data
  • Masking sensitive or confidential information
  • Handling duplicates or outliers

To perform data cleansing in Alteryx, drag the “Data Cleansing” tool from the palette and connect it to the output anchor of your input data.

In the configuration area, deselect all the other options apart from “Last” and “Leading and Trailing Whitespace” under the “Remove Unwanted Characters” heading.

Next, run the workflow to execute the command.

The configuration for the Data Cleansing tool

The configuration for the Data Cleansing tool

To check if the cleansing was performed correctly, click the “Browse” tool and select the “Last” column from the preview window.

Checking values with trailing whitespace after execution

Checking values with trailing whitespace after execution

Notice the “Values with Trailing Whitespace” parameter says “0,” meaning the action was successful.

Data Transformation in Alteryx

Data can come in various shapes, sizes, and structures. Sometimes, it may be ready to dive straight in with analysis, but that’s typically in data competitions. Data is messy in the real world, and it’s your responsibility, as a data professional, to format it so that it can be queried to derive meaningful insights.

The common data transformations are:

  • Pivoting
  • Set data types
  • Aggregations

The transformation we’re going to do is convert the DOB of users to their real age. To do this in Alteryx, drag the “Formula” tool from the palette and connect it to the output anchor of the “Data Cleansing” tool.

In the configuration panel, you’ll be told to “select a column.” Select “Add a column” from this drop-down list and title it “Age.”

To calculate a person's age, we must subtract the current date from their Birth Date. To do this, enter “DateTimeDiff” where it says, “Enter Expression here.

Replace “dt1” with “DateTimeToday()” and “dt2” with “[Birth Date].” The “u” in the expression stands for units; the units we’re working with in this instance is “Years,” so replace the “u” with that.

The last thing you must do is change the Data type to “Int16.

How your configuration box should look

How your configuration box should look

Great! Now you know how to set up a basic workflow in Alteryx.

Automating Workflows with Alteryx

Workflow automation is the use of software to complete tasks without the need for human input. It’s a tool frequently used in business to speed up processes and reduce the need for manual work and repetitive tasks.

We can automate workflows in Alteryx with Batch Macros and Scheduling.

Batch macros

Batch processing refers to a method used by computers to periodically complete high-volume, repetitive data jobs – typically when compute resources are experiencing low demand. We can perform batch processing in Alteryx using Batch macro.

Namely, Batch macro is a tool that runs multiple times in a workflow and creates an output after each run. The macro runs once for each record (or a selected group of records) in the data, and the inputs can be configured to be used in each workflow execution or exclusively in specific runs.

Creating a Batch macro is simple. Navigate to the Workflow Configuration tab and select “Workflow” from the headings. Under the Workflow heading, you’ll see a “Type” sub-heading – change the selection to Macro, and select “Batch Macro” from the dropdown list.

Setting up Batch macro

Setting up Batch macro

Once the workflow is saved as a Batch macro, each tool in the workflow will receive a lightning bolt anchor, and only interface tools can connect to them.

Scheduling workflows

It’s also possible to schedule workflows, applications, or packages in Alteryx. Note scheduling is the act of assigning resources to perform tasks automatically at a designated frequency, date, and time.

Users can decide where they would like their scheduled workflows to run, but it’s usually based on their company's configuration. The two options are:

  • Alteryx Server: Schedule to your company's Server or a controller.
  • Designer plus Desktop Automation (Scheduler): Schedule to your computer.

Scheduling a workflow in Alteryx is quite straightforward. Simply open the workflow you would like to schedule and select the “Add Workflow to Schedule” icon that is next to the “Run” icon at the top of the canvas.

The “Add Workflow to Schedule” icon

The “Add Workflow to Schedule” icon

You can also schedule a workflow by navigating to “Options” and selecting “Schedule Workflow.”

Alteryx Best Practices

Like any tool, Alteryx has a set of best practices to ensure you produce the best outcomes. We will look at five, but you can check out the PDF containing 24 Best Practices shared in the Aletryx community if you want to know more.

1. Remove all browse tools

The browse tool is extremely helpful during the development phase since it enables users to see/review the entire dataset from a connected tool. However, once you’ve completed your workflow, they are not useful for two reasons:

  1. They distort the overview of the workflow
  2. They create temporary yxdb (Alteryx database), which delays the processing.

2. Select correct data types & variable names

When you implement this best practice, you’ll have a good overview of your variables, and you will save time since you won’t be attempting to perform invalid transformations; for example, you will not attempt to perform a numeric operation on a string.

3. Documentation using descriptive titles

Be careful to document your workflows with descriptive titles. There are a number of reasons to do this; for example, it makes it much easier to hand over to a client or colleague and return to your previous work later.

If workflows are dependent, there is plenty of value in enumerating them. By using numbered titles, it will be much easier to decode the workflow dependency and obtain a clear understanding of what’s going on without needing to open each workflow.

4. Deal with errors and warnings immediately

Dealing with errors and warnings as soon as they occur is an Alteryx best practice. The reason it’s so important is that it allows you to catch errors in your logic early before they corrupt your workflow.

Note that Alteryx displays errors by adding an exclamation mark under the tool where the error occurred. However, conversion errors and warnings can be found by observing the tool reference, displayed in parentheses after the tool name in the results pane.

5. Investigate data using subsamples

Limit the number of records you use when you are initially building out your workflow. This is important because it speeds up processing, thus saving you valuable time – especially when working with a large dataset.

To set a record limit, navigate to the Configuration window in the “Input Data” tool and specify a value.

The place to specify a record limit in your Input Data tool

The place to specify a record limit in your Input Data tool

This will help you reach your objectives much faster.

Conclusion

Alteryx is a data analytics and visualization tool that was created to simplify advanced analytics automation and increase its accessibility to data professionals. Namely, users can leverage Alteryx to prepare, blend, and analyze data from various sources without coding knowledge. The main benefit of the tools is that users can easily reduce manual effort by automating their analytics processes by creating workflows.

In this hands-on tutorial, we covered:

  • How to install Alteryx
  • Data preparation
  • Building your first workflow
  • Automating your workflow
  • Best practices

Check out these resources to continue your learning:

Thanks for reading!


Photo of Kurtis Pykes
Author
Kurtis Pykes
Topics

Start Your Data Journey Today!

Course

Introduction to Data Science in Python

4 hr
452.1K
Dive into data science using Python and learn how to effectively analyze and visualize your data. No coding experience or skills needed.
See DetailsRight Arrow
Start Course
See MoreRight Arrow
Related

Data Sets and Where to Find Them: Navigating the Landscape of Information

Are you struggling to find interesting data sets to analyze? Do you have a plan for what to do with a sample data set once you’ve found it? If you have data set questions, this tutorial is for you! We’ll go over the basics of what a data set is, where to find one, how to clean and explore it, and where to showcase your data story.
Amberle McKee's photo

Amberle McKee

11 min

You’re invited! Join us for Radar: The Analytics Edition

Join us for a full day of events sharing best practices from thought leaders in the analytics space
DataCamp Team's photo

DataCamp Team

4 min

10 Top Data Analytics Conferences for 2024

Discover the most popular analytics conferences and events scheduled for 2024.
Javier Canales Luna's photo

Javier Canales Luna

7 min

Avoiding Burnout for Data Professionals with Jen Fisher, Human Sustainability Leader at Deloitte

Jen and Adel cover Jen’s own personal experience with burnout, the role of a Chief Wellbeing Officer, the impact of work on our overall well-being, the patterns that lead to burnout, the future of human sustainability in the workplace and much more.
Adel Nehme's photo

Adel Nehme

44 min

Becoming Remarkable with Guy Kawasaki, Author and Chief Evangelist at Canva

Richie and Guy explore the concept of being remarkable, growth, grit and grace, the importance of experiential learning, imposter syndrome, finding your passion, how to network and find remarkable people, measuring success through benevolent impact and much more. 
Richie Cotton's photo

Richie Cotton

55 min

Mastering Bayesian Optimization in Data Science

Unlock the power of Bayesian Optimization for hyperparameter tuning in Machine Learning. Master theoretical foundations and practical applications with Python to enhance model accuracy.
Zoumana Keita 's photo

Zoumana Keita

11 min

See MoreSee More