Cursus
Whether you're tackling a complex project or a mundane, time-intensive task, GitHub Copilot can streamline your coding efforts. In this tutorial, you'll discover how GitHub Copilot works, explore its key features, and learn how it can significantly enhance your productivity and coding efficiency. Let's dive in.
The Understanding GitHub blog post is a great resource for those who are completely new to GitHub, consider reading it first if that's your case.
What is GitHub Copilot?
GitHub Copilot is a groundbreaking AI-based programming assistant launched by GitHub in 2021. It uses OpenAI's Codex model, which is a descendant of the GPT models that focused its training on a diverse range of programming languages and coding contexts. For this reason, GitHub Copilot is thought to be more capable than ChatGPT for code-writing tasks.
The most striking feature of GitHub Copilot is its seamless integration into popular development environments, such as Visual Studio Code. By embedding directly into the code editor, GitHub Copilot acts as a pair programmer that offers real-time suggestions, code completions, and recommendations, all of which we will explore in-depth.
Exploring GitHub Copilot Features
GitHub Copilot has many great time-saving features. Here is a list of the most important ones that can speed up the lifecycle of your data science project.
Chat interface in your editor
GitHub Copilot integrates a ChatGPT-like chat interface directly into your IDE, eliminating the need to switch between your editor and external websites to correct code.
Let's look at the following example: Here we ask GitHub Copilot to generate the code required to translate a text from English to Italian using the OpenAI model.
GitHub Copilot sidebar in VS Code IDE
GitHub Copilot for the command line interface
For those who wok in the terminal, GitHub Copilot in the CLI provides a chat-like interface within the command line. Here, we have asked Github Copilot to explain the command sudo apt-get
.
GitHub Copilot inside the command line interface
GitHub Copilot for docs
GitHub Copilot can provide AI-generated answers by sourcing information directly from documentation, which can save a lot of time and effort.
GitHub Copilot for Docs
AI-powered pull requests
It's no surprise that GitHub Copilot integrates with GitHub. GitHub Copilot provides a feature to describe the changes in a GitHub repository and review them for a pull request.
GIF based on video taken from GitHub Next
How to Get Started Using GitHub Copilot
Now that we have explored GitHub Copilot's impressive features, let's learn how to set up and use it in Visual Studio Code. To do this, we first need to take care of two administrative tasks: We need to install Visual Studio Code and sign up for and install GitHub Copilot.
Installing Visual Studio Code
We install VS Code by visiting the Visual Studio Code website and following the instructions. The website includes how-to videos if you have trouble.
Downloading Visual Studio Code
Signing Up for GitHub and Installing GitHub Copilot
To install GitHub Copilot, we first need to create a GitHub account. If you're looking to test GitHub Copilot without a long-term commitment, consider opting for the free 30-day trial.
Signing Up for GitHub
Setting up GitHub Copilot with Visual Studio Code
We then enter Visual Studio Code and install two extensions from the marketplace: GitHub Copilot and GitHub Copilot Chat. You just need to press the “Install” button and sign in to GitHub.
Setting up GitHub Copilot
Using GitHub Copilot Inside Visual Studio Code
To test GitHub Copilot, we will use the Seoul Bike Sharing dataset, one of many curated datasets through DataLab. Our objective is to predict the number of public bikes rented per hour in Seoul’s bike-sharing system based on weather information, such as temperature, humidity, wind speed, and other variables.
Using GitHub Copilot to import data into VS Code
Let's start by importing a CSV file and viewing the first five rows. GitHub Copilot gets to work right away by autofilling the suggested CSV. We press "Tab" to accept its suggestions.
Importing data into Github Copilot
Using GitHub Copilot to display a plot
As a next step, we choose to create a visualization. A correlation matrix with a heatmap is as good a choice as any to illustrate GitHub Copilot's intelligence. We see that GitHub Copilot not only writes the code for our correlation matrix but also finishes our sentence when we make the request.
Writing code for a plot in Github Copilot
Oops - we have obtained an error because we didn’t remove the categorical variables from the correlation matrix, which is a common mistake. We can fix this error by adding a comment and a new piece of code. GitHub Copilot finds the correct columns to remove, correcting the error.
Displaying a plot in Github Copilot
Using GitHub Copilot to prepare data for training
After exploring the data, it’s time to preprocess it before training our model. For this exercise, we choose an ordinary least squares linear regression. To do this, we need to encode the categorical variables using one-hot encoding.
As we type our request, GitHub Copilot makes predictions and code suggestions. It even started to work with us as we had second thoughts about including one of our variables.
Preparing data in Github Copilot
GitHub Copilot helps us with all of the necessary steps in our workflow, including choosing our independent variables, splitting our data into training and test sets, and cleaning our data to be ready for our model.
Creating a train / test split in Github Copilot
Using GitHub Copilot to evaluate our model
The last phase of our mini-project is to train and evaluate our linear regression model. GitHub Copilot helps us find model statistics on our training data and then evaluates the model's performance on the testing set.
Training our model in Github Copilot
Evaluating our model in Github Copilot
The mean squared error is higher than we expected, which makes us consider another model. We switch to a random forest model, to test the output, and we see the error is much lower than before. If you want to explore these models in much greater detail, check out our Machine Learning Fundaments in Python course.
Viewing model statistics in Github Copilot
Exploring Github Copilot Plans and Pricing
There are three different GitHub Copilot’s plans available depending on your needs.
- Copilot Individual is the least expensive plan. It allows you to use GitHub Copilot in an IDE or on the command line. It’s free for students and teachers. All the features covered in our tutorial are included in this plan.
- Copilot Business is a subscription appropriate for business purposes. It allows access to GitHub Copilot’s services as a member of the organization.
- Copilot Enterprise is the most complete plan for larger enterprise accounts that need additional customization.
Github Copilot pricing structure
GitHub Copilot Alternatives
Let's now explore some compelling alternatives to GitHub Copilot. The following three companies are all at the forefront of the generative AI revolution and provide generative AI solutions to assist with code creation.
The DataLab IDE
- DataLab: DataCamp's very own DataLab is an AI-enabled notebook. Simply attach the data source, ask the AI what you need, and get insights. The required notebooks are already installed. DataLab is perfect for novices looking to learn as well as business professionals needing to leverage AI to create compelling presentations for decision-makers.
- TabNine: TabNine is an alternative that also provides AI code completions and AI chat agents and it works with many popular IDEs.
- SonarQube: SonarQube is geared towards software development. With SonarQube, developers would upload data and use SonarQube to receive AI-assisted and quality-assured code.
Would you like to get GitHub certified? Check out our comprehensive guide on the different GitHub certifications!
Conclusion
We've just finished a complete data science project in a matter of minutes using GitHub Copilot. It proves to be a useful asset for speeding up all aspects of the data science workflow, everything from displaying plots to model building with a train/test workflow.
If you found this tutorial helpful and want to get started with GitHub Copilot, we recommend DataCamp's video Pair Programming with GitHub Copilot. Another course, GitHub concepts, will be a great companion course, especially if you feel unsure about GitHub.
Thanks for reading!