Speed Up Your Process Using the Workspace AI Assistant
Discover the power of our AI Assistant. Get started with exciting prompts that will supercharge your data workflow!
The sample dataset we'll use here consists of orders made with a UK-based online retailer from December 2010 to December 2011. Source of dataset.
Get started with AI, follow these steps:
- Hover on the space in between cells and add a new cell by clicking the "plus" icon or the line.
- Type in your first prompt.
- Click on "Ask AI" or press the return key.
1. Automatically Handle All Your Package Imports
To perform a machine learning classification task, you will need to import the following packages:
import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score
2. Build Beautiful Visualizations
Try this Prompt:
import pandas as pd import plotly.express as px # Load the dataset df = pd.read_csv("online_retail.csv") # Convert the InvoiceDate column to datetime df['InvoiceDate'] = pd.to_datetime(df['InvoiceDate']) # Filter the data for the year 2011 df_2011 = df[df['InvoiceDate'].dt.year == 2011] # Group the data by month and calculate the total sales monthly_sales = df_2011.groupby(df_2011['InvoiceDate'].dt.month)['Quantity'].sum().reset_index() # Create the plot fig = px.bar(monthly_sales, x='InvoiceDate', y='Quantity', labels={'InvoiceDate': 'Month', 'Quantity': 'Sales'}) # Show the plot fig.show()
SELECT Country, COUNT(*) as Purchase_Count FROM online_retail GROUP BY Country ORDER BY Purchase_Count DESC LIMIT 3;
Summary of Analysis
In this workspace, we performed a machine learning classification task using the Logistic Regression algorithm. We imported the necessary packages such as pandas, numpy, and sklearn. We also split the data into training and testing sets, scaled the features using StandardScaler, trained the model, and evaluated its accuracy.
Additionally, we built a beautiful visualization using the plotly.express library. We loaded the "online_retail.csv" dataset, converted the InvoiceDate column to datetime, filtered the data for the year 2011, and grouped the data by month to calculate the total sales. We then created a bar plot to visualize the monthly sales.
Lastly, we executed a SQL query to select the top three countries with the highest purchase count from the "online_retail" table.
Overall, this workspace showcased the use of various data analysis techniques, including machine learning, data visualization, and SQL queries.
5. Format Your Code
Directly below the code cell that follows, try this prompt:
# Update the cell above to follow PEP 8 standards import pandas as pd import plotly.express as px # Load the dataset df = pd.read_csv("online_retail.csv") # Convert the InvoiceDate column to datetime df['InvoiceDate'] = pd.to_datetime(df['InvoiceDate']) # Filter the data for the year 2011 df_2011 = df[df['InvoiceDate'].dt.year == 2011] # Group the data by month and calculate the total sales monthly_sales = df_2011.groupby(df_2011['InvoiceDate'].dt.month)['Quantity'].sum().reset_index() # Create the plot fig = px.bar(monthly_sales, x='InvoiceDate', y='Quantity', labels={'InvoiceDate': 'Month', 'Quantity': 'Sales'}) # Show the plot fig.show()
result=5+5;print(result)
Looking for more prompts to try? The following tutorial has more: 10 Ways to Speed Up Your Analysis With the Workspace AI Assistant