Skip to content
E-Commerce Data
  • AI Chat
  • Code
  • Report
  • E-Commerce Data

    This dataset consists of orders made in different countries from December 2010 to December 2011. The company is a UK-based online retailer that mainly sells unique all-occasion gifts. Many of its customers are wholesalers.

    Not sure where to begin? Scroll to the bottom to find challenges!

    import pandas as pd
    
    pd.read_csv("online_retail.csv")

    Data Dictionary

    VariableExplanation
    InvoiceNoA 6-digit integral number uniquely assigned to each transaction. If this code starts with letter 'c' it indicates a cancellation.
    StockCodeA 5-digit integral number uniquely assigned to each distinct product.
    DescriptionProduct (item) name
    QuantityThe quantities of each product (item) per transaction
    InvoiceDateThe day and time when each transaction was generated
    UnitPriceProduct price per unit in sterling (pound)
    CustomerIDA 5-digit integral number uniquely assigned to each customer
    CountryThe name of the country where each customer resides

    Source of dataset.

    Citation: Daqing Chen, Sai Liang Sain, and Kun Guo, Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining, Journal of Database Marketing and Customer Strategy Management, Vol. 19, No. 3, pp. 197-208, 2012 (Published online before print: 27 August 2012. doi: 10.1057/dbm.2012.17).

    Don't know where to start?

    Challenges are brief tasks designed to help you practice specific skills:

    • 🗺️ Explore: Negative order quantities indicate returns. Which products have been returned the most?
    • 📊 Visualize: Create a plot visualizing the profits earned from UK customers weekly, monthly, and quarterly.
    • 🔎 Analyze: Are order sizes from countries outside the United Kingdom significantly larger than orders from inside the United Kingdom?

    Scenarios are broader questions to help you develop an end-to-end project for your portfolio:

    You are working for an online retailer. Currently, the retailer sells over 4000 unique products. To take inventory of the items, your manager has asked you whether you can group the products into a small number of categories. The categories should be similar in terms of price and quantity sold and any other characteristics you can extract from the data.

    You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.