Skip to content

DataCamp Workshop: Using Feature Stores for Managing Feature Engineering in Python

Set up a Free FeatureByte Tutorial Cloud Account

  1. In your browser, open tutorials.featurebyte.com/tutorial/sign-up then enter your details and click the Sign Up button
  2. Check your email inbox and open the email from FeatureByte asking you to verify your account. Click on the Verify Email link.
  3. Once again, check your email inbox for an email from FeatureByte and copy the API token you were sent.
  4. Enter your API token as an environment variable named API_TOKEN

Install the Featurebyte Library

!pip install -U featurebyte==0.4.2

Import Featurebyte Libraries

Featurebyte has an API and there is a Python SDK we can use to interface with the API

Load the featurebyte library and connect to the local instance of featurebyte

# library imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import datetime
import os
from datetime import datetime

# load the featurebyte SDK
import featurebyte as fb

# load your API token from the environment variables
api_token = "PF706DTsZpYQsYg19lImHJfU62hX_KkfEWjUNxB5J0Y"

# register your API token
fb.register_tutorial_api_token(api_token)
# this script requires version 0.4.1 or higher
print("FeatureByte Version: " + fb.version)

Load a new catalog

# get the helper functions to create a pre-built catalog
from credit_card_catalogs import *

# get the helper functions to plot individual examples
from plot_helper import *

# create a new catalog for this tutorial
catalog = create_demo_credit_card_catalog()

# get the views
[bank_customer_view, state_details_view, credit_card_view, card_transactions_view, 
    card_fraud_status_view, card_transaction_groups_view, purchases_view, interest_payments_view, 
    fraud_reports_view] = get_credit_card_views(catalog)

Case Study: Credit Card Fraud

  • A bank provides card cards to its customers
  • The bank wants to identify transactions that are likely to be fraudulent
  • They have historical data about the customers, credit cards, transactions, and fraud

The challenge is to quickly create a diverse yet intuitive range of features that might be helpful in predicting which transactions are fraudulent.

The Data Model

List the Entities and Relationships

Test Yourself: List all the entities in the catalog
https://docs.featurebyte.com/0.3/reference/featurebyte.api.catalog.Catalog.list_entities/

entities = catalog.list_entities()
print(entities)