Skip to content
DataCamp Workshop (shared): Using Feature Stores for Managing Feature Engineering in Python (copy)
DataCamp Workshop: Using Feature Stores for Managing Feature Engineering in Python
Set up a Free FeatureByte Tutorial Cloud Account
- In your browser, open tutorials.featurebyte.com/tutorial/sign-up then enter your details and click the Sign Up button
- Check your email inbox and open the email from FeatureByte asking you to verify your account. Click on the Verify Email link.
- Once again, check your email inbox for an email from FeatureByte and copy the API token you were sent.
- Enter your API token as an environment variable named API_TOKEN
Install the Featurebyte Library
!pip install -U featurebyte==0.4.2
Import Featurebyte Libraries
Featurebyte has an API and there is a Python SDK we can use to interface with the API
Load the featurebyte library and connect to the local instance of featurebyte
# library imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import math
import datetime
import os
from datetime import datetime
# load the featurebyte SDK
import featurebyte as fb
# load your API token from the environment variables
api_token = "PF706DTsZpYQsYg19lImHJfU62hX_KkfEWjUNxB5J0Y"
# register your API token
fb.register_tutorial_api_token(api_token)
# this script requires version 0.4.1 or higher
print("FeatureByte Version: " + fb.version)
Load a new catalog
# get the helper functions to create a pre-built catalog
from credit_card_catalogs import *
# get the helper functions to plot individual examples
from plot_helper import *
# create a new catalog for this tutorial
catalog = create_demo_credit_card_catalog()
# get the views
[bank_customer_view, state_details_view, credit_card_view, card_transactions_view,
card_fraud_status_view, card_transaction_groups_view, purchases_view, interest_payments_view,
fraud_reports_view] = get_credit_card_views(catalog)
Case Study: Credit Card Fraud
- A bank provides card cards to its customers
- The bank wants to identify transactions that are likely to be fraudulent
- They have historical data about the customers, credit cards, transactions, and fraud
The challenge is to quickly create a diverse yet intuitive range of features that might be helpful in predicting which transactions are fraudulent.
The Data Model
List the Entities and Relationships
Test Yourself: List all the entities in the catalog
https://docs.featurebyte.com/0.3/reference/featurebyte.api.catalog.Catalog.list_entities/
entities = catalog.list_entities()
print(entities)