Skip to content
pip install kagglehub
import kagglehub
import pandas as pd
import sqlite3

path = kagglehub.dataset_download("naufalhawarinor/farmhill")

csv_file = f"{path}/farmhill_cleann.csv"

data = pd.read_csv(csv_file)

# Create SQLite in-memory database and load data into SQL
conn = sqlite3.connect(':memory:')  # Create in-memory SQLite database
data.to_sql('farmhill', conn, index=False, if_exists='replace')

print("Data loaded into SQLite database.")
  1. Carilah Customer dengan total serapan lebih dari 500kg!
query = """
SELECT 
    Customer,
    SUM("Jumlah Real (Kg)") AS Total_Serapan
FROM 
    farmhill
GROUP BY 
    Customer
HAVING 
    Total_Serapan > 500
ORDER BY 
    Total_Serapan DESC;
"""
result = pd.read_sql_query(query, conn)
print(result)
  1. Buatlah Query untuk menghitung rata-rata harga per varian per kategori item!
query = """
SELECT 
    "Kategori Item",
    "Nama Item",
    ROUND(AVG(Harga), 0) AS Rata_Rata_Harga
FROM 
    farmhill
GROUP BY 
    "Kategori Item", 
    "Nama Item";
"""
result = pd.read_sql_query(query, conn)
pd.set_option('display.max_rows', None)  # Menampilkan semua baris
pd.set_option('display.max_columns', None)  # Menampilkan semua kolom
pd.set_option('display.width', None)  # Menyesuaikan lebar tampilan
print(result)
  1. Buatlah Query untuk menghitung total penjualan (Net) dan total invoice per customer!
query = """
SELECT 
    Customer,
    COUNT(DISTINCT "No. Invoice") AS Total_Invoice,
    SUM("Total (Net)") AS Total_Penjualan_Net
FROM 
    farmhill
GROUP BY 
    Customer
ORDER BY
	Total_Penjualan_Net DESC;
"""
result = pd.read_sql_query(query, conn)
pd.set_option('display.max_rows', None)  # Menampilkan semua baris
pd.set_option('display.max_columns', None)  # Menampilkan semua kolom
pd.set_option('display.width', None)  # Menyesuaikan lebar tampilan
print(result)
  1. Buatlah skema database yang baik berdasarkan Data Penjualan!

Column NameCriteria
Transaction_IDNominal. The unique identifier for a transaction.
Invoice_NoNominal. The invoice number associated with the transaction.
Customer_IDDiscrete. The unique identifier for the customer involved in the transaction. Must be a positive integer.
SalesPerson_IDDiscrete. The unique identifier for the salesperson involved in the transaction.
Location_IDDiscrete. The unique identifier for the location where the transaction occurred.
Product_IDDiscrete. The unique identifier for the product sold in the transaction.
DateNominal. The date the transaction occurred in the format 'YYYY-MM-DD'.
WeekNominal. The week of the month when the transaction occurred (e.g., 'Week 1', 'Week 2').
Day_NameNominal. The name of the day on which the transaction occurred (e.g., 'Monday', 'Tuesday').
Quantity_KgContinuous. The amount of product sold in kilograms. Must be a positive float.
Price_KgContinuous. The price per unit of the product sold. Must be a positive float.
Total_BrutoContinuous. The total price before discounts or returns. Calculated as Quantity_Kg * Price_Kg.
Return_KgContinuous. The amount of product returned by the customer in kilograms. Must be a positive float.
Net_KgContinuous. The net amount of product sold after returns. Calculated as Quantity_Kg - Return_Kg.
Net_PriceContinuous. The net price per unit after discounts. Must be a positive float.
Total_NetContinuous. The total net price after returns and discounts. Calculated as Net_Kg * Net_Price.