Skip to content
FarmHill - Naufal Hawari
pip install kagglehubimport kagglehub
import pandas as pd
import sqlite3
path = kagglehub.dataset_download("naufalhawarinor/farmhill")
csv_file = f"{path}/farmhill_cleann.csv"
data = pd.read_csv(csv_file)
# Create SQLite in-memory database and load data into SQL
conn = sqlite3.connect(':memory:') # Create in-memory SQLite database
data.to_sql('farmhill', conn, index=False, if_exists='replace')
print("Data loaded into SQLite database.")- Carilah Customer dengan total serapan lebih dari 500kg!
query = """
SELECT
Customer,
SUM("Jumlah Real (Kg)") AS Total_Serapan
FROM
farmhill
GROUP BY
Customer
HAVING
Total_Serapan > 500
ORDER BY
Total_Serapan DESC;
"""
result = pd.read_sql_query(query, conn)
print(result)- Buatlah Query untuk menghitung rata-rata harga per varian per kategori item!
query = """
SELECT
"Kategori Item",
"Nama Item",
ROUND(AVG(Harga), 0) AS Rata_Rata_Harga
FROM
farmhill
GROUP BY
"Kategori Item",
"Nama Item";
"""
result = pd.read_sql_query(query, conn)
pd.set_option('display.max_rows', None) # Menampilkan semua baris
pd.set_option('display.max_columns', None) # Menampilkan semua kolom
pd.set_option('display.width', None) # Menyesuaikan lebar tampilan
print(result)- Buatlah Query untuk menghitung total penjualan (Net) dan total invoice per customer!
query = """
SELECT
Customer,
COUNT(DISTINCT "No. Invoice") AS Total_Invoice,
SUM("Total (Net)") AS Total_Penjualan_Net
FROM
farmhill
GROUP BY
Customer
ORDER BY
Total_Penjualan_Net DESC;
"""
result = pd.read_sql_query(query, conn)
pd.set_option('display.max_rows', None) # Menampilkan semua baris
pd.set_option('display.max_columns', None) # Menampilkan semua kolom
pd.set_option('display.width', None) # Menyesuaikan lebar tampilan
print(result)- Buatlah skema database yang baik berdasarkan Data Penjualan!
| Column Name | Criteria |
|---|---|
| Transaction_ID | Nominal. The unique identifier for a transaction. |
| Invoice_No | Nominal. The invoice number associated with the transaction. |
| Customer_ID | Discrete. The unique identifier for the customer involved in the transaction. Must be a positive integer. |
| SalesPerson_ID | Discrete. The unique identifier for the salesperson involved in the transaction. |
| Location_ID | Discrete. The unique identifier for the location where the transaction occurred. |
| Product_ID | Discrete. The unique identifier for the product sold in the transaction. |
| Date | Nominal. The date the transaction occurred in the format 'YYYY-MM-DD'. |
| Week | Nominal. The week of the month when the transaction occurred (e.g., 'Week 1', 'Week 2'). |
| Day_Name | Nominal. The name of the day on which the transaction occurred (e.g., 'Monday', 'Tuesday'). |
| Quantity_Kg | Continuous. The amount of product sold in kilograms. Must be a positive float. |
| Price_Kg | Continuous. The price per unit of the product sold. Must be a positive float. |
| Total_Bruto | Continuous. The total price before discounts or returns. Calculated as Quantity_Kg * Price_Kg. |
| Return_Kg | Continuous. The amount of product returned by the customer in kilograms. Must be a positive float. |
| Net_Kg | Continuous. The net amount of product sold after returns. Calculated as Quantity_Kg - Return_Kg. |
| Net_Price | Continuous. The net price per unit after discounts. Must be a positive float. |
| Total_Net | Continuous. The total net price after returns and discounts. Calculated as Net_Kg * Net_Price. |