Skip to content
SQL筆記
PostgreSQL Summary Stats and Window Functions Note
視窗函數
章節-Fetching, ranking, and paging
這裡筆記中的內容,是依照PostgreSLQ 為主設計的
目前介紹的函數有
- LAG(參照欄位,格數) :以相對位置往前=PRECEDING幾格,並將參數置入於刮弧中。 用法如下
DataFrameas
df
variable
SELECT
f.title,
r.imdb_score,
LAG(r.imdb_score,1) OVER(ORDER BY r.imdb_score DESC) --前一個分數較高的作品
FROM cinema.reviews AS r
JOIN cinema.films AS f
ON r.id = f.id
WHERE r.imdb_score IS NOT NULL -- 有未被評分的電影
ORDER BY r.imdb_score DESC
LIMIT 10;- LEAD() :以相對位置往後=FOLLOWINGP幾格,並將參數置入於刮弧中。
DataFrameas
df
variable
- FIRSR_VALUE() :取表頭的第一個值。
- LAST_VALUE :(ORDER BY OXOX RANG EBETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING):取表格的最後一個值,並且要加上後續的補充參數,確保整張表格都被包含。
- ROW_NUMBERS():用來依照列數,給與對應的數字。
- RANK():依序排名,佔缺額 i.e. 12345名,234同分,則排名為12225。
- DENSE_RANK():連續號碼排名 i.e. 12345,234同分,則排名為12223。
- NTILE() :用分頁來理解最簡單,同時也能將所有查詢結果分最接近平分的分數,可以作為區分四分位數的方法之一。
聚合視窗函數應用
章節-Aggregate window functions and frames
結合另一段課程章節裡的 聚合函數 可以讓視窗函數有更多的篩選能力
目前新學的函數為結合上一段 Fretching的技巧,建立『Frames』在選定的框架中
可以更靈活的與表格的各區塊比對
視窗函數之外更多的功能
章節-Beyond window functions
本篇有超多不合理的解釋跟錯誤 以回覆如下
In the course discribtion says: "Pivot it by Year to get the following reshaped, cleaner table."
| Gender | 2008 | 2012 |
|---|---|---|
| Men | AUS | FRA |
| Women | RUS | USA |
And I enter the code below (I know is not correct answer)
AS ct (Gender VARCHAR, Year VARCHAR, Country VARCHAR)
to get
| Gender | Year | Country |
|---|---|---|
| Men | AUS | FRA |
| Women | RUS | USA |
Submit this answer can still get throug the course. I think is not gonna be one of the answer, right?
我寫了以下內容回覆
In the course of Beyond window functions : Pivoting and Pivoting with ranking, checker work wired.
In Pivoting course description says:
"Pivot it by Year to get the following reshaped, cleaner table."
| Gender | 2008 | 2012 |
|--------|------|------|
| Men | AUS | FRA |
| Women | RUS | USA |
And I enter the code below
(I know is not correct answer)
```SQL
AS ct (Gender VARCHAR,
Year VARCHAR,
Country VARCHAR)
```
to get
| Gender | Year | Country |
|--------|------|---------|
| Men | AUS | FRA |
| Women | RUS | USA |
Submit this answer can still get through the course.
I think is not gonna be one of the answer, right?
And more in Pivoting with ranking
Says "You'll need to count the gold medals each country has earned, produce the ranks of each country by medals earned, then pivot the table to this shape."
So, in the query I use COUNT(medal) to collect how many medals each country earned.
But the answer checker says I need to use COUNT(*) means count every columns?
Though, filter work well and result will be the same, but why we use COUNT(*) instead of COUNT(medal).
And because of first question makes me not really trust the hint provide by checker.
Is there anyone can help me to lean SQL properly?
題外補充
晚上回家看書,多得到一個新的知識 :: INTEGER 是指將後續得到的值,轉為整數形態的意思 但奇怪的部分是這個功能是突然跑出來的
Pivoting
示範完整語法如下
DataFrameas
df
variable
-- 這裡是怪異的 Pivoting 轉置功能註解
-- 下方是轉置的啟動語句,意思是說如果沒有轉置的功能要利用EXTENSION的功能
CREATE EXTENSION IF NOT EXISTS tablefunc;
-- 將要轉換的查詢利用 CROSSTAB 並用($$ $$)來包住內容,而這部分應該是指啟用 EXTENSION中的CROSSTAB
SELECT * FROM CROSSTAB($$
-- 這裡選用CTE先建立一個查詢,以便後續欄位與值更好篩選
WITH Country_Awards AS (
SELECT
Country,
Year,
COUNT(*) AS Awards
FROM Summer_Medals
WHERE
Country IN ('FRA', 'GBR', 'GER')
AND Year IN (2004, 2008, 2012)
AND Medal = 'Gold'
GROUP BY Country, Year)
-- 進入主查詢
SELECT
Country,
Year,
RANK() OVER
(PARTITION BY Year
ORDER BY Awards DESC) :: INTEGER AS rank
FROM Country_Awards
ORDER BY Country ASC, Year ASC;
-- 這裡以上都是包含在 $$)中被轉換的主要內容
-- Fill in the correct column names for the pivoted table
$$) AS ct
/*
是否綁定一定要AS為ct不確定,但下方是呈現需要轉置的欄位將呈現什麼樣子
上方查詢得到的資料是以國家、年度、排名的一張表格如下一個區塊
*/
(Country VARCHAR,
"2004" INTEGER,
"2008" INTEGER,
"2012" INTEGER)
Order by Country ASC;如果想看到這些國家在相同的時間裡各自的排名
就需要轉置,將年份作為欄位名稱、排名作為值
最終得到查詢結果如下
country 2004 2008 2012 FRA 2 3 3 GBR 3 2 1 GER 1 1 2
繼續上課新章節,介紹了GROUP BY 子句的新功能 ROLLUP()以及 CUBE()都是緊接在ODER BY之後
兩種子句都將產生NULL值,因為都會新增一個小計後的數值
ROLLUP():是用來將指定的欄位收合,DataFrameas
df
variable
-- 版本一,可以獲得每個國家的獎牌種類與數量
-- Count the medals per gender and medal type
SELECT
country,
gender,
medal AS Medal_type,
COUNT(*) AS Awards
FROM medals.Summer_Medals
WHERE
Year = 2012
AND Country = 'RUS'
-- Get all possible group-level subtotals
GROUP BY country, Gender, medal
ORDER BY Gender ASC, Medal ASC;如果加入了ROLLUP會變成以下這樣