SQL

Software Design SQL50本ノックやってみた。

環境構築編

購入の価値ありです。

 

 

もくじ

SQL1. SELECT

 

postgres=# SELECT * FROM payment;

 payment_id | customer_id | staff_id | rental_id | amount |        payment_date
------------+-------------+----------+-----------+--------+----------------------------
      16050 |         269 |        2 |         7 |   1.99 | 2007-01-24 21:40:19.996577
      16051 |         269 |        1 |        98 |   0.99 | 2007-01-25 15:16:50.996577
      16052 |         269 |        2 |       678 |   6.99 | 2007-01-28 21:44:14.996577
      16053 |         269 |        2 |       703 |   0.99 | 2007-01-29 00:58:02.996577
      16054 |         269 |        1 |       750 |   4.99 | 2007-01-29 08:10:06.996577
      16055 |         269 |        2 |      1099 |   2.99 | 2007-01-31 12:23:14.996577
      16056 |         270 |        1 |       193 |   1.99 | 2007-01-26 05:10:14.996577
      16057 |         270 |        1 |      1040 |   4.99 | 2007-01-31 04:03:42.996577
      16058 |         271 |        1 |      1096 |   8.99 | 2007-01-31 11:59:15.996577
      16059 |         272 |        1 |        33 |   0.99 | 2007-01-25 02:47:17.996577


(略)




postgres=# SELECT * FROM customer;

 customer_id | store_id | first_name  |  last_name   |                  email                   | address_id | activebool | create_date |     last_update     | active
-------------+----------+-------------+--------------+------------------------------------------+------------+------------+-------------+---------------------+--------
           1 |        1 | MARY        | SMITH        | MARY.SMITH@sakilacustomer.org            |          5 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           2 |        1 | PATRICIA    | JOHNSON      | PATRICIA.JOHNSON@sakilacustomer.org      |          6 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           3 |        1 | LINDA       | WILLIAMS     | LINDA.WILLIAMS@sakilacustomer.org        |          7 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           4 |        2 | BARBARA     | JONES        | BARBARA.JONES@sakilacustomer.org         |          8 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           5 |        1 | ELIZABETH   | BROWN        | ELIZABETH.BROWN@sakilacustomer.org       |          9 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           6 |        2 | JENNIFER    | DAVIS        | JENNIFER.DAVIS@sakilacustomer.org        |         10 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           7 |        1 | MARIA       | MILLER       | MARIA.MILLER@sakilacustomer.org          |         11 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           8 |        2 | SUSAN       | WILSON       | SUSAN.WILSON@sakilacustomer.org          |         12 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           9 |        2 | MARGARET    | MOORE        | MARGARET.MOORE@sakilacustomer.org        |         13 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
          10 |        1 | DOROTHY     | TAYLOR       | DOROTHY.TAYLOR@sakilacustomer.org        |         14 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1

(略)

 

 

SQL2. 条件を指定する WHERE

 

SELECT
  payment_id,
  customer_id
FROM
  payment
WHERE
  customer_id = 1;


 payment_id | customer_id
------------+-------------
      16677 |           1
      16678 |           1
      18495 |           1
      18496 |           1
      18497 |           1
      18498 |           1
      18499 |           1
      18500 |           1
      18501 |           1
      22680 |           1


(略)

 

 

SQL3. cutomerテーブルの名前(first_name)がKELLYのレコードを抽出

 

SELECT
  first_name,
  last_name
FROM
  customer
WHERE
  first_name = 'KELLY';


 first_name | last_name
------------+-----------
 KELLY      | TORRES
 KELLY      | KNOTT
(2 rows)

 

 

SQL4. customerテーブルから、名字(last_name)がKNOTTで名前がKELLYの顧客を抽出する

 

SELECT
  first_name,
  last_name
FROM
  customer
WHERE
  first_name = 'KELLY'
  AND last_name = 'KNOTT'
;



 first_name | last_name
------------+-----------
 KELLY      | KNOTT
(1 row)

 

 

 

SQL5. customerテーブルから、名前がKELLYもしくはMARIAの顧客を抽出する

 

SELECT
  first_name,
  last_name
FROM
  customer
WHERE
  first_name = 'KELLY'
OR
  first_name = 'MARIA'
;



 first_name | last_name
------------+-----------
 MARIA      | MILLER
 KELLY      | TORRES
 KELLY      | KNOTT
(3 rows)

 

 

SQL6. customerテーブルから、名前がKELLYやMARIA以外の顧客を抽出する

 

SELECT
  first_name,
  last_name
FROM
  customer
WHERE
  NOT (
    first_name = 'KELLY'
    OR
    first_name = 'MARIA'
  )
;


 first_name  |  last_name
-------------+--------------
 MARY        | SMITH
 PATRICIA    | JOHNSON
 LINDA       | WILLIAMS
 BARBARA     | JONES
 ELIZABETH   | BROWN
 JENNIFER    | DAVIS
 SUSAN       | WILSON
 MARGARET    | MOORE
 DOROTHY     | TAYLOR
 LISA        | ANDERSON
 NANCY       | THOMAS
 KAREN       | JACKSON

(略)

 

 

SQL7. OR条件を列挙して記述できるIN句。cutomerテーブルから、名前がAARON, ADAM, ANNの顧客を抽出する

 

SELECT
  first_name,
  last_name
FROM
  customer
WHERE
  first_name IN ('AARON', 'ADAM', 'ANN')
;


 first_name | last_name
------------+-----------
 ANN        | EVANS
 ADAM       | GOOCH
 AARON      | SELBY
(3 rows)

 

 

SQL8. 比較演算子 paymentテーブルから支払額(amount)が6.99ドル以上のレコードを抽出する

 

SELECT
  payment_id,
  amount
FROM
  payment
WHERE
  amount >= 6.99
;


 payment_id | amount
------------+--------
      16052 |   6.99
      16058 |   8.99
      16060 |   6.99
      16061 |   6.99
      16073 |  10.99
      16074 |   6.99
      16082 |   6.99
      16086 |   6.99
      16087 |   6.99
      16092 |   6.9

(略)

 

 

SQL9. paymentテーブルから、支払額(amount)が0.99ドル以外のテーブルを抽出

 

SELECT
  payment_id,
  amount
FROM
  payment
WHERE
  amount != 0.99
;


 payment_id | amount
------------+--------
      16050 |   1.99
      16052 |   6.99
      16054 |   4.99
      16055 |   2.99
      16056 |   1.99
      16057 |   4.99
      16058 |   8.99

 

SQL11. rentalテーブルのreturn_dateがNULLではないレコードを抽出する

 

SELECT
  rental_id,
  return_date
FROM
  rental
WHERE
  return_date IS NOT NULL
;


 rental_id |     return_date
-----------+---------------------
         2 | 2005-05-28 19:40:33
         3 | 2005-06-01 22:12:39
         4 | 2005-06-03 01:43:41
         5 | 2005-06-02 04:33:21
         6 | 2005-05-27 01:32:07
         7 | 2005-05-29 20:34:53
         8 | 2005-05-27 23:33:46

(略)

 

 

SQL12. BETWEEN演算子 customテーブルから、顧客IDが11から13の顧客をBETWEENを使って抽出

 

SELECT
  customer_id,
  first_name,
  last_name
FROM
  customer
WHERE
  customer_id BETWEEN 11 AND 13
;



 customer_id | first_name | last_name
-------------+------------+-----------
          11 | LISA       | ANDERSON
          12 | NANCY      | THOMAS
          13 | KAREN      | JACKSON
(3 rows)

 

SQL13. LIKE演算子 filmテーブルのdescriptionにAmazingが含まれているレコードを抽出

 

SELECT
  title,
  description
FROM
  film
WHERE
  description LIKE '%Amazing%'
;


         title         |                                                          description
-----------------------+--------------------------------------------------------------------------------------------------------------------------------
 ANNIE IDENTITY        | A Amazing Panorama of a Pastry Chef And a Boat who must Escape a Woman in An Abandoned Amusement Park
 ANONYMOUS HUMAN       | A Amazing Reflection of a Database Administrator And a Astronaut who must Outrace a Database Administrator in A Shark Tank
 BRANNIGAN SUNRISE     | A Amazing Epistle of a Moose And a Crocodile who must Outrace a Dog in Berlin
 BUCKET BROTHERHOOD    | A Amazing Display of a Girl And a Womanizer who must Succumb a Lumberjack in A Baloon Factory
 BULWORTH COMMANDMENTS | A Amazing Display of a Mad Cow And a Pioneer who must Redeem a Sumo Wrestler in The Outback
 CARRIE BUNCH          | A Amazing Epistle of a Student And a Astronaut who must Discover a Frisbee in The Canadian Rockies
 CASABLANCA SUPER      | A Amazing Panorama of a Crocodile And a Forensic Psychologist who must Pursue a Secret Agent in The First Manned Space Station
 CELEBRITY HORN        | A Amazing Documentary of a Secret Agent And a Astronaut who must Vanquish a Hunter in A Shark Tank
 CHAMPION FLATLINERS   | A Amazing Story of a Mad Cow And a Dog who must Kill a Husband in A Monastery
 CLASH FREDDY          | A Amazing Yarn of a Composer And a Squirrel who must Escape a Astronaut in Australia
 CLONES PINOCCHIO      | A Amazing Drama of a Car And a Robot who must Pursue a Dentist in New Orleans
 DEEP CRUSADE          | A Amazing Tale of a Crocodile And a Squirrel who must Discover a Composer in Australia
 EARLY HOME            | A Amazing Panorama of a Mad Scientist And a Husband who must Meet a Woman in The Outback

(略)

 

SQL14. NOT LIKE filmテーブルのdescriptionにAmazingが含まれていないレコードの抽出

 

SELECT
  title,
  description
FROM
  film
WHERE
  description NOT LIKE '%Amaziong%'
;


            title            |                                                            description
-----------------------------+------------------------------------------------------------------------------------------------------------------------------------
 ACADEMY DINOSAUR            | A Epic Drama of a Feminist And a Mad Scientist who must Battle a Teacher in The Canadian Rockies
 ACE GOLDFINGER              | A Astounding Epistle of a Database Administrator And a Explorer who must Find a Car in Ancient China
 ADAPTATION HOLES            | A Astounding Reflection of a Lumberjack And a Car who must Sink a Lumberjack in A Baloon Factory
 AFFAIR PREJUDICE            | A Fanciful Documentary of a Frisbee And a Lumberjack who must Chase a Monkey in A Shark Tank
 AFRICAN EGG                 | A Fast-Paced Documentary of a Pastry Chef And a Dentist who must Pursue a Forensic Psychologist in The Gulf of Mexico
 AGENT TRUMAN                | A Intrepid Panorama of a Robot And a Boy who must Escape a Sumo Wrestler in Ancient China
 AIRPLANE SIERRA             | A Touching Saga of a Hunter And a Butler who must Discover a Butler in A Jet Boat
 AIRPORT POLLOCK             | A Epic Tale of a Moose And a Girl who must Confront a Monkey in Ancient India

(略)

 

 

SQL15. COUNT() paymentテーブルの総件数を求める

 

SELECT COUNT(*) FROM payment;

 count
-------
 16049
(1 row)

 

 

SQL16. DISTINCT 重複を除外。 paymentテーブルから、支払いを行ったユニークな顧客IDを求める

SELECT
  DISTINCT customer_id
FROM
  payment
;


 customer_id
-------------
         251
         106
         120
         285
         264
         497
         452
         496
         455
         209
(略)

 

 

SQL17. paymentテーブルから、支払いを行ったユニークな顧客数を求める

SELECT
  COUNT(DISTINCT customer_id)
FROM
  payment
;

 count
-------
   599
(1 row)

 

SQL18. ORDER BY 順序を指定する。 customerテーブルから、顧客の名字を昇順に出力する

 

SELECT
  customer_id,
  last_name
FROM
  customer
ORDER BY
  last_name
;


 customer_id |  last_name
-------------+--------------
         505 | ABNEY
         504 | ADAM
          36 | ADAMS
          96 | ALEXANDER
         470 | ALLARD
          27 | ALLEN
         220 | ALVAREZ
          11 | ANDERSON
         326 | ANDREW
         183 | ANDREWS
         449 | AQUINO
         368 | ARCE
         560 | ARCHULETA
         188 | ARMSTRONG
(略)

 

SQL20. LIMIT句。 customerテーブルから、最近登録された顧客3名を抽出する

 

SELECT
  first_name,
  last_name
FROM
  customer
ORDER BY
  customer_id DESC
LIMIT 3
;


 first_name | last_name
------------+-----------
 AUSTIN     | CINTRON
 WADE       | DELVALLE
 FREDDIE    | DUGGAN
(3 rows)

 

SQL21. GROUP BY句。paymentテーブルから、これまでの累計で支払い回数が多い顧客の上位3人の顧客IDを抽出する

 

SELECT
  customer_id,
  COUNT(*) AS payment_count
FROM
  payment
GROUP BY
  customer_id
ORDER BY
  payment_count DESC
LIMIT 3
;


 customer_id | payment_count
-------------+---------------
         148 |            46
         526 |            45
         236 |            42
(3 rows)

 

SQL22. ROUND句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109″のように小数点以下を四捨五入した形で抽出。1ドル110円とする。

 

SELECT
  ROUND(amount * 110) AS amount_yen
FROM
  payment
LIMIT 3
;


 amount_yen
------------
        219
        109
        769
(3 rows)

 

 

SQL23. CONCAT句。paymentテーブルの売上金額(amount)をドルから円に変換して、”109yen”のように小数点以下を四捨五入して単位をつける。

 

SELECT
  CONCAT(
    ROUND(amount * 110),
    'yen'
  ) AS amount_yen
FROM
  payment
LIMIT 3
;

 amount_yen
------------
 219yen
 109yen
 769yen
(3 rows)

 

 

 

結合

 

SQL24. LEFT JOIN句。paymentテーブルにcustomer_idでひも付くcustomerテーブルを結合し、payment_id, last_name, first_nameカラムのデータを抽出する

 

SELECT
  payment_id,
  last_name,
  first_name
FROM
  payment
  LEFT JOIN customer
    ON payment.customer_id
      = customer.customer_id
;

または

SELECT
  payment_id,
  last_name,
  first_name
FROM
  payment
  LEFT JOIN customer
    USING(customer_id)
;


 payment_id |  last_name   | first_name
------------+--------------+-------------
      16050 | WALTERS      | CASSANDRA
      16051 | WALTERS      | CASSANDRA
      16052 | WALTERS      | CASSANDRA
      16053 | WALTERS      | CASSANDRA
      16054 | WALTERS      | CASSANDRA
      16055 | WALTERS      | CASSANDRA
      16056 | CURTIS       | LEAH
      16057 | CURTIS       | LEAH
      16058 | NEAL         | PENNY
      16059 | CALDWELL     | KAY
      16060 | CALDWELL     | KAY
      16061 | CALDWELL     | KAY
      16062 | CALDWELL     | KAY
      16063 | LOWE         | PRISCILLA
(略)

 

SQL25. INNER JOIN句。paymentテーブルから、顧客名がBRIAN WYMANの支払いデータを抽出する

 

SELECT
  payment_id,
  payment.customer_id,
  amount
FROM
  payment
  INNER JOIN customer
    ON payment.customer_id
      = customer.customer_id
WHERE
  first_name = 'BRIAN'
  AND last_name = 'WYMAN'
;



 payment_id | customer_id | amount
------------+-------------+--------
      16160 |         318 |   9.99
      17400 |         318 |   2.99
      17401 |         318 |   2.99
      17402 |         318 |   0.99
      17403 |         318 |   7.99
      20001 |         318 |   2.99
      25744 |         318 |   4.99
      25745 |         318 |   2.99
      25746 |         318 |   8.99
      25747 |         318 |   0.99
      25748 |         318 |   0.99
      25749 |         318 |   5.99
(12 rows)

 

SQL26. WHERE >GROUP BY > HAVINGの優先順位。 filmテーブル, film_categoryテーブル, categoryテーブルから、カテゴリ名ごとに映画の作品数を集計し、65件件以上のものを抽出する

 

SELECT
  category.name AS name,
  COUNT(category.name) AS film_cnt
FROM
  film
  INNER JOIN film_category
    USING(film_id)
  INNER JOIN category
    USING(category_id)
WHERE
  category.name IN(
    'Sports',
    'Games',
    'Travel'
  )
GROUP BY
  category.name
HAVING
  COUNT(category.name) > 60
ORDER BY
  film_cnt DESC
;


  name  | film_cnt
--------+----------
 Sports |       74
 Games  |       61
(2 rows)

 

SQL27. CASE句。paymentテーブルで、支払い額が5を超える場合はexpensive, 1を超える場合はmodest, そうでなければcheapとして一覧表示する

 

SELECT
  payment_id,
  amount,
  CASE
    WHEN amount > 5 THEN 'expensive'
    WHEN amount > 1 THEN 'modest'
    ELSE 'cheap'
  END AS price_range
FROM
  payment
;


 payment_id | amount | price_range
------------+--------+-------------
      16050 |   1.99 | modest
      16051 |   0.99 | cheap
      16052 |   6.99 | expensive
      16053 |   0.99 | cheap
      16054 |   4.99 | modest
      16055 |   2.99 | modest
      16056 |   1.99 | modest
      16057 |   4.99 | modest
      16058 |   8.99 | expensive
      16059 |   0.99 | cheap
      16060 |   6.99 | expensive

(略)

 

 

SQL28. POSIX正規表現。filmテーブルのdescriptionに入っているThoughtfulまたは、Insightfulのレコード数を求める。

 

PostgreSQLの場合

SELECT
  COUNT(*)
FROM
  film
WHERE
  description ~ '(Thou|Insi)ghtful'
;

 

MySQL

SELECT
  COUNT(*)
FROM
  film
WHERE
  description REGEXP ~ '(Thou|Insi)ghtful'
;

 

正規表現

  • ~ 正規表現に一致し、大文字小文字の区別を行う
  • ~* 正規表現に一致し、大文字小文字の区別をしない
  • !~ 正規表現に一致しない、大文字小文字の区別を行う
  • !~* 正規表現に一致しない、大文字小文字の区別をしない

 

 

集計関数

  • COUNT レコード数
  • SUM 合計
  • AVG 平均値
  • MAX 最大値
  • MIN 最小値

 

●SUM

SELECT
  SUM(amount) AS total_sales
FROM
  payment
;


 total_sales
-------------
    67416.51
(1 row)

 

●SUM + GROUP BY

SELECT
  customer_id,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  customer_id
;


 customer_id | total_sales
-------------+-------------
         251 |      120.69
         106 |      100.77
         120 |      143.68
         285 |      135.74
         264 |       98.75
         497 |      129.72
         452 |      107.68
         496 |       88.79
         455 |       87.76

 

SQL29. paymentテーブルから、支払い額上位5名の顧客データを抽出する

SELECT
  customer_id,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  customer_id
ORDER BY
  total_sales DESC
LIMIT 5
;


 customer_id | total_sales
-------------+-------------
         526 |      221.55
         148 |      216.54
         144 |      195.58
         137 |      194.61
         178 |      194.61
(5 rows)

 

 

日付の条件指定

 

CAST(変換したいカラム AS 変換後のデータ型)

SQL30. CAST句。paymentテーブルから、日付ごとの売上金額を集計する。

SELECT
  CAST(payment_date AS DATE) AS p_date,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  p_date
ORDER BY
  p_date
;
   p_date   | total_sales
------------+-------------
 2007-01-24 |       86.81
 2007-01-25 |      568.61
 2007-01-26 |      743.30
 2007-01-27 |      708.27
 2007-01-28 |      793.10
 2007-01-29 |      655.42
 2007-01-30 |      622.41
 2007-01-31 |      646.51
 2007-02-14 |      152.63
 2007-02-15 |     1376.49

(略)

成功!

 

SELECT
  payment_date,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  payment_date
ORDER BY
  payment_date
;


        payment_date        | total_sales
----------------------------+-------------
 2007-01-24 21:21:56.996577 |        2.99
 2007-01-24 21:22:59.996577 |        2.99
 2007-01-24 21:32:05.996577 |        3.99
 2007-01-24 21:33:07.996577 |        4.99
 2007-01-24 21:33:47.996577 |        6.99
 2007-01-24 21:36:33.996577 |        0.99
 2007-01-24 21:40:19.996577 |        1.99
 2007-01-24 22:00:12.996577 |        4.99
(略)

payment_dateがtimestamp型のため、
日付ごとでは集計が上手くいかない!!!!

 

SQL31. EXTRACT句。paymentテーブルから、月別の売上金額を集計する。

SELECT
  EXTRACT(YEAR FROM payment_date) As YYYY,
  EXTRACT(MONTH FROM payment_date) AS mm,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  yyyy,
  mm
ORDER BY
  mm
;


 yyyy | mm | total_sales
------+----+-------------
 2007 |  1 |     4824.43
 2007 |  2 |     9631.88
 2007 |  3 |    23886.56
 2007 |  4 |    28559.46
 2007 |  5 |      514.18
(5 rows)

 

または、

●LEFT句 文字列から文字数分の文字を左から切り出す関数。

LEFT(文字列, 文字数)

SELECT
  LEFT(
    CAST(payment_date AS VARCHAR),
    7
  ) AS yyyymm,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  yyyymm
ORDER BY
  yyyymm
;


 yyyymm  | total_sales
---------+-------------
 2007-01 |     4824.43 
 2007-02 |     9631.88
 2007-03 |    23886.56
 2007-04 |    28559.46
 2007-05 |      514.18
(5 rows)

 

 

 

SQL32. 日付の条件指定。paymentテーブルから、2007年1月の売上データを抽出する。

 

●よくある間違い

SELECT
  SUM(amount) AS total_sales
FROM
  payment
WHERE
  payment_date >= '2007-01-01'
  AND payment_date <= '2007-01-31'
;


 total_sales
-------------
     4177.92 ←少ない。SQL31参照
(1 row)

2007-01-31のデータが含まれていない結果になっています。

 

●理由
WHERE payment_date <= ‘2007-01-31’
これは、2007-01-31 00:00:00までのデータを取得するから。

 

●正しい方法①

SELECT
  SUM(amount) AS total_sales
FROM
  payment
WHERE
  payment_date >= '2007-01-01'
  AND payment_date < '2007-02-01'
;


 total_sales
-------------
     4824.43
(1 row)

 

●正しい方法② BETWEENを利用する

SELECT
  SUM(amount) AS total_sales
FROM
  payment
WHERE
  CAST(payment_date AS DATE)
    BETWEEN '2007-01-01' AND '2007-01-31'
;


 total_sales
-------------
     4824.43
(1 row)

 

●正しい方法③

SELECT
  SUM(amount) AS total_sales
FROM
  payment
WHERE
  payment_date >= '2007-01-01'
  AND CAST(payment_date AS DATE) <= '2007-01-31'
;


 total_sales
-------------
     4824.43
(1 row)

 

●正しい方法④ EXTRACT

SELECT
  SUM(amount) AS total_sales
FROM
  payment
WHERE
  EXTRACT(YEAR FROM payment_date) = 2007
  AND EXTRACT(MONTH FROM payment_date) = 1
;


 total_sales
-------------
     4824.43
(1 row)

 

複数のクエリを組み合わせた抽出 FROM, IN, EXISTS

 

SQL33. paymentテーブルから顧客IDごとに累計売上を合計し、1顧客あたりの平均売上、最低売上、最高売上を求める。

 

●過程① 顧客IDごとに累計売上を合計する

SELECT
  customer_id,
  SUM(amount) AS total_sales
FROM
  payment
GROUP BY
  customer_id
;
 customer_id | total_sales
-------------+-------------
         251 |      120.69
         106 |      100.77
         120 |      143.68
         285 |      135.74
         264 |       98.75

これでcustomer_id毎の合計売上は出る。

 

●過程② ①で作った顧客IDごとに売上を合計したテーブルがcustomer_paymentというテーブル名で存在すると想像しながら書く。

SELECT
  AVG(total_sales),
  MIN(total_sales),
  MAX(total_sales)
FROM
  customer_payment
;


ERROR:  relation "customer_payment" does not exist
LINE 6:   customer_payment

※当然エラーになるが、イメージはつかめる

 

●過程①と②を組み合わせる。

SELECT
  AVG(total_sales),
  MIN(total_sales),
  MAX(total_sales)
FROM
  (SELECT
    customer_id,
    SUM(amount) AS total_sales
  FROM
    payment
  GROUP BY
    customer_id
  ) AS customer_payment
;


         avg          |  min  |  max
----------------------+-------+--------
 112.5484307178631052 | 50.85 | 221.55
(1 row)

 

●JOINとサブクエリの使い分けなるべくインデックスが働くJOINを利用する。
サブクエリの場合は

  • サブクエリの結果テーブルにはインデックスが働かない。
  • ネストが深くなり、可読性が低くなる
    →重くなりがち、可読性が低くなるサブクエリの欠点を解消するWITHを利用する。

 

 

SQL34. customerテーブルとpayment_p2007_05テーブルからJOINを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する

SELECT
  last_name
FROM
  customer AS c
  INNER JOIN payment_p2007_05 AS p
    ON c.customer_id = p.customer_id
;

  last_name
-------------
 BROWN
 MOORE
 ANDERSON
 WHITE
 HARRIS
 HARRIS
 CLARK
 RODRIGUEZ
 LEWIS
 YOUNG
 HERNANDEZ

(略)

一見上手くいったかのように見えるが、
2回購入した人は2回でる。DISTINCTを使って重複を削除することもできるが、
JOINを繋げていくケースでは
JOINするごとにどんどん行が膨れ上がって、
処理時間も膨大になる。

SQL35ではこれをINを使って解決する。

 

SQL35. INを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する。

SELECT
  last_name
FROM
  customer
WHERE
  customer_id IN
    (SELECT
      customer_id
    FROM
      payment_p2007_05
  )
;


  last_name
-------------
 BROWN
 MOORE
 ANDERSON
 WHITE
 HARRIS
 CLARK
 RODRIGUEZ

条件はcolumn IN (query)という形になり、
queryのところにSELECTクエリがかかれます。

ここでは5月に支払いのあったcsutomer_idを抽出しており、
その結果を値として親クエリではINを用いて絞りこみを行っています。

 

別のテーブルに存在する値だけを抽出するEXISTS

●EXISTS句の特徴

  • ある外部キーが他のテーブルに存在するかを判定
  • 一般的にJOINからDISTINCTを行うのに比べ高速である
  • SELECT句のカラムは結果に影響せず、慣例的に1が使われることが多い
  • これまでの条件演算と同様にNOT EXISTSとして、否定条件で使うこともできる。

 

●INとEXISTSのサブクエリの実行結果について

IN()のサブクエリ単体で実行結果が返るが、
EXISTS()のサブクエリは文法的にエラーの為実行結果は返らない。

●INとEXISTSの使い分け

  • サブクエリの結果が小さくなる場合はINを使う
  • 親クエリの結果が小さくなる場合はEXISTSを使う

 

SQL36. payment_p2007_05テーブルから、EXISTSを用いて、2007年5月に支払いのあった顧客のlast_nameを抽出する

 

SELECT
  last_name
FROM
  customer AS c
WHERE
  EXISTS (
    SELECT
      1
    FROM
      payment_p2007_05 AS p
    WHERE
      c.customer_id = p.customer_id
  )
;


  last_name
-------------
 BROWN
 MOORE
 ANDERSON
 WHITE
 HARRIS
 CLARK
 RODRIGUEZ

 

 

問い合わせの結合 UNION, INTERSECT, EXCEPT

 

●結果の和集合をとる UNION
2つ以上のクエリから、その結果の和集合(どちらか、もしくは両方に含まれるもの)
を取り出したい時はUNIONで繋ぎます。

 

SQL37. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、どちらかに含まれるcustomer_idを抽出する

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
UNION
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_05
;


 customer_id
-------------
         251
         106
         120
         264
         497
         452
         496
         455
         209
         276
         577
         550
(略)

 

●結果の積集合をとる INTERSECT

同様に積集合(両方に含まれるもの)をとりたい場合は、INTERSECT

SQL38. 1月と5月の支払い履歴(payment_p2007_01/05テーブル)から、両方に含まれるcustomer_idを抽出する

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_05
INTERSECT
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
;


 customer_id
-------------
         251
         120
         337
         227
         497
         190
         452

(略)

 

 

SQL39. 1, 2, 3月の支払い履歴(payment_p2007_01/02/03テーブル)から、全てに含まれるcustomer_idを抽出する。

 

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
INTERSECT
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_02
INTERSECT
  SELECT
    DISTINCT customer_id
FROM
  payment_p2007_03
;


 customer_id
-------------
         251
         106
         120
         497
         452
         496
         455
         209
         276

(略)

 

●結果の差集合をとる EXCEPT

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_05
EXCEPT
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
;


 customer_id
-------------
         264
         410
          80
         175
          15
         585
         579
         568
         335
         229
         336
         178
         512
         188
         521
         592
         527
         163
          41
(19 rows)

 

 

SQL40. +1。 1月には支払いがあって、5月には支払いがなかったcustomer_idの抽出

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
EXCEPT
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_05
;


 customer_id
-------------
         106
         455
         209
         276
         577
         599
         151

(略)

(381 rows)

 

 

SQL41. payment_p2007_01/05テーブルから、1月には支払いがあって、5月には支払いがなかったcustomer_idを昇順で3件抽出する。

 

SELECT
  DISTINCT customer_id
FROM
  payment_p2007_01
EXCEPT
SELECT
  DISTINCT customer_id
FROM
  payment_p2007_05
ORDER BY
  customer_id ASC
LIMIT 3
;



 customer_id
-------------
           1
           2
           3
(3 rows)

 

 

複雑なサブクエリを簡略化するWITH ※MySQLでは8.0から利用可能

 

●WITH
WITHを使えば、実行結果を1つの仮想テーブルとして扱うことが出来ます。

 

SQL42. WITH句を使って、payment_p2007_01テーブルから1月に7回以上支払いのあったアクティブなcustomerのemailを抽出する

WITH loyal_customer AS (
  SELECT
    customer_id,
    COUNT(*) AS cnt
  FROM
    payment_p2007_01
  GROUP BY
    customer_id
  HAVING
    COUNT(*) >=7
)
SELECT
  email
FROM
  customer AS c
  INNER JOIN loyal_customer AS lc
    ON c.customer_id = lc.customer_id
WHERE c.active = 1
;


              email
----------------------------------
 EDNA.WEST@sakilacustomer.org
 SUE.PETERS@sakilacustomer.org
 LESLIE.SEWARD@sakilacustomer.org
(3 rows)

 

●WITHは2つ以上でも使える

WITH table1 AS (
  query
),
table2 AS (
  query
)

 

 

 

部分的に集計関数を適用する ウィンドウ関数

●ウィンドウ関数

構文
function_name() OVER (ORDER BY カラム名 DESC|ASC)

SQL44. ウィンドウ関数を使って、payment\p2007_01テーブルとcustomer_listテーブルから1月の利用回数が多かった顧客をその順位と一緒に表示する。

 

SELECT
  cl.name,
  COUNT(*) AS cnt,
  RANK() OVER (
    ORDER BY COUNT(*) DESC
  ) AS ranking
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.name
;


         name          | cnt | ranking
-----------------------+-----+---------
 SUE PETERS            |   8 |       1
 EDNA WEST             |   7 |       2
 LESLIE SEWARD         |   7 |       2
 MINNIE ROMERO         |   6 |       4
 BILLY POULIN          |   6 |       4
 CASSANDRA WALTERS     |   6 |       4
 NAOMI JENNINGS        |   6 |       4
 HEATHER MORRIS        |   6 |       4
 COURTNEY DAY          |   6 |       4
 RUTH MARTINEZ         |   6 |       4
 ENRIQUE FORSYTHE      |   6 |       4
 VICKIE BREWER         |   6 |       4
 GERALDINE PERKINS     |   6 |       4
 GERTRUDE CASTILLO     |   6 |       4
 JO FOWLER             |   5 |      15
 TIM CARY              |   5 |      15
 BETTY WHITE           |   5 |      15

(略)

 

●範囲(パーティション)を指定する PARTITION

SQL44. payment_p2007_01テーブルとcustomer_listテーブルから、1月の顧客の利用回数順位を国別に表示する

SELECT
  cl.id,
  cl.country,
  COUNT(*) AS cnt,
  RANK() OVER (
    PARTITION BY cl.country
    ORDER BY COUNT(*) DESC
  ) AS rank
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.id, cl.country
;


 id  |                country                | cnt | rank
-----+---------------------------------------+-----+------
 176 | Algeria                               |   5 |    1
  69 | Algeria                               |   2 |    2
 441 | Algeria                               |   1 |    3
 528 | Angola                                |   5 |    1
 383 | Angola                                |   2 |    2
 381 | Anguilla                              |   3 |    1
 359 | Argentina                             |   3 |    1
  89 | Argentina                             |   3 |    1
 322 | Argentina                             |   2 |    3
 405 | Argentina                             |   2 |    3
 107 | Argentina                             |   2 |    3
 445 | Argentina                             |   2 |    3
 331 | Argentina                             |   2 |    3
  24 | Argentina                             |   2 |    3
 530 | Argentina                             |   1 |    9
 243 | Argentina                             |   1 |    9
 560 | Argentina                             |   1 |    9
 219 | Argentina                             |   1 |    9
 159 | Austria                               |   4 |    1
 173 | Austria                               |   2 |    2
 266 | Austria                               |   2 |    2
 334 | Azerbaijan                            |   2 |    1
 333 | Azerbaijan                            |   1 |    2
 590 | Bahrain                               |   1 |    1
  21 | Bangladesh                            |   3 |    1
 332 | Bangladesh                            |   2 |    2

(略)

 

 

●ウィンドウ関数で平均を求める AVG

構文
AVG(カラム名)

SELECT
  cl.id,
  cl.country,
  COUNT(*) AS cnt,
  ROUND(AVG(COUNT(*)) OVER (
    PARTITION BY cl.country
  ), 2) AS avg_pay,
  RANK() OVER (
    PARTITION BY cl.country
    ORDER BY COUNT(*) DESC
  ) AS rank
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.id, cl.country
;


 id  |                country                | cnt | avg_pay | rank
-----+---------------------------------------+-----+---------+------
 176 | Algeria                               |   5 |    2.67 |    1
  69 | Algeria                               |   2 |    2.67 |    2
 441 | Algeria                               |   1 |    2.67 |    3
 528 | Angola                                |   5 |    3.50 |    1
 383 | Angola                                |   2 |    3.50 |    2
 381 | Anguilla                              |   3 |    3.00 |    1
  89 | Argentina                             |   3 |    1.83 |    1
 359 | Argentina                             |   3 |    1.83 |    1
 331 | Argentina                             |   2 |    1.83 |    3
 445 | Argentina                             |   2 |    1.83 |    3
  24 | Argentina                             |   2 |    1.83 |    3
 107 | Argentina                             |   2 |    1.83 |    3
 322 | Argentina                             |   2 |    1.83 |    3
 405 | Argentina                             |   2 |    1.83 |    3
 243 | Argentina                             |   1 |    1.83 |    9
 530 | Argentina                             |   1 |    1.83 |    9
 560 | Argentina                             |   1 |    1.83 |    9
 219 | Argentina                             |   1 |    1.83 |    9
 159 | Austria                               |   4 |    2.67 |    1
 266 | Austria                               |   2 |    2.67 |    2
 173 | Austria                               |   2 |    2.67 |    2
 334 | Azerbaijan                            |   2 |    1.50 |    1

(略)

 

 

●ウィンドウ関数で合計を求める SUM

SQL46. ノック45のクエリを変更し、平均回数の代わりに国ごとの合計利用回数を表示する

SUM(COUNT(*))を用いて、国ごとの合計利用回数を並べて表示させます。

SELECT
  cl.id,
  cl.country,
  COUNT(*) AS cnt,
  SUM(COUNT(*)) OVER (
    PARTITION BY cl.country
  ) AS total_pay,
  RANK() OVER (
    PARTITION BY cl.country
    ORDER BY COUNT(*) DESC
  ) AS rank
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.id, cl.country
;



 id  |                country                | cnt | total_pay | rank
-----+---------------------------------------+-----+-----------+------
 176 | Algeria                               |   5 |         8 |    1
  69 | Algeria                               |   2 |         8 |    2
 441 | Algeria                               |   1 |         8 |    3
 528 | Angola                                |   5 |         7 |    1
 383 | Angola                                |   2 |         7 |    2
 381 | Anguilla                              |   3 |         3 |    1
  89 | Argentina                             |   3 |        22 |    1
 359 | Argentina                             |   3 |        22 |    1
 331 | Argentina                             |   2 |        22 |    3
 445 | Argentina                             |   2 |        22 |    3
  24 | Argentina                             |   2 |        22 |    3
 107 | Argentina                             |   2 |        22 |    3
 322 | Argentina                             |   2 |        22 |    3
 405 | Argentina                             |   2 |        22 |    3
 243 | Argentina                             |   1 |        22 |    9
 530 | Argentina                             |   1 |        22 |    9
 560 | Argentina                             |   1 |        22 |    9
 219 | Argentina                             |   1 |        22 |    9
 159 | Austria                               |   4 |         8 |    1
 266 | Austria                               |   2 |         8 |    2
 173 | Austria                               |   2 |         8 |    2
 334 | Azerbaijan                            |   2 |         3 |    1
 333 | Azerbaijan                            |   1 |         3 |    2
 590 | Bahrain                               |   1 |         1 |    1
  21 | Bangladesh                            |   3 |         5 |    1
 332 | Bangladesh                            |   2 |         5 |    2
 144 | Belarus                               |   2 |         2 |    1
 455 | Bolivia                               |   2 |         3 |    1
 431 | Bolivia                               |   1 |         3 |    2
 246 | Brazil                                |   5 |        50 |    1
  77 | Brazil                                |   5 |        50 |    1
 551 | Brazil                                |   5 |        50 |    1
 307 | Brazil                                |   4 |        50 |    4

 

●累積比率の為の累積回数

 

累積比率 = 累積利用回数 / 全体の利用回数累積比率 = 累積利用回数 / 全体の利用回数

  • ROWS BETWEENで指定できる値
  • UNBOUNDED PRECEDING パーティションの最初の行
  • n PRECEDING n行前
  • CURRENT_ROW 現在の行
  • n FOLLOWING n行後
  • UNBOUNDED FOLLOWING パーティションの最後の行

 

SQL47. 1月の国別の利用回数を降順に抽出し、累積回数と合わせて表示する。

SELECT
  cl.country,
  COUNT(*) AS count,
  SUM(COUNT(*)) OVER (
    ORDER BY COUNT(*) DESC
    ROWS BETWEEN
      UNBOUNDED PRECEDING
      AND CURRENT ROW
  ) AS cumulative_count
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.country
ORDER BY
  count DESC
;


                country                | count | cumulative_count
---------------------------------------+-------+------------------
 India                                 |   111 |              111
 China                                 |   109 |              220
 United States                         |    70 |              290
 Japan                                 |    65 |              355
 Mexico                                |    63 |              418
 Russian Federation                    |    60 |              478
 Brazil                                |    50 |              528
 Philippines                           |    38 |              566
 Indonesia                             |    31 |              597
 Nigeria                               |    29 |              626
 Turkey                                |    27 |              653
 South Africa                          |    24 |              677
 Argentina                             |    22 |              699
 Germany                               |    20 |              719
 Taiwan                                |    16 |              735
 United Kingdom                        |    16 |              751
 Iran                                  |    16 |              767
 Spain                                 |    15 |              782
 Saudi Arabia                          |    13 |              795
 Colombia                              |    13 |              808
 South Korea                           |    12 |              820
 Pakistan                              |    12 |              832
 Ukraine                               |    11 |              843
 Poland                                |    11 |              854
 Israel                                |    11 |              865
 Italy                                 |    11 |              876
 Vietnam                               |     9 |              885
 Peru                                  |     9 |              894
 Egypt                                 |     9 |              903
 Congo, The Democratic Republic of the |     8 |              911
 France                                |     8 |              919
 Austria                               |     8 |              927
 Venezuela                             |     8 |              935

 

●累積比率

SQL48. payment_p2007_01テーブルとcustomer_listテーブルから1月の国別の利用比率を降順に並べ、累積で表示する

SELECT
  cl.country,
  ROUND(
    SUM(COUNT(*)) OVER (
      ORDER BY COUNT(*) DESC
      ROWS BETWEEN
        UNBOUNDED PRECEDING
        AND CURRENT ROW
    ) / SUM(COUNT(*)) OVER (),
    2
  ) AS cumulative_percent
FROM
  payment_p2007_01 AS p
  INNER JOIN customer_list AS cl
    ON p.customer_id = cl.id
GROUP BY
  cl.country
ORDER BY
  COUNT(*) DESC
;


                country                | cumulative_percent
---------------------------------------+--------------------
 India                                 |               0.10
 China                                 |               0.19
 United States                         |               0.25
 Japan                                 |               0.31
 Mexico                                |               0.36
 Russian Federation                    |               0.41
 Brazil                                |               0.46
 Philippines                           |               0.49
 Indonesia                             |               0.52
 Nigeria                               |               0.54
 Turkey                                |               0.56
 South Africa                          |               0.59
 Argentina                             |               0.60
 Germany                               |               0.62
 Taiwan                                |               0.64
 United Kingdom                        |               0.65
 Iran                                  |               0.66
 Spain                                 |               0.68
 Saudi Arabia                          |               0.69

(略)

 

●ウィンドウ関数で移動平均を求める

移動平均とは主に時系列データなどに対し、
ある一定区間ごとの平均値を、、区間をずらしながら求めるもの。

例えば曜日ごとの変動の大きいデータに対して7日間移動平均を計算することで、
曜日変動を抑えた傾向をみることが出来ます。

 

今回はpaymentテーブルを使い、4月6日から12日の利用回数の3日移動平均を見てましょう。
まずは当該っ期間について、日別の利用回数を計算します。

 

SQL49. paymentテーブルを使い、2007年4月6日から12日の日別利用回数を集計する。

SELECT
  CAST(payment_date AS DATE) AS d,
  COUNT(*)
FROM
  payment AS p
WHERE
  CAST(payment_date AS DATE)
    BETWEEN '2007-04-06' AND '2007-04-12'
GROUP BY
  d
ORDER BY
  d ASC
;


     d      | count
------------+-------
 2007-04-06 |   486
 2007-04-07 |   472
 2007-04-08 |   516
 2007-04-09 |   514
 2007-04-10 |   482
 2007-04-11 |   468
 2007-04-12 |   452
(7 rows)

 

SQL50. paymentテーブルを使い、2007年4月6日から12日の利用回数の3日移動平均を計算する。

SELECT
  CAST(payment_date AS DATE) AS d,
  COUNT(*),
  ROUND(AVG(COUNT(*)) OVER (
    ORDER BY
      CAST(payment_date AS DATE) ASC
    ROWS BETWEEN
      2 PRECEDING
      AND CURRENT ROW
  ), 2) AS moving_avg
FROM
  payment AS p
WHERE
  CAST(payment_date AS DATE)
    BETWEEN '2007-04-06' AND '2007-04-12'
GROUP BY
  d
ORDER BY
  d ASC
;


     d      | count | moving_avg
------------+-------+------------
 2007-04-06 |   486 |     486.00
 2007-04-07 |   472 |     479.00
 2007-04-08 |   516 |     491.33
 2007-04-09 |   514 |     500.67
 2007-04-10 |   482 |     504.00
 2007-04-11 |   468 |     488.00
 2007-04-12 |   452 |     467.33
(7 rows)

 

 

 

 

Amazonおすすめ

iPad 9世代 2021年最新作

iPad 9世代出たから買い替え。安いぞ!🐱 初めてならiPad。Kindleを外で見るならiPad mini。ほとんどの人には通常のiPadをおすすめします><

コメントを残す

メールアドレスが公開されることはありません。 * が付いている欄は必須項目です

日本語が含まれない投稿は無視されますのでご注意ください。(スパム対策)