Intermediate-SQL

代写BA,DS和A&F专业的R和python作业请私信

文章目录

1. We’ll take the CASE

1.1 We’l take the CASE (video)

1.2 Basic CASE statements

What is your favorite team?

The European Soccer Database contains data about 12,800 matches from 11 countries played between 2011-2015! Throughout this course, you will be shown filtered versions of the tables in this database in order to better explore their contents.

In this exercise, you will identify matches played between FC Schalke 04 and FC Bayern Munich. There are 2 teams identified in each match in the hometeam_id and awayteam_id columns, available to you in the filtered matches_germany table. ID can join to the team_api_id column in the teams_germany table, but you cannot perform a join on both at the same time.

However, you can perform this operation using a CASE statement once you’ve identified the team_api_id associated with each team!

Instruction 1

  • Select the team’s long name and API id from the teams_germany table.
  • Filter the query for FC Schalke 04 and FC Bayern Munich using IN, giving you the team_api_IDs needed for the next step.
SELECT
  -- Select the team long name and team API id
  team_long_name,
  team_api_id
FROM teams_germany
-- Only include FC Schalke 04 and FC Bayern Munich
WHERE 
  team_long_name IN ('FC Schalke 04', 
                     'FC Bayern Munich');

Instruction 2

  • Create a CASE statement that identifies whether a match in Germany included FC Bayern Munich, FC Schalke 04, or neither as the home team.
  • Group the query by the CASE statement alias, home_team.
-- Identify the home team as Bayern Munich, Schalke 04, or neither
SELECT 
  CASE WHEN hometeam_id = 9789 THEN 'FC Schalke 04'
  WHen hometeam_id = 9823 THEN 'FC Bayern Munich'
  ELSE 'Other' END AS home_team,
  COUNT(id) AS total_matches
FROM matches_germany
-- Group by the CASE statement alias
GROUP BY home_team;

1.3 CASE statements comparing column values

Barcelona is considered one of the strongest teams in Spain’s soccer league.

In this exercise, you will be creating a list of matches in the 2011/2012 season where Barcelona was the home team. You will do this using a CASE statement that compares the values of two columns to create a new group – wins, losses, and ties.

In 3 steps, you will build a query that identifies a match’s winner, identifies the identity of the opponent, and finally filters for Barcelona as the home team. Completing a query in this order will allow you to watch your results take shape with each new piece of information.

The matches_spain table currently contains Barcelona’s matches from the 2011/2012 season, and has two key columns, hometeam_id and awayteam_id, that can be joined with the teams_spain table. However, you can only join teams_spain to one column at a time.

Instruction 1

  • Select the date of the match and create a CASE statement to identify matches as home wins, home losses, or ties.
SELECT 
  -- Select the date of the match
  date,
  -- Identify home wins, losses, or ties
  CASE WHEN home_goal > away_goal THEN 'Home win!'
  WHEN home_goal < away_goal THEN 'Home loss :(' 
  ELSE 'Tie' END AS outcome
FROM matches_spain;

Instruction 2:

  • Left join the teams_spain table team_api_id column to the matches_spain table awayteam_id. This allows us to retrieve the away team’s identity.
  • Select team_long_name from teams_spain as opponent and complete the CASE statement from Step 1.
SELECT 
  m.date,
  --Select the team long name column and call it 'opponent'
  t.team_long_name AS opponent, 
  -- Complete the CASE statement with an alias
  CASE WHEN m.home_goal > away_goal THEN 'Home win!'
  WHEN m.home_goal < away_goal THEN 'Home loss :('
  ELSE 'Tie' END AS outcome
FROM matches_spain AS m
-- Left join teams_spain onto matches_spain
LEFT JOIN teams_spain AS t
ON m.awayteam_id = t.team_api_id;

Instruction 3

  • Complete the same CASE statement as the previous steps.
  • Filter for matches where the home team is FC Barcelona (id = 8634).
SELECT 
  m.date,
  t.team_long_name AS opponent,
  -- Complete the CASE statement with an alias
  CASE WHEN m.home_goal > away_goal THEN 'Barcelona win!'
  WHEN m.home_goal < away_goal THEN 'Barcelona loss :(' 
  ELSE 'Tie' END AS outcome 
FROM matches_spain AS m
LEFT JOIN teams_spain AS t 
ON m.awayteam_id = t.team_api_id
-- Filter for Barcelona as the home team
WHERE m.hometeam_id = 8634; 

1.4 CASE statements comparing two column values part 2

Instruction

  • Complete the CASE statement to identify Barcelona’s away team games (id = 8634) as wins, losses, or ties.
  • Left join the teams_spain table team_api_id column on the matches_spain table hometeam_id column. This retrieves the identity of the home team opponent.
  • Filter the query to only include matches where Barcelona was the away team.
-- Select matches where Barcelona was the away team
SELECT  
  m.date,
  t.team_long_name AS opponent,
  CASE WHEN m.home_goal < m.away_goal ThEN'Barcelona win!'
  WHEN m.home_goal > m.away_goal THEN'Barcelona loss :(' 
  ELSE 'Tie' END AS outcome
FROM 
  matches_spain AS m
-- Join teams_spain to matches_spain
LEFT JOIN 
  teams_spain AS t 
ON 
  m.hometeam_id = t.team_api_id
WHERE 
  m.awayteam_id = 8634;

1.5 In CASE things get more complex (video)

1.6 In CASE of rivalry

Barcelona and Real Madrid have been rival teams for more than 80 years. Matches between these two teams are given the name El Clásico (The Classic). In this exercise, you will query a list of matches played between these two rivals.

You will notice in Step 2 that when you have multiple logical conditions in a CASE statement, you may quickly end up with a large number of WHEN clauses to logically test every outcome you are interested in. It’s important to make sure you don’t accidentally exclude key information in your ELSE clause.

In this exercise, you will retrieve information about matches played between Barcelona (id = 8634) and Real Madrid (id = 8633). Note that the query you are provided with already identifies the Clásico matches using a filter in the WHERE clause.

Instruction 1

  • Complete the first CASE statement, identifying Barcelona or Real Madrid as the home team using the hometeam_id column.
  • Complete the second CASE statement in the same way, using awayteam_id.
SELECT 
  date,
  -- Identify the home team as Barcelona or Real Madrid
  CASE WHEN hometeam_id = 8634 THEN 'FC Barcelona' 
  ELSE 'Real Madrid CF' END AS home,
  -- Identify the away team as Barcelona or Real Madrid
  CASE WHEN awayteam_id = 8634 THEN 'FC Barcelona' 
  ELSE 'Real Madrid CF' END AS away
FROM 
  matches_spain
WHERE 
  (awayteam_id = 8634 OR hometeam_id = 8634) AND
  (awayteam_id = 8633 OR hometeam_id = 8633);

Instruction 2

  • Construct the final CASE statement identifying who won each match. Note there are 3 possible outcomes, but 5 conditions that you need to identify.
  • Fill in the logical operators to identify Barcelona or Real Madrid as the winner.
SELECT 
  date,
  CASE WHEN hometeam_id = 8634 THEN 'FC Barcelona' 
  ELSE 'Real Madrid CF' END as home,
  CASE WHEN awayteam_id = 8634 THEN 'FC Barcelona' 
  ELSE 'Real Madrid CF' END as away,
  -- Identify all possible match outcomes
  CASE WHEN home_goal > away_goal AND 
            hometeam_id = 8634 
  THEN 'Barcelona win!'
  WHEN home_goal > away_goal AND 
       hometeam_id = 8633 
  THEN 'Real Madrid win!'
  WHEN home_goal < away_goal AND 
       awayteam_id = 8634 
  THEN 'Barcelona win!'
  WHEN home_goal < away_goal AND 
       awayteam_id = 8633 
  THEN 'Real Madrid win!'
  ELSE 'Tie!' END AS outcome
FROM 
  matches_spain
WHERE (awayteam_id = 8634 OR hometeam_id = 8634) AND    
      (awayteam_id = 8633 OR hometeam_id = 8633);

1.7 Filtering your CASE statement

Let’s generate a list of matches won by Italy’s Bologna team! There are quite a few additional teams in the two tables, so a key part of generating a usable query will be using your CASE statement as a filter in the WHERE clause.

CASE statements allow you to categorize data that you’re interested in – and exclude data you’re not interested in. In order to do this, you can use a CASE statement as a filter in the WHERE statement to remove output you don’t want to see.

Here is how you might set that up:

SELECT *
FROM table
WHERE 
    CASE WHEN a > 5 THEN 'Keep'
         WHEN a <= 5 THEN 'Exclude' END = 'Keep';

In essence, you can use the CASE statement as a filtering column like any other column in your database. The only difference is that you don’t alias the statement in WHERE.
Instruction 1

  • Identify Bologna’s team ID listed in the teams_italy table by selecting the team_long_name and team_api_id.
-- Select team_long_name and team_api_id from team
SELECT
  team_long_name,
  team_api_id
FROM teams_italy
-- Filter for team name
WHERE team_long_name = 'Bologna';

Instruction 2

  • Select the season and date that a match was played.
  • Complete the CASE statement so that only Bologna’s home and away wins are identified.
-- Select the season and date columns
SELECT 
  season,
  date,
  -- Identify when Bologna won a match
  CASE WHEN hometeam_id = 9857 AND 
            home_goal > away_goal THEN 'Bologna Win'
  WHEN awayteam_id = 9857 AND 
       away_goal > home_goal THEN 'Bologna Win' 
  END AS outcome
FROM matches_italy;

Instruction 3

  • Select the home_goal and away_goal for each match.
  • Use the CASE statement in the WHERE clause to filter all NULL values generated by the statement in the previous step.
-- Select the season, date, home_goal, and away_goal columns
SELECT 
  season,
  date,
  home_goal,
  away_goal
FROM 
  matches_italy
WHERE 
  -- Exclude games not won by Bologna
  CASE WHEN hometeam_id = 9857 AND 
            home_goal > away_goal THEN 'Bologna Win'
  WHEN awayteam_id = 9857 AND 
       away_goal > home_goal THEN 'Bologna Win' 
  END IS NOT NULL;

1.8 CASE WHEN with aggregate functions (video)

1.9 COUNT using CASE WHEN

Instruction 1

  • Create a CASE statement that identifies the id of matches played in the 2012/2013 season. Specify that you want ELSE values to be NULL.
  • Wrap the CASE statement in a COUNT function and group the query by the country alias.
SELECT 
  c.name AS country,
  -- Count games from the 2012/2013 season
  COUNT(CASE WHEN m.season = '2012/2013' 
        THEN m.id 
        ELSE NULL END) AS matches_2012_2013
FROM country AS c
LEFT JOIN match AS m
ON c.id = m.country_id
-- Group by country name alias
GROUP BY country;

Instruction 2

  • Create 3 CASE WHEN statements counting the matches played in each country across the 3 seasons.
  • END your CASE statement without an ELSE clause.
SELECT 
  c.name AS country,
  -- Count matches in each of the 3 seasons
  COUNT(CASE WHEN m.season = '2012/2013' 
        THEN m.id END) AS matches_2012_2013,
  COUNT(CASE WHEN m.season = '2013/2014' 
        THEN m.id END) AS matches_2013_2014,
  COUNT(CASE WHEN m.season = '2014/2015' 
        THEN m.id END) AS matches_2014_2015
FROM country AS c
LEFT JOIN match AS m
ON c.id = m.country_id
-- Group by country name alias
GROUP BY country;

1.10 COUNT and CASE WHEN with multiple conditions

Instruction

  • Create 3 CASE statements to “count” matches in the '2012/2013', '2013/2014', and '2014/2015' seasons, respectively.
  • Have each CASE statement return a 1 for every match you want to include, and a 0 for every match to exclude.
  • Wrap the CASE statement in a SUM to return the total matches played in each season.
  • Group the query by the country name alias.
SELECT 
	c.name AS country,
    -- Sum the total records in each season where the home team won
	SUM(CASE WHEN m.season = '2012/2013' AND m.home_goal > m.away_goal 
        THEN 1 ELSE 0 END) AS matches_2012_2013,
 	SUM(CASE WHEN m.season = '2013/2014' AND m.home_goal > m.away_goal  
        THEN 1 ELSE 0 END) AS matches_2013_2014,
	SUM(CASE WHEN m.season = '2014/2015' AND m.home_goal > m.away_goal 
        THEN 1 ELSE 0 END) AS matches_2014_2015
FROM country AS c
LEFT JOIN match AS m
ON c.id = m.country_id
-- Group by country name alias
GROUP BY country;

1.11 Calculating percent with CASE and AVG

Instruction

  • Create 3 CASE statements to COUNT the total number of home team wins, away team wins, and ties. This will allow you to examine the total number of records. You will convert this to an AVG in the next step.
SELECT 
  c.name AS country,
  -- Sum the home wins, away wins, and ties in each country
  COUNT(CASE WHEN m.home_goal > m.away_goal 
        THEN m.id END) AS home_wins,
  COUNT(CASE WHEN m.home_goal < m.away_goal 
        THEN m.id END) AS away_wins,
  COUNT(CASE WHEN m.home_goal = m.away_goal 
        THEN m.id END) AS ties
FROM country AS c
LEFT JOIN matches AS m
ON c.id = m.country_id
GROUP BY country;

Instruction 2

  • Calculate the percentage of matches tied using a CASE statement inside AVG.
  • Fill in the logical operators for each statement. Alias your columns as ties_2013_2014 and ties_2014_2015, respectively.
SELECT 
  c.name AS country,
  -- Calculate the percentage of tied games in each season
  AVG(CASE WHEN m.season='2013/2014' AND 
                m.home_goal = m.away_goal THEN 1
	  WHEN m.season='2013/2014' AND 
	       m.home_goal != m.away_goal THEN 0 END) AS ties_2013_2014,
  AVG(CASE WHEN m.season='2014/2015' AND 
                m.home_goal = m.away_goal THEN 1
	  WHEN m.season='2014/2015' AND 
	       m.home_goal != m.away_goal THEN 0 END) AS ties_2014_2015
FROM country AS c
LEFT JOIN matches AS m
ON c.id = m.country_id
GROUP BY country;

Instruction 3

  • The previous “ties” columns returned values with 14 decimal points, which is not easy to interpret. Use the ROUND function to round to 2 decimal points.
SELECT 
  c.name AS country,
  -- Round the percentage of tied games to 2 decimal points
  ROUND(AVG(CASE WHEN m.season='2013/2014' AND 
                      m.home_goal = m.away_goal THEN 1
			WHEN m.season='2013/2014' AND 
			      m.home_goal != m.away_goal THEN 0
			END),2) AS pct_ties_2013_2014,
  ROUND(AVG(CASE WHEN m.season='2014/2015' AND 
                      m.home_goal = m.away_goal THEN 1
			WHEN m.season='2014/2015' AND 
			      m.home_goal != m.away_goal THEN 0
			END),2) AS pct_ties_2014_2015
FROM country AS c
LEFT JOIN matches AS m
ON c.id = m.country_id
GROUP BY country;

2. Short and Simple Subqueries

2.1 WHERE are the Subqueries? (video)

2.2 Filtering using scalar subqueries

Instruction 1:

  • Calculate triple the average home + away goals scored across all matches. This will become your subquery in the next step. Note that this column does not have an alias, so it will be called ?column? in your results.
-- Select the average of home + away goals, multiplied by 3
SELECT 
	3 * AVG(home_goal + away_goal)
FROM matches_2013_2014;

Instruction 2:

SELECT 
	-- Select the date, home goals, and away goals scored
    date,
	home_goal,
	away_goal
FROM  matches_2013_2014
-- Filter for matches where total goals exceeds 3x the average
WHERE (home_goal + away_goal) > 
       (SELECT 3 * AVG(home_goal + away_goal)
        FROM matches_2013_2014); 

2.3 Filtering using a subquery with a list

Instruction:

  • Create a subquery in the WHERE clause that retrieves all unique hometeam_ID values from the match table.
  • Select the team_long_name and team_short_name from the team table. Exclude all values from the subquery in the main query.
SELECT 
	-- Select the team long and short names
	team_long_name,
	team_short_name
FROM team 
-- Exclude all values from the subquery
WHERE team_api_id NOT IN
     (SELECT DISTINCT hometeam_ID  FROM match);

2.4 Filtering with more complex subquery conditions

Instruction:

  • Create a subquery in WHERE clause that retrieves all hometeam_ID values from match with a home_goal score greater than or equal to 8.
  • Select the team_long_name and team_short_name from the team table. Include all values from the subquery in the main query.
SELECT
	-- Select the team long and short names
	team_long_name,
	team_short_name
FROM team
-- Filter for teams with 8 or more home goals
WHERE team_api_id IN
	  (SELECT hometeam_ID 
       FROM match
       WHERE home_goal >= 8);

2.5 Subqueries in FROM (video)

2.6 Joining Subqueries in FROM

  • Create the subquery to be used in the next step, which selects the country ID and match ID (id) from the match table.
  • Filter the query for matches with greater than or equal to 10 goals.

Instruction 1:

SELECT 
	-- Select the country ID and match ID
	country_id, 
    id 
FROM match
-- Filter for matches with 10 or more goals in total
WHERE (home_goal + away_goal) >= 10;

Instruction 2:

SELECT
	-- Select country name and the count match IDs
    c.name AS country_name,
    COUNT(sub.id) AS matches
FROM country AS c
-- Inner join the subquery onto country
-- Select the country id and match id columns
INNER JOIN (SELECT country_id, id 
           FROM match
           -- Filter the subquery by matches with 10+ goals
           WHERE (home_goal + away_goal) >= 10) AS sub
ON c.id = sub.country_id
GROUP BY country_name;

2.7 Building on Subqueries in FROM

Instruction :

SELECT
	-- Select country, date, home, and away goals from the subquery
    country,
    date,
    home_goal,
    away_goal
FROM 
	-- Select country name, date, and total goals in the subquery
	(SELECT c.name AS country, 
     	    m.date, 
     		m.home_goal, 
     		m.away_goal,
           (m.home_goal + m.away_goal) AS total_goals
    FROM match AS m
    LEFT JOIN country AS c
    ON m.country_id = c.id) AS subq
-- Filter by total goals scored in the main query
WHERE total_goals >= 10;

2.8 Subqueries in SELECT (video)

2.9 Add a subquery to the SELECT clause

Instruction:

  • In the subquery, select the average total goals by adding home_goal and away_goal.
  • Filter the results so that only the average of goals in the 2013/2014 season is calculated.
  • In the main query, select the average total goals by adding home_goal and away_goal. This calculates the average goals for each league.
  • Filter the results in the main query the same way you filtered the subquery. Group the query by the league name.
SELECT 
	l.name AS league,
    -- Select and round the league's total goals
    ROUND(AVG(m.home_goal + m.away_goal), 2) AS avg_goals,
    -- Select & round the average total goals for the season
    (SELECT round(AVG(home_goal + away_goal), 2) 
     FROM match
     WHERE season = '2013/2014') AS overall_avg
FROM league AS l
LEFT JOIN match AS m
ON l.country_id = m.country_id
-- Filter for the 2013/2014 season
WHERE season = '2013/2014'
GROUP BY league;

2.10 Subqueries in Select for Calculations

Instruction:

  • Select the average goals scored in a match for each league in the main query.
  • Select the average goals scored in a match overall for the 2013/2014 season in the subquery.
  • Subtract the subquery from the average number of goals calculated for each league.
  • Filter the main query so that only games from the 2013/2014 season are included.
SELECT
	-- Select the league name and average goals scored
	l.name AS league,
	ROUND(AVG(m.home_goal + m.away_goal),2) AS avg_goals,
    -- Subtract the overall average from the league average
	ROUND(AVG(m.home_goal + m.away_goal) - 
		(SELECT AVG(home_goal + away_goal)
		 FROM match 
         WHERE season = '2013/2014'),2) AS diff
FROM league AS l
LEFT JOIN match AS m
ON l.country_id = m.country_id
-- Only include 2013/2014 results
WHERE m.season = '2013/2014'
GROUP BY l.name;

2.11 Subqueries everywhere! And best practices! (video)

2.12 ALL the Subqueries EVERYWHERE

Instruction:

  • Extract the average number of home and away team goals in two SELECT subqueries.
  • Calculate the average home and away goals for the specific stage in the main query.
  • Filter both subqueries and the main query so that only data from the 2012/2013 season is included.
  • Group the query by the m.stage column.
SELECT 
	-- Select the stage and average goals for each stage
	m.stage,
    ROUND(AVG(m.home_goal + m.away_goal),2) AS avg_goals,
    -- Select the average overall goals for the 2012/2013 season
    ROUND((SELECT AVG(home_goal + away_goal) 
           FROM match 
           WHERE season = '2012/2013'),2) AS overall
FROM match AS m
-- Filter for the 2012/2013 season
WHERE m.season = '2012/2013'
-- Group by stage
GROUP BY stage;

2.13 Add a subquery in FROM

Instruction:

  • Calculate the average home goals and average away goals from the match table for each stage in the FROM clause subquery.
  • Add a subquery to the WHERE clause that calculates the overall average home goals.
  • Filter the main query for stages where the average home goals is higher than the overall average.
  • Select the stage and avg_goals columns from the s subquery into the main query.
SELECT 
	-- Select the stage and average goals from the subquery
	s.stage,
	ROUND(s.avg_goals,2) AS avg_goals
FROM 
	-- Select the stage and average goals in 2012/2013
	(SELECT
		 stage,
         AVG(home_goal + away_goal) AS avg_goals
	 FROM match
	 WHERE season = '2012/2013'
	 GROUP BY stage) AS s
WHERE 
	-- Filter the main query using the subquery
	s.avg_goals > (SELECT AVG(home_goal + away_goal) 
                    FROM match WHERE season = '2012/2013');

2.14 Add a subquery in SELECT

Instruction:

  • Create a subquery in SELECT that yields the average goals scored in the 2012/2013 season. Name the new column overall_avg.
  • Create a subquery in FROM that calculates the average goals scored in each stage during the 2012/2013 season.
  • Filter the main query for stages where the average goals exceeds the overall average in 2012/2013.
SELECT 
	-- Select the stage and average goals from s
	s.stage,
    ROUND(s.avg_goals,2) AS avg_goal,
    -- Select the overall average for 2012/2013
    (SELECT AVG(home_goal + away_goal) FROM match WHERE season = '2012/2013') AS overall_avg
FROM 
	-- Select the stage and average goals in 2012/2013 from match
	(SELECT
		 stage,
         AVG(home_goal + away_goal) AS avg_goals
	 FROM match
	 WHERE season = '2012/2013'
	 GROUP BY stage) AS s
WHERE 
	-- Filter the main query using the subquery
	s.avg_goals > (SELECT AVG(home_goal + away_goal) 
                    FROM match WHERE season = '2012/2013');

3. Correlated Queries, Nested Queries, and Common Table Expressions

3.1 Correlated Subqueries (video)

3.2 Basic Correlated Subqueries

Instruction:

  • Select the country_id, date, home_goal, and away_goal columns in the main query.
  • Complete the AVG value in the subquery.
  • Complete the subquery column references, so that country_id is matched in the main and subquery.
SELECT 
	-- Select country ID, date, home, and away goals from match
	main.country_id,
    date,
    main.home_goal, 
    main.away_goal
FROM match AS main
WHERE 
	-- Filter the main query by the subquery
	(home_goal + away_goal) > 
        (SELECT AVG((sub.home_goal + sub.away_goal) * 3)
         FROM match AS sub
         -- Join the main query to the subquery in WHERE
         WHERE main.country_id = sub.country_id);

3.3 Correlated subquery with multiple conditions

  • Select the country_id, date, home_goal, and away_goal columns in the main query.
  • Complete the subquery: Select the matches with the highest number of total goals.
  • Match the subquery to the main query using country_id and season.
  • Fill in the correct logical operator so that total goals equals the max goals recorded in the subquery.

Instruction 1:

SELECT 
	-- Select country ID, date, home, and away goals from match
	main.country_id,
    main.date,
    main.home_goal,
    main.away_goal
FROM match AS main
WHERE 
	-- Filter for matches with the highest number of goals scored
	(home_goal + away_goal) = 
        (SELECT MAX(sub.home_goal + sub.away_goal)
         FROM match AS sub
         WHERE main.country_id = sub.country_id
               AND main.season = sub.season);

3.4 Nested subqueries (video)

3.5 Nested simple subqueries

Instruction 1:

  • Complete the main query to select the season and the max total goals in a match for each season. Name this max_goals.
  • Complete the first simple subquery to select the max total goals in a match across all seasons. Name this overall_max_goals.
  • Complete the nested subquery to select the maximum total goals in a match played in July across all seasons.
  • Select the maximum total goals in the outer subquery. Name this entire subquery july_max_goals.
SELECT
	-- Select the season and max goals scored in a match
	season,
    MAX(home_goal + away_goal) AS max_goals,
    -- Select the overall max goals scored in a match
   (SELECT MAX(home_goal + away_goal) FROM match) AS overall_max_goals,
   -- Select the max number of goals scored in any match in July
   (SELECT MAX(home_goal + away_goal) 
    FROM match
    WHERE id IN (
          SELECT id FROM match WHERE EXTRACT(MONTH FROM date) = 07)) AS july_max_goals
FROM match
GROUP BY season;

3.6 Nest a subquery in FROM

Instruction 1:

  • Generate a list of matches where at least one team scored 5 or more goals.
-- Select matches where a team scored 5+ goals
SELECT
	country_id,
    season,
	id
FROM match
WHERE home_goal > 5 OR away_goal > 5;

Instruction 2:

  • Turn the query from the previous step into a subquery in the FROM statement.
  • COUNT the match ids generated in the previous step, and group the query by country_id and season.
-- Count match ids
SELECT
    country_id,
    season,
    count(id) AS matches
-- Set up and alias the subquery
FROM (
	SELECT
    	country_id,
    	season,
    	id
	FROM match
	WHERE home_goal >= 5 OR away_goal >= 5) AS subquery
-- Group by country_id and season
GROUP BY country_id, season;

Instruction 3:

  • Finally, declare the same query from step 2 as a subquery in FROM with the alias outer_s. Left join it to the country table using the outer query’s country_id column.
  • Calculate an AVG of high scoring matches per season in the main query.
SELECT
	c.name AS country,
    -- Calculate the average matches per season
	AVG(id) AS avg_seasonal_high_scores
FROM country AS c
-- Left join outer_s to country
LEFT JOIN (
  SELECT country_id, season,
         COUNT(id) AS matches
  FROM (
    SELECT country_id, season, id
	FROM match
	WHERE home_goal >= 5 OR away_goal >= 5) AS inner_s
  -- Close parentheses and alias the subquery
  GROUP BY country_id, season ) AS outer_s
ON c.id = outer_s.country_id
GROUP BY country;

3.7 Common Table Expressions (video)

3.8 Clean up with CTEs

Instruction:

  • Complete the syntax to declare your CTE.
  • Select the country_id and match id from the match table in your CTE.
  • Left join the CTE to the league table using country_id.
-- Set up your CTE
WITH match_list AS (
    SELECT 
  		country_id, 
  		id
    FROM match
    WHERE (home_goal + away_goal) >= 10)
-- Select league and count of matches from the CTE
SELECT
    l.name AS league,
    COUNT(match_list.id) AS matches
FROM league AS l
-- Join the CTE to the league table
LEFT JOIN match_list ON l.id = match_list.country_id
GROUP BY l.name;

3.9 Organizing with CTEs

Instruction:

  • Declare your CTE, where you create a list of all matches with the league name.
  • Select the league, date, home, and away goals from the CTE.
  • Filter the main query for matches with 10 or more goals.
-- Set up your CTE
WITH match_list AS (
  -- Select the league, date, home, and away goals
    SELECT 
  		l.name AS league, 
     	m.date, 
  		m.home_goal, 
  		m.away_goal,
       (m.home_goal + m.away_goal) AS total_goals
    FROM match AS m
    LEFT JOIN league as l ON m.country_id = l.id)
-- Select the league, date, home, and away goals from the CTE
SELECT league, date, home_goal, away_goal
FROM match_list
-- Filter by total goals
WHERE total_goals >= 10;

3.10 CTEs with nested subqueries

Instruction 1:

  • Declare a CTE that calculates the total goals from matches in August of the 2013/2014 season.
  • Left join the CTE onto the league table using country_id from the match_list CTE.
  • Filter the list on the inner subquery to only select matches in August of the 2013/2014 season.
-- Set up your CTE
WITH match_list AS (
    SELECT 
  		country_id,
  	   (home_goal + away_goal) AS goals
    FROM match
  	-- Create a list of match IDs to filter data in the CTE
    WHERE id IN (
       SELECT id
       FROM match
       WHERE season = '2013/2014' AND EXTRACT(MONTH FROM date) = 8))
-- Select the league name and average of goals in the CTE
SELECT 
	l.name,
    AVG(match_list.goals)
FROM league AS l
-- Join the CTE onto the league table
LEFT JOIN match_list ON l.id = match_list.country_id
GROUP BY l.name;

3.11 Deciding on techniques to use (video)

3.12 Get team names with a subquery

Instruction 1:

  • Create a query that left joins team to match in order to get the identity of the home team. This becomes the subquery in the next step.
SELECT 
	m.id, 
    t.team_long_name AS hometeam
-- Left join team to match
FROM match AS m
LEFT JOIN team as t
ON m.hometeam_id = team_api_id;

Instruction 2:

  • Add a second subquery to the FROM statement to get the away team name, changing only the hometeam_id. Left join both subqueries to the match table on the id column.
SELECT
	m.date,
    -- Get the home and away team names
    hometeam,
    awayteam,
    m.home_goal,
    m.away_goal
FROM match AS m
-- Join the home subquery to the match table
LEFT JOIN (
  SELECT match.id, team.team_long_name AS hometeam
  FROM match
  LEFT JOIN team
  ON match.hometeam_id = team.team_api_id) AS home
ON home.id = m.id
-- Join the away subquery to the match table
LEFT JOIN (
  SELECT match.id, team.team_long_name AS awayteam
  FROM match
  LEFT JOIN team
  -- Get the away team ID in the subquery
  ON match.awayteam_id = team.team_api_id) AS away
ON away.id = m.id;

3.13 Get team names with correlated subqueries

Instruction 1:

  • Using a correlated subquery in the SELECT statement, match the team_api_id column from team to the hometeam_id from match.
SELECT
    m.date,
   (SELECT team_long_name
    FROM team AS t
    -- Connect the team to the match table
    WHERE t.team_api_id = m.hometeam_id) AS hometeam
FROM match AS m;

Instruction 2:

SELECT
    m.date,
    (SELECT team_long_name
     FROM team AS t
     WHERE t.team_api_id = m.hometeam_id) AS hometeam,
    -- Connect the team to the match table
    (SELECT team_long_name
     FROM team AS t
     WHERE t.team_api_id = m.awayteam_id) AS awayteam,
    -- Select home and away goals
     home_goal,
     away_goal
FROM match AS m;

3.14 Get team names with CTEs

Instruction 1:

  • Select id from match and team_long_name from team. Join these two tables together on hometeam_id in match and team_api_id in team.
SELECT 
	-- Select match id and team long name
    m.id, 
    t.team_long_name AS hometeam
FROM match AS m
-- Join team to match using team_api_id and hometeam_id
LEFT JOIN team AS t 
ON t.team_api_id = m.hometeam_id;

Instruction 2:

  • Declare the query from the previous step as a common table expression. SELECT everything from the CTE into the main query. Your results will not change at this step!
-- Declare the home CTE
WITH home AS (
	SELECT m.id, t.team_long_name AS hometeam
	FROM match AS m
	LEFT JOIN team AS t 
	ON m.hometeam_id = t.team_api_id)
-- Select everything from home
SELECT *
FROM home;

Instruction 3:

  • Let’s declare the second CTE, away. Join it to the first CTE on the id column.
  • The date, home_goal, and away_goal columns have been added to the CTEs. SELECT them into the main query.
WITH home AS (
  SELECT m.id, m.date, 
  		 t.team_long_name AS hometeam, m.home_goal
  FROM match AS m
  LEFT JOIN team AS t 
  ON m.hometeam_id = t.team_api_id),
-- Declare and set up the away CTE
away AS (
  SELECT m.id, m.date, 
  		 t.team_long_name AS awayteam, m.away_goal
  FROM match AS m
  LEFT JOIN team AS t 
  ON m.awayteam_id = t.team_api_id)
-- Select date, home_goal, and away_goal
SELECT 
	home.date,
    home.hometeam,
    away.awayteam,
    home.home_goal,
    away.away_goal
-- Join away and home on the id column
FROM home
INNER JOIN away
ON home.id = away.id;

3.15 Which technique to use?

4. Window Functions

4.1 It’s OVER (video)

4.2 The match is OVER

Instruction 1:

  • Select the match ID, country name, season, home, and away goals from the match and country tables.
  • Complete the query that calculates the average number of goals scored overall and then includes the aggregate value in each row using a window function.
SELECT 
	-- Select the id, country name, season, home, and away goals
	m.id, 
    c.name AS country, 
    m.season,
	m.home_goal,
	m.away_goal,
    -- Use a window to include the aggregate average in each row
	AVG(m.home_goal + m.away_goal) OVER() AS overall_avg
FROM match AS m
LEFT JOIN country AS c ON m.country_id = c.id;

4.3 What’s OVER here?

Instruction:

  • Select the league name and average total goals scored from league and match.
  • Complete the window function so it calculates the rank of average goals scored across all leagues in the database.
  • Order the rank by the average total of home and away goals scored.
SELECT 
	-- Select the league name and average goals scored
	l.name AS league,
    AVG(m.home_goal + m.away_goal) AS avg_goals,
    -- Rank each league according to the average goals
    RANK() OVER(ORDER BY AVG(m.home_goal + m.away_goal)) AS league_rank
FROM league AS l
LEFT JOIN match AS m 
ON l.id = m.country_id
WHERE m.season = '2011/2012'
GROUP BY l.name
-- Order the query by the rank you created
ORDER BY league_rank;

4.4 Flip OVER your results

Instruction 1:

  • Complete the same parts of the query as the previous exercise.
  • Complete the window function to rank each league from highest to lowest average goals scored.
  • Order the main query by the rank you just created.
SELECT 
	-- Select the league name and average goals scored
	l.name AS league,
    AVG(m.home_goal + m.away_goal) AS avg_goals,
    -- Rank leagues in descending order by average goals
    RANK () OVER(ORDER BY AVG(m.home_goal + m.away_goal) DESC) AS league_rank
FROM league AS l
LEFT JOIN match AS m 
ON l.id = m.country_id
WHERE m.season = '2011/2012'
GROUP BY l.name
-- Order the query by the rank you created
ORDER BY league_rank;

4.5 OVER with a PARTITION (video)

4.6 PARTITION BY a column

Instruction 1:

  • Complete the two window functions that calculate the home and away goal averages. Partition the window functions by season to calculate separate averages for each season.
  • Filter the query to only include matches played by Legia Warszawa, id = 8673.
SELECT
	date,
	season,
	home_goal,
	away_goal,
	CASE WHEN hometeam_id = 8673 THEN 'home' 
		 ELSE 'away' END AS warsaw_location,
    -- Calculate the average goals scored partitioned by season
    AVG(home_goal) OVER (PARTITION BY season) AS season_homeavg,
    AVG(away_goal) OVER (PARTITION BY season) AS season_awayavg
FROM match
-- Filter the data set for Legia Warszawa matches only
WHERE 
	hometeam_id = 8673 
    OR awayteam_id = 8673
ORDER BY (home_goal + away_goal) DESC;

4.7 PARTITION BY multiple columns

  • Construct two window functions partitioning the average of home and away goals by season and month.
  • Filter the data set by Legia Warszawa’s team ID (8673) so that the window calculation excludes all teams who did not play against them.

Instruction:

SELECT 
	date,
	season,
	home_goal,
	away_goal,
	CASE WHEN hometeam_id = 8673 THEN 'home' 
         ELSE 'away' END AS warsaw_location,
	-- Calculate average goals partitioned by season and month
    AVG(home_goal) OVER(PARTITION BY season, 
         	EXTRACT(month FROM date)) AS season_mo_home,
    AVG(away_goal) OVER(PARTITION BY season, 
            EXTRACT(month FROM date)) AS season_mo_away
FROM match
WHERE 
	hometeam_id = 8673 
    OR awayteam_id = 8673
ORDER BY (home_goal + away_goal) DESC;

4.8 Sliding Windows (video)

4.9 Slide to the left

  • Complete the window function by:
    • Assessing the running total of home goals scored by FC Utrecht.
    • Assessing the running average of home goals scored.
    • Ordering both the running average and running total by date.

Instruction:

SELECT 
	date,
	home_goal,
	away_goal,
    -- Create a running total and running average of home goals
    SUM(home_goal) OVER(ORDER BY date 
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total,
    AVG(home_goal) OVER(ORDER BY date 
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_avg
FROM match
WHERE 
	hometeam_id = 9908 
	AND season = '2011/2012';

4.10 Slide to the right

Instruction 1:

  • Complete the window function by:
    • Assessing the running total of home goals scored by FC Utrecht.
    • Assessing the running average of home goals scored.
    • Ordering both the running average and running total by date, descending.
SELECT 
	-- Select the date, home goal, and away goals
	date,
    home_goal,
    away_goal,
    -- Create a running total and running average of home goals
    SUM(home_goal) OVER(ORDER BY date 
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total,
    AVG(home_goal) OVER(ORDER BY date 
         ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_avg
FROM match
WHERE 
	awayteam_id = 9908 
    AND season = '2011/2012';

4.11 Bringing it all together (video)

4.12 Setting up the home team CTE

Instruction:

  • Create a CASE statement that identifies each match as a win, lose, or tie for Manchester United.
  • Fill out the logical operators for each WHEN clause in the CASE statement (equals, greater than, less than).
  • Join the tables on home team ID from match, and team_api_id from team.
  • Filter the query to only include games from the 2014/2015 season where Manchester United was the home team.
SELECT 
	m.id, 
    t.team_long_name,
    -- Identify matches as home/away wins or ties
	CASE WHEN m.home_goal > m.away_goal THEN 'MU Win'
		WHEN m.home_goal < m.away_goal THEN 'MU Loss'
        ELSE 'Tie' END AS outcome
FROM match AS m
-- Left join team on the home team ID and team API id
LEFT JOIN team AS t 
ON m.hometeam_id = t.team_api_id
WHERE 
	-- Filter for 2014/2015 and Manchester United as the home team
	season = '2014/2015'
	AND t.team_long_name = 'Manchester United';

4.13 Setting up the away team CTE

Instruction:

  • Complete the CASE statement syntax.
  • Fill out the logical operators identifying each match as a win, loss, or tie for Manchester United.
  • Join the table on awayteam_id, and team_api_id.
SELECT 
	m.id, 
    t.team_long_name,
    -- Identify matches as home/away wins or ties
	CASE WHEN m.home_goal > m.away_goal THEN 'MU Loss'
		WHEN m.home_goal < m.away_goal THEN 'MU Win'
        ELSE 'Tie' END AS outcome
-- Join team table to the match table
FROM match AS m
LEFT JOIN team AS t 
ON m.awayteam_id = t.team_api_id
WHERE 
	-- Filter for 2014/2015 and Manchester United as the away team
	season = '2014/2015'
	AND t.team_long_name = 'Manchester United';

4.14 Putting the CTEs together

Instruction:

  • Declare the home and away CTEs before your main query.
  • Join your CTEs to the match table using a LEFT JOIN.
  • Select the relevant data from the CTEs into the main query.
  • Select the date from match, team names from the CTEs, and home/ away goals from match in the main query.
-- Set up the home team CTE
WITH home AS (
  SELECT m.id, t.team_long_name,
	  CASE WHEN m.home_goal > m.away_goal THEN 'MU Win'
		   WHEN m.home_goal < m.away_goal THEN 'MU Loss' 
  		   ELSE 'Tie' END AS outcome
  FROM match AS m
  LEFT JOIN team AS t ON m.hometeam_id = t.team_api_id),
-- Set up the away team CTE
away AS (
  SELECT m.id, t.team_long_name,
	  CASE WHEN m.home_goal > m.away_goal THEN 'MU Win'
		   WHEN m.home_goal < m.away_goal THEN 'MU Loss' 
  		   ELSE 'Tie' END AS outcome
  FROM match AS m
  LEFT JOIN team AS t ON m.awayteam_id = t.team_api_id)
-- Select team names, the date and goals
SELECT DISTINCT
    m.date,
    home.team_long_name AS home_team,
    away.team_long_name AS away_team,
    m.home_goal,
    m.away_goal
-- Join the CTEs onto the match table
FROM match AS m
LEFT JOIN home ON m.id = home.id
LEFT JOIN away ON m.id = away.id
WHERE m.season = '2014/2015'
      AND (home.team_long_name = 'Manchester United' 
           OR away.team_long_name = 'Manchester United');

4.15 Add a window function

Instruction 1:

  • Set up the CTEs so that the home and away teams each have a name, ID, and score associated with them.
  • Select the date, home team name, away team name, home goal, and away goals scored in the main query.
  • Rank the matches and order by the difference in scores in descending order.
-- Set up the home team CTE
WITH home AS (
  SELECT m.id, t.team_long_name,
	  CASE WHEN m.home_goal > m.away_goal THEN 'MU Win'
		   WHEN m.home_goal < m.away_goal THEN 'MU Loss' 
  		   ELSE 'Tie' END AS outcome
  FROM match AS m
  LEFT JOIN team AS t ON m.hometeam_id = t.team_api_id),
-- Set up the away team CTE
away AS (
  SELECT m.id, t.team_long_name,
	  CASE WHEN m.home_goal > m.away_goal THEN 'MU Loss'
		   WHEN m.home_goal < m.away_goal THEN 'MU Win' 
  		   ELSE 'Tie' END AS outcome
  FROM match AS m
  LEFT JOIN team AS t ON m.awayteam_id = t.team_api_id)
-- Select columns and and rank the matches by date
SELECT DISTINCT
    m.date,
    home.team_long_name AS home_team,
    away.team_long_name AS away_team,
    m.home_goal, m.away_goal,
    RANK() OVER(ORDER BY ABS(home_goal - away_goal) DESC) as match_rank
-- Join the CTEs onto the match table
FROM match AS m
LEFT JOIN home ON m.id = home.id
LEFT JOIN away ON m.id = away.id
WHERE m.season = '2014/2015'
      AND ((home.team_long_name = 'Manchester United' AND home.outcome = 'MU Loss')
      OR (away.team_long_name = 'Manchester United' AND away.outcome = 'MU Loss'));
  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值