Q1 [0 points] (q1_sample):
The purpose of this query is to make sure that the formatting of your output matches exactly the formatting of our auto-grading script.
Details: List all types of work ordered by type ascendingly.
Answer: Here's the correct SQL query and expected output:
SELECT
name
FROM
work_type
ORDER BY
name;
Q2 [5 points] (q2_long_name):
List works with longest name of each type.
Details: For each work type, find works that have the longest names. There might be cases where there is a tie for the longest names - in that case, return all of them. Display work names and corresponding type names, and order it according to work type (ascending) and use work name (ascending) as tie-breaker.
SELECT
a.name,
work_type.name
FROM
( SELECT MAX( LENGTH( name ) ), name, type FROM work GROUP BY type ) AS a,
work_type
WHERE
a.type = work_type.id
ORDER BY a.type ASC;
标准答案:
select work.name,
work_type.name
from work
inner join (
select max(length(work.name)) as max_length,
work.type as type
from work
group by work.type
) as newtable on newtable.max_length = length(work.name)
and work.type = newtable.type
inner join work_type on work.type = work_type.id
order by work.type asc,
work.name asc;
Q3 [5 points] (q3_old_music_nations):
List top 10 countries with the most classical music artists (born or started before 1850) along with the number of associated artists.
Details: Print country and number of associated arists before 1850. For example, Russia|191
. Sort by number of artists in descending order.
SELECT
area.name,
COUNT( 1 ) AS number
FROM
artist,
area
WHERE
area.id = artist.area
AND artist.begin_date_year < 1850
GROUP BY
artist.area
ORDER BY
number DESC
LIMIT 10;
标准答案:
select a2.name,
count(*) as c
from artist a1
inner join area a2 on a1.area = a2.id
where begin_date_year < 1850
group by a1.area
order by c desc
limit 10;
Q4 [10 points] (q4_dubbed_smash):
List the top 10 dubbed artist names with the number of dubs.
Details: Count the number of distinct names in artist_alias
for each artist in the artist
table, and list only the top ten who's from the United Kingdom and started after 1950 (not included). Print the artist name in the artist
table and the number of corresponding distinct dubbed artist names in the artist_alias
table.
SELECT
artist.name,
COUNT( 1 ) AS number
FROM
artist
INNER JOIN artist_alias ON artist.id = artist_alias.artist
WHERE
begin_date_year > 1950
AND area IN ( SELECT id FROM area WHERE name = 'United Kingdom' )
GROUP BY
artist.name
ORDER BY
number DESC
LIMIT 10;
标准答案:
Select artist.name,
count(distinct artist_alias.name) as num
From artist
inner join artist_alias on artist.id = artist_alias.artist
Where artist.begin_date_year > 1950
and area = 221
Group by artist.id
Order by num desc
Limit 10;
Q5 [10 points] (q5_vinyl_lover):
List the distinct names of releases issued in vinyl format by the British band Coldplay.
Details: Vinyl format includes ALL vinyl dimensions excluding VinylDisc
. Sort the release names by release date ascendingly.
SELECT DISTINCT
"release".name
FROM
artist_credit_name,
"release",
release_info,
medium,
medium_format
WHERE
artist_credit_name.name = 'Coldplay'
AND medium_format.name LIKE '%Vinyl'
AND artist_credit_name.artist_credit = "release".artist_credit
AND "release".id = release_info."release"
AND "release".id = medium."release"
AND medium.format = medium_format.id
ORDER BY
release_info.date_year,
release_info.date_month,
release_info.date_day;
标准答案:
select distinct r1.name as rname
from artist_credit_name a1
inner join artist_credit a2 on a1.artist_credit = a2.id
inner join release r1 on a2.id = r1.artist_credit
inner join release_info r2 on r1.id = r2.release
inner join medium m1 on r1.id = m1.release
inner join medium_format m2 on m1.format = m2.id
where a1.name = 'Coldplay'
and m2.name like '%Vinyl'
order by date_year,
date_month,
date_day;
Q6 [10 points] (q6_old_is_not_gold):
Which decades saw the most number of official releases? List the number of official releases in every decade since 1900. Like 1970s|57210
.
Details: Print all decades and the number of official releases. Releases with different issue dates or countries are considered different releases. Print the relevant decade in a fancier format by constructing a string that looks like this: 1970s
. Sort the decades in decreasing order with respect to the number of official releases and use decade (descending) as tie-breaker. Remember to exclude releases whose dates are NULL
.
SELECT
decade || 's' AS decades,
COUNT( 1 ) AS num
FROM
(
SELECT
release_info.date_year / 10 * 10 AS decade
FROM
release_info,
"release"
WHERE
release_info."release" = "release".id
AND "release".status = 1
AND release_info.date_year >= 1900
)
GROUP BY
decade
ORDER BY
num DESC,
decade DESC
标准答案:
Select decade,
count(*) as cnt
from (
select (CAST((date_year / 10) as int) * 10) || 's' as decade
from release
inner join release_info on release.id = release_info.release
where release.status = 1
and date_year >= 1900
)
Group by decade
Order by cnt desc,
decade desc;
Q7 [15 points] (q7_release_percentage):
List the month and the percentage of all releases issued in the corresponding month all over the world in the past year. Display like 2020.01|5.95
.
Details: The percentage of releases for a month is the number of releases issued in that month devided by the total releases in the past year from 07/2019 to 07/2020, both included. Releases with different issue dates or countries are considered different releases. Round the percentage to two decimal places using ROUND()
. Sort by dates in ascending order.
WITH num ( date_year, date_month ) AS (
SELECT
date_year,
date_month
FROM
"release"
INNER JOIN release_info ON "release".id = release_info."release"
WHERE
( date_year = 2019 AND date_month > 6 )
OR ( date_year = 2020 AND date_month < 8 )
) SELECT CAST
( date_year AS varchar ) || '.' || ( CASE WHEN date_month < 10 THEN '0' ELSE '' END ) || CAST ( date_month AS varchar ) AS date,
ROUND( COUNT( * ) * 100.0 / ( SELECT COUNT( * ) FROM num ), 2 )
FROM
num
GROUP BY
date
ORDER BY
date ASC;
标准答案:
with past_year_release (year, month) as (
select date_year,
date_month
from release_info r1
inner join release r2 on r1.release = r2.id
where (
(
date_year = 2019
and date_month >= 7
)
or (
date_year = 2020
and date_month <= 7
)
)
)
select cast(year as varchar) || '.' || (
case
when month < 10 then '0'
else ''
end
) || cast(month as varchar) as date,
round(
count(*) * 100.0 / (
select count(*)
from past_year_release
),
2
)
from past_year_release
group by date
order by date;
Q8 [15 points] (q8_collaborate_artist):
List the number of artists who have collaborated with Ariana Grande.
Details: Print only the total number of artists. An artist is considered a collaborator if they appear in the same artist_credit with Ariana Grande. The answer should include Ariana Grande herself.
SELECT
COUNT( DISTINCT artist )
FROM
artist_credit_name
WHERE
artist_credit IN ( SELECT artist_credit FROM artist_credit_name WHERE name = 'Ariana Grande' );
标准答案:
Select count(distinct artist)
From artist_credit_name
Where artist_credit in (
select artist_credit
from artist_credit_name
where name = 'Ariana Grande'
);
Q9 [15 points] (q9_dre_and_eminem):
List the rank, artist names, along with the number of collaborative releases of Dr. Dre and Eminem among other most productive duos (as long as they appear in the same release) both started after 1960 (not included). Display like [rank]|Dr. Dre|Eminem|[# of releases]
.
Details: For example, if you see a release by A, B, and C, it will contribute to three pairs of duos: A|B|1
, A|C|1
, and B|C|1
. You will first need to calculate a rank of these duos by number of collaborated releases (release with artist_credit shared by both artists) sorted descendingly, and then find the rank of Dr. Dre
and Eminem
. Only releases in English are considered. Both artists should be solo artists. All pairs of names should have the alphabetically smaller one first. Use artist names (asc) as tie breaker.
Hint: Artist aliases may be used everywhere. When doing aggregation, using artist ids will ensure you get the correct results. One example entry in the rank list is 9|Benj Pasek|Justin Paul|27
标准答案:
with duos_list (id1, id2, count) as (
select a1.artist as id1,
a2.artist as id2,
count(*) as c
from artist_credit_name a1
inner join artist_credit_name a2 on a1.artist_credit = a2.artist_credit
inner join release r on a2.artist_credit = r.artist_credit
inner join artist a3 on a1.artist = a3.id
inner join artist a4 on a2.artist = a4.id
inner join artist_type a5 on a3.type = a5.id
inner join artist_type a6 on a4.type = a6.id
inner join language l on r.language = l.id
where a3.name < a4.name
and a5.name = "Person"
and a6.name = "Person"
and l.name = 'English'
and a3.begin_date_year > 1960
and a4.begin_date_year > 1960
group by a1.artist,
a2.artist
)
select *
from (
select row_number () over (
order by count desc,
a1.name,
a2.name
) as rank,
a1.name as name1,
a2.name as name2,
count
from duos_list d
inner join artist a1 on d.id1 = a1.id
inner join artist a2 on d.id2 = a2.id
)
where name1 = 'Dr. Dre'
and name2 = 'Eminem';
Q10 [15 points] (q10_around_the_world):
Concat all dubbed names of The Beatles using comma-separated values(like "Beetles, fab four
").
Details: Find all dubbed names of artist "The Beatles
" in artist_alias and order them by id (ascending). Print a single string containing all the dubbed names separated by commas.
标准答案:
with c as (
select row_number() over (
order by c.id asc
) as seqnum,
c.name as name
from artist_alias c
join artist on c.artist = artist.id
where artist.name = 'The Beatles'
),
flattened as (
select seqnum,
name as name
from c
where seqnum = 1
union all
select c.seqnum,
f.name || ', ' || c.name
from c
join flattened f on c.seqnum = f.seqnum + 1
)
select name
from flattened
order by seqnum desc
limit 1;