03 聚合分组查询
聚合函数
SQL123 SQL类别高难度试卷得分的截断平均值
牛客的运营同学想要查看大家在SQL类别中高难度试卷的得分情况
请你帮她从exam_record数据表中计算所有用户完成SQL类别高难度试卷得分的截断平均值(去掉一个最大值和一个最小值后的平均值)
SELECT tag,difficulty,ROUND((S-A-B)/(N-2),1) AS clip_avg_score FROM
(SELECT tag,difficulty,COUNT(score) AS N,MAX(score) AS A,MIN(score) AS B,SUM(score) AS S
FROM exam_record AS ER
JOIN examination_info AS EI ON EI.exam_id=ER.exam_id
WHERE tag='SQL' AND difficulty='hard'
ORDER BY score) AS T
SQL124 统计作答次数
有一个试卷作答记录表exam_record,请从中统计出总作答次数total_pv、试卷已完成作答数complete_pv、已完成的试卷数complete_exam_cnt
SELECT
COUNT(exam_id) AS total_pv,
SUM(IF(submit_time IS NOT NULL,1,0)) AS complete_pv,
COUNT(DISTINCT IF(submit_time IS NOT NULL,exam_id, NULL)) AS complete_exam_cnt
FROM exam_record
SQL125 得分不小于平均分的最低分
请从试卷作答记录表中找到SQL试卷得分不小于该类试卷平均得分的用户最低得分
SELECT MIN(score) AS min_score_over_avg FROM exam_record
JOIN examination_info ON exam_record.exam_id=examination_info.exam_id
WHERE score>=
(SELECT AVG(score) AS A FROM exam_record AS ER
JOIN examination_info AS EI ON ER.exam_id=EI.exam_id
WHERE tag='SQL'
GROUP BY tag)
AND tag='SQL'
分组查询
SQL126 平均活跃天数和月活人数
用户在牛客试卷作答区作答记录存储在表exam_record中
请计算2021年每个月里试卷作答区用户平均月活跃天数avg_active_days和月度活跃人数mau
SELECT
CONCAT(SUBSTR(submit_time,1,4),SUBSTR(submit_time,6,2)) AS month,
ROUND(COUNT(DISTINCT uid,DAY(submit_time))/COUNT(DISTINCT uid),2) AS avg_active_days,
COUNT(DISTINCT uid) AS mau
FROM exam_record
WHERE submit_time IS NOT NULL
AND YEAR(submit_time)=2021
GROUP BY month
SQL127 月总刷题数和日均刷题数
现有一张题目练习记录表practice_record
请从中统计出2021年每个月里用户的月总刷题数month_q_cnt 和日均刷题数avg_day_q_cnt(按月份升序排序)以及该年的总体情况
SELECT
DATE_FORMAT(submit_time,'%Y%m') AS submit_month,
COUNT(submit_time) AS month_q_cnt,
ROUND(COUNT(submit_time)/MAX(DAY(last_day(submit_time))),3) AS avg_day_q_cnt
FROM practice_record
WHERE YEAR(submit_time)=2021
GROUP BY submit_month
UNION ALL
SELECT
'2021汇总' AS submit_month,
COUNT(*) AS month_q_cnt,
ROUND(COUNT(*) /31 ,3) AS avg_day_q_cnt
FROM practice_record
WHERE year(submit_time) = '2021'
ORDER BY submit_month
SQL128 未完成试卷数大于1的有效用户
现有试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)
请统计2021年每个未完成试卷作答数大于1的有效用户的数据(有效用户指完成试卷作答数至少为1且未完成数小于5),输出用户ID、未完成试卷作答数、完成试卷作答数、作答过的试卷tag集合,按未完成试卷数量由多到少排序
SELECT
uid,
SUM(IF(submit_time IS NULL,1,0)) AS incomplete_cnt,
SUM(IF(submit_time IS NOT NULL,1,0)) AS complete_cnt,
GROUP_CONCAT(DISTINCT CONCAT_WS(':', DATE(start_time), tag) SEPARATOR ';') AS detail
FROM exam_record AS ER
LEFT JOIN examination_info AS EI ON ER.exam_id=EI.exam_id
WHERE YEAR(start_time)=2021
GROUP BY uid
HAVING incomplete_cnt BETWEEN 2 AND 4
AND complete_cnt>=1
ORDER BY incomplete_cnt DESC
GROUP_CONCAT
将 GROUP BY 产生的同一个分组中的值连接起来,返回一个字符串结果
GROUP_CONCAT( [DISTINCT] 要连接的字段 [ORDER BY ] [SEPARATOR '分隔符'] )
group_concat只有与group by语句同时使用才能产生效果 所以使用 GROUP_CONCAT必须对源数据进行分组,否则所有数据会被合并成一行
CONCAT_WS
将多个字符串连接成一个字符串,可以指定分隔符
CONCAT_WS(separator, str1, str2, ...)
04 多表查询
嵌套子查询
SQL129 月均完成试卷数不小于3的用户爱作答的类别
请从表中统计出 “当月均完成试卷数”不小于3的用户们爱作答的类别及作答次数,按次数降序输出
SELECT tag, COUNT(exam_id) AS tag_cnt
FROM exam_record
JOIN examination_info USING(exam_id)
WHERE uid IN (
SELECT uid
FROM exam_record
WHERE score IS NOT NULL
GROUP BY uid
HAVING COUNT(score)/COUNT(DISTINCT DATE_FORMAT(start_time, "%Y%m"))>=3
)
GROUP BY tag
ORDER BY tag_cnt DESC
SQL130 试卷发布当天作答人数和平均分
请计算每张SQL类别试卷发布后,当天5级以上的用户作答的人数uv和平均分avg_score,按人数降序,相同人数的按平均分升序
SELECT exam_id,COUNT(DISTINCT uid) AS uv,ROUND(AVG(score),1) AS avg_score
FROM exam_record
WHERE
(exam_id, DATE(start_time)) IN
(SELECT exam_id, DATE(release_time) FROM examination_info WHERE tag = "SQL")
AND uid IN ( SELECT uid FROM user_info WHERE level>5 )
GROUP BY exam_id
ORDER BY uv DESC,avg_score
SQL131 作答试卷得分大于过80的人的用户等级分布
统计作答SQL类别的试卷得分大于过80的人的用户等级分布,按数量降序排序(保证数量都不同)
SELECT level,COUNT(uid) AS level_cnt
FROM user_info
WHERE uid IN
(SELECT uid FROM exam_record JOIN examination_info USING(exam_id)
WHERE tag='SQL' AND score>80)
GROUP BY level
ORDER BY level_cnt DESC
合并查询
SQL132 每个题目和每份试卷被作答的人数和次数
请统计每个题目和每份试卷被作答的人数和次数,分别按照"试卷"和"题目"的uv&pv降序显示
SELECT * FROM
(SELECT exam_id AS tid,COUNT(DISTINCT uid) AS uv,COUNT(uid) AS pv
FROM exam_record
GROUP BY tid
ORDER BY uv DESC,pv DESC) AS E
UNION
SELECT * FROM
(SELECT question_id AS tid,COUNT(DISTINCT uid) AS uv,COUNT(uid) AS pv
FROM practice_record
GROUP BY tid
ORDER BY uv DESC,pv DESC) AS Q
SQL133 分别满足两个活动的人
假使以前我们有两拨运营活动,分别给每次试卷得分都能到85分的人(activity1)、至少有一次用了一半时间就完成高难度试卷且分数大于80的人(activity2)发了福利券
请写出一个SQL实现:输出2021年里,所有每次试卷得分都能到85分的人以及至少有一次用了一半时间就完成高难度试卷且分数大于80的人的id和活动号,按用户ID排序输出
SELECT DISTINCT uid,'activity1' AS activity FROM exam_record
WHERE uid NOT IN ( SELECT uid FROM exam_record WHERE score<85)
AND YEAR(submit_time)=2021
UNION
SELECT DISTINCT uid,'activity2' AS activity FROM exam_record
JOIN examination_info USING(exam_id)
WHERE YEAR(submit_time)=2021
AND TIMESTAMPDIFF(SECOND,start_time,submit_time)<=1800
AND difficulty='hard'
AND score>=80
ORDER BY uid
连接查询
SQL134 满足条件的用户的试卷完成数和题目练习数
请你找到高难度SQL试卷得分平均值大于80并且是7级的红名大佬,统计他们的2021年试卷总完成次数和题目总练习次数,只保留2021年有试卷完成记录的用户。结果按试卷完成数升序,按题目练习数降序
SELECT
H.uid,
COUNT(DISTINCT J.id) AS exam_cnt,
COUNT(DISTINCT N.id) AS question_cnt
FROM
(SELECT UI.uid FROM exam_record AS ER
JOIN user_info AS UI ON ER.uid=UI.uid
JOIN examination_info AS EI ON ER.exam_id=EI.exam_id
WHERE EI.tag='SQL' AND EI.difficulty='hard'
AND UI.level=7
AND YEAR(ER.submit_time)=2021
GROUP BY uid
HAVING AVG(ER.score)>80) AS H
LEFT JOIN exam_record AS J ON H.uid=J.uid
AND YEAR(J.submit_time)=2021
LEFT JOIN practice_record AS N ON H.uid=N.uid
AND YEAR(N.submit_time)=2021
GROUP BY uid
ORDER BY exam_cnt,question_cnt DESC
SQL135 每个6/7级用户活跃情况
请统计每个6/7级用户总活跃月份数、2021年活跃天数、2021年试卷作答活跃天数、2021年答题活跃天数,按照总活跃月份数、2021年活跃天数降序排序
SELECT
UI.uid,
COUNT(DISTINCT LEFT(T,6)) AS act_month_total,
COUNT(DISTINCT IF(LEFT(T,4)='2021' ,RIGHT(T,4),NULL)) AS act_days_2021,
COUNT(DISTINCT IF(LEFT(T,4)='2021' AND tag='B',RIGHT(T,4),NULL)) AS act_days_2021_exam,
COUNT(DISTINCT IF(LEFT(T,4)='2021' AND tag='A',RIGHT(T,4),NULL)) AS act_days_2021_question
FROM
(
SELECT uid,DATE_FORMAT(submit_time,'%Y%m%d') AS T,'A' AS tag FROM practice_record
UNION ALL
SELECT uid,DATE_FORMAT(start_time,'%Y%m%d') AS T,'B' AS tag FROM exam_record
) AS UT
RIGHT JOIN user_info AS UI ON UI.uid=UT.uid
WHERE UI.level>=6
GROUP BY uid
ORDER BY act_month_total DESC ,act_days_2021 DESC