IT 面試中,數據庫的相關問題基本上屬於必考問題,而其中關於sql語句也是經常考察的一個重要知識點。
下面介紹下sql語句中一個比較重要的操作group by,他的重要行一方面體現在他的理解困難度,一方面體現應用中的長見性。
首先,給出一個studnet學生表:
CREATE TABLE `student` ( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(30) DEFAULT NULL, `sex` tinyint(1) DEFAULT '0', `score` int(10) NOT NULL, `dept` varchar(10) DEFAULT NULL, PRIMARY KEY (`id`) ) ENGINE=MyISAM AUTO_INCREMENT=8 DEFAULT CHARSET=utf8
添加一些測試數據:
mysql> select * from student where id<10; +----+------+------+-------+---------+ | id | name | sex | score | dept | +----+------+------+-------+---------+ | 1 | a | 1 | 90 | dev | | 2 | b | 1 | 90 | dev | | 3 | b | 0 | 88 | design | | 4 | c | 0 | 60 | sales | | 5 | c | 0 | 89 | sales | | 6 | d | 1 | 100 | product | +----+------+------+-------+---------+
給出需求,寫出sql:
給出各個部門最高學生的分數。
要想得到各個部門學生,首先就要分組,按照部門把他們分組,然後在各個部門中找到分數最高的就可以了。
所以sql語句為:
mysql> select *, max(score) as max from student group by dept order by name; +----+------+------+-------+---------+------+ | id | name | sex | score | dept | max | +----+------+------+-------+---------+------+ | 1 | a | 1 | 90 | dev | 90 | | 3 | b | 0 | 88 | design | 88 | | 4 | c | 0 | 60 | sales | 89 | | 6 | d | 1 | 100 | product | 100 | +----+------+------+-------+---------+------+ 4 rows in set (0.00 sec)
這只是個簡單的例子,我們可以再把這個例子復雜化,比如分數最高的必須是女生,即sex列值必須為1才挑選出,這時的sql語句應該為:
mysql> select *,max(score) as max from student group by dept having sex='1' order by name; +----+------+------+-------+---------+------+ | id | name | sex | score | dept | max | +----+------+------+-------+---------+------+ | 1 | a | 1 | 90 | dev | 90 | | 6 | d | 1 | 100 | product | 100 | +----+------+------+-------+---------+------+ 2 rows in set (0.46 sec)這裡我們沒有用where語句而是用了having,這裡簡單說明一下,因為我們的條件是在分組後進行的,其實分組前挑選出sex='1',然後再按照dept部門分組,也是可行的,這裡就要看題目是怎麼要求的:
mysql> select *,max(score) as max from student where sex='1' group by dept order by name; +----+------+------+-------+---------+------+ | id | name | sex | score | dept | max | +----+------+------+-------+---------+------+ | 1 | a | 1 | 90 | dev | 90 | | 6 | d | 1 | 100 | product | 100 | +----+------+------+-------+---------+------+ 2 rows in set (0.05 sec)
查詢出的結果時一致的,如果把選擇條件改為必須部門所有人的分數之和大於150才能把分數最高的部門的人列出來,這裡就必須使用having了,因為 having 裡面可以使用聚合函數sum,並且也必須分完組我們才能得到這個組的總分數,才能比較是否該值大於150:
mysql> select *,max(score) as max from student group by dept having sum(score)>150 order by name; +----+------+------+-------+---------+------+ | id | name | sex | score | dept | max | +----+------+------+-------+---------+------+ | 1 | a | 1 | 90 | dev | 90 | | 6 | d | 1 | 100 | product | 100 | +----+------+------+-------+---------+------+ 2 rows in set (0.00 sec)
額外增加一個例子,比如我要選出不重復的部門,我們可以使用
mysql> select distinct dept from student; +---------+ | dept | +---------+ | dev | | design | | sales | | product | +---------+ 4 rows in set (0.02 sec)
但是如果我們還要列出他的id等一些其他信息,我們如果這樣:
mysql> select name,distinct dept from student; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'distinct dept from student' at line 1
這是不行的,因為distinct只能放到開始位置,如果:
mysql> select distinct dept,name from student; +---------+------+ | dept | name | +---------+------+ | dev | a | | dev | b | | design | b | | sales | c | | product | d | | product | m | +---------+------+ 6 rows in set (0.00 sec)為什麼沒有達到預期的效果,因為distinct 作用到了2個字段上,這時,我們就需要groub by 出場了。
mysql> select dept,name from student group by dept; +---------+------+ | dept | name | +---------+------+ | design | b | | dev | a | | product | d | | sales | c | +---------+------+ 4 rows in set (0.00 sec)
按照dept分組,自然就達到去重的目的了。所以有時候如果我們碰到了一個問題很難解決,比如用distinct去重,並帶上其他列值,我們就需要嘗試換個思路,可能答案自然就找到了。