在mysql中出現重復的數據時我們有需要進行處理,下面我來給大家介紹關於mysql重復數據的一些處理方法,包括刪除重復數據、排序數據且不重復、查詢等等。
今日做一個功能需求,從表中排序取出前十個且數據不一致的數據,如下列a表中的值有以下數據:
代碼如下 復制代碼
mysql> select * from a;
+----+----------+
| id | user |
+----+----------+
| 1 | zhangsan |
| 2 | lisi |
| 3 | wangwu |
| 4 | zhangsan |
| 5 | zhaosi |
| 6 | wangwu |
| 7 | lisi |
| 8 | lisi |
| 9 | zhaosi |
+----+----------+
9 rows in set (0.00 sec)
我們需要取出id最大的前四位,且user不能一致的信息,按照以上信息以及需求,我們要搜索出結果為
zhaosi
lisi
wangwu
zhangsan
不能按照普通的做法,如:
代碼如下 復制代碼
mysql> select * from a order by id desc limit 4;
+----+--------+
| id | user |
+----+--------+
| 9 | zhaosi |
| 8 | lisi |
| 7 | lisi |
| 6 | wangwu |
+----+--------+
4 rows in set (0.00 sec)
這樣搜索出來的有重復值,得使用distinct關鍵字
mysql> select distinct user from a order by id desc limit 4;
+----------+
| user |
+----------+
| zhaosi |
| wangwu |
| lisi |
| zhangsan |
+----------+
4 rows in set (0.00 sec)
其實應該是lisi與wangwu互換一下才是比較理想的,因為lisi最大的ID是8,而wangwu最大的ID是6,可能是lisi有一個ID為2導致的,我們把ID為2的刪除,在來試試
mysql> delete from a where id=2;
Query OK, 1 row affected (0.02 sec)
mysql> select * from a;
+----+----------+
| id | user |
+----+----------+
| 1 | zhangsan |
| 3 | wangwu |
| 4 | zhangsan |
| 5 | zhaosi |
| 6 | wangwu |
| 7 | lisi |
| 8 | lisi |
| 9 | zhaosi |
+----+----------+
8 rows in set (0.00 sec)
mysql> select distinct user from a order by id desc limit 4;
+----------+
| user |
+----------+
| lisi |
| zhaosi |
| wangwu |
| zhangsan |
+----------+
4 rows in set (0.00 sec)
結果正是由於前邊有較低的ID記錄影響了排序。
雖然這條語句能搜索正確的效果,但可能排序不是那麼理想,也就是ID最大的前四位能搜索出來,但在這四位數據裡並不是按照ID大小排序的。
例1測試數據
代碼如下 復制代碼/* 表結構 */
DROP TABLE IF EXISTS `t1`;
CREATE TABLE IF NOT EXISTS `t1`(
`id` INT(1) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(20) NOT NULL,
`add` VARCHAR(20) NOT NULL,
PRIMARY KEY(`id`)
)Engine=InnoDB;
/* 插入測試數據 */
INSERT INTO `t1`(`name`,`add`) VALUES
('abc',"123"),
('abc',"123"),
('abc',"321"),
('abc',"123"),
('xzy',"123"),
('xzy',"456"),
('xzy',"456"),
('xzy',"456"),
('xzy',"789"),
('xzy',"987"),
('xzy',"789"),
('ijk',"147"),
('ijk',"147"),
('ijk',"852"),
('opq',"852"),
('opq',"963"),
('opq',"741"),
('tpk',"741"),
('tpk',"963"),
('tpk',"963"),
('wer',"546"),
('wer',"546"),
('once',"546");
SELECT * FROM `t1`;
+----+------+-----+
| id | name | add |
+----+------+-----+
| 1 | abc | 123 |
| 2 | abc | 123 |
| 3 | abc | 321 |
| 4 | abc | 123 |
| 5 | xzy | 123 |
| 6 | xzy | 456 |
| 7 | xzy | 456 |
| 8 | xzy | 456 |
| 9 | xzy | 789 |
| 10 | xzy | 987 |
| 11 | xzy | 789 |
| 12 | ijk | 147 |
| 13 | ijk | 147 |
| 14 | ijk | 852 |
| 15 | opq | 852 |
| 16 | opq | 963 |
| 17 | opq | 741 |
| 18 | tpk | 741 |
| 19 | tpk | 963 |
| 20 | tpk | 963 |
| 21 | wer | 546 |
| 22 | wer | 546 |
| 23 | once | 546 |
+----+------+-----+
rows in set (0.00 sec)
查找id最小的重復數據(只查找id字段)
代碼如下 復制代碼/* 查找id最小的重復數據(只查找id字段) */
SELECT DISTINCT MIN(`id`) AS `id`
FROM `t1`
GROUP BY `name`,`add`
HAVING COUNT(1) > 1;
+------+
| id |
+------+
| 1 |
| 12 |
| 19 |
| 21 |
| 6 |
| 9 |
+------+
rows in set (0.00 sec)
查找所有重復數據
代碼如下 復制代碼/* 查找所有重復數據 */
SELECT `t1`.*
FROM `t1`,(
SELECT `name`,`add`
FROM `t1`
GROUP BY `name`,`add`
HAVING COUNT(1) > 1
) AS `t2`
WHERE `t1`.`name` = `t2`.`name`
AND `t1`.`add` = `t2`.`add`;
+----+------+-----+
| id | name | add |
+----+------+-----+
| 1 | abc | 123 |
| 2 | abc | 123 |
| 4 | abc | 123 |
| 6 | xzy | 456 |
| 7 | xzy | 456 |
| 8 | xzy | 456 |
| 9 | xzy | 789 |
| 11 | xzy | 789 |
| 12 | ijk | 147 |
| 13 | ijk | 147 |
| 19 | tpk | 963 |
| 20 | tpk | 963 |
| 21 | wer | 546 |
| 22 | wer | 546 |
+----+------+-----+
rows in set (0.00 sec)
查找除id最小的數據外的重復數據
代碼如下 復制代碼/* 查找除id最小的數據外的重復數據 */
SELECT `t1`.*
FROM `t1`,(
SELECT DISTINCT MIN(`id`) AS `id`,`name`,`add`
FROM `t1`
GROUP BY `name`,`add`
HAVING COUNT(1) > 1
) AS `t2`
WHERE `t1`.`name` = `t2`.`name`
AND `t1`.`add` = `t2`.`add`
AND `t1`.`id` <> `t2`.`id`;
+----+------+-----+
| id | name | add |
+----+------+-----+
| 2 | abc | 123 |
| 4 | abc | 123 |
| 7 | xzy | 456 |
| 8 | xzy | 456 |
| 11 | xzy | 789 |
| 13 | ijk | 147 |
| 20 | tpk | 963 |
| 22 | wer | 546 |
+----+------+-----+
rows in set (0.00 sec)
1 2