0%

SQL删除重复数据,只保留一行

在sql的使用中,我们总是碰到需要删除重复数据的情况,但是又不能全部删除完,必须要保留至少一个重复的数据。重复的记录根据两个字段a2,a3判断(实际使用中可以拓展为多个)

eg:表A

a1 a2 a3
1 1 1
2 1 2
3 2 2
4 2 2
5 3 3
6 2 2

在上述的表中第三行和第四行重复,我们要选择一行删除,流程如下:

  1. 选择重复的行:
1
2
3
select *,count(*) 
from A group by a2,a3
having count(*)>1;

结果如下:

a1 a2 a3 count(*)
3 2 2 3
  1. 使用in来找到我们想要的ID
1
2
3
4
5
6
7
SELECT *
FROM A
WHERE (a2,a3) IN
(SELECT A.`a2`,A.`a3`
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*)>1)

得到的结果如下:

|a1|a2|a3|
| — | — |
|3|2|2|
|4|2|2|
|6|2|2|
那么后面就很好办了:

3.选出要删除的值:

1
2
3
4
5
6
7
8
9
10
11
12
SELECT * 
FROM A
WHERE (a2, a3) IN
(SELECT `a2`,`a3`
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)
AND a1 NOT IN
(SELECT MIN(a1)
FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1) ;

结果是保留a1最小的值,其他选项全部选出,
请注意此时并不是将Select 改为delete就可以了,如果你直接这样子改的话,会报如下错误:

You can’t specify target table ‘A’ for update in FROM clause

该错误提示你,不能先select出同一表中的某些值,再update这个表(在同一语句中)。所以要稍微修改一下。

  1. 删除值
    sql语句如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
//创建中间表
CREATE TABLE F(a1 INTEGER,a2 INTEGER,a3 INTEGER);
//将要删除的数据插入中间表
INSERT INTO F (
SELECT *
FROM A
WHERE (a2, a3) IN (SELECT `a2`,`a3`
FROM A GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)
AND a1 NOT IN
(SELECT MIN(a1) FROM A
GROUP BY A.`a2`,A.`a3`
HAVING COUNT(*) > 1)) ;
//删除中间表
DELETE FROM A WHERE a1 IN (SELECT a1 FROM F);
SELECT *FROM A;

结果如下:

|a1|a2|a3|
|—|—|
|1|1|1|
|2|1|2|
|3|2|2|
|5|3|3|

完毕

注:如果说不用保留一行数据的话那么就简单多了,只需要一个很简单的sql语句:

1
DELETE FROM A WHERE (a2,a3) IN (SELECT a2,a3 FROM A GROUP BY a2,a3 HAVING COUNT(*)>1)

本文整理自

SQL删除重复数据,只保留一行

仅做个人学习总结所用,遵循CC 4.0 BY-SA版权协议,如有侵权请联系删除!