[adsense id=”0514458240″ width=”468″ height=”60″]
MySQL offers a nice feature which both gets rid of duplicates and adds a unique/primary key. The syntax for the MySQL command is ALTER IGNORE TABLE table_name which allows you to either add a unique key or a primary key.
Here is an example:
mysql> create table emp(emp_id int, name varchar(100), salary decimal(5,0)); Query OK, 0 rows affected (0.01 sec) mysql> insert into emp(emp_id, name, salary) values (1,'Ron',1000),(2,'Tom',1100),(3,'Dick',1000),(3,'Harry',1250); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> select * from emp; +--------+-------+--------+ | emp_id | name | salary | +--------+-------+--------+ | 1 | Ron | 1000 | | 2 | Tom | 1100 | | 3 | Dick | 1000 | | 3 | Harry | 1250 | +--------+-------+--------+ 4 rows in set (0.00 sec) mysql> alter table emp add constraint primary key (emp_id); ERROR 1062 (23000): Duplicate entry '3' for key 'PRIMARY' mysql> alter ignore table emp add constraint primary key (emp_id); Query OK, 4 rows affected, 1 warning (0.03 sec) Records: 4 Duplicates: 1 Warnings: 1 mysql> select * from emp; +--------+------+--------+ | emp_id | name | salary | +--------+------+--------+ | 1 | Ron | 1000 | | 2 | Tom | 1100 | | 3 | Dick | 1000 | +--------+------+--------+ 3 rows in set (0.00 sec)
To identify the entries that have a duplicate we can issue the following SQL.
mysql> drop table emp; Query OK, 0 rows affected (0.01 sec) mysql> create table emp(emp_id int, name varchar(100), salary decimal(5,0)); Query OK, 0 rows affected (0.01 sec) mysql> insert into emp(emp_id, name, salary) values (1,'Ron',1000),(2,'Tom',1100),(3,'Dick',1000),(3,'Harry',1250); Query OK, 4 rows affected (0.01 sec) Records: 4 Duplicates: 0 Warnings: 0 mysql> select * from emp; +--------+-------+--------+ | emp_id | name | salary | +--------+-------+--------+ | 1 | Ron | 1000 | | 2 | Tom | 1100 | | 3 | Dick | 1000 | | 3 | Harry | 1250 | +--------+-------+--------+ 4 rows in set (0.00 sec) mysql> select emp_id -> from emp -> group by emp_id -> having count(*) > 1; +--------+ | emp_id | +--------+ | 3 | +--------+ 1 row in set (0.00 sec)