How To Remove Duplicates From a Table with Postgres

Written by Sean Behan on Tue Oct 08th 2013

Let's create a table that will hold some dummy data.

> create table fruits (id serial primary key, name varchar, color varchar);

> insert into fruits (name, color) values ('apple', 'red');
> insert into fruits (name, color) values ('apple', 'green');
> insert into fruits (name, color) values ('banana', 'yellow');
> insert into fruits (name, color) values ('banana', 'yellow');

We can inspect the table and see that we have 2 entries for banana, when we only want 1.

> select * from fruits;
 id |  name  | color  
----+--------+--------
  1 | apple  | red
  2 | apple  | green
  3 | banana | yellow
  4 | banana | yellow
(4 rows)

To delete the first entry we use the USING clause, which is Postgres specific.

> delete from fruits using fruits f
where fruits.name = f.name
and fruits.color = f.color
and fruits.id < f.id;

All duplicates have been removed!

> select * from fruits;
 id |  name  | color  
----+--------+--------
  1 | apple  | red
  2 | apple  | green
  4 | banana | yellow
(3 rows)

Tagged with..
#Postgres #databases #data

Just finishing up brewing up some fresh ground comments...

How To Remove Duplicates From a Table with Postgres

SB