Wednesday, May 6, 2009

How to Remove Duplicates in a string in SQL

I was working on a Functionality and came i cross duplicates in a Database table Field, so i thought that i will just put this string in a arralist and remove the Duplicates from there , but i thought how Arraylist are slow i thought mybe it will be good to do it here in sql, while googling i came across a very interesting post by Pinal Dave and the post originally written by Ashish Jain . you can Find it here http://blog.sqlauthority.com/2009/01/15/sql-server-remove-duplicate-entry-from-comma-delimited-string-udf/

Here is How it was done,

Create a UDF(User Derfined Function)


alter FUNCTION dbo.DistinctList
(
@List VARCHAR(MAX),
@Delim CHAR
)
RETURNS
VARCHAR(MAX)
AS
BEGIN
DECLARE @ParsedList TABLE
(
Item VARCHAR(MAX)
)
DECLARE @list1 VARCHAR(MAX), @Pos INT, @rList VARCHAR(MAX)
SET @list = LTRIM(RTRIM(@list)) + @Delim
SET @pos = CHARINDEX(@delim, @list, 1)
WHILE @pos > 0
BEGIN
SET @list1 = LTRIM(RTRIM(LEFT(@list, @pos - 1)))
IF @list1 <> ''
INSERT INTO @ParsedList VALUES (CAST(@list1 AS VARCHAR(MAX)))
SET @list = SUBSTRING(@list, @pos+1, LEN(@list))
SET @pos = CHARINDEX(@delim, @list, 1)
END
SELECT @rlist = COALESCE(@rlist+',','') + item
FROM (SELECT DISTINCT Item FROM @ParsedList) t
RETURN @rlist
END

And Run your Fuction like this


SELECT dbo.DistinctList(MY_FIELD,' ') DistinctList FROM MYTABLE

This Function takes two parameters the first one is the duplicated string, in this case i have used my field in mytable. e.g if you have

"12345612345" and run this on your field ,you will end up with

"123456"

Thanks

Vuyiswa Maseko