Saturday, July 05, 2008 3:31 PM,
In this post I'll show just how easy it is to make your SQL Server 2005 database support Regular Expressions through what is known as SQL CLR
Just Fire up Visual Studio and create a new library project (I called it TextFunctions)
Add a new Class and Call it Regular Expressions and simply paste in the following Code:
using Microsoft.SqlServer.Server;
using System.Text.RegularExpressions;
using System.Data.SqlTypes;
public class RegularExpressions
{
[Microsoft.SqlServer.Server.SqlFunction(IsDeterministic=true, IsPrecise=true)]
public static SqlBoolean RegExMatch(SqlString input, SqlString pattern)
{
if (input.IsNull || pattern.IsNull) //nulls dont qualify for a match
return SqlBoolean.False;
//Use the static IsMatch method. This is more performant than creating a
// new instance of Regex as the static method also caches the last expressions we used.
return Regex.IsMatch(input.Value, pattern.Value, RegexOptions.IgnoreCase);
}
[Microsoft.SqlServer.Server.SqlFunction(IsDeterministic = true, IsPrecise = true)]
public static SqlString RegExReplace(SqlString input, SqlString pattern, SqlString replacement)
{
if (input.IsNull || pattern.IsNull || replacement.IsNull)
return SqlString.Null;
return new SqlString(Regex.Replace(input.Value, pattern.Value, replacement.Value, RegexOptions.IgnoreCase));
}
}
In this example we have two functions: one to Match a Regular expression pattern and one to perform a Replace based on a regular expression pattern.
To enable our SQL Server database to make use of the above functions we need to follow the steps below:
- Compile the library for Release
- CLR Enable Sql Server with:
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
Reference the assembly and register the functions:
CREATE ASSEMBLY TextFunctions FROM 'c:\Development\Testing\CLR\TextFunctions\bin\Release\TextFunctions.dll'
GO
CREATE Function RegExMatch(@Input NVARCHAR(512),@Pattern NVARCHAR(127))
RETURNS BIT
EXTERNAL NAME TextFunctions.RegularExpressions.RegExMatch
GO
CREATE Function RegExReplace(@Input NVARCHAR(512),@Pattern NVARCHAR(127), @Replacement NVARCHAR(512))
RETURNS NVARCHAR(512)
EXTERNAL NAME TextFunctions.RegularExpressions.RegExReplace
GO
Now we're ready to put our new functions to the test:
select dbo.RegExMatch('me@mymail.com','^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$')
--Verifies an email address
SELECT dbo.RegExMatch('12354', '\d')
--Verifies number Returns 1
SELECT dbo.RegExReplace('Poem Title (Author)', '.*?\((.*?)\).*', '$1')
-- Returns the match within the parenthesis, returns: Author (Useful for splitting one column in two)
SELECT EmailAddress, dbo.RegExMatch(EmailAddress,'^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$')
FROM Users
ORDER BY 2
--Checks your users table for Invalid Email addresses,
--at the top of the results all users with invalid email addresses
Regular expressions can come in really handy especially in data cleansing/transformation operations in your database.
Here are some further pointers to boost creativity with regular expressions:
Email Validation with Regular Expressions
Regular Expressions Cheat Sheet
RegExLib - Library of Regular Expressions for .Net