SQL Server中的T-SQL RegEx命令

This article explores T-SQL RegEx commands in SQL Server for performing data search using various conditions.

本文探讨了SQL Server中的T-SQL RegEx命令,用于在各种条件下执行数据搜索。

介绍 (Introduction)

We store data in multiple formats or data types in SQL Server tables. Suppose you have a data column that contains string data in alphanumeric format. We use LIKE logical operator to search specific character in the string and retrieve the result. For example, in the Employee table, we want to filter results and get the only employee whose name starts with character A.

我们在SQL Server表中以多种格式或数据类型存储数据。 假设您有一个数据列,其中包含字母数字格式的字符串数据。 我们使用LIKE逻辑运算符在字符串中搜索特定字符并检索结果。 例如,在Employee表中,我们要过滤结果并获得名称以字符A开头的唯一雇员。

We use regular expressions to define specific patterns in T-SQL in a LIKE operator and filter results based on specific conditions. We also call these regular expressions as T-SQL RegEx functions. In this article, we will use the term T-SQL RegEx functions for regular expressions.

我们使用正则表达式在LIKE运算符中定义T-SQL中的特定模式,并根据特定条件过滤结果。 我们还将这些正则表达式称为T-SQL RegEx函数。 在本文中,我们将术语T-SQL RegEx函数用于正则表达式。

We can have multiple types of regular expressions:

我们可以有多种类型的正则表达式:

  • Alphabetic RegEx

    字母正则表达式
  • Numeric RegEx

    数值正则表达式
  • Case Sensitivity RegEx

    区分大小写RegEx
  • Special Characters RegEx

    特殊字符RegEx
  • RegEx to Exclude Characters

    正则表达式排除字符

前提条件 (Pre-requisite)

In this article, we will use the AdventureWorks sample database. Execute the following query, and we get all product descriptions:

在本文中,我们将使用AdventureWorks示例数据库。 执行以下查询,我们将获得所有产品描述:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription];

product descriptions table

Let’s explore T-SQL RegEx in the following examples.

让我们在以下示例中探索T-SQL RegEx。

示例1:筛选结果以A或L字符开头 (Example 1: Filter results for description starting with character A or L)

Suppose we want to get product description starting with character A or L. We can use format [XY]% in the Like function.

假设我们要获得以字符A或L开头的产品描述。我们可以在Like函数中使用格式[XY]%。

Execute the following query and observe the output contains rows with first character A or L:

执行以下查询,并观察输出包含第一个字符A或L的行:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[AL]%'

Filter results for description starting with character A or L

示例2:筛选结果以用于描述第一个字符A和第二个字符L (Example 2: Filter results for description with first character A and second character L)

In the previous example, we filtered results for starting character A, or L. Suppose we want starting characters of descriptions AL. We can use T-SQL RegEx [X][Y]% in the Like operator.

在前面的示例中,我们过滤了起始字符A或L的结果。假设我们想要描述AL的起始字符。 我们可以在Like运算符中使用T-SQL RegEx [X] [Y]%。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][L]%'

In the output, you can we get only records with first character A and second characters L.

在输出中,您可以只获取具有第一个字符A和第二个字符L的记录。

Filter results for description with first character A and second character L

We can specify multiple characters as well to filter records. The following query gives results for starting characters [All] together:

我们也可以指定多个字符来过滤记录。 以下查询给出一起开始字符[All]的结果:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][L][L]%'

filter results using T-SQL RegEX

示例3:过滤结果以描述和A和D之间的起始字符 (Example 3: Filter results for description and starting character between A and D)

In the previous example, we specified a particular starting character to filter the results. We can specify character range using [X-Z]% functions.

在前面的示例中,我们指定了一个特殊的起始字符来过滤结果。 我们可以使用[XZ]%函数指定字符范围。

The following query gives results for description starting character from A and D:

下面的查询给出了描述结果从A和D的起始字符:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D]%'

Filter results for description and starting character between A and D

Similarly, we can specify multiple conditions for each character. For example, the below query does the following searches:

同样,我们可以为每个字符指定多个条件。 例如,以下查询执行以下搜索:

  • The first character should be from A and D alphabets

    第一个字符应来自A和D字母
  • The second character should be from F and L alphabet

    第二个字符应来自F和L字母
SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D][F-I]%'

In the output, you can see that both result set satisfies both conditions.

在输出中,您可以看到两个结果集都满足两个条件。

specify multiple conditions

示例4:筛选结果以描述和A和D之间的结尾字符 (Example 4: Filter results for description and ending character between A and D)

In the previous examples, we filtered the data for the starting characters. We might want to filter for the end position character as well.

在前面的示例中,我们过滤了起始字符的数据。 我们可能还需要过滤结束位置字符。

In the previous examples, note the position of percentage (%) operator. We specified a percentage character at the end of search characters.

在前面的示例中,请注意百分比(%)运算符的位置。 我们在搜索字符的末尾指定了百分比字符。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A-D][F-I]%'

In the following query, we changed the position of percentage character at the beginning on search character. It looks for the characters with the following condition:

在以下查询中,我们更改了百分比字符在搜索字符开头的位置。 查找具有以下条件的字符:

  • Ending character should be from G and S

    结束符应来自G和S
SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '%[G-S]'

In the output, we get the character that satisfies our search condition.

在输出中,我们得到满足搜索条件的字符。

Filter results for description and ending character between A and D

示例5:筛选结果以用于描述起始字母AF和结束字符S之间的描述 (Example 5: Filter results for description starting letters AF and ending character between S)

Let’s make it a bit complex. We want to search using the following conditions:

让我们稍微复杂一点。 我们要使用以下条件进行搜索:

  • Starting character should be A (first) and F (second)

    起始字符应为A(第一个)和F(第二个)
  • Ending character should be S

    结尾字符应为S

Execute the following query, and in the output, we can see it satisfies our requirement:

执行以下查询,在输出中,我们可以看到它满足我们的要求:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[A][F]%[S]'

Filter results for description starting letters AF and  ending character between S

示例6:筛选结果以用于描述起始字母(A到T除外) (Example 6: Filter results for description starting letters excluding A to T )

In the following example, we do not want the first character of output rows from A to T. We can exclude characters using [^X-Y] format in Like operator.

在下面的示例中,我们不希望输出行从A到T的第一个字符。我们可以在Like运算符中使用[^ XY]格式排除字符。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[^A-T]%'

In the output, we do not have any first characters from A to T.

在输出中,我们没有从A到T的任何第一个字符。

Filter results for description starting letters excluding A to T

示例7:过滤结果以使用特定模式进行描述 (Example 7: Filter results for description with a specific pattern)

In the example below, we want to filter records using the following conditions:

在下面的示例中,我们要使用以下条件过滤记录:

  • The first character should be from R and S character – [R-S]

    第一个字符应来自R和S字符-[RS]
  • We can have any combination after the first character – %

    我们可以在第一个字符–%之后进行任意组合
  • We require the P character – [P]

    我们需要P字符-[P]
  • It should be followed by either an [P] or [I] – [PI]

    后面应带有[P]或[I] – [PI]
  • It can have any other character after previous condition- %

    前一个条件后可以有其他任何字符-%
SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[R-S]%[P][I]%'

Filter results for description with a specific pattern

示例8:使用T-SQL RegEx函数的区分大小写的搜索 (Example 8: Case sensitive search using T-SQL RegEx functions)

By default, we do not get case sensitive results. For example, the following queries return the same result set:

默认情况下,我们不会得到区分大小写的结果。 例如,以下查询返回相同的结果集:

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[r-s]%[P][i]%'
 
  SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like '[R-S]%[P][I]%'

Case sensitive search using T-SQL RegEx functions

We can perform case sensitive search using the following two ways:

我们可以使用以下两种方式执行区分大小写的搜索:

  1. Database collation setting: Each database in SQL Server have a collation. Right-click on the database and in the properties page, you can see the collation

    数据库排序规则设置:SQL Server中的每个数据库都有一个排序规则。 右键单击数据库,然后在属性页中,可以看到排序规则

    database collation

    We have SQL_Latin1_General_CP1_CI_AS performs case insensitive behaviour for the database. We can change this collation to case sensitive collation. It is not a simple solution. It might create issues for your queries. It is not a recommended way unless you explicitly require case sensitive collation.

    我们让SQL_Latin1_General_CP1_CI_AS对数据库执行不区分大小写的行为。 我们可以将此排序规则更改为区分大小写的排序规则。 这不是一个简单的解决方案。 它可能会给您的查询带来问题。 除非明确要求区分大小写,否则不建议使用此方法。

    We can use Column Collation with T-SQL RegEx functions to perform case sensitive search.

    我们可以将列排序规则与T-SQL RegEx函数一起使用以执行区分大小写的搜索。

    Create table Characters
      (Alphabet char(1)
      )
      Go
      Insert into Characters values ('A')
      Insert into Characters values ('a')
      Go
    

    In the table, we have letter A in upper and lowercase. If we run the following select statement, it returns both uppercase and lowercase:

    在表中,我们将字母A大小写。 如果我们运行以下select语句,它将返回大写和小写:

    SELECT * from Characters 
      where Alphabet like '[A]%'
    

    sample data

    Suppose we want to filter the uppercase letter in the result. We can use column collation as per the following query:

    假设我们要过滤结果中的大写字母。 我们可以根据以下查询使用列排序规则:

    select * from Characters 
      where Alphabet COLLATE Latin1_General_BIN  like '[A]%'
    

    It returns uppercase letter A in the output.

    它在输出中返回大写字母A。

    case sensitive search

    Similarly, the following query returns lowercase letter in the output:

    同样,以下查询在输出中返回小写字母:

    select * from Characters 
      where Alphabet COLLATE Latin1_General_BIN  like '[a]%'
    

    case sensitive search

  2. We can use T-SQL RegEx function to find both upper and lowercase characters in the output.

    我们可以使用T-SQL RegEx函数在输出中查找大写和小写字符。

    We want the following output:

    我们需要以下输出:

    • The first character should be uppercase character C

      第一个字符应为大写字符C
    • The second character should be lowercase character h

      第二个字符应为小写字符h
    • Rest of the characters can be in any letter case

      其余字符可以是大写字母

    SELECT [Description]
      FROM [AdventureWorks].[Production].[ProductDescription]
      where [Description] COLLATE Latin1_General_BIN  like '[C][h]%'
    

    case sensitive search for both upper and lowecase

示例9:使用T-SQL正则表达式查找包含数字的文本行 (Example 9: Use T-SQL Regex to Find Text Rows that Contain a Number)

We can find a row that contains the number as well in the text. For example, we want to filter the results with rows that contain number 0 to 9 in the beginning.

我们可以在文本中找到包含数字的行。 例如,我们要使用开头包含数字0到9的行来过滤结果。

SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like  '[0-9]%'</p>
  <p>
    <img style="margin: 0px auto; display: block;" src="/wp-content/uploads/2019/09/use-t-sql-regex-to-find-text-rows-that-contain-a-n.png" alt="Use T-SQL Regex to Find Text Rows that Contain a Number" />
  </p>
  <p>
    Similar to the characters, we can also specify the numbers for different positions. In the following example, we want the first digit from 1 to 5. The second digit should be in between 0 to 9.
  </p>
  <p><pre lang="tsql">SELECT [Description]
  FROM [AdventureWorks].[Production].[ProductDescription]
  where [Description] like  '[1-5][0-9]%'

search for a number pattern

示例10:使用T-SQL正则表达式查找有效的电子邮件ID (Example 10: Use T-SQL Regex to Find valid email ID’s)

Let’s explore a practical scenario of the RegEX function. We have a customer table, and it holds the customer email address. We want to identify valid email address from the user data. Sometimes, users make typo mistake and enter @@ instead of @ character.

让我们探讨RegEX函数的实际情况。 我们有一个客户表,其中包含客户的电子邮件地址。 我们想从用户数据中识别有效的电子邮件地址。 有时,用户会输入错误,并输入@@而不是@字符。

First, create the sample table and insert some email address into it in different formats.

首先,创建示例表,并以不同的格式在其中插入一些电子邮件地址。

CREATE TABLE TSQLREGEX(
     Email VARCHAR(1000)
  )
 
  Insert into TSQLREGEX values('raj@gmail.com')
  Insert into TSQLREGEX values('HSDFX@gmail.com')
  Insert into TSQLREGEX values('JHKHKO.PVS@gmail.com')
  Insert into TSQLREGEX values('ABC@@gmail.com')
  Insert into TSQLREGEX values('ABC.DFG.LKF#@gmail.com')

Execute the following select statement with the T-SQL RegEx function and it eliminates invalid email addresses.

使用T-SQL RegEx函数执行以下select语句,它将消除无效的电子邮件地址。

Select * from TSQLREGEX where email
  LIKE '%[A-Z0-9][@][A-Z0-9]%[.][A-Z0-9]%'

We do not have following invalid email address in the list.

列表中没有以下无效的电子邮件地址。

  • ABC@@gmail.com

    ABC @@ gmail.com
  • ABC.DFG.LKF#@gmail.com

    ABC.DFG.LKF#@gmail.com

Use T-SQL Regex to Find valid email ID's

结论 (Conclusion)

In this article, we explored T-SQL RegEx functions to perform a search using various conditions. You should be aware of these to search based on specific requirements.

在本文中,我们探讨了T-SQL RegEx函数以使用各种条件执行搜索。 您应了解这些内容,以根据特定要求进行搜索。

翻译自: https://www.sqlshack.com/t-sql-regex-commands-in-sql-server/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值