2017-06-04 86 views
2

我有像'John is my name; Ram is my name; Adam is my name'的数据。SQL Server:在特定字符后检查大写或小写

我的规则是这样的,每个在;之后的第一个字母应该是大写字母。

如何选择所有符合规则的值?

+0

的SQL Server的哪个版本? – Shnugo

+0

@Shnugo Microsoft SQL Server 2012 - 11.0.5058.0(X64) –

+1

这将是一个丑陋的问题,特别是如果分号分隔的术语数量未知。更好的解决方案是将数据标准化并将每个名称/句子放在单独的记录中。 –

回答

1

的其他答案显示如何将行转换为与您的模式相匹配的内容。

如果你只是想select符合您所描述的模式的行,你可以使用patindex()like使用区分大小写的排序规则(或使用collate申请一个)。

这里假定除了规则之外,每个分号后面的字母必须是大写字母,第一个字母也应该是大写字母。如果不是这种情况,只需删除where中的第一个子句即可。

select * 
from t 
where patindex('[ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 1 
    and patindex('%; [^ABCDEFGHIJKLMNOPQRSTUVWXYZ]%', val collate latin1_general_cs_as) = 0 

​​

测试设置:

create table t (id int not null identity(1,1),val varchar(256)) 
insert into t values 
('John is my name; Ram is my name; Adam is my name') 
,('john is my name; ram is my name; adam is my name') 

rextester演示:http://rextester.com/DBGIS10645

上述两种返回的:

+----+--------------------------------------------------+ 
| id |      val      | 
+----+--------------------------------------------------+ 
| 1 | John is my name; Ram is my name; Adam is my name | 
+----+--------------------------------------------------+ 
2

你可能会与XML招这样

DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; Adam is my name'; 
WITH Splitted AS 
(
    SELECT CAST('<x>' + REPLACE((SELECT REPLACE(@YourString,'; ','$$SplitHere$$') AS [*] FOR XML PATH('')),'$$SplitHere$$','</x><x>')+ '</x>' AS XML) AS Casted 
) 
,DerivedTable AS 
(
    SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS PartNr 
      ,x.value(N'text()[1]',N'nvarchar(max)') AS Part 
    FROM Splitted 
    CROSS APPLY Casted.nodes(N'/x') AS X(x) 
) 
SELECT PartNr 
     ,Part 
     ,CASE WHEN ASCII(LEFT(Part,1)) BETWEEN ASCII('A') AND ASCII('Z') THEN 1 ELSE 0 END AS FirstIsCapital 
FROM DerivedTable; 

Nr Part   FirstLetterIsCaptial 
---------------------------------------- 
1 John is my name  1 
2 Ram is my name  1 
3 Adam is my name  1 

我不知道你的最终目标是什么...找份,其中第一个字母,结果这个分裂不是资本?确保你的规则满员?

但是:
最好的是,以此来纠正你的设计,并将这些部件在1:n相关的边桌

+0

使用以下源字符串进行测试:''John是我的名字;拉姆是我的名字;亚当是我的名字' –

+0

@BogdanSahlean,那么我会使用'L/RTRIM()'...错误是存储格式...解决这个问题的任何代码将是一个黑客... – Shnugo

+0

'TRIM'是在SQL Server 2017中引入的。如果源字符串是''!John是我的名字;!Ram是我的名字;!Adam是我的名字;'Hola!'? –

1

丑陋的解决方案的点点,但你可以给一个尝试...

Declare @str nvarchar(max) = 'John is my name; Ram is my name; Adam is my name' 

Declare @xml as xml 
Set @xml = cast(('<X>'+replace(@str,';' ,'</X><X>')+'</X>') as xml) 
Select * from (
    Select RowN = Row_Number() over (order by (SELECT NULL)), LTrim(RTrim(N.value('.', 'nvarchar(MAX)'))) as value FROM @xml.nodes('X') as T(N) -- this is to split if you are using sql server 2016 you can use string_Split 
) a 
Where unicode(substring(a.[value],1,1)) = unicode(upper(substring(a.[value],1,1))) 

想法是分割字符串与Unicode值检查,看它是否是上还是不

+0

这个解决方案有一个很大的缺陷:如果你的@ str包含禁止的字符,它会破坏... – Shnugo

+0

例子?没有得到你的观点 –

+0

只需用''约翰是我的名字;吉姆和汤姆是我的朋友'' – Shnugo

1

注意:标准的做法是在应用程序中使用C#/ VB [.Net]执行此操作。

[1]解决方案:

DECLARE @Source NVARCHAR(100) = N'john is my name; Ram is my name; adam is my name' 

SELECT z.Sentence 
FROM (VALUES (CONVERT(XML, N'<root><i>' + REPLACE(@Source, N';', N'</i><i>;') + N'</i></root>'))) AS x(XmlCol) 
CROSS APPLY x.XmlCol.nodes(N'/root/i') AS y(XmlCol) 
CROSS APPLY (VALUES(y.XmlCol.value('(text())[1]', 'NVARCHAR(100)'))) AS z(Sentence) 
WHERE SUBSTRING(z.Sentence, NULLIF(PATINDEX('%[a-z]%', z.Sentence), 0), 1) LIKE '%[a-z]%' COLLATE Latin1_General_BIN 
ORDER BY ROW_NUMBER() OVER(ORDER BY y.XmlCol) 

在这种情况下,结果将是

john is my name 
; adam is my name 

[2]如果你正在试图利用第一个字母从每一个一句然后我会用下列溶液(见注释广告行结束):

DECLARE @Source NVARCHAR(100) = N'john is my name; ram is my name; adam is my name' 

SELECT (
    SELECT u.NewSentence AS '*' 
    FROM (VALUES (CONVERT(XML, N'<root><i>' + REPLACE(@Source, N';', N';</i><i>') + N'</i></root>'))) AS x(XmlCol) -- It convert source string into XML. Every ; acct as a delimiter for sentence. End results will be like this <root><i>john...;</i><i> ram ....</i>...</root> 
    CROSS APPLY x.XmlCol.nodes(N'/root/i') AS y(XmlCol) -- It decompose original XML into separate sentences as XML 
    CROSS APPLY (VALUES(y.XmlCol.value('(text())[1]', 'NVARCHAR(100)'))) AS z(Sentence) -- ... AS NVARCHAR(100) 
    CROSS APPLY (VALUES(PATINDEX('%[a-z]%', z.Sentence))) AS t(FirstLetterIndex) -- It finds index of first letter 
    CROSS APPLY (VALUES(IIF(t.FirstLetterIndex > 0, STUFF(z.Sentence, t.FirstLetterIndex, 1, UPPER(SUBSTRING(z.Sentence, t.FirstLetterIndex, 1))), z.Sentence))) AS u(NewSentence) -- It replace every first letter with the capitalized version/UPPER(...) 
    ORDER BY ROW_NUMBER() OVER(ORDER BY y.XmlCol) -- All sentences should be ordered by original position within source string 
    FOR XML PATH('') -- It concatenates all sentences back in one string 
) 

例如,如果源串是N'john is my name; ram is my name; adam is my name'那么结果将是N'John is my name; Ram is my name; Adam is my name'

Demo

注:该解决方案的工作(以及基于XML切碎所有其他解决方案)如果源字符串不包括一些XML字符保留(如<)。让我知道如果这是你的情况。

+0

只需注意:我的XML碎片没有问题,保留字符...用'PATINDEX'('%[az]%')搜索第一个字母似乎过于复杂... – Shnugo

+0

我相信FOR XML ...增加一些开销。我提到如果源字符串包含这样的字符,OP应该让我知道。没有提到每个句子的第一个字母应该是一个字母。它可能是一个空间,也可能是第一封信之前可能是100个空格。 –

1

你可以创建一个这样的功能。

Create FUNCTION SPLITTER ( 
    @textData NVARCHAR(MAX), 
    @Delimeter NVARCHAR(MAX)) RETURNS @RtnValue TABLE (
    Data NVARCHAR(MAX)) AS BEGIN 
    DECLARE @index INT DECLARE @data nvarchar(1000) DECLARE @firstCharacter char 
    SET @index = CHARINDEX(@Delimeter,@textData) 

    WHILE (@index>0) 
    BEGIN 
       set @data = LTRIM(RTRIM(SUBSTRING(@textData, 1, @index - 1)))  set @firstCharacter = SUBSTRING(@data,1,1); 
       if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter))  begin   INSERT INTO @RtnValue (data) SELECT @data  end; 

     SET @textData = SUBSTRING(@textData, @index + DATALENGTH(@Delimeter)/2, LEN(@textData)) 

     SET @index = CHARINDEX(@Delimeter, @textData) 
    END 
     set @data = @textData set @firstCharacter = SUBSTRING(@data,1,1); 
      if UNICODE(@firstCharacter) = UNICODE(upper(@firstCharacter)) begin  INSERT INTO @RtnValue (data) SELECT @data end; 

    RETURN END 

使用这样

SELECT * FROM分路器( '约翰是我的名字,拉姆是我的名字;亚当是我的名字', ';')

1

你可以抓住的NGrams8K副本,并做到这一点:

-- note that I made the 3rd item start with lower-case 
DECLARE @YourString VARCHAR(100)='John is my name; Ram is my name; adam is my name'; 

WITH D(n) AS 
(
    SELECT 0 UNION ALL SELECT position 
    FROM dbo.NGrams8k(@yourstring,1) WHERE token = ';' 
), 
TOKEN(token) AS 
(
    SELECT LTRIM(SUBSTRING(@YourString, N+1, 
      ISNULL(NULLIF(CHARINDEX(';', @YourString, N+1),0), 101)-(N+1))) 
    FROM D 
) 
SELECT token, 
     FirstLetterIsCaptial = IIF(ASCII(SUBSTRING(token,1,1)) BETWEEN 65 AND 90, 1, 0) 
FROM TOKEN; 

结果

token    FirstLetterIsCaptial 
------------------ -------------------- 
John is my name 1 
Ram is my name  1 
adam is my name 0