IMO,设计方法是什么让这很难。仅仅因为您允许用户分配标签并不意味着标签必须作为单个分隔的单词列表存储。您可以标准化结构成类似:
Create Table Posts (Id ... not null primary key)
Create Table Tags(Id ... not null primary key, Name ... not null Unique)
Create Table PostTags
(PostId ... not null References Posts(Id)
, TagId ... not null References Tags(Id))
现在,你的问题就变得简单:
Select T.Id, T.Name, Count(*) As TagCount
From PostTags As PT
Join Tags As T
On T.Id = PT.TagId
Group By T.Id, T.Name
Order By Count(*) Desc
如果硬要存储标签作为分隔值,那么唯一的办法就是对他们的分隔符分割值通过编写自定义拆分功能,然后做你的计数。底部是Split功能的一个例子。有了它,您的查询看起来是这样的(用逗号分隔符):
Select Tag.Value, Count(*) As TagCount
From Posts As P
Cross Apply dbo.Split(P.Tags, ',') As Tag
Group By Tag.Value
Order By Count(*) Desc
拆分功能:
Create Function [dbo].[Split]
(
@DelimitedList nvarchar(max)
, @Delimiter nvarchar(2) = ','
)
RETURNS TABLE
AS
RETURN
(
With CorrectedList As
(
Select Case When Left(@DelimitedList, DataLength(@Delimiter)/2) <> @Delimiter Then @Delimiter Else '' End
+ @DelimitedList
+ Case When Right(@DelimitedList, DataLength(@Delimiter)/2) <> @Delimiter Then @Delimiter Else '' End
As List
, DataLength(@Delimiter)/2 As DelimiterLen
)
, Numbers As
(
Select TOP (Coalesce(Len(@DelimitedList),1)) Row_Number() Over (Order By c1.object_id) As Value
From sys.objects As c1
Cross Join sys.columns As c2
)
Select CharIndex(@Delimiter, CL.list, N.Value) + CL.DelimiterLen As Position
, Substring (
CL.List
, CharIndex(@Delimiter, CL.list, N.Value) + CL.DelimiterLen
, Case
When CharIndex(@Delimiter, CL.list, N.Value + 1)
- CharIndex(@Delimiter, CL.list, N.Value)
- CL.DelimiterLen < 0 Then Len(CL.List)
Else CharIndex(@Delimiter, CL.list, N.Value + 1)
- CharIndex(@Delimiter, CL.list, N.Value)
- CL.DelimiterLen
End
) As Value
From CorrectedList As CL
Cross Join Numbers As N
Where N.Value < Len(CL.List)
And Substring(CL.List, N.Value, CL.DelimiterLen) = @Delimiter
)
你需要使用字符串操作每个森泰斯转换为一组单词。如果你创建一个表值函数,它接受一个字符串并输出一个字表,然后你可以使用'myData CROSS APPLY myFunction(myTable.sentance)',然后使用GROUP BY来计算一切。确切地说需要什么规则来打破一个单独的单词,我会留给你或其他:) – MatBailie