真正的问题不在于查询,而在于模式,特别是聚簇索引。评论顺序要求在你定义的时候是非常有用的(每个答案只有5个?)。我将这些要求解释为“每个帖子提取5条评论(回答或问题),优先考虑优先考虑的问题,然后考虑更新的问题。我知道这不是如何评论,但你必须更加谨慎地定义你的需求。
这里是我的查询:
declare @postId int;
set @postId = ?;
with cteQuestionAndReponses as (
select post_id
from Posts
where post_id = @postId
union all
select post_id
from Posts
where parent_id = @postId)
select * from
cteQuestionAndReponses p
outer apply (
select count(*) as CommentsCount
from Comments c
where is_deleted = 0
and c.post_id = p.post_id) as cc
outer apply (
select top(5) *
from Comments c
where is_deleted = 0
and p.post_id = c.post_id
order by upvotes desc, date desc
) as c
我有一些14K职位和我的测试表67K意见,查询得到的职位在7毫秒:
Table 'Comments'. Scan count 12, logical reads 50, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Posts'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 7 ms.
这里是我测试模式搭配:
create table Posts (
post_id int identity (1,1) not null
, content varchar(max) not null
, parent_id int null -- (null for questions, question_id for answer)
, constraint fkPostsParent_id
foreign key (parent_id)
references Posts(post_id)
, constraint pkPostsId primary key nonclustered (post_id)
);
create clustered index cdxPosts on
Posts(parent_id, post_id);
go
create table Comments (
comment_id int identity(1,1) not null
, body varchar(max) not null
, is_deleted bit not null default 0
, post_id int not null
, upvotes int not null default 0
, date datetime not null default getutcdate()
, constraint pkComments primary key nonclustered (comment_id)
, constraint fkCommentsPostId
foreign key (post_id)
references Posts(post_id)
);
create clustered index cdxComments on
Comments (is_deleted, post_id, upvotes, date, comment_id);
go
,这里是我的测试数据生成:
insert into Posts (content)
select 'Lorem Ipsum'
from master..spt_values;
insert into Posts (content, parent_id)
select 'Ipsum Lorem', post_id
from Posts p
cross apply (
select top(checksum(newid(), p.post_id) % 10) Number
from master..spt_values) as r
where parent_id is NULL
insert into Comments (body, is_deleted, post_id, upvotes, date)
select 'Sit Amet'
-- 5% deleted comments
, case when abs(checksum(newid(), p.post_id, r.Number)) % 100 > 95 then 1 else 0 end
, p.post_id
-- up to 10 upvotes
, abs(checksum(newid(), p.post_id, r.Number)) % 10
-- up to 1 year old posts
, dateadd(minute, -abs(checksum(newid(), p.post_id, r.Number) % 525600), getutcdate())
from Posts p
cross apply (
select top(abs(checksum(newid(), p.post_id)) % 10) Number
from master..spt_values) as r
@Mark:SO被设置为在相同的表中存在问题和答案。 – 2009-12-16 23:03:10
SO有问题,答案和评论。什么是“帖子”?他们有问题吗?答案?都?我如何知道哪些帖子属于哪个问题? – 2009-12-16 23:03:56
@OMG小马,好的我不知道。 – 2009-12-16 23:04:33