2011-07-12 40 views
0

我需要从Microsoft Content Management Server(MCMS)数据库中提取大量数据(> 1000页)以用于Sitecore网站。如何从Microsoft Content Management Server(MCMS)数据库提取数据

我可以看到两个主要选项:

  1. 将数据迁移到一个新的简化数据库并显示在新的网站 信息。

  2. 将MCMS解决方案转换为SharePoint,并使用可用于Sitecore的SharePoint 连接器模块来显示此信息。

我宁愿因为目前还没有计划使用SharePoint在未来的数据管理/内容,并希望这些信息存储在一个简单的SQL Server数据库,以便更好地搜索下去的第一条路线。

我看了一下数据库的问题,并认为我会感兴趣的主表是Node,NodePlaceholderNodePlaceholderContent但我正在努力寻找我所期望的。那里的任何人都可以为我解释一下这个数据库的模式吗?或者我会尝试以这种方式迁移数据时遇到问题?

回答

5

我刚刚经历了一个类似的从MCMS 2002(迁移到Wordpress)导出内容页面的过程。

我不是说这是获取数据的100%正确方法,但它对我有用。

下面是我已经采取的从数据库中获取页面内容的过程。

正如你已经看到存储的大部分数据都是NodeNodePlaceholderContent

1)为了得到一个什么样的Node表保存您可以查看按类型组织内容的想法表

SELECT 
    [Type] 
    ,CASE [Type] 
     WHEN  1 THEN 'Server' 
     WHEN  4 THEN 'Channel' 
     WHEN  16 THEN 'Post/Page' 
     WHEN  64 THEN 'Resource Gallery' 
     WHEN 256 THEN 'Resource Gallery Item (images/documents)' 
     WHEN 16384 THEN 'Template Gallery' 
     WHEN 65536 THEN 'Template' END as [Description] 
    ,COUNT([Type]) as [Count] 
FROM  dbo.Node 
GROUP BY [Type] 
ORDER BY [Count] DESC 

2)页面(和帖子,将覆盖帖子进一步下跌)是类型= 16 ...但得到的只是网页(而不是职位),我们需要通过IsShortcut = 0

SELECT * FROM dbo.Node WHERE [Type] = 16 AND IsShortcut = 0 
过滤

3)我只是想发布的网页,所以过滤器由ApprovalStatus = 1

-- Get all published pages 
SELECT * 
FROM dbo.Node WHERE [Type] = 16 
AND IsShortcut = 0 
AND ApprovalStatus = 1 

4)接下来,确定创建/页由修改(与用户名)

-- Get published pages & author/editor 
SELECT 
    [page].Id 
    ,[page].NodeGuid 
    ,[page].Name 
    ,[created].Username as 'CreatedBy' 
    ,[page].CreatedWhen 
    ,[modified].Username as 'ModifiedBy' 
    ,[page].ModifiedWhen 
FROM  dbo.Node [page] 
-- add JOIN on created by user 
INNER JOIN dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId 
-- add JOIN on modified by user 
INNER JOIN dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId 
WHERE [Type] = 16 
AND IsShortcut = 0 
AND ApprovalStatus = 1 

5)接下来,找出其中的层次结构,我们通过使用Node.ParentGUID

SELECT 
    [page].Id 
    ,[page].NodeGuid 
    ,[page].Name 
    ,[pageParent].Name -- add page parent Name 
    ,[created].Username as 'CreatedBy' 
    ,[page].CreatedWhen 
    ,[modified].Username as 'ModifiedBy' 
    ,[page].ModifiedWhen 
FROM  dbo.Node [page] 
INNER JOIN dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId 
INNER JOIN dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId 
-- add JOIN on Node using ParentGUID 
INNER JOIN dbo.Node [pageParent] ON [pageParent].NodeGUID = [page].ParentGUID 
WHERE [page].[Type] = 16 
AND [page].IsShortcut = 0 
AND [page].ApprovalStatus = 1 

该查询,让我知道,网页要么p中arent节点FoldersArchive Folder

6。)转到再升一级(获得母公司的母公司)

SELECT 
    [page].Id 
    ,[page].NodeGuid 
    ,[page].Name 
    ,[pageParent].Name 
    ,[pageParent2].Name -- add parent of parent name 
    ,[created].Username as 'CreatedBy' 
    ,[page].CreatedWhen 
    ,[modified].Username as 'ModifiedBy' 
    ,[page].ModifiedWhen 
FROM  dbo.Node [page] 
INNER JOIN dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId 
INNER JOIN dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId 
INNER JOIN dbo.Node [pageParent] ON [pageParent].NodeGUID = [page].ParentGUID 
-- add another JOIN on Node using ParentGUID (parent of parent) 
INNER JOIN dbo.Node [pageParent2] ON [pageParent2].NodeGUID = [pageParent].ParentGUID 
WHERE [page].[Type] = 16 
AND [page].IsShortcut = 0 
AND [page].ApprovalStatus = 1 

母公司的母公司为Server(根级别),所以现在我的结论是,如果页面的父:

  • Folders - 那么这就是活动页面
  • Archive Folder - 那么这就是另一个页面的先前版本中

我只是想活动页面所以我要加入在Folders父母只有

7.)现在如何标记。在我们的MCMS模板中,只有一个占位符区域。 NodePlaceholder表将标识占位符的名称,如果您的模板中有多个占位符区域,则该名称是有用的。为了简单起见,我只想加入NodePlaceholdercontent

SELECT 
    [page].Id 
    ,[page].NodeGuid 
    ,[page].Name 
    /* remove parent names */ 
    ,[created].Username as 'CreatedBy' 
    ,[page].CreatedWhen 
    ,[modified].Username as 'ModifiedBy' 
    ,[page].ModifiedWhen 
    ,html.PropValue as 'HTML' -- add the markup 
FROM  dbo.Node [page] 
INNER JOIN dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId 
INNER JOIN dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId 
-- change alias to "folders" 
INNER JOIN dbo.Node [folders] ON [folders].NodeGUID = [page].ParentGUID AND [folders].Name = 'Folders' 
-- join on PlaceholderContent to get the HTML 
-- this table will also have references to any static files contained in the page (such as images) so we filter those out by PropName = 'HTML' 
INNER JOIN dbo.NodePlaceholderContent html ON html.NodeId = [page].Id AND html.PropName = 'HTML' 
WHERE [page].[Type] = 16 
AND [page].IsShortcut = 0 
AND [page].ApprovalStatus = 1 

8)所以在这一点上,我有一个小卡在试图确定在页面系统(即相对路径或什么渠道它生活在),回到步骤1 & 2,type = 16可以是帖子或页面(它们不是相同的东西,但它们是相关的)。所以现在我们加入我们的页面到邮件记录来确定路径。

一些谷歌搜索我偶然发现this excerpt from Microsoft Content Management Server 2002: a complete guide真的帮得到的方式休息后(并确定了Node.Type枚举)

SELECT 
    [page].Id 
    ,[page].NodeGuid 
    ,[page].Name 
    ,[post].DisplayName as 'Title' -- add page Title from the post record 
    ,[pageParent].Name 
    ,[pageParent2].Name 
    ,[created].Username as 'CreatedBy' 
    ,[page].CreatedWhen 
    ,[modified].Username as 'ModifiedBy' 
    ,[page].ModifiedWhen 
    ,html.PropValue as 'HTML' 
FROM  dbo.Node [page] 
INNER JOIN dbo.ClientAccount [created] ON [created].UserId = [page].CreatedByUserId 
INNER JOIN dbo.ClientAccount [modified] ON [modified].UserId = [page].ModifiedByUserId 
INNER JOIN dbo.Node [folders] ON [folders].NodeGUID = [page].ParentGUID AND [folders].Name = 'Folders' 
INNER JOIN dbo.NodePlaceholderContent html ON html.NodeId = [page].Id AND html.PropName = 'HTML' 
-- join using followGUID to get the posting 
INNER JOIN dbo.Node [post] ON [post].FollowGUID = [page].NodeGUID 
WHERE [page].[Type] = 16 
AND [page].IsShortcut = 0 
AND [page].ApprovalStatus = 1 

9)最后一步现在是持续上涨后父层次结果导致多个LEFT JOINS加紧ParentGUID链。该查询使用这些LEFT JOINS给出了层次结构的直观表示。

SELECT 
    CASE WHEN postParent9.Name IS NULL THEN '' ELSE postParent9.Name + ' > ' END + 
    CASE WHEN postParent8.Name IS NULL THEN '' ELSE postParent8.Name + ' > ' END + 
    CASE WHEN postParent7.Name IS NULL THEN '' ELSE postParent7.Name + ' > ' END + 
    CASE WHEN postParent6.Name IS NULL THEN '' ELSE postParent6.Name + ' > ' END + 
    CASE WHEN postParent5.Name IS NULL THEN '' ELSE postParent5.Name + ' > ' END + 
    CASE WHEN postParent4.Name IS NULL THEN '' ELSE postParent4.Name + ' > ' END + 
    CASE WHEN postParent3.Name IS NULL THEN '' ELSE postParent3.Name + ' > ' END + 
    CASE WHEN postParent2.Name IS NULL THEN '' ELSE postParent2.Name + ' > ' END + 
    CASE WHEN postParent1.Name IS NULL THEN '' ELSE postParent1.Name + ' > ' END + 
    page.Name as [Path] 
    ,page.Name + '.htm' as [PageName] 
    ,post.DisplayName as [PageTitle] 
    ,CASE page.[Type] 
     WHEN  1 THEN 'Server' 
     WHEN  4 THEN 'Channel' 
     WHEN  16 THEN 'Post/Page' 
     WHEN  64 THEN 'Resource Gallery' 
     WHEN 256 THEN 'Resource Gallery Item (images/documents)' 
     WHEN 16384 THEN 'Template Gallery' 
     WHEN 65536 THEN 'Template' END as [Type] 
    ,page.CreatedWhen as 'Created' 
    ,page.ModifiedWhen as 'Modified' 
    ,html.PropValue as 'HTML' 
FROM  dbo.Node page 
INNER JOIN dbo.Node folders ON folders.NodeGUID = page.ParentGUID AND folders.Name = 'Folders' 
INNER JOIN dbo.NodePlaceholderContent html ON html.NodeId = page.Id AND html.PropName = 'HTML' 
INNER JOIN dbo.Node post ON post.FollowGUID = page.NodeGUID AND post.IsShortcut = 1 
LEFT JOIN dbo.Node postParent1 ON postParent1.NodeGuid = post.ParentGUID 
LEFT JOIN dbo.Node postParent2 ON postParent2.NodeGuid = postParent1.ParentGUID 
LEFT JOIN dbo.Node postParent3 ON postParent3.NodeGuid = postParent2.ParentGUID 
LEFT JOIN dbo.Node postParent4 ON postParent4.NodeGuid = postParent3.ParentGUID 
LEFT JOIN dbo.Node postParent5 ON postParent5.NodeGuid = postParent4.ParentGUID 
LEFT JOIN dbo.Node postParent6 ON postParent6.NodeGuid = postParent5.ParentGUID 
LEFT JOIN dbo.Node postParent7 ON postParent7.NodeGuid = postParent6.ParentGUID 
LEFT JOIN dbo.Node postParent8 ON postParent8.NodeGuid = postParent7.ParentGUID 
LEFT JOIN dbo.Node postParent9 ON postParent9.NodeGuid = postParent8.ParentGUID 

顺便说一句,我的任务不涉及出口资源库内容(图片/文档/等),但如果你确实需要这些作品为这里应该有足够的信息来获得对一个良好的开端好。

我希望这可以帮助别人从MCMS 2002迁移...

相关问题