2016-12-05 87 views
2

我需要估计数据库大小的先决条件,所以我想了解SQL Server如何在示例波纹管中存储数据。估计SQL Server中的表大小

在我的SQL Server数据库中,我有一个名为InfoComp至极表包含4行:

IdInfoComp : Integer Not Null (PK) 
IdDefinition : Integer Not Null (FK) 
IdObject : Integer Not Null (FK) 
Value : NVarChar(Max) Not Null 

我想估计表的大小。在实际使用,我可以得到存储在Value平均长度与此SQL查询:

SELECT AVG(Value) FROM InfoComp 
Result : 8 

所以,我的计算似乎是(字节):

(Size(IdInfoComp) + Size(IdDefinition) + Size(IdObject) + AVG Size(Value)) * Rows count 

(4 + 4 + 4 + ((8 * 2) + 2)) * NbRows 

但是,当我试图在实际情况下应用此计算,这是错误的。在我的情况下,我有3,250,273行,因此结果应该是92 MB,但是MS SQL Report说:

(数据)147 888 KB(索引)113 072 KB和(保留)261 160 KB。

我在哪里错了?

+3

我不确定,但一个快速的谷歌搜索带来了一个官方的MSDN页面,有更具体的说明...也许尝试吗? https://msdn.microsoft.com/en-us/library/ms175991.aspx(看起来也许它是计算索引的大小,这与整个数据足迹完全不同) – jleach

+0

我实际上正在尝试。但我有一些差异(近10MB)。 – Cedric

+1

而不是平均水平,你可以将“价值”的总和加起来。 –

回答

2

试试这个......让我接近。我使用msdn文章来创建。您可以设置行数。这将会执行db中的每个表,包括索引。目前还不会处理列存储,也不会处理关系。它只会将行数估算应用于每个表格。

/*Do NOT change this section*/ 
GO 
CREATE TABLE RowSizes (TypeName VARCHAR(30), TableName VARCHAR(255), IndexName VARCHAR(255), Null_Bitmap SMALLINT, VariableFieldSize BIGINT, FixedFieldSize BIGINT, Row_Size BIGINT, LOBFieldSize BIGINT); 
CREATE TABLE LeafSizes (TypeName VARCHAR(30), TableName VARCHAR(255), IndexName VARCHAR(255), Row_Size BIGINT, Rows_Per_Page BIGINT, Free_Rows_Per_Page BIGINT, Non_Leaf_Levels BIGINT, Num_Leaf_Pages BIGINT, Num_Index_Pages BIGINT, Leaf_space_used_bytes BIGINT); 
GO 
CREATE PROCEDURE dbo.cp_CalcIndexPages 
    @IndexType VARCHAR(20) 
AS 
BEGIN 
    DECLARE @IndexName VARCHAR(255) 
     , @TableName varchar(255) 
     , @Non_Leaf_Levels bigint = 127 
     , @Rows_Per_Page bigint = 476 
     , @Num_Leaf_Pages bigint =10000; 

    WHILE EXISTS(SELECT TOP 1 1 FROM dbo.LeafSizes WHERE TypeName = @IndexType AND Num_Index_Pages = 0)-- AND IndexName = 'PK_ProcessingMessages') 
    BEGIN 
     SELECT TOP 1 @IndexName = IndexName 
      , @TableName = TableName 
      , @Non_Leaf_Levels = Non_Leaf_Levels 
      , @Rows_Per_Page = Rows_Per_Page 
      , @Num_Leaf_Pages = Num_Leaf_Pages 
     FROM dbo.LeafSizes 
     WHERE TypeName = @IndexType 
      AND Num_Index_Pages = 0; 

     DECLARE @Counter INT = 1 
      , @Num_Index_Pages INT = 0; 

     WHILE @Counter <= @Non_Leaf_Levels 
     BEGIN 
      BEGIN TRY 

      SELECT @Num_Index_Pages += ROUND(CASE WHEN @Num_Leaf_Pages/POWER(@Rows_Per_Page, @Counter) < CONVERT(FLOAT, 1) THEN 1 ELSE @Num_Leaf_Pages/POWER(@Rows_Per_Page, @Counter) END, 0) 
      END TRY 

      BEGIN CATCH 
       SET @Num_Index_Pages += 1 
      END CATCH 

      SET @Counter += 1 
     END 

     IF @Num_Index_Pages = 0 
      SET @Num_Index_Pages = 1; 

     UPDATE dbo.LeafSizes 
     SET Num_Index_Pages = @Num_Index_Pages 
      , Leaf_space_used_bytes = 8192 * @Num_Index_Pages 
     WHERE TableName = @TableName 
      AND IndexName = @IndexName; 

    END 
END 
GO 
/*Do NOT change above here*/ 

--Set parameters here 
DECLARE @NumRows INT = 1000000 --Number of rows for estimate 
    ,@VarPercentFill money = .6; --Percentage of variable field space used to estimate. 1 will provide estimate as if all variable columns are 100% full. 


/*Do not change*/ 
WITH cte_Tables AS (--Get Tables 
    SELECT o.object_id, s.name+'.'+o.name AS ObjectName 
    FROM sys.objects o 
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id 
    WHERE type = 'U' 
), cte_TableData AS (--Calculate Field Sizes 
    SELECT o.ObjectName AS TableName 
     , SUM(CASE WHEN t.name IN ('int', 'bigint', 'tinyint', 'char', 'datetime', 'smallint', 'date') THEN 1 ELSE 0 END) AS FixedFields 
     , SUM(CASE WHEN t.name IN ('int', 'bigint', 'tinyint', 'char', 'datetime', 'smallint', 'date') THEN c.max_length ELSE 0 END) AS FixedFieldSize 
     , SUM(CASE WHEN t.name IN ('varchar') THEN 1 ELSE 0 END) AS VariableFields 
     , SUM(CASE WHEN t.name IN ('varchar') THEN c.max_length ELSE 0 END)*@VarPercentFill AS VariableFieldSize 
     , SUM(CASE WHEN t.name IN ('xml') THEN 1 ELSE 0 END) AS LOBFields 
     , SUM(CASE WHEN t.name IN ('xml') THEN 10000 ELSE 0 END) AS LOBFieldSize 
     , COUNT(1) AS TotalColumns 
    FROM sys.columns c 
    INNER JOIN cte_Tables o ON o.object_id = c.object_id 
    INNER JOIN sys.types t ON c.system_type_id = t.system_type_id 
    GROUP BY o.ObjectName 
), cte_Indexes AS (--Get Indexes and size 
    SELECT s.name+'.'+o.name AS TableName 
     , ISNULL(i.name, '') AS IndexName 
     , i.type_desc 
     , i.index_id 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS FixedFields 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.key_ordinal > 0 THEN tc.max_length ELSE 0 END) AS FixedFieldSize 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS VariableFields 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.key_ordinal > 0 THEN tc.max_length ELSE 0 END)*@VarPercentFill AS VariableFieldSize 
     , SUM(CASE WHEN t.name IN ('xml') AND c.key_ordinal > 0 THEN 1 ELSE 0 END) AS LOBFields 
     , SUM(CASE WHEN t.name IN ('xml') AND c.key_ordinal > 0 THEN 10000 ELSE 0 END) AS LOBFieldSize 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.is_included_column > 0 THEN 1 ELSE 0 END) AS FixedIncludes 
     , SUM(CASE WHEN t.name IN ('tinyint','smallint', 'int', 'bigint', 'char', 'datetime', 'date') AND c.is_included_column > 0 THEN 1 ELSE 0 END) AS FixedIncludesSize 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.is_included_column > 0 THEN 1 ELSE 0 END)*@VarPercentFill AS VariableIncludes 
     , SUM(CASE WHEN t.name IN ('varchar') AND c.is_included_column > 0 THEN tc.max_length ELSE 0 END) AS VariableIncludesSize 
     , COUNT(1) AS TotalColumns 
    FROM sys.indexes i 
    INNER JOIN sys.columns tc ON i.object_id = tc.object_id 
    INNER JOIN sys.index_columns c ON i.index_id = c.index_id 
     AND c.column_id = tc.column_id 
     AND c.object_id = i.object_id 
    INNER JOIN sys.objects o ON o.object_id = i.object_id AND o.is_ms_shipped = 0 
    INNER JOIN sys.schemas s ON o.schema_id = s.schema_id 
    INNER JOIN sys.types t ON tc.system_type_id = t.system_type_id 
    GROUP BY s.name+'.'+o.name, ISNULL(i.name, ''), i.type_desc, i.index_id 
) 
INSERT RowSizes 
SELECT 'Table' AS TypeName 
    , n.TableName 
    , '' AS IndexName 
    , 2 + ((n.FixedFields+n.VariableFields+7)/8) AS Null_Bitmap 
    , 2 + (n.VariableFields * 2) + n.VariableFieldSize AS Variable_Data_Size 
    , n.FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap*/ 
    , n.FixedFieldSize + (2 + (n.VariableFields * 2) + (n.VariableFieldSize)) + (2 + ((n.FixedFields+n.VariableFields+7)/8)) + 4 AS Row_Size 
    , n.LOBFieldSize 
FROM cte_TableData n 
UNION 
SELECT i.type_desc 
    , i.TableName 
    , i.IndexName 
    , 0 AS Null_Bitmap 
    , CASE WHEN i.VariableFields > 0 THEN 2 + (i.VariableFields * 2) + i.VariableFieldSize + 4 ELSE 0 END AS Variable_Data_Size 
    , i.FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap if not clustered*/ 
    , i.FixedFieldSize + CASE WHEN i.VariableFields > 0 THEN 2 + (i.VariableFields * 2) + i.VariableFieldSize + 4 ELSE 0 END + 7 AS Row_Size 
    , i.LOBFieldSize 
FROM cte_Indexes i 
WHERE i.index_id IN(0,1) 
UNION 
SELECT i.type_desc 
    , i.TableName 
    , i.IndexName 
    , CASE WHEN si.TotalColumns IS NULL THEN 2 + ((i.FixedFields+i.VariableFields+i.VariableIncludes+i.FixedIncludes+8)/8) 
      ELSE 2 + ((i.FixedFields+i.VariableFields+i.VariableIncludes+i.FixedIncludes+7)/8) 
     END AS Null_Bitmap 
    , CASE WHEN si.TotalColumns IS NULL THEN 2 + ((i.VariableFields + 1) * 2) + (i.VariableFieldSize + 8) 
      ELSE 2 + (i.VariableFields * 2) + i.VariableFieldSize 
     END AS Variable_Data_Size 
    , CASE WHEN si.TotalColumns IS NULL THEN si.FixedFieldSize 
      ELSE i.FixedFieldSize + si.FixedFieldSize 
     END AS FixedFieldSize 
    /*FixedFieldSize + Variable_Data_Size + Null_Bitmap if not clustered*/ 
    , CASE WHEN si.TotalColumns IS NULL THEN i.FixedFieldSize + (2 + ((i.VariableFields + 1) * 2) + (i.VariableFieldSize + 8)) + (2 + ((i.TotalColumns+8)/8)) + 7 
      ELSE i.FixedFieldSize + (2 + (i.VariableFields * 2) + i.VariableFieldSize) + (2 + ((i.TotalColumns+7)/8)) + 4 
     END AS Row_Size 
    , i.LOBFieldSize 
FROM cte_Indexes i 
LEFT OUTER JOIN cte_Indexes si ON i.TableName = si.TableName AND si.type_desc = 'CLUSTERED' 
WHERE i.index_id NOT IN(0,1) AND i.type_desc = 'NONCLUSTERED'; 

--SELECT * FROM RowSizes 

/*Calculate leaf sizes for tables and HEAPs*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    ,'' AS IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 8096 * ((100 - 90)/100)/(r.Row_Size + 2) AS Free_Rows_Per_Page 
    , 0 AS Non_Leaf_Levels 
    /*Num_Leaf_Pages = Number of Rows/(Rows_Per_Page - Free_Rows_Per_Page) OR 1 if less than 1*/ 
    , CASE WHEN @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) < 1 
      THEN 1 
      ELSE @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) 
     END AS Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    /*Leaf_space_used = 8192 * Num_Leaf_Pages*/ 
    , 8192 * CASE WHEN @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) < 1 
       THEN 1 
       ELSE @NumRows/((8096/(r.Row_Size + 2)) - (8096 * ((100 - 90)/100)/(r.Row_Size + 2))) 
      END + (@NumRows * LOBFieldSize) AS Leaf_space_used_bytes 
FROM RowSizes r 
WHERE r.TypeName = 'Table' 
ORDER BY TypeName, TableName; 

/*Calculate leaf sizes for CLUSTERED indexes*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    , r.IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 0 AS Free_Rows_Per_Page 
    , 1 + ROUND(LOG(8096/(r.Row_Size + 2)), 0)*(l.Num_Leaf_Pages/(8096/(r.Row_Size + 2))) AS Non_Leaf_Levels 
    , l.Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    , 0 AS Leaf_space_used_bytes 
FROM RowSizes r 
INNER JOIN LeafSizes l ON r.TableName = l.TableName AND l.TypeName = 'Table' 
WHERE r.TypeName = 'CLUSTERED'; 

PRINT 'CLUSTERED' 
EXEC dbo.cp_CalcIndexPages @IndexType = 'CLUSTERED' 

/*Calculate leaf sizes for NONCLUSTERED indexes*/ 
INSERT LeafSizes 
SELECT r.TypeName 
    , r.TableName 
    , r.IndexName 
    , r.Row_Size 
    , 8096/(r.Row_Size + 2) AS Rows_Per_Page 
    , 0 AS Free_Rows_Per_Page 
    , 1 + ROUND(LOG(8096/(r.Row_Size + 2)), 0)*(l.Num_Leaf_Pages/(8096/(r.Row_Size + 2))) AS Non_Leaf_Levels 
    , l.Num_Leaf_Pages 
    , 0 AS Num_Index_Pages 
    , 0 AS Leaf_space_used_bytes 
FROM RowSizes r 
INNER JOIN LeafSizes l ON r.TableName = l.TableName AND l.TypeName = 'Table' 
WHERE r.TypeName = 'NONCLUSTERED'; 

PRINT 'NONCLUSTERED' 
EXEC dbo.cp_CalcIndexPages @IndexType = 'NONCLUSTERED' 

SELECT * 
FROM dbo.LeafSizes 
--WHERE TableName = 'eligibility.clientrequest' 

SELECT TableName 
    , @NumRows AS RowsPerTable 
    , @VarPercentFill*100 AS VariableFieldFillFactor 
    , SUM(CASE WHEN TypeName = 'Table' THEN Leaf_space_used_bytes ELSE 0 END)/1024/1024 AS TableSizeMB 
    , SUM(Leaf_space_used_bytes)/1024/1024 AS SizeWithIndexesMB 
FROM LeafSizes 
--WHERE TableName = 'eligibility.clientrequest' 
GROUP BY TableName 
ORDER BY TableName; 


GO 
/*Cleanup when done*/ 
DROP PROCEDURE dbo.cp_CalcIndexPages; 
DROP TABLE dbo.RowSizes; 
DROP TABLE dbo.LeafSizes; 
0

不幸的是,我不能说,为什么你的计算是错误的,因为是关于如何创建表和数据库是如何配置没有足够的信息。所以我会尽量共同回答,并且您会收到小费。

您应该首先知道任何SQL Server数据库的大小大于或等于model数据库的大小。这是因为model数据库是新数据库的模板,因此每次执行CREATE DATABASE语句时都会复制它。

数据库中的所有信息都存储在磁盘上的8 KB页面中。有许多类型的页面。其中一些(如分配图和元数据)用于内部目的,而其他用于存储数据。

表大小取决于磁盘上组织的数据(是否具有聚簇索引),列类型和数据压缩。索引的大小取决于索引表上唯一索引的存在,索引的级数,填充因子等。

正如我之前所说,一切都存储在页面和数据中。 SQL Server具有行内数据的页面,行溢出数据的页面以及LOB数据的页面。数据页面由三个主要部分组成:页眉,数据行和数据偏移量数组。

页眉占用每个数据页的前96个字节,其余组件留下8,096个字节。行偏移量数组是存储在页面末尾的一个2字节条目块。输入计数存储在标题中并称为时隙计数。

标题和行偏移量数组之间的区域是存储数据行的区域。每行由两部分组成:固定大小部分和可变长度部分。

数据行的结构是:

  • 状态位A,1个字节
  • 状态位B,1个字节
  • 固定长度尺寸(FSIZE),2个字节
  • 定点长度数据(FDATA),FSIZE - 4
  • 编号列(NCOL)的,2个字节
  • NULL位图,天花(NCOL/8)
  • 编号存储在行(VarCount)可变长度列的,2个字节
  • 变量列偏移阵列(VarOffset),2 * VarCount
  • 可变长度数据(VARDATA),VarOff [VarCount] - (FSIZE + 4 +天花板(NCOL/8)+ 2 * VarCount)

索引行存储在相同的方式,数据行。

并非我在这里解释的所有东西,但我希望这将帮助您了解SQL Server使用分配空间的目的。另外,您应该记住,数据库文件按照FILEGROWTH选项指定的大小增长,这可能导致实际大小大于估计的大小。

也看看Microsoft SQL Server 2012 Internals的书,并阅读如何Estimate the Size of a Database。它可能会对你有趣。