代码改变世界

微软BI 之SSAS 系列 - 多维数据集维度用法之一 引用维度 Referenced Dimension

2014-01-04 14:58  BIWORK  阅读(5294)  评论(11编辑  收藏  举报

在 CUBE 设计过程中有一个非常重要的点就是定义维度与度量值组关系,维度的创建一般在前,而度量值组一般来源于一个事实表。当维度和度量值组在 CUBE 中定义完成之后,下一个最重要的动作就是定义两者之间的关系。在前面几篇文章中也已经看到了如何将度量值组和维度通过哪些维度属性进行关联的操作,但是那些关联通常都是 Regular 类型的,属于直接的普通的关联。

但是实际上除了 Regular 之外,还有 Fact, Referenced, Many to Many 和 Data Mining(数据挖掘类不在此讨论)。

今天在这篇文章中讲解 Referenced 和 Fact 两种关系的定义。

使用的测试代码如下 -

USE BIWORK_SSIS
GO
SET NOCOUNT ON
IF OBJECT_ID('FactInternetSalesReason','U') IS NOT NULL
DROP TABLE FactInternetSalesReason 

IF OBJECT_ID('FactInternetSales','U') IS NOT NULL
DROP TABLE FactInternetSales

IF OBJECT_ID('FactResellerSales','U') IS NOT NULL
DROP TABLE FactResellerSales

IF OBJECT_ID('DimEmployee','U') IS NOT NULL
DROP TABLE DimEmployee

IF OBJECT_ID('DimDate','U') IS NOT NULL
DROP TABLE DimDate 

IF OBJECT_ID('DimProduct','U') IS NOT NULL
DROP TABLE DimProduct 

IF OBJECT_ID('DimSalesReason','U') IS NOT NULL
DROP TABLE DimSalesReason 

IF OBJECT_ID('DimProductSubcategory','U') IS NOT NULL
DROP TABLE DimProductSubcategory

IF OBJECT_ID('DimProductCategory','U') IS NOT NULL
DROP TABLE DimProductCategory

IF OBJECT_ID('DimReseller','U') IS NOT NULL
DROP TABLE DimReseller 

IF OBJECT_ID('DimCustomer','U') IS NOT NULL
DROP TABLE DimCustomer 

IF OBJECT_ID('DimGeography','U') IS NOT NULL
DROP TABLE DimGeography 
GO
  
CREATE TABLE DimDate
(
    DateKey INT PRIMARY KEY,
    ShortDateName NVARCHAR(12) NOT NULL,
    FullDateName NVARCHAR(20)  NOT NULL, 
    DayNumberOfWeek TINYINT NOT NULL,
    DayNameOfWeek NVARCHAR(10) NOT NULL,
    DayNumberOfMonth TINYINT NOT NULL,
    DayNumberOfYear SMALLINT NOT NULL, 
    WeekNumberOfYear TINYINT NOT NULL,
    IsWeekend NVARCHAR(7) NOT NULL, 
    IsLeapYear BIT NOT NULL,
    MonthKey INT NOT NULL,
    MonthNumberOfYear TINYINT NOT NULL,
    MonthNameOfYear NVARCHAR(10) NOT NULL,
    MonthNameWithYear NVARCHAR(20) NOT NULL, 
    CalendarQuarterKey INT NOT NULL, 
    CalendarQuarterNumber TINYINT NOT NULL, 
    CalendarQuarterNameWithYear NVARCHAR(20) NOT NULL, 
    CalendarSemesterNumber TINYINT NOT NULL,
    CalendarYearKey SMALLINT NOT NULL, 
    CalendarYearName NVARCHAR(20) NOT NULL, 
    FiscalQuarterKey INT,
    FiscalQuarterNumber TINYINT NOT NULL, 
    FiscalQuarterName NVARCHAR(20),
    FiscalSemester TINYINT NOT NULL,
    FiscalYearKey SMALLINT NOT NULL, 
    FiscalYearName NVARCHAR(20),
)

DECLARE @StartDate DATETIME
DECLARE @EndDate DATETIME

SELECT @StartDate = '2005-01-01',
       @EndDate = '2013-12-31'

WHILE (@StartDate <= @EndDate)
BEGIN
    INSERT INTO DimDate 
    (
        DateKey,
        ShortDateName,
        FullDateName, 
        DayNumberOfWeek,
        DayNameOfWeek,
        DayNumberOfMonth,
        DayNumberOfYear, 
        WeekNumberOfYear,
        IsWeekend,
        IsLeapYear, 
        MonthKey,
        MonthNumberOfYear,
        MonthNameOfYear, 
        MonthNameWithYear,  
        CalendarQuarterKey,
        CalendarQuarterNumber,
        CalendarQuarterNameWithYear,
        CalendarSemesterNumber,
        CalendarYearKey, 
        CalendarYearName,  
        FiscalQuarterNumber, 
        FiscalSemester,
        FiscalYearKey 
    )
    SELECT CAST(CONVERT(VARCHAR(8),@StartDate,112) AS INT) AS 'DateKey',
           CONVERT(VARCHAR(20), @StartDate,106) AS 'ShortDateName',
           CONVERT(VARCHAR(2),DATENAME(DD,@StartDate)) 
                    + ' ' 
                    + DATENAME(MM,@StartDate) 
                    + ' ' 
                    + CONVERT(CHAR(4), DATEPART(YY,@StartDate)) AS 'FullDateName', -- 1 July 2005  
           DATEPART(DW,@StartDate) AS 'DayNumberOfWeek',
           DATENAME(DW,@StartDate) AS 'DayNameOfWeek',
           DATENAME(DD,@StartDate) AS 'DayNumberOfMonth',
           DATENAME(DY,@StartDate) AS 'DayNumberOfYear',
           DATEPART(WW,@StartDate) AS 'WeekNumberOfYear',
           CASE WHEN DATEPART(DW,@StartDate) IN (1,7)  
                    THEN 'Weekend'
                ELSE 'Weekday' 
           END AS 'IsWeekend',
           CASE WHEN ((YEAR(@StartDate) % 4 = 0) AND (YEAR(@StartDate) % 100 != 0 OR YEAR(@StartDate) % 400 = 0))
                    THEN 1
                ELSE 0
           END AS 'IsLeapYear', 
           DATEPART(YY,@StartDate) * 100 + DATEPART(MM,@StartDate) AS 'MonthKey', -- 200507
           DATEPART(MM,@StartDate) AS 'MonthNumberOfYear',
           DATENAME(MM,@StartDate) AS 'MonthNameOfYear',
           DATENAME(MM,@StartDate) + ' ' + CONVERT(CHAR(4),DATEPART(YY,@StartDate)) AS 'MonthNameWithYear',  -- July 2005  
           DATEPART(YY,@StartDate) * 100 + DATEPART(QQ,@StartDate) AS 'CalendarQuarterKey',  -- 200503
           DATEPART(QQ,@StartDate) AS 'CalendarQuarterNumber',
           'CY ' + CONVERT(CHAR(4),DATEPART(YY,@StartDate)) 
                 + ' Qtr ' 
                 + CONVERT(CHAR(1), DATEPART(QQ,@StartDate)) AS 'CalendarQuarterNameWithYear', -- CY 2005 Qtr 3  
           CASE WHEN DATEPART(MM,@StartDate) BETWEEN 1 AND 6
                    THEN 1
                ELSE 2
           END AS 'CalendarSemester',
           DATEPART(YY,@StartDate) AS 'CalendarYearKey',  
           'CY ' + CONVERT(CHAR(4),DATEPART(YY,@StartDate)) AS 'CalendarYearName', -- CY 2005 
           CASE WHEN DATEPART(MM,@StartDate) BETWEEN 1 AND 6
                     THEN DATEPART(QQ,@StartDate) + 2
                ELSE DATEPART(QQ,@StartDate) - 2
           END AS 'FiscalQuarter',
           CASE WHEN DATEPART(MM,@StartDate) BETWEEN 1 AND 6
                    THEN 2
                ELSE 1
           END AS 'FiscalSemester',
           CASE WHEN DATEPART(MM,@StartDate) BETWEEN 1 AND 6
                    THEN DATEPART(YY,@StartDate) 
                ELSE DATEPART(YY,@StartDate) + 1
           END AS 'FiscalYear' 
    
           UPDATE DimDate 
           SET FiscalQuarterKey = FiscalYearKey * 100 + FiscalQuarterNumber,  -- 200601
               FiscalYearName = 'FY ' + CONVERT(CHAR(4), FiscalYearKey),   -- FY 2006
               FiscalQuarterName =  'FY ' + CONVERT(Char(4), FiscalYearKey) + ' Qtr ' + CONVERT(CHAR(1), FiscalQuarterNumber) -- FY 2006 Qtr 1      
           WHERE DateKey = CONVERT(INT,CONVERT(VARCHAR(8),@StartDate,112))
        
    SET @StartDate = @StartDate + 1
END

SELECT EmployeeKey,
       ParentEmployeeKey,
       EmployeeNationalIDAlternateKey,
       CASE WHEN ISNULL(MiddleName,'') = ''  THEN FirstName +' '+ LastName  
            ELSE FirstName +' '+ MiddleName +' '+LastName
       END AS FullName,
       Title 
INTO DimEmployee
FROM AdventureWorksDW2012.dbo.DimEmployee

SELECT SalesReasonKey,
       SalesReasonAlternateKey,
       SalesReasonName,
       SalesReasonReasonType
INTO DimSalesReason
FROM AdventureWorksDW2012.dbo.DimSalesReason

SELECT SalesOrderNumber,
       SalesOrderLineNumber,
       SalesReasonKey
INTO FactInternetSalesReason
FROM AdventureWorksDW2012.dbo.FactInternetSalesReason

SELECT ProductKey,
       ProductAlternateKey,
       ProductSubcategoryKey,
       EnglishProductName,
       StandardCost,
       Color,
       SafetyStockLevel,
       ListPrice,
       Class,
       Size,
       StartDate,
       EndDate,
       [Status],
       ProductAlternateKey + ' (' + CONVERT (Char(10), StartDate, 120) + ')' AS ProductID
INTO DimProduct
FROM AdventureWorksDW2012.dbo.DimProduct

SELECT ProductSubcategoryKey,
       ProductSubcategoryAlternateKey,
       EnglishProductSubcategoryName,
       ProductCategoryKey 
INTO DimProductSubcategory
FROM AdventureWorksDW2012.dbo.DimProductSubcategory

SELECT ProductCategoryKey,
       ProductCategoryAlternateKey,
       EnglishProductCategoryName
INTO DimProductCategory
FROM AdventureWorksDW2012.dbo.DimProductCategory 

SELECT ProductKey,
       OrderDateKey,
       EmployeeKey,
       ResellerKey,
       SalesOrderLineNumber,
       SalesOrderNumber,
       UnitPrice,
       ProductStandardCost,
       SalesAmount,
       CarrierTrackingNumber,
       CustomerPONumber,
       SalesOrderNumber+'_'+CONVERT(VARCHAR(10),SalesOrderLineNumber) AS 'SalesOrderLineID'
INTO FactResellerSales
FROM AdventureWorksDW2012.dbo.FactResellerSales

SELECT ProductKey,
       OrderDateKey,
       DueDateKey,
       ShipDateKey,
       CustomerKey,
       SalesOrderNumber,
       SalesOrderLineNumber,
       OrderQuantity,
       UnitPrice,
       SalesAmount
INTO FactInternetSales
FROM AdventureWorksDW2012.dbo.FactInternetSales  

SELECT GeographyKey, 
       City,
       StateProvinceCode,
       StateProvinceName,
       CountryRegionCode,
       EnglishCountryRegionName
INTO DimGeography 
FROM AdventureWorksDW2012.dbo.DimGeography

SELECT CustomerKey,
       GeographyKey,
       CustomerAlternateKey,
       Title,
       CASE WHEN ISNULL(MiddleName,'') = ''  THEN FirstName +' '+ LastName  
            ELSE FirstName +' '+ MiddleName +' '+LastName
       END AS CustomerName 
INTO DimCustomer        
FROM AdventureWorksDW2012.dbo.DimCustomer

SELECT ResellerKey,
       GeographyKey,
       ResellerAlternateKey,
       ResellerName
INTO DimReseller
FROM AdventureWorksDW2012.dbo.DimReseller

------------------------------------------------------------------------------
-- Add Primary Key Constraint
------------------------------------------------------------------------------  
ALTER TABLE DimEmployee 
ADD CONSTRAINT PK_EmployeeKey PRIMARY KEY CLUSTERED (EmployeeKey)

ALTER TABLE DimSalesReason 
ADD CONSTRAINT PK_SalesReasonKey PRIMARY KEY CLUSTERED (SalesReasonKey)

ALTER TABLE DimProduct 
ADD CONSTRAINT PK_ProductKey PRIMARY KEY CLUSTERED(ProductKey)

ALTER TABLE DimProductSubcategory 
ADD CONSTRAINT PK_SubcategoryKey PRIMARY KEY CLUSTERED(ProductSubcategoryKey)

ALTER TABLE DimProductCategory 
ADD CONSTRAINT PK_CategoryKey PRIMARY KEY CLUSTERED(ProductCategoryKey)

ALTER TABLE DimGeography 
ADD CONSTRAINT PK_GeographyKey PRIMARY KEY CLUSTERED(GeographyKey)

ALTER TABLE DimReseller 
ADD CONSTRAINT PK_ResellerKey PRIMARY KEY CLUSTERED(ResellerKey)

ALTER TABLE DimCustomer 
ADD CONSTRAINT PK_CustomerKey PRIMARY KEY CLUSTERED(CustomerKey)

ALTER TABLE FactInternetSales 
ADD CONSTRAINT PK_InternetSales PRIMARY KEY CLUSTERED(SalesOrderNumber,SalesOrderLineNumber)

ALTER TABLE FactResellerSales 
ADD CONSTRAINT PK_ResellerSales PRIMARY KEY CLUSTERED(SalesOrderNumber,SalesOrderLineNumber)

ALTER TABLE FactInternetSalesReason 
ADD CONSTRAINT PK_InternetSalesReason PRIMARY KEY CLUSTERED(SalesOrderNumber,SalesOrderLineNumber,SalesReasonKey)
------------------------------------------------------------------------------
-- Add Primary Key Constraint
------------------------------------------------------------------------------
ALTER TABLE DimCustomer
ADD CONSTRAINT FK_Customer_GeographyKey FOREIGN KEY(GeographyKey) REFERENCES DimGeography(GeographyKey)

ALTER TABLE DimReseller
ADD CONSTRAINT FK_Reseller_GeographyKey FOREIGN KEY(GeographyKey) REFERENCES DimGeography(GeographyKey)

ALTER TABLE DimEmployee
ADD CONSTRAINT FK_Employee_ParentEmployeeKey FOREIGN KEY(ParentEmployeeKey) REFERENCES DimEmployee(EmployeeKey)

ALTER TABLE FactResellerSales
ADD CONSTRAINT FK_Reseller_EmployeeKey FOREIGN KEY(EmployeeKey) REFERENCES DimEmployee(EmployeeKey)

ALTER TABLE FactResellerSales
ADD CONSTRAINT FK_Reseller_ProductKey FOREIGN KEY(ProductKey) REFERENCES DimProduct(ProductKey)

ALTER TABLE FactResellerSales
ADD CONSTRAINT FK_Reseller_OrderDateKey FOREIGN KEY(OrderDateKey) REFERENCES DimDate(DateKey) 

ALTER TABLE FactResellerSales
ADD CONSTRAINT FK_Reseller_ResellerKey FOREIGN KEY(ResellerKey) REFERENCES DimReseller(ResellerKey)

ALTER TABLE FactInternetSales
ADD CONSTRAINT FK_Internet_ProductKey FOREIGN KEY(ProductKey) REFERENCES DimProduct(ProductKey)

ALTER TABLE FactInternetSales
ADD CONSTRAINT FK_Internet_OrderDateKey FOREIGN KEY(OrderDateKey) REFERENCES DimDate(DateKey)

ALTER TABLE FactInternetSales
ADD CONSTRAINT FK_Internet_ShipDateKey FOREIGN KEY(ShipDateKey) REFERENCES DimDate(DateKey)

ALTER TABLE FactInternetSales
ADD CONSTRAINT FK_Internet_DueDateKey FOREIGN KEY(DueDateKey) REFERENCES DimDate(DateKey) 

ALTER TABLE FactInternetSales
ADD CONSTRAINT FK_Internet_CustomerKey FOREIGN KEY(CustomerKey) REFERENCES DimCustomer(CustomerKey) 

ALTER TABLE DimProduct 
ADD CONSTRAINT FK_Product_SubcatetoryKey FOREIGN KEY (ProductSubcategoryKey) REFERENCES DimProductSubcategory (ProductSubcategoryKey)

ALTER TABLE DimProductSubcategory 
ADD CONSTRAINT FK_Subcategory_CatetoryKey FOREIGN KEY (ProductCategoryKey) REFERENCES DimProductCategory (ProductCategoryKey)

ALTER TABLE FactInternetSalesReason 
ADD CONSTRAINT FK_InternetSalesReason_Order FOREIGN KEY (SalesOrderNumber,SalesOrderLineNumber) 
REFERENCES FactInternetSales (SalesOrderNumber,SalesOrderLineNumber)

ALTER TABLE FactInternetSalesReason 
ADD CONSTRAINT FK_InternetSalesReason_OrderReason FOREIGN KEY (SalesReasonKey) REFERENCES DimSalesReason (SalesReasonKey)

SELECT * FROM DimEmployee 
SELECT * FROM DimDate 
SELECT * FROM DimProduct
SELECT * FROM DimProductSubcategory
SELECT * FROM DimProductCategory
SELECT * FROM DimCustomer
SELECT * FROM DimReseller
SELECT * FROM DimGeography
SELECT * FROM DimSalesReason

SELECT * FROM FactResellerSales
SELECT * FROM FactInternetSales
SELECT * FROM FactInternetSalesReason

SELECT TOP 20 
       ProductKey,  -- 产品 KEY
       SalesOrderLineID, -- 订单明细编号
       SalesOrderNumber, -- 订单编号
       SalesOrderLineNumber, -- 订单详细编号
       UnitPrice, -- 单位价格
       ProductStandardCost, -- 标准成本
       SalesAmount, -- 销售额
       CarrierTrackingNumber, -- 承运人跟踪号,
       CustomerPONumber -- 采购订单号
FROM FactResellerSales
View Code

新建 SSIS 项目并创建数据源和数据源视图,在前面的几篇博客中已经讲到了 -

SSAS 系列 - 基于雪花模型的维度设计 - 主要已经处理了 DimProduct, DimProductSubcategory, DimProductCategory 这几张表。

SSAS 系列 - 自定义的日期维度设计 - 主要处理 DimDate 表。

SSAS系列 - 关于父子维度的设计 - 主要处理 DimEmployee 表。

SSAS 系列 - 实现Cube 以及角色扮演维度,度量值格式化和计算成员的创建 - 基于以上这些维度的度量值,角色扮演维度等概念。

在这个数据源视图中,新加入了 DimCustomer, DimReseller, DimGeography , DimSalesReason, FactInternetSalesReason 这几个维度表。

那么现在假设我需要直接通过 Geography 地理纬度直接聚合访问 Reseller Sales 和 Internet Sales 度量值应该如何实现?

其实这个我们完全可以借鉴 SSAS 系列 - 基于雪花模型的维度设计 这篇文章中的有关 Product Category, Product Subcategory, Product 这种层次结构的设计而应用到 DimGeography 和 DimCustomer 或者 DimGeography 和 DimReseller 的关系中,一样可以实现通过 Geography 直接聚合访问度量值。

但是今天我们将换一种方式来实现,直接通过定义维度和度量值关联的 Referenced 关系来实现。

先快速创建好 Geography, Customer 和 Reseller 这几个维度创建好,只有先创建好了维度才能定义与度量值组之间的关系。

Geography 层次结构。

Customer 维度非常简单,只包含了一个非键属性 Geography Key, Reseller 维度也是一样创建的。

在多维数据集 Cube 的维度设计面板中选择添加新的维度 - Customer, Geography 和 Reseller。

根据数据源视图可以直接看到 FactInternetSales 和 FactResellerSales 是分别引用了 DimCustomer 的 CustomerKey 和 DimReseller 的 ResellerKey。所以当添加完这几个维度的时候,在维度用法中可以看到 Customer 维度和Reseller 维度是已经自动关联上了度量值组,并且它们之间的关系就是最简单的 Regular 关系。

可以观察数据源视图 FactInternetSales 是通过 DimCustomer 间接的关联到 DimGeography 上的,因此我们可以说:

度量值组 Internet Sales "引用了" Geography 维度,Internet Sales 的 "引用维度 Reference Dimension" 是 Geography。

是通过 “中间维度” Customer 维度引用的。

需要引用到"引用维度上的属性" Reference dimension attribute 是 GeographyKey。

通过"中间维度的属性" Intermediate dimension attribute 是 GeographyKey。

编辑 Geography 和 Internet Sales 交叉的区域,定义它们之间的关联关系,在这里选择 Referenced 引用关系。

然后对照下面的操作完成关联,其实下面的这些关联动作就是实现了上述关联描述。

Materialize 具体化 - 默认选项。它是指我们这种引用关系一旦被具体化,那么维度成员和事实数据之间的关联关系将成员多维数据集物理存储结构中的一部分。效率上是提高了,但是处理的时间可能会增多。 如果不选择具体化,那么每次查询的时候多维数据集需要根据中间维度关联到引用维度然后再计算聚合值,效率上会变得很低。因为此时,多维数据集的物理存储结构中将只会保存事实数据和中间维度的关联关系。

下图同样建立了 Reseller Sales 与 Reseller 维度之间的引用关联关系。

记住引用关联关系的图标,以后打开已有项目时通过图标就知道维度和度量值组是什么关联方式了。

部署并浏览数据,直接通过 Geography 维度来浏览 Internet Sales Amount 和 Reseller Sales Amount 度量值的数据。

更多 BI 文章请参看 BI 系列随笔列表 (SSIS, SSRS, SSAS, MDX, SQL Server) 如果觉得这篇文章看了对您有帮助,请帮助推荐,以方便他人在 BIWORK 博客推荐栏中快速看到这些文章。