多种存储过程分页方法的速度对比

一.TempTable(临时表)

The methods in both articles could be optimized with just the Primary Key data copied to the temp table and then doing the join with the main query. Therefore, the essence of this method would be the following

CREATE
TABLE #Temp (
ID int
IDENTITY
PRIMARY
KEY,
PK  /* here goes PK type */
)
INSERT
INTO #Temp SELECT PK FROM
Table
ORDER
BY SortColumn
SELECT ... FROM
Table
JOIN #Temp temp ON
Table.PK = temp.PK ORDER
BY temp.ID 

WHERE ID > @StartRow AND ID < @EndRow

The method can be optimized further by copying the rows to the temp table until the end paging row is reached (SELECT TOP EndRow...), but the point is that in the worst case – for a table with 1 million records you end up with 1 million records in a temp table as well. Considering all this and having looked upon the results in the article above, I decided to discard this method from my tests.

二.Asc-Desc

This method uses default ordering in a subquery and then applies the reverse ordering. The principle goes like this

DECLARE @temp TABLE (
PK  /* PK Type */
NOT
NULL
PRIMARY
)
INSERT
INTO @temp
SELECT
TOP @PageSize PK FROM (
SELECT
TOP (@StartRow + @PageSize)
PK,
SortColumn /*If sorting column is defferent from the PK, SortColumn must 

be fetched as well, otherwise just the PK is necessary */
ORDER
BY SortColumn /* default order – typically ASC */)
ORDER
BY SortColumn /* reversed default order – typically DESC */
SELECT ... FROM
Table
JOIN @Temp temp ON
Table.PK = temp.PK 

ORDER
BY SortColumn /* default order */

三.RowCount

The base logic of this method relies on the SQL SET ROWCOUNT expression to both skip the unwanted rows and fetch the desired ones:

DECLARE @Sort /* the type of the sorting column */
SET
ROWCOUNT @StartRow
SELECT @Sort = SortColumn FROM
Table
ORDER
BY SortColumn
SET
ROWCOUNT @PageSize
SELECT ... FROM
Table
WHERE SortColumn >= @Sort ORDER
BY SortColumn

四.SubQuery

There are 2 more methods I’ve taken into consideration, and they come from different resources. The first one is well known triple query or the SubQuery method. The most thorough approach is the one I’ve found in the following article
http://www.winnetmag.com/Article/ArticleID/40505/40505.html
Although you'll need to be subscribed, a .zip file with the SubQuery stored procedure variations is available. The Listing_04.SELECT_WITH_PAGINGStoredProcedure.txt file contains the complete generalized dynamic SQL. I used a similar generalization logic with all other stored procedures in this text. Here is the principle followed by the link to the whole procedure (I shortened the original code a bit, because a recordcount portion was unnecessary for my testing purposes).

SELECT ... FROM
Table
WHERE PK IN
(SELECT
TOP @PageSize PK FROM
Table
WHERE PK NOT
IN
(SELECT
TOP @StartRow PK FROM
Table
ORDER
BY SortColumn)
ORDER
BY SortColumn)
ORDER
BY SortColumn

五.Cursor(游标)

I’ve found the last method while browsing through the Google groups, you can find the original thread here. This method uses a server-side dynamic cursor. A lot of people tend to avoid cursors, they usually have poor performance because of their non-relational, sequential nature. The thing is that paging IS a sequential task and whatever method you use you have to somehow reach the starting row. In all the previous methods this is done by selecting all rows preceding the starting row plus the desired rows and then discarding all the preceding rows. Dynamic cursor has the FETCH RELATIVE option which does the “magic” jump. The base logic goes like this

DECLARE @PK /* PK Type */
DECLARE @tblPK TABLE (
PK /* PK Type */
NOT
NULL
PRIMARY
KEY
)
DECLARE PagingCursor CURSOR DYNAMIC READ_ONLY FOR
SELECT @PK FROM
Table
ORDER
BY SortColumn
OPEN PagingCursor
FETCH RELATIVE @StartRow FROM PagingCursor INTO @PK
WHILE @PageSize > 0
AND @@FETCH_STATUS = 0
BEGIN
INSERT @tblPK(PK) VALUES(@PK)
FETCH NEXT FROM PagingCursor INTO @PK
SET @PageSize = @PageSize - 1
END
CLOSE PagingCursor
DEALLOCATE PagingCursor
SELECT ... FROM
Table
JOIN @tblPK temp ON
Table.PK = temp.PK 

ORDER
BY SortColumn

Generalization of Complex Queries

As pointed out before, all the procedures are generalized with dynamic SQL, thus, in theory, they can work with any kind of complex query. Here is a complex query sample that works with Northwind database.

SELECT Customers.ContactName AS Customer,
Customers.Address + ', ' + Customers.City + ', ' + 

Customers.Country AS Address,
SUM([Order Details].UnitPrice*[Order Details].Quantity) AS


[Total money spent]
FROM Customers
INNER
JOIN Orders ON Customers.CustomerID = Orders.CustomerID
INNER
JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID
WHERE Customers.Country <> 'USA'
AND Customers.Country <> 'Mexico'
GROUP
BY Customers.ContactName, Customers.Address, Customers.City, 

Customers.Country
HAVING (SUM([Order Details].UnitPrice*[Order Details].Quantity))>1000
ORDER
BY Customer DESC, Address DESC

The paging stored procedure call that returns the second page looks like this

EXEC ProcedureName
/* Tables */
'Customers
INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID
INNER JOIN [Order Details] ON Orders.OrderID = [Order Details].OrderID',
/* PK */
'Customers.CustomerID',
/* ORDER BY */
'Customers.ContactName DESC, Customers.Address DESC',
/* PageNumber */
2,
/* Page Size */
10,
/* Fields */
'Customers.ContactName AS Customer,
Customers.Address + '', '' + Customers.City + '', '' + Customers.Country 

AS Address,
SUM([Order Details].UnitPrice*[Order Details].Quantity) AS [Total money spent]',
/* Filter */
'Customers.Country <> ''USA'' AND Customers.Country <> ''Mexico''',
/*Group By*/
'Customers.CustomerID, Customers.ContactName, Customers.Address, 

Customers.City, Customers.Country
HAVING (SUM([Order Details].UnitPrice*[Order Details].Quantity))>1000'

Note that in the original query, aliases are used in the ORDER BY clause. You can't do that in paging procedures, because the most time-consuming task in all of them is skipping rows preceding the starting row. This is done in various ways, but the principle is not to fetch all the required fields at first, but only the PK column(s) (in case of RowCount method the sorting column), which speeds up this task. All required fields are fetched only for the rows that belong to the requested page. Therefore, field aliases don't exist until the final query, and sorting columns have to be used earlier (in row skipping queries).

The RowCount procedure has another problem, it is generalized to work with only one column in the ORDER BY clause. The same goes for Asc-Desc and Cursor methods, though they can work with several ordering columns, but require that only one column is included in the PK. I guess this could be solved with more dynamic SQL, but in my opinion it is not worth the fuss. Although these situations are highly possible, they are not that frequent. Even if they are, you can always write a separate paging procedure following the principles above.

具体请参看:
http://www.codeproject.com/KB/aspnet/PagingLarge.aspx

在旧版的 SQL Server 2000 和其它厂牌的数据库，大多未内建「排序或自动给号」的函数；程序员在撰写分页的 Stored Procedure 时，可能还得用 IDENTITY 自动增号 (流水号) 的功能，先建立一个存放「临时行号 (自动增号、流水号)」的「临时数据表 (Temporary Table)」；但此种大量建立「临时数据表」的做法，会影响 DB server 的 performance，并非「分页」处理的最佳解法。而在 SQL Server 2005 中，已新增了一个专门用来「排序和自动给号」的 ROW_NUMBER 函数，可对已从数据库撷取的数据，再赋予一个「自动编号」的「字段；列 (column)」，且听说性能颇优，也更有利于 ASP.NET 分页的处理。
我们若在 SQL Server 2005 的 Northwind 数据库中，执行下列的 SQL 语句 (取自 SQL Server 在线丛书)：

WITH 暂存表 AS
(SELECT OrderID, CustomerID, OrderDate, ROW_NUMBER() OVER(ORDER BY OrderID DESC) AS 字段编号
FROM Orders)
SELECT * FROM 暂存表 WHERE 字段编号 BETWEEN 5 AND 13;

则可由 ROW_NUMBER 函数模拟的临时数据表中，取得我们写分页时，所需要的某个范围内的数据记录笔数。以上图 2 来说，即只撷取 DESC 反向排序后、5 至 13 号的这九笔记录。且这种 WITH 的 T-SQL 新语法，又称为「一般数据表表达式 (CTE, common_table_expression)」，也是与 ROW_NUMBER 函数搭配，撰写分页程序的精要所在，在 SQL Server 在线丛书也有相关介绍。
版工还在网络上的论坛 [5]，看到别人提供的语法，先用 ROW_NUMBER 函数取得数据的顺序编号，再用 WHERE 条件过滤：
SELECT TOP(分页大小) *
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY 排序条件) AS RowNo FROM 数据表
) AS T
WHERE RowNo > ((目前页数 - 1) * 分页大小)

假设我们的 GridView 每页要显示 10 笔记录，user 目前在 GridView 的第 20 页，当他单击「下一页」或第 21 页的页码时，就去 SQL Server 2005 撷取第 201 ~ 210 笔记录，在 Stored Procedure 里即执行下列 SQL 语句。撷取结果如下图 3 所示：
SELECT TOP(10) *
FROM
(
SELECT OrderID, CustomerID, OrderDate, ROW_NUMBER() OVER (ORDER BY OrderID DESC) AS 字段编号 FROM Orders
) AS 暂存表
WHERE 字段编号 > ((21 - 1) * 10)
但以 ROW_NUMBER 函数撰写分页的话，亦要考虑系统以后无法更换数据库的问题；且用 ROW_NUMBER 写好的 Stored Procedure，在其它 project 中，也无法重复使用于他牌的数据库，或旧版的 SQL Server。就如同 ADO.NET 2.0 中，有新增一些针对 SQL Server 2005 可提升 performance 的 .NET 数据处理语法，但使用前应先评估，系统日后是否有移植或维护上的问题。

/*基于SQL SERVER 2005 */
CREATE PROCEDURE [dbo].[Zhzuo_GetItemsPage2005]
    @PageIndex INT,      /*页面索引,0为第一页*/
    @PageSize INT,      /*每页记录数*/
    @RecordCount INT OUT,     /*总记录数*/
    @PageCount INT OUT
AS /*获取记录数*/
SELECT @RecordCount = COUNT(*) FROM Production.Product
/*计算页面数据*/
SET @PageCount = CEILING(@RecordCount * 1.0 / @PageSize)
/* 基于SQL SERVER 2005 */
SELECT SerialNumber,ProductID,Name FROM
(SELECT ProductID,Name,ROW_NUMBER() OVER (ORDER BY ProductID DESC) AS SerialNumber FROM Production.Product ) AS T
WHERE T.SerialNumber > (@PageIndex * @PageSize) and T.SerialNumber <= ((@PageIndex+1) * @PageSize)
第三个存储过程使用2005下新的功能，实现的分页存储过程功能更加简单明了，而且更加容易理解。注意这里的ProductID为主键,根据ProductID进行排序生成ROW_NUMBER，通过ROW_NUMBER来确定具体的页数。

利用select top 和 select max(列键)

create procedure proc_paged_with_selectMax --利用select top and select max(列)
(
    @pageIndex int, --页索引
    @pageSize int    --页记录数
)
as
begin
set nocount on;
    declare @timediff datetime
    declare @sql nvarchar(500)
    select @timediff=Getdate()
    set @sql='select top '+str(@pageSize)+' * From tb_TestTable where(ID>(select max(id) From (select top '+str(@pageSize*@pageIndex)+' id From tb_TestTable order by ID) as TempTable)) order by ID'
    execute(@sql)
    select datediff(ms,@timediff,GetDate()) as 耗时
set nocount off;
end

在使用SQL Server（SQL Server 2000）进行排序我们经常使用的是TOP关键字，但是使用TOP有一个问题，就是TOP后面只支持数值而不支持数值型的变量，这样就没有办法在存储过程中来支持这种动态排序的要求。或者使用Exec来执行构造SQL来执行，但这样执行的效率低而且不够灵活。

实际上，在SQL Server 2000中我们完全可以使用ROWCOUNT关键字解决这个问题。

ROWCOUNT关键字作用是可以直接指定需要返回记录集的行数。

1、使用ROWCOUNT查询前100行记录。

DECLARE @rc INT

SET @rc = 100
SET ROWCOUNT @rc

SELECT * FROM emp

使用TOP可以得到同样的结果

SELECT TOP 100 FROM emp

2、在INSERT INTO..SELECT中使用ROWROUNT。

DECLARE @rc INT

SET @rc = 100
SET ROWCOUNT @rc

INSERT INTO cust (cname)
        SELECT cname=emp_name FROM emp

3、在执行UPDATE和DELETE时使用ROWCOUNT。

      因为UPDATE和DELETE无法直接使用ORDER BY语法，如果使用ROWCOUNT，将按照主键顺序从前往后操作。

DECLARE @rc INT
SET @rc    = 100

SET ROWCOUNT @rc
DELETE FROM emp

不过也有解决办法，只要能够使用ORDER BY关键字就可以了，比如说直接用含ORDER BY的子句，或者先使用ORDER BY语法把需要操作的标识列存为一个临时表或表变量，然后再操作语句中使用IN或EXISTS关键字。

DECLARE @rc INT

SET @rc = 100
SET ROWCOUNT @rc

DECLARE @tmp TABLE(ID INT)
INSERT INTO @tmp
SELECT ID FROM emp ORDER BY cid[ASC/DESC]

DELETE FROM emp WHERE ID IN (SELECT ID FROM @tmp )

4、对于ROWCOUNT的设置是与Session有关的。如果占用了一个Session，那么对应的这个数据库Session将使用最近一次设置的ROWCOUNT，直到Session结束或者修改了ROWCOUNT。

5、在用户自定义函数中不能使用ROWCOUNT。

6、取消ROWCOUNT设置。

使用这样的语句即可取消ROWCOUNT了，因为如果不取消之后所有的查询返回的结果集行数都会受此影响。

SET ROWCOUNT 0

我不知道为什么在联机帮助中说，写存储过程的时候应该注意尽量避免使用ROWCOUNT，而建议使用TOP。难道MS不知道TOP关键后面的数字不能为变量吗？也许MS是出于担心开发者忘记了取消ROWCOUNT而影响正常的实现。

8、总结

有了ROWCOUNT关键字后就可以非常方便的实现变量形式的排序问题了。

posted on 2008-09-16 23:33 webenvoy 阅读(495) 评论(0) 收藏举报

刷新页面返回顶部

华娱网络

多种存储过程分页方法的速度对比

导航

公告