【Weekly SQLpassion Newsletter】Actual Number of Rows are not always accurate
Today’s blog posting is a quite interesting one, because I want to show you a concrete example where in an Actual Execution Plan the Actual Number of Rows are WRONG! Yes, you have read correctly: I’m talking here about the Actual Number of Rows, and not about the Estimated Number of Rows, which are always somehow(adv. 以某种方式,用某种方法; 不知怎么地, 莫名其妙地; ) wrong, because they are only estimated during the Cardinality(n. 基数;集的势) Estimation.
Counting rows in the wrong way
A few weeks ago I had worked with a customer, and they had a really interesting phenomenon: SQL Server has returned a lot of rows, but when they looked into the Execution Plan, the Actual Number of Rows were a lot higher. You don’t trust me? Look at the following Actual Execution Plan within SQL Server Management Studio:
SQL Server returned here within SQL Server Management Studio 110561 rows, but the Actual Number of Rows within the Actual Execution Plan are much higher: 118968!
Wow, this is a quite interesting behavior. And I don’t think that this behavior is just by-design. The problem lies in a Filtered Non-Clustered Columnstore Index that was introduced with SQL Server 2016. You can create a Filtered Non-Clustered Columnstore Index in an OLTP(On-Line Transaction Processing) workload to support so-called Realtime Data Analytics Scenarios. Instead of creating a dedicated Data Warehouse database you perform your Analytics workload directly on your OLTP tables, which can be also changed concurrently.
This means now that a Non-Clustered Columnstore Index is also changeable since SQL Server 2016. And in that behavior lies the problem: when SQL Server calculates and returns you the Actual Number of Rows, the deleted rows from the Non-Clustered Columnstore Index are also considered. And therefore the Actual Number of Rows in the Execution Plan are just higher as the real row count that is returned from the query.
The following T-SQL code shows a simple scenario with which you can reproduce this scenario.
-- Create a table copy SELECT * INTO Sales.SalesOrderDetail2 FROM Sales.SalesOrderDetail GO -- Create a Non-Clustered ColumnStore Index for the "cold" data partition CREATE NONCLUSTERED COLUMNSTORE INDEX idx_ColdData ON Sales.SalesOrderDetail2(ProductID) WHERE ModifiedDate < '20140601' GO -- Let's delete some rows DELETE FROM Sales.SalesorderDetail2 WHERE ModifiedDate >= '20140501' AND ModifiedDate < '20140601' GO -- These rows are not logically deleted SELECT * FROM sys.column_store_row_groups WHERE object_id = OBJECT_ID('Sales.SalesOrderDetail2') GO -- Uses the Non-Clustered ColumnStore Index to access the data SELECT ProductID FROM Sales.SalesOrderDetail2 WHERE ModifiedDate < '20140601' GO -- Clean up DROP TABLE Sales.SalesOrderDetail2 GO
Summary
Never, ever trust anyone – especially not a piece of software! I find this behavior quite funny, because until I encountered this specific scenario I always thought that the Actual Number of Rows are somehow calculated on-the-fly during the Query Execution. But it seems (at least in combination with a Non-Clustered Columnstore Index) that this is not really always the case.
Thanks for your time,
-Klaus
【推荐】博客园的心动:当一群程序员决定开源共建一个真诚相亲平台
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】Flutter适配HarmonyOS 5知识地图,实战解析+高频避坑指南
【推荐】开源 Linux 服务器运维管理面板 1Panel V2 版本正式发布
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 从“看懂世界”到“改造世界”:AI发展的四个阶段你了解了吗?
· 协程本质是函数加状态机——零基础深入浅出 C++20 协程
· 编码之道,道心破碎。
· 记一次 .NET 某发证机系统 崩溃分析
· 微服务架构学习与思考:SOA架构与微服务架构对比分析
· 历时半年,我将一个大型asp.net的零代码快速开发平台转成了java
· C#实现语音预处理:降噪、静音检测、自动增益(附Demo源码)
· 推荐五大AI+MCP自动化测试工具!
· 记一次 .NET 某无语的电商采集系统 CPU爆高分析
· Spring Boot 启动优化实践