【SQLpassion Performance Tuning Training Plan】 - Week 12: Parallel Execution Plans
Parallel(并行; vt. 使…与…平行) Execution Plans
In this installment of the SQLpassion Performance Tuning Training Plan I want to talk in more detail about Parallel Execution Plans in SQL Server. Executing a query with a parallel execution plan means that multiple threads are used by SQL Server to perform the necessary operators from the execution plan. In the first step I will give you a general introduction to the most common operators used in a parallel execution plan, and afterwards we talk in more detail about how SQL Server decides if a parallel plan make sense.
Parallel Operators
The most common misconception(n. 误解;错觉;错误想法) about parallel execution plans is that only a given set of threads (e.g. 8 threads on an 8-core machine) is used for the whole plan. That's not really true, because SQL Server assigns(v. 分配,分派) multiple worker threads to all the operators that are parallelism-aware. This means that a larger parallel plan can consume a huge amount of threads. SQL Server distinguishes(vt. 区分;辨别;) between two kinds of operators in a parallel plan: parallelism-aware operators, and so-called Exchange Operators:
- A lot of the traditional operators are able to perform their work with multiple threads, they are parallelism-aware: Index Scan, Index Seek, Nested Loop, Hash Join, Sort, etc.
- Exchange Operators are used to distribute(vt. 分配;散布;分开;把…分类) and merge rows between multiple threads in a parallel execution plan.
SQL Server implements the following 3 Exchange Operators:
- Distribute Streams: Used to transition(n. 过渡;转变;) from a single-threaded region to a multi-threaded region(n. 区域; 地区;范围;部位) in a parallel plan
- Repartition(vt. 再分配,重新分配) Streams: Used to redistribute rows between threads (e.g. when the upfront(adj. 预付的;在前面的;正直的,坦率的) operator is a parallel Hash Join)
- Gather Streams: Used to transition from a multi-threaded region to a single-threaded region in a parallel plan
When you look at a parallel execution plan you will always see a combination of both kinds of operators. Every execution plan must produce a single-threaded result, therefore you will always find at the end of a parallel execution plan a Gather Streams operator.

If the Query Optimizer produces a parallel plan also depends if you are using query constructs that prevents a parallel plan, like:
- T-SQL and SQLCLR UDFs
- Build-In functions like OBJECT_ID(), ERROR_NUMBER(), @@TRANCOUNT
There are also multiple query constructs which are forcing a serial zone(n. 地带;地区;) in your parallel plan:
- System Table Scans
- Sequence Functions
- Backward Scans
- Recursive(adj. [数] 递归的;循环的) Queries
- TVFs
- TOP
The less serial zones you have in your parallel plan, the faster your query will be. Think about that the next time when you write your queries.
When to go parallel?
Every execution plan gets a so-called Cost Factor assigned by SQL Server. The cost factor is just a simple number that tells SQL Server how expensive an execution plan is. The higher the number, the higher the associated(adj. 关联的;联合的) costs for running that execution plan.
SQL Server has a configuration option called Cost-Threshold(n. 入口;门槛;开始;极限;临界值) for Parallelism, which defines the cost factor at which the Query Optimizer also considers about parallel plans. By default that configuration option is set to 5, which means that a query with a higher cost factor will be executed in parallel as long as parallelism is possible.
When a parallel plan is compiled(adj. 编译的) by the Query Optimizer, the option Max Degree of Parallelism (MAXDOP) defines how many threads are used for every parallel operator in the execution. As I have already said earlier, every operator within the parallel execution plan runs with multiple threads, not only the complete plan. Of course, threads can be shared and reused by upfront operators in the parallel execution plan. The following picture shows you the two configuration options on the SQL Server Instance level.

By default the MAXDOP option is set to 0. Therefore SQL Server will by default parallelize a query across all CPU cores that are assigned to SQL Server. This can lead to performance problems if you are dealing with a NUMA system (Non(adv. 非,不 n. 投反对票的人,反对票) Uniform(adj. 统一的;一致的;相同的;) Memory Access). A good best practice is to limit the MAXDOP option to the number of cores that you have within one NUMA node (incl(abbr. 包括(including,included);包括的(inclusive);包围(enclosure)). cores that are coming from Hyperthreading(超线程)). SQL Server will then alsoensure that parallel plans stay within a NUMA node.
Summary
In this installment of the SQLpassion Performance Tuning Training Plan I have talked about Parallel Execution Plans in SQL Server. You have learned about the various operators that are involved(adj. 有关的;卷入的;复杂的v. 涉及;使参与;) in parallel plans, and how you can configure parallelism in SQL Server. If you are interested in more details about parallel plans, I also suggest that you read the following online articles about it:
- https://www.simple-talk.com/sql/learn-sql-server/understanding-and-using-parallelism-in-sql-server/
- http://sqlblog.com/blogs/adam_machanic/archive/2010/05/24/sql-university-parallelism-week-introduction.aspx
With this newsletter the 3rd month of the SQLpassion Performance Tuning Training Plan is already over, and in the next month I will be talking about Statistics in SQL Server.
Stay tuned!
浙公网安备 33010602011771号