Relational Engine之Server Architecture
Server Architecture
Figure "Server
Architecture" illustrates the main components of the relational engine
portion of SQL Server. The illustrated components can be organized into
three groupings of subsystems.
Server Architecture
Open Data Services(ODS)
ODS manages the network: it
listens for new connections, cleans up failed connections, acknowledges
"attentions" (cancellations of commands), coordinates threading
services to SQL Server, and returns result sets, messages, and status
values back to the client by using TDS.
Compilation components: On the left are the components that address compiling queries.
Command parser handles language events raised by ODS. It checks for
proper syntax and translates Transact-SQL commands into an internal
format that can be operated on. This internal format is known as a query tree. If the parser doesn't recognize the syntax, a syntax error is immediately raised and identifies where the error occurred.
The optimizer is the responsibility of the query optimizer to take the
nonprocedural request expressed in SQL and translate it into a set of
disk I/Os, filtering, and other procedural logic that efficiently
satisfies the request.(optimizer只负责非程序化的SQL语句,即严格意义上的SQL、DML语句)< /font>
Execution components:
On the right-hand side is the execution infrastructure. This is really
a much smaller set of facilities. Once the compilation components have
finished their job, they have created something that can be directly
executed with just a few services thrown in.
SQL Manager:
In the middle of the figure is something called the SQL Manager. It
controls the flow of everything inside the SQL Server. RPC messages are
handled by the SQL Manager. The pseudo system stored procedures are
also logically a part of the SQL Manager. SQL statements typically
coming in as TDS SQL Language messages are processed directly from the
compilation side. Results are sent back out by components in the
execution engine calling ODS to format the TDS results messages.(SQL Manager负责处理程序化的TSQL语句,即严格意义上的Transact-SQL或TSQL。也包括对其它的RPC events以及pseudo system stored procedures的处理)
Expression service.
The Expression Services library is a component that does data
conversion, predicate evaluation (filtering), and arithmetic
calculations. It also works with the ODS layer to format output results
into TDS messages.
For example:SELECT @MyQty=Qty/10 FROM MyTable
The expression manager copies the value of qty from the row set returned by the storage engine, multiplies it by 10, and stores the result in @myqty.
Other components:
Catalog services component,
which handles data such definition statements as CREATE TABLE, CREATE
VIEW, and so forth. Catalog services also handles system tables,
materializing those that are really pseudo tables. The catalog services
component is located primarily in the relational engine, but actually
about one-third of it operates within the sphere of the storage engine,
so it is treated as a shared component.
User Mode Scheduler (UMS):
SQL Server's own internal scheduler for fibers and threads. There is a
very sophisticated internal mechanism for scheduling how work is
assigned to either fibers or threads, depending on how you've
configured the server, and allows SQL Server to do the proper load
balancing across processors on an SMP system. The UMS also keeps SQL
Server from thrashing by running too many threads concurrently.
Finally, there are the system procedures(指那些非pseudo system stored
procedures) that people are familiar with; logically they are part of
the relational engine. These system procedures are treated as part of
the server because their purpose is to expose primitive server
capabilities, like the system tables, at a higher and more appropriate
level for application use. 应用程序使用这些存储过程,可以避免系统表变化时应用程序变得不可用。
Client/Server Interactions When Processing SQL
Ad hoc(non-parameterized SQL) & default result set:
Here is an example of an ODBC call:(There is an almost direct
equivalent of this call for OLE-DB, which we won't look at because the
processing is practically identical to the ODBC call.)
SQLExecDirect(hstmt, "SELECT * FROM parts where partid = 7", SQL_NTS)
This is a classic example of ad hoc SQL. Clients all provide some
notion of cursors, so one of the questions that must be asked by the
client internally is what kind of result set or what kind of cursor is
the programmer asking for. The fastest type is what the documentation
calls a default result set. This type of cursor has also been historically called a firehouse cursor
and sometimes it isn't even thought of as a cursor at all. After the
SQL request is sent to the server, the server starts sending results
back to the client and won't stop sending results until the client has
consumed the entire set. This is like a giant firehouse pumping data
out at the client.
RPC(parameterized SQL):
Once the client has determined that it's a default result set, the next
step is to determine if there are any parameter markers. One of the
options when using this SQLExecDirect call in ODBC (and its equivalent
in OLE-DB) is that, instead of supplying a specific value like 7 in the
WHERE clause, you can pass in a parameter marker by replacing the
constant with a question mark, as shown here:
SQLExecDirect(hstmt, "SELECT * FROM parts where partid = ?", SQL_NTS)
Note that you must separately provide the actual value of the parameter.
The client needs to know if there are any parameter markers present in
this SQL statement, or is it true ad hoc. That will affect what the
client does with this statement internally and determines what is
actually sent as messages to the SQL Server. In the case where there is
no question mark, it is clear that the client simply wants to send this
request as SQL Language TDS messages, and then the client will sit at
the end of the firehose and take the results back. The client can then
return the result to the application based on the application's
parameters. The client's internal processing choices can be a little
obscure in terms of what you request through the ODBC or OLE DB APIs.
For example, an application program doesn't directly request a default
result set. Instead, in ODBC, if you ask for a cursor that is read-only
and forward-only and gives you one row at a time, that defines it to be
a firehose cursor (a default result set) as far as the client internals
goes.
There is one main problem with a firehose cursor. The client can't send any other SQL statements down to the server until it has consumed all the rows. Because the result set may have a very large number of rows, some applications won't work well with firehose cursors.
Prior to SQL Server version 7.0, the SQLExecDirect call would be
processed in much the same way whether or not parameter markers were
substituted for the constant. If you specified a parameter marker, the
client would actually take the value that you supplied through a
different call and plug it in where the question mark was. The new
statement with the substituted value was then sent down as an ad hoc
SQL statement. There was no benefit of using parameterized SQL at the
server. In SQL Server 7.0,
however, if parameter markers are used with SQLExecDirect, the TDS
message sent down to SQL Server isn't a SQL language message. Instead,
it's sent down to server using the sp_executesql procedure, so it's an
RPC as far as the TDS protocol is concerned. At the client, the result
is basically the same. The client will get the firehose of data back.
Scrollable cursor: If
you don't want this firehose of data back, you can always use a block
or a scrollable cursor. In this case, the flow becomes very different.
A call is made to the sp_cursoropen entry point (one of these
pseudo-stored procedures) passing in the SQL text. The sp_cursoropen
manipulates the SQL to add additional logic to enable it to scroll, it
potentially redirects some results into a temp table, and then it gives
a response with a handle to the cursor indicating that the cursor is
now open. Still outside of the programmer's control, the client calls
sp_cursorfetch, brings down one or more rows to the client that are
then returned to the user application. The client can also use
sp_cursor to reposition the cursor, or change certain statistics. When
you're done processing the cursor, the client will call sp_cursorclose.
Fast forward-only cursor:
Let's examine a simple case where we're just returning one row to the
client. In the default result set case, you have one round trip of
messages from the client to the server, and back again. There is the
SQL message (or sp_executesql) going down to the server, and then the
results coming back. In the case of a (nonfirehose) cursor for the same
one row, you see what you've traditionally seen with SQL Server. There
is a round-trip to do the open, a round-trip to do the fetch, and a
round-trip to do the close. The process has used three times as many
messages as the default result set would have used. In SQL Server 7.0,
there is something called a fast forward-only cursor, which uses the
same cursor infrastructure. It doesn't behave like a firehose, because
it doesn't require that you process all the result rows prior to
sending any additional SQL messages. So if you bring back five rows and
there is still more data, you can still send an update down to the
server.
A fast forward-only cursor is faster on the server than a regular
cursor, and it lets you specify two additional options. One is called autofetch and one is called autoclose.
Autofetch will return the first set of rows as part of the response
message to the open. Autoclose automatically closes the cursor after
the last row is read. Because it is forward-only and read-only, you
can't scroll back. SQL Server simply passes a message back with that
last set of data saying that the cursor is closed. If you're using fast
forward-only cursors, you can get the communication down to the same
one round-trip in messages for small numbers of rows. If you have large
numbers of rows, you're at least only paying the additional cost for
each block of rows. Cursor processing has gotten a lot closer to that
default result set if you use fast forward-only cursors.
Flow of Client/server interactions
The Prepare/Execute Model
In
addition to the execute direct model (invoked in ODBC with
SQLExecDirect), there is another execution model exposed in ODBC and
OLE-DB, called the prepare/execute model. Defining the SQL to be
executed is done as a separate step from actually executing the SQL.
Here's an example in ODBC:
SQLPrepare(hstmt, "SELECT * FROM parts where partid = ?", SQL_NTS)
SQLExecute(hstmt)
SQLExecute(hstmt)
In 7.0, there are two pseudo system stored procedures that provide a
native interface. For the prepare call, we again take a look at what
kind of cursor it is, and then we either call sp_prepare or
sp_cursorprepare. That does the compilation part of processing the SQL
or the store procedure, but doesn't actually execute the plan. Instead,
the pseudo system stored procedure returns a handle to the plan. Now
your application can repeatedly re-execute the SQL, passing in
different parameter values, for example, without needing to recompile.
In SQL Server 7.0, the prepare/execute method is a native feature of
SQL Server. After the SQL statement has been prepared, it is executed.
In the case of default result sets, that is just by the application
programmer calling sp_execute with the handle supplied from the prepare
operation, and then the statement runs. In the case of cursors, it
looks exactly like the other cursor processing and, in fact, it has the
same characteristics, including allowing autofetch and autoclose if the
cursor is fast forward-only.
Prepare / Execute Model
Calling Stored Procedures
Stored procedures are generally invoked from ODBC and OLE-DB by sending
a SQL statement down to the SQL Server that uses the ODBC canonical
CALL syntax to call a procedure. It might look something like this:
SQLExecDirect(hstm, "{call addorder(?)}", SQL_NTS)
In the case of a default result set, it's a simple flow because this is
what RPC messages were originally intended for. The client sends an RPC
message to the server and gets back the results coming from the
procedure. If it's a cursor, it's a little more complicated. The client
calls sp_cursoropen, like with any other cursor. Sp_cursoropen has some
logic built into it to detect whether the stored procedure contains
only a single SELECT statement. If so, a cursor is opened on that
SELECT. If it's not a single SELECT statement in the procedure, then
the client gets back a message with an indicator that says "we opened
this for you, but we're going to stream the results back to you as a
firehose and you can present that to the user."
Calling stored procedures
The SQL Manager
The SQL Manager is the driving force in a lot of the server processing. It's really at the heart of the server. The SQL Manager deals with all of the requests to run stored procedures.
It manages the procedure cache, it has the pseudo system stored
procedures, and it's involved in auto parameterization of ad hoc
queries, which we'll discuss shortly.
Typically, the SQL manager is invoked with an RPC message,
as you ask SQL Server to do some work for you. However, when a SQL
language statement comes in through a SQL message and goes into the
compilation side of the engine, the SQL manager is also involved. It can be involved when a procedure or a batch has an EXEC statement in it,
because the EXEC is actually calling the SQL manger. If the SQL
statement passes the autoparameterization template, then the SQL
manager is called to parameterize the queries. It's also called when ad
hoc queries need to be placed in the cache.
<Microsoft SQL Server Query Processor Internals and Architecture>
浙公网安备 33010602011771号