SQL Server essence

  博客园 :: 首页 :: 新随笔 :: 联系 :: 订阅 订阅 :: 管理 ::
  37 随笔 :: 8 文章 :: 2 评论 :: 0 引用

2010年1月19日 #

By Peter He, 2010/01/15 (first published: 2008/09/01)

Total article views: 22571 | Views in the last 30 days: 8176
 Rate this |   Join the discussion |  Briefcase |   Print
Local Temporary Table and Table Variable

There are lots of articles, blogs, forums discussed the differences and similarities between table variables and local temporary tables (ref. as temp tables hereafter). E.g.:

A summarization of table variable vs. temp table is attached in the appendix. Instead of repeating those well known facts, I'll focus on several differences that are hardly mentioned so far between table variable and temp table definitions.

XML Collection

The AdventureWorks sample database has an xml collection HumanResources.HRResumeSchemaCollection, which is used by the [Resume] column in the table HumanResources.JobCandidate.

The following script queries all the resumes into a table variable:

USE AdventureWorks
GO
DECLARE @TestXmls TABLE
(JobCandidateID int NOT NULL, [Resume] xml(HumanResources.HRResumeSchemaCollection) PRIMARY KEY CLUSTERED(JobCandidateID))

INSERT @TestXmls (JobCandidateID,[Resume])

SELECT JobCandidateID,[Resume] FROM HumanResources.JobCandidate
SELECT * FROM @TestXmls

GO

USE AdventureWorks
GO

It works perfectly. However, if you change the table variable to a temp table:

CREATE TABLE #TestXmls (JobCandidateID int NOT NULL,
  [Resume] xml(HumanResources.HRResumeSchemaCollection) 
  PRIMARY KEY CLUSTERED(JobCandidateID))

You will get the following error:

Msg 6314, Level 16, State 1, Line 2

Collection specified does not exist in metadata : 'HumanResources.HRResumeSchemaCollection'

This is because a temp table is created in TempDB, which does not have the xml collection HumanResources.HRResumeSchemaCollection. Though you can create the xml collection in the TempDB, it's not practical because:

  • TempDB is recreated whenever SQL server restarts;
  • TempDB is shared by all user/system databases and it is easily to have naming conflicts;

The work around is to use un-typed xml in the temporary table.

CREATE TABLE #TestXmls (JobCandidateID int NOT NULL,
  [Resume] xml PRIMARY KEY CLUSTERED(JobCandidateID))

INSERT #TestXmls (JobCandidateID,[Resume])

SELECT JobCandidateID,[Resume] FROM HumanResources.JobCandidate
SELECT * FROM #TestXmls
GO

User Defined DataTypes and User Defined Types (UDTs)

A user defined data type and UDT (CLR type) is similar to xml collection. It only resides in the database it is created. The AdventureWorks sample database has a user defined data type dbo.name, whose native data type is nvarchar(50). The following script queries all the address type names:

USE AdventureWorks
GO
DECLARE @TestUDTs TABLE (AddressTypeID int NOT NULL,
  [Name] dbo.Name NOT NULL PRIMARY KEY CLUSTERED(AddressTypeID))

INSERT @TestUDTs(AddressTypeID,[Name])

SELECT AddressTypeID,[Name]
 FROM Person.AddressType
SELECT * FROM @TestUDTs

It works fine. If you change to use temp table:

CREATE TABLE #TestUDTs (AddressTypeID int NOT NULL,
[Name] nvarchar(50) NOT NULL PRIMARY KEY CLUSTERED(AddressTypeID))

It will fail with the following error:

Msg 2715, Level 16, State 7, Line 1

Column, parameter, or variable #2: Cannot find data type dbo.Name.

The work around is to use the native data type of the user defined data type:

CREATE TABLE #TestUDTs (AddressTypeID int
NOT NULL,
[Name] nvarchar(50) NOT NULL PRIMARY KEY CLUSTERED(AddressTypeID))

UDTs are also scoped to a single database. If you use UDTs in a table variable, they are used in the same way as native data types. However since there is no equivalent native data type, if you need to use a UDT in a temp table, the UDT must be registered in tempdb the same way as for a user database, i.e. you need to CREATE the ASSEMLY and CREATE the TYPE in the TempDB. Due to the reasons mentioned in XML Collection

Collation

The collations for the string type (char, nchar, varchar, nvarchar) columns, if not given explicitly, will take the database collation of TempDB in a temp table, and inherit the collation of the current user database in a table variable. If not handled correctly, string comparing will fail.

Let's first find out the collation of the TempDB:

SELECT collation_name FROM sys.databases
WHERE database_id = DB_ID('tempdb')

To list all available collations:

SELECT * FROM fn_helpcollations()

Now let's create a test database, remember to use a different collation than the TempDB:

CREATE DATABASE CollateTest COLLATE Albanian_BIN
GO
USE CollateTest
GO

Create a table and populate some data:

CREATE TABLE dbo.Test1 (C2 int NOT NULL, C1 nvarchar(100) NOT NULL PRIMARY KEY CLUSTERED(C2))

GO

INSERT dbo.Test1 (C1,C2) SELECT N'R1',100 UNION ALL SELECT N'R2',200

GO

 

Now use a table variable to join with the table:

DECLARE @TestCollations TABLE (RID int identity not null, [Name] nvarchar(100) NOT NULL
    PRIMARY KEY CLUSTERED(RID))

INSERT @TestCollations ([Name]) VALUES (N'R2')
INSERT @TestCollations ([Name]) VALUES (N'R3')

SELECT A.* FROM dbo.Test1 A, @TestCollations B WHERE A.C1=B.[Name]
GO

It works fine. If you change to use temp table:

CREATE TABLE #TestCollations (RID int identity not null, [Name] nvarchar(100) NOT NULL
    PRIMARY KEY CLUSTERED(RID))

INSERT #TestCollations ([Name]) VALUES (N'R2')
INSERT #TestCollations ([Name]) VALUES (N'R3')

SELECT A.* FROM dbo.Test1 A, #TestCollations B WHERE A.C1=B.[Name]
GO

You get the following error:

Msg 468, Level 16, State 9, Line 6

Cannot resolve the collation conflict between "Latin1_General_CI_AI" and "Albanian_BIN" in the equal to operation.

There are two workarounds. One is to force the collation conversion in the string comparing:

SELECT A.* 
 FROM dbo.Test1 A, #TestCollations B 
  WHERE A.C1=B.[Name] collate database_default

However it's a little bit annoying if you have lots of string comparing in your database. A better solution is to explicitly give the collation for the string column in the temp table:

IF object_id('tempdb..#TestCollations') IS NOT NULL
 DROP TABLE #TestCollations
GO
CREATE TABLE #TestCollations (
  RID int identity not null, 
  [Name] nvarchar(100) collate database_default NOT NULL 
  PRIMARY KEY CLUSTERED(RID))

INSERT #TestCollations ([Name]) VALUES (N'R2')
INSERT #TestCollations ([Name]) VALUES (N'R3')

SELECT A.* FROM dbo.Test1 A, #TestCollations B WHERE A.C1=B.[Name]
GO

The "collate database_default" clause in the column definition forces SQL Server to take current database(CollateTest)'s collation instead of TempDB's for the temp table.

If a SQL Server hosts multiple databases, or you need to restore external databases to the server, you can easily run into the collation conflict issue if the user databases use temp tables and string columns do not declare collations properly.

Conclusion

If you use table variables, you are free to use user defined data types, user defined types (UDTs), and xml collections that are defined in your database. You don't need to worry about collation conflicts. If you have to use a temp table due to various reasons, please keep in mind the following best practices:

  • Always attach "collate database_default" clause for string columns in a temp table.
  • Always use un-typed xml for xml columns in a temp table.
  • Always use the equivalent native data type for a user defined data type in a temp table.
  • If you need to use UDTs in a temp table, you have to register the type in TempDB.

 Appendix

Table Variable vs. Temp Table Summary:

Feature

Table Variable

Temp Table

Note

Table Name

Max 128 characters

Max 116 characters

 

Data Storage

In memory and TempDB

TempDB

 

Meta Data

In memory

TempDB

A table variable inherits current database settings and can use the registered UDTs, user defined data types, and xml collections in the database. A temp table inherits the settings of TempDB and cannot use the types created in the user database if the same types do not exist in the TempDB.

Scope

Current batch

Current session

Temp tables created in a stored procedure (SP) can be referenced by dynamic queries in the SP, sub SPs, triggers fired by the affected tables of the SP.

Constraints

Allowed

Allowed

For table variables, since no DDL is allowed, constraints can not be created in separate DDL statements.

DDL

Not allowed

Allowed.

E.g. create Index on the temp table.

Concurrent

Supported

Supported

Constraints and Indexes with explicit name in a temp table cause duplicate name error.

Statistics

Not supported

Supported

Estimated row number in execution plan for table variable is always 1

Parallel execution plan

Supported only for select

Supported

Parallel query execution plans are not generated for queries that modify table variables.

Transaction and Locking

Not participated

Participated

Data in table variable is not affected if the transaction is rolled back

Cause Recompile

No

Yes

Temp Table creation causes SPs/batches to recompile

SELECT INTO <t>

Not supported

Supported

 

INSERT <t> EXEC

Not supported

Supported

 

Use

UDFs, Stored procedures, Triggers, Batches

Stored procedures, Triggers, Batches

 Temp tables can't be used in UDFs.

By Peter He, 2010/01/15 (first published: 2008/09/01)

Total article views: 22571 | Views in the last 30 days: 8176
posted @ 2010-01-19 17:54 天蝎 阅读(252) 评论(1) 编辑

2010年1月7日 #

By Ed Swiedler, 2010/01/06

Total article views: 1765 | Views in the last 30 days: 1765
 Rate this |   Join the discussion |  Briefcase |   Print

Applications running in production frequently, at least more so than we are prone to admit, have problems. Diagnosing these problems can be a nightmare when we do not have access to information regarding the error. I have seen several solutions, but none that would allow me to log all activity in a central location. The one that comes most quickly to mind, is to use a table in a database to contain all of the messages.

The problem for me was that the table would grow very quickly to a point at which it was unmanageable. I could partition the table to increase the speed of both insert and query operations, but I would still have to manage the database (backup, etc.). I did like the idea of inserting a row into a table, however, as this made it simple to log activity by merely inserting the information into a table.

I also wanted to write the log messages to text files that would exist in a location that was readily available to me to be able to research problems. Typically, there are no shares defined on database servers that are available to developers

I finally hit on what I believe is a rather elegant solution: to overcome the growth of the table, I decided to use Service Broker.

Service Broker is a delivery system for routing messages from a Queue to a Service. This routing begins by what is known as the Initiator beginning a conversation with Service Broker and specifying the Target Service with which it wants to communicate. The Target Service can then send a response to the Initiator by sending a message back to the Initiator Service. By defining a Target Queue associated with the Target Service, the effect is that the Message is placed in the Target Queue. The processing of the messages in the Target Queue can be handled directly by SQL Server by specifying a stored procedure to call whenever a message appears on the queue.

The problem I had with using a stored procedure to process the messages was that I wanted to be able to specify a location for the log files and have flexibility in the format of the written messages. After a little reading, I decided that I could create a Windows Service that would query the Target Queue and process the messages. This meant that I would have control over the location of the log files and the format of the messages.

The tasks I needed to complete in order to make this solution work were:

  • Define a table in the database of your choice. If you will be logging messages from multiple servers, creating a linked server in each of the other servers can simplify the insertion from these servers into the table.
  • Define an INSTEAD trigger for the table that would handle the conversation with the Service Broker service.
  • Define a Windows Service that would RECEIVE those messages and write them to a log file.

The sequence of events is:

  1. The application senses a condition that requires logging information.
  2. The application inserts the pertinent information into the log table.
  3. The trigger for the table fires and begins the conversation with the Service Broker Target Service.
  4. The Service Broker service adds the message to the Target Queue.
  5. The Windows Service receives the message and writes the formatted information to the appropriate log file.

That's the guts of it. The details follow.

Message Type

This defines what kind of messages will be handled. The only specification here is that they will all be XML and verified by Service Broker as valid.

CREATE MESSAGE TYPE [LogRequest] 
AUTHORIZATION [dbo]
VALIDATION = WELL_FORMED_XML

Message Contract

This defines what type of message will be handled and who will send it. We are only interested in sending messages from the client, so we specify the initiator as the sender.


CREATE CONTRACT [LogContract]
AUTHORIZATION [dbo]
(
[LogRequest] SENT BY INITIATOR
)

Queues

Initiator Queue

This defines the queue for the initiator. It is only used to send messages back to the sender, which we are not doing, but it is good practice to define one in case you ever decide to do so.

CREATE QUEUE [dbo].[LogInitiatorQueue]

Target Queue

This defines the queue into which the sender will put the messages.

CREATE QUEUE [dbo].[LogTargetQueue] 
WITH
STATUS = ON ,
RETENTION = OFF
ON [PRIMARY]

Services

Initiator Service


This defines the service for retrieving the messages from the initiator queue.

CREATE SERVICE [LogInitiatorService] 
AUTHORIZATION [dbo]
ON QUEUE [dbo].[LogInitiatorQueue]

Target Service


This defines the service that will handle the messages from the target queue.

CREATE SERVICE [LogTargetService] 
AUTHORIZATION [dbo]
ON QUEUE [dbo].[LogTargetQueue]
(
[LogContract]
)

Log Table


This table is where the client will insert the messages.

CREATE TABLE [dbo].[LogData]( 
 [MachineName] [varchar(50)] NOT NULL,
[ProcessName] [varchar](50) NOT NULL,
[ProcessMessage] [varchar](max) NOT NULL,
[LogTime] [datetime] NULL
) ON [PRIMARY]

The messages will never actually be inserted into the table. Rather, they will be re-routed to the target service by the use of a trigger that is executed instead of allowing the data to be inserted.

CREATE TRIGGER [dbo].[LogData_Insert] 
ON [dbo].[LogData]
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON; DECLARE
@MessageBody XML,
@Handle UNIQUEIDENTIFIER,
@RequestMsg XML SELECT
@RequestMsg =
(
SELECT
*
FROM
Inserted
FOR XML PATH ('Row'), ROOT('LOG'), TYPE
); BEGIN TRANSACTION BEGIN DIALOG @Handle
FROM SERVICE
[LogInitiatorService]
TO SERVICE
N'LogTargetService'
ON CONTRACT
[LogContract]
WITH
ENCRYPTION = OFF;SEND ON CONVERSATION @Handle
MESSAGE TYPE
[LogRequest](@RequestMsg); COMMIT TRANSACTION END

Windows Service

The service will be a Windows Service that will constantly be trying to receive messages from the Target Queue. The standard plumbing for a Windows Service will not be discussed here. The central code that performs the function of receiving the messages from the queue is as follows:

  1. Create a connection to the database containing the Service Broker Target Queue.
  2. Create a command for the connection that will receive the next message on the Target Queue. This command will wait forever for a message to be available.
  3. Create an XML document from the character string received from the execution of the command. This allows us to retrieve the information within the message. Remember that the messages on the Target Queue are XML.
  4. Format the information in the message.
  5. Determine the log file to which you want to write the message. In my case, I am writing all log files by each computer in its own folder. Each log file will be named using the ProcessName within the message.
  6. If log files are being created for each day, or some other period, check to se if the current file needs to be moved o an archive location.
  7. Write the formatted message to the log file.

Connection

using ( SqlConnection Conn = new SqlConnection ( this.LogConnectionString ) )
{
Conn.Open ();

Command

using ( SqlCommand Cmd = Conn.CreateCommand () )
{
Cmd.CommandText = "ProcessLogQueue";

The contents of the stored procedure follow:

BEGIN TRANSACTION 

WAITFOR (RECEIVE TOP(1) 
 CONVERT (XML, CONVERT (VARCHAR(MAX), message_body) ) 
FROM 
 LogTargetQueue 

COMMIT TRANSACTION

Cmd.CommandType = System.Data.CommandType.StoredProcedure;
Cmd.CommandTimeout = 0; // no timeout

while ( !this.m_Terminate ) // looping until the service is stopped
{
string Response = Cmd.ExecuteScalar ().ToString (); // execute the command

if ( Response.Length > 0 )
{
XmlDocument Doc = new XmlDocument ();
Doc.LoadXml ( Response );
XmlNode RootNode = Doc.SelectSingleNode ( "LOG" );
XmlNode RowNode = RootNode.SelectSingleNode ( "Row" );
string ProcessName = RowNode.SelectSingleNode ( "ProcessName" ).InnerText;
string MachineName = RowNode.SelectSingleNode ( "MachineName" ).InnerText; string MachineFolder = Path.Combine ( this.LogFolder, MachineName );

if ( !Directory.Exists ( MachineFolder ) )
{
try
{
Directory.CreateDirectory ( MachineFolder );
}
catch ( Exception Ex )
{
// log the failure to a logfile for the Windows Service
return;
}
}

// create the name of the log file

string FileName = Path.Combine ( MachineFolder, ProcessName + ".log" );

try
{
this.CheckLog ( MachineName, FileName ); // does the log file nee to be archived?
}
catch ( Exception Ex )
{
// log the failure to a logfile for the Windows Service
return;
}

using ( StreamWriter SW = new StreamWriter ( FileName, true ) )
{
SW.WriteLine ( string.Format ( "{0} {1}", RowNode.SelectSingleNode ( "LogTime" ).InnerText, RowNode.SelectSingleNode ( "ProcessMessage" ).InnerText ) );
}
}
}
}
}
}
catch ( SqlException Ex )
{
// log the failure to a logfile for the Windows Service and quit
}
catch ( System.Threading.ThreadAbortException )
{
// we have been ordered to quit
}
catch ( Exception Ex )
{
// log the failure to a logfile for the Windows Service and quit
}
}

private void CheckLog ( string MachineName, string FileName )
{

FileInfo FI = new FileInfo ( FileName );

if ( FI.Exists )
{
DateTime Today = Convert.ToDateTime ( DateTime.Now.ToShortDateString () );

if ( FI.LastWriteTime < Today )
{
string ArchiveFileName = Path.GetFileName ( FileName ).Replace ( ".log", FI.LastWriteTime.ToString ( "yyyyMMdd" ) + ".log" );
string MachineFolder = Path.Combine ( this.ArchiveFolder, MachineName );

if ( !Directory.Exists ( MachineFolder ) )
{
Directory.CreateDirectory ( MachineFolder );
}

FI.MoveTo ( Path.Combine ( MachineFolder, ArchiveFileName ) );
}
}
}

An example of the XML produced by the trigger on the log table follows:

<LOG>
<Row>
<ProcessName>MyProcess</ProcessName>
<ProcessMessage>This is the message</ProcessMessage>
<LogTime>2009-12-20T11:47:47.677</LogTime>
<MachineName>MyMachine</MachineName>
</Row>
</LOG>

The essentials here are that we are requesting the next entry from the queue and will wait forever for it. Once we have a message, we extract the Machine Name and Process Name and build a file name from them, using the Machine Name as the folder name. If the file exists, we check to see when a message was last written. If it was not today, we rename it and move it to the archive folder.

Client

As was indicated earlier, the Client, either a stored procedure or any Win32 application, will quite simply insert the message into the database table and move on. As an example, suppose you have an order processing system and, at the end of the process you want to log the number of orders processed by each customer in a batch. You would collect the summary information in a temporary table and run the following code. By doing this, you have started the process described at the beginning of this document.

Note that, if the ProcessMessage is NULL, the Service Broker conversation will fail.

 INSERT INTO Logging.dbo.LogData
(
MachineName,
ProcessName,
ProcessMessage
)
SELECT
CAST(SERVERPROPERTY('MachineName') AS varchar),
'ProcessCustomerOrders',
'Customer: ' + CAST(CutomerID AS varchar) + ' Orders: '+ CAST(OrderCount AS varchar)
FROM
@CustomerOrders

Conclusion

In the end, everyone will have their own preferences for this. Being an old main-frame nut, I am very used to scanning logs. The accumulation of logging information into text files makes it very easy for me to locate problems or trace activity without a lot of effort. As an aside, if you find yourself frequently looking at file content, I advise you to take a look at V (available at http://www.fileviewer.com/). It is hands-down the best utility I have ever seen. It includes the ability to tail a file to watch as it changes.

By Ed Swiedler, 2010/01/06

Total article views: 1765 | Views in the last 30 days: 1765
posted @ 2010-01-07 16:29 天蝎 阅读(112) 评论(0) 编辑

2010年1月6日 #

Written By: Alan Cranfield -- 1/4/2010 -- 4 comments -- printer friendly -- become a member

 

Problem
Partitioning, introduced with SQL 2005 Enterprise Edition, is a godsend for working with very large tables. It provides the means to effectively manage and scale your data at a time when tables are growing exponentially, but maintenance windows are either shrinking or non-existent as with many 24/7 online environments.  In this tip I will cover some data partitioning myths and truths you should know about.

Solution
Having worked with large partitioned databases for a few years, I'd like to share some of my lessons learned. Its important to understand what partitioning is and also what it is not.


Enterprise Edition

First off, I must stress that table and index partitioning is an Enterprise feature in SQL 2005 and remains an Enterprise Feature in SQL 2008.  There is no getting around this. Having said that I have found that partitioning can be the most compelling motivating factor for shops to go with Enterprise Edition over Standard. The headaches, maintenance outages, weekend work et al associated with maintaining Very Large Databases can often be eliminated or minimized with a well thought out partitioning scheme. If you have VLDBs then Enterprise Edition is a no brainer as it will also give you these additional features to complement your partitioning - Asynchronous Database Mirroring, Online Index Rebuilds, backup compression (SQL 2008).

All the Enterprise features are also in Developer edition, so installing Developer edition on your workstation is a good way to test and play with Enterprise features. Often features that are Enterprise now could be in Standard down the road so you need to keep up to speed as a SQL Server professional.

You can't restore a database with partitioning to a non-Enterprise Edition instance. Important to bear this in mind.  Any partition functions and schemes would need to be removed prior to backing up the database and restoring it to a non Enterprise edition instance.

Processed 208 pages for database 'myDB', file 'myDB' on file 1.

Processed 4 pages for database 'myDB', file 'myDB_log' on file 1.

 

Msg 3167, Level 16, State 1, Line 1

RESTORE could not start database 'myDB'.

Msg 3013, Level 16, State 1, Line 1

 

RESTORE DATABASE is terminating abnormally.

Msg 905, Level 21, State 1, Line 1

Database 'myDB' cannot be started in this edition of SQL Server because it contains a partition function 'myRangePF1'. Only Enterprise edition of SQL Server supports partitioning.

 

Partitioned Views offer similar properties to partitioned tables, but do not require Enterprise Edition. See here for an overview on partitioned views. Partitioned views was a surprisingly effective but complicated way of 'partitioning' data in SQL 2000 and still works just as well in SQL 2005 and 2008.


Myth 1: Partitioning is a "Scale-Out" solution

Partitions cannot span servers or instances. Partitions have to be in the same instance and in the same database. Partitioning therefore is a scale-up solution. A scale-out solution for SQL Server can be implemented through distributed partitioned views hosted on Federated Database Servers. This is an advanced solution that I have yet to encounter in my career (I would like to hear from anyone who has actually implemented this in a production environment).


Myth 2: Partitions must be created on different filegroups

The partition scheme definition defines on which filegroup a partition resides. It's a common misconception that you have to spread your partitions out among multiple filegroups. Most code examples seem to use multiple filegroups and I think this is where this myth stems from. The only reason, in my opinion, that you would want multiple filegroups is if those filegroups reside on physically separate drives and you were doing this to improve your I/O for large range queries. I have not seen any performance benefit to having multiple filegroups located on the same drive.

Only split your partitions among multiple filegroups if you have a compelling reason to do so. That reason should be increased performance or manageability.


Myth 3: To partition a non-partitioned table you will need to drop and recreate it

Not true. You can partition an existing table by creating a clustered index (or rebuilding an existing clustered index) on your new Partition Scheme. This will effectively partition your data as the leaf level of a clustered index is essentially the data.  The example below creates a partitioned clustered index on a partition scheme PScheme_Day.

CREATE CLUSTERED INDEX idx ON tblPartitioned(SQLCreated) 
WITH DROP_EXISTING
ON PScheme_Day(SQLCreated)
 

 


Myth 4: Partitioning an existing table is a strictly offline operation

It's true that rebuilding or creating a clustered index is indeed an offline operation. Your table will not be available for querying during this operation. However, partitioning is an Enterprise feature, so we have the online index rebuild feature available to use. The ONLINE = ON option allows us to still query the table while under the covers the partitioning operation is going on. SQL Server does this by using an internal snapshot of the data. Obviously, there is a performance hit and I don't recommend you do this during a busy time but if you have a requirement for 24x7 availability then this is a possible solution.

CREATE CLUSTERED INDEX idx ON tblPartitioned(SQLCreated) 
WITH (DROP_EXISTING = ON, ONLINE = ON)
ON PScheme_Day(SQLCreated)

 


Myth 5: SWITCH'ing partitions OUT or IN in only a few seconds

You'll often read that that the reason partitioning operations like SWITCH IN and SWITCH OUT are so fast and efficient is that they are "metadata" only operations meaning that there is no actual data movement but only the pointers or internal references to the data get changed.

SWITCH'ing partitions OUT or IN is a truly "meta data" operation in that although the partitioned data has magically moved from the partitioned table to the SWITCH table or vice versa there hasn't actually been any movement of data on the disk or inside the data file. However, in my experience, I have found that with highly transactional partitioned tables the SWITCH operation can get blocked or cause blocking itself. This is due to the ALTER TABLE...SWITCH operation requiring a schema modify lock on both the source and target tables.

To get around this potential blocking issue I always set a timeout before any switch operation so that if the ALTER TABLE...SWITCH does start to cause blocking it will only be for as long as the timeout. I will then re-issue the ALTER TABLE...SWITCH statement repeatedly with the timeout until the operation goes through successfully. In this way the partition maintenance task will not interfere with the OLTP function of the database:

In the example below I set a 30 second timeout for the SWITCH to complete. If it timeouts I will repeat until it completes successfully.

SET LOCK_TIMEOUT 30000
ALTER TABLE myPartionedTable SWITCH PARTITION 2 TO SwitchTable
Msg 1222, Sev 16, State 56, Line 1 : Lock request time out period exceeded. 

 


Myth 6: Altering a partition function is a metadata only operation

In practice you may find out that a MERGE or SPLIT operation may take much much longer than the few seconds expected. Altering a partition function is an offline operation and can also result in movement of data on disk and so become extremely resource intensive.

As long as the 2 partitions you are MERGEing together are empty then there will be no data movement and it will be a metadata only operation. If you have a sliding window and you are MERGEing the oldest 2 partitions together then a rule to follow is to always make sure both partitions are empty (SWITCHed OUT) before you MERGE. 

To avoid or minimize data movement when SPLITing a partition function always ensure that you know beforehand how many rows are in the underlying partition that is being SPLIT and how many rows would fall on each side of the new boundary. Armed with this information you should be able to determine beforehand the amount of data movement to expect depending on whether you have defined your partition function using RANGE RIGHT or LEFT.

Be aware that more than one table or index can reside on the same partition scheme or use the same partition function. When you run ALTER PARTITION FUNCTION it will affect all of these tables and indexes in a single transaction.


Myth 7: Partitioned tables improve query performance

If your query is written in such a way that it can read only the partitions it needs the data from then you will get partition elimination and therefore an equivalent performance improvement. If your query does not join or filter on the partition key then there will be no improvement in performance over an unpartitioned table i.e. no partition elimination. In fact, a query that hits a partitioned table has the potential to be even slower than than an unpartitioned table even if both tables have the same index defined. This is due to the fact that each partition in a partitioned table is actually its own b-tree which means that a partitioned index seek will need to do one seek per partition as opposed to one seek per table for an unpartitioned index seek.

Both tables below have a nonclustered index defined on the commodity column. The index on table tblPartitioned has been created on the partition scheme to align it with the clustered index but as we can see from the below this causes each partition to be scanned because the partition key is actually on a different column. 

select count(*)
from tblPartitioned
where commodity = 'BOFRB'
 
select count(*)
from tblUnPartitioned
where commodity = 'BOFRB'
 
 
-----------
371
Table 'tblPartitioned'. Scan count 54, logical reads 109, physical reads 0, read-ahead reads 0, 
lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 
-----------
371
Table 'tblUnPartitioned'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, 
lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 
 

The graphical query plan clearly shows the difference in cost between the two queries. The top partitioned table query

 

If we also filter on the partition key column we get the desired performance improvement over the equivalent query on the unpartitioned table. If you are partitioning tables in an existing database then its a good idea to test all your queries for performance. Don't assume that once the data is partitioned that your queries will improve. Queries may need to be updated to include the new partition key in the where clause or join to take advantage of the performance benefits of partitioning.


Myth 8: Partitioned tables ease maintenance of VLDBs and Very Large Tables

Partitioning can be effectively used to break up a very large table into multiple partitions based on a column or partitioning key. However, there can be significant management overhead in maintaining partitioned tables. The features of partitioned tables that give you the ease of management can also conspire against you to make your maintenance a nightmare!

Thinking of each partition as its own table will help you understand how best to approach your maintenance. Index rebuilds can be quicker because you can rebuild just those partitions that you identify as being fragmented. Conversely, full partitioned Index rebuilds will take longer as each partition is its own b-tree.

Note that old style DBCC SHOWCONTIG command does not work for partitioned indexes. You need to familiarize yourself with the more powerful DMV sys.dm_db_index_physical_stats. This DMV returns fragmentation information at a partition level and provides you the necessary information to make a decision on which partitions require rebuilding or reorganizing.

-- to return fragmentation information on partitioned indexes
SELECT
 object_name(a.object_id) AS object_name,
 a.index_id,
 b.name,
 b.type_desc,
 a.partition_number,
 a.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL , NULL, 'LIMITED') a
JOIN sys.indexes b on a.object_id = b.object_id and a.index_id = b.index_id
order by object_name(a.object_id), a.index_id, b.name, b.type_desc, a.partition_number

Once fragmentation has been identified we can rebuild the index statement for just that single partition:

--Rebuild only partition 11.
ALTER INDEX IX_alert_events_timestamp
ON dbo.alert_events
REBUILD Partition = 11;
GO

If you're partitioning your data on a date field and you have a sliding window and you are only ever adding data to the right most partitions then you can save unnecessary overhead by not checking for fragmentation of older partitions where you already know they are not fragmented. sys.dm_db_index_physical_stats DMV allows you to specify a partition number. This greatly improves IO over full index scans and has the potential to shorten maintenance job durations.


Conclusion

Well designed database partitioning gives the DBA the opportunity to effectively work with massive amounts of data while at the same time providing the same level of performance as a small database. Having a sound partition management and maintenance plan will free up valuable DBA time to concentrate on more challenging problems and opportunities.

 

Next Steps

Readers Who Read This Tip Also Read Comment or Ask Questions About This Tip Twitter This Tip!
posted @ 2010-01-06 11:39 天蝎 阅读(143) 评论(0) 编辑

2009年12月21日 #

出处:http://blog.csdn.net/wenpingblog/archive/2009/12/18/5032829.aspx

 

什么是SHELL呢?下面是我平日资料累计中的一点内容,肯定来源于网上(无从考证了),共享于大家(感谢此文原作者的共享),正文开始如下:

====================================================

在介绍 shell 是甚幺东西之前,不妨让我们重新检视使用者与计算机系统的关系。

我们知道计算机的运作不能离开硬件,但使用者却无法直接对硬件作驱动,
硬件的驱动只能透过一个称为"操作系统(Operating System)"的软件来控管,
事实上,我们每天所谈的 linux ,严格来说只是一个操作系统,我们称之为"核心(kernel)"。
然而,从使用者的角度来说,使用者也没办法直接操作 kernel ,
而是透过 kernel 的"外壳"程序,也就是所谓的 shell ,来与 kernel 沟通。
这也正是 kernel 跟 shell 的形像命名关系。如图:
图(FIXME)

从技术角度来说,shell 是一个使用者与系统的互动界面(interface),
主要是让使用者透过命令行(command line)来使用系统以完成工作。
因此,shell 的最简单的定义就是---命令解译器(Command Interpreter):
* 将使用者的命令翻译给核心处理,
* 同时,将核心处理结果翻译给使用者。

每次当我们完成系统登入(log in),我们就取得一个互动模式的 shell ,也称为 login shell 或 primary shell。
若从行程(process)角度来说,我们在 shell 所下达的命令,均是 shell 所产生的子行程。这现像,我们暂可称之为 fork 。
如果是执行脚本(shell script)的话,脚本中的命令则是由另外一个非互动模式的子 shell (sub shell)来执行的。
也就是 primary shell 产生 sub shell 的行程,sub shell 再产生 script 中所有命令的行程。
(关于行程,我们日后有机会再补充。)

这里,我们必须知道:kernel 与 shell 是不同的两套软件,而且都是可以被替换的:
* 不同的操作系统使用不同的 kernel ,
* 而在同一个 kernel 之上,也可使用不同的 shell 。
在 linux 的预设系统中,通常都可以找到好几种不同的 shell ,且通常会被列于如下档案里:
/etc/shells
不同的 shell 有着不同的功能,且也彼此各异、或说"大同小异"。
常见的 shell 主要分为两大主流:
sh:
burne shell (sh)
burne again shell (bash)
csh:
c shell (csh)
tc shell (tcsh)
korn shell (ksh)
(FIXME)

大部份的 Linux 系统的预设 shell 都是 bash ,其原因大致如下两点:
* 自由软件
* 功能强大
bash 是 gnu project 最成功的产品之一,自推出以来深受广大 Unix 用户喜爱,
且也逐渐成为不少组织的系统标准。

===================================================================

上文讲述了什么是SHELL,SHELL是什么。那么Oracle运行在那一个层面呢?也是SHELL之上。Oracle并未直接访问UNIX Kernel,而是通过SHELL来支撑其运行环境。

没SHELL行吗?

posted @ 2009-12-21 14:24 天蝎 阅读(79) 评论(0) 编辑

2009年12月10日 #

Original link:http://www.sqlskills.com/blogs/bobb/post/Plan-freezing-and-other-plan-guide-enhancements-in-SQL-Server-2008.aspx

 

One of the nifty new SQL Server 2005 features was called "plan forcing". You could acheive plan forcing by using the USE PLAN query hint but this made such a hideous query that the better way to do this is to use a plan_guide. A plan_guide is a named database object (like table or view) that associates a query hint with a particular query.

In SQL Server 2008 there have been some enhancements made for plan_guides and plan forcing. The biggest change is that you can create a plan from the plan cache directly, rather than using the rather long set of steps that you needed in SQL Server 2005. You only need a few lines of boilerplate code, setting up and invoking sp_create_plan_guide_from_cache, and the plan guide is automatically created. See SQL Server 2008 Nov 2007 for details and copious examples. The concept of creating a plan guide from a query plan already in cache is known as "plan freezing".

You can also validate that a plan guide will still be used (after dropping an index, for example) by using the new sys.fn_validate_plan_guide function. In addition, its easy to see if a Plan Guide is being used or not with the profiler trace events "Plan_guide_successful" and "Plan_guide_unsuccessful", located under the "Performance" category. These can also be monitored over time with two new perfmon counters "Guided Plan Executions/sec" and, what has to be the most humorous perfmon counter "Misguided Plan Executions/sec".

One other plan guide enhancement that doesn't seem to be mentioned in BOL is that you can use plan guides for INSERT, UPDATE, DELETE, or even MERGE statements in SQL Server 2008. This feature works in the SQL Server 2008 Nov CTP (CTP5).

As always the standard disclaimer is "query hints are (usually) evil" but if you're desperate enough to need plan forcing or plan freezing, its good to know that its there.

One reason why I've seen people use plan freezing is as a hedge against plan instability that can result from what's known as "parameter sniffing". Parameter sniffing is usually a good thing; the query processor sniffs the parameter values the first time you execute a parameterized stored procedure and uses the statistics based on those parameters accordingly. However, if your "first time execution" was based on uncommon parameter values, you're now stuck with a suboptimal plan. Plan freezing can be use to freeze a plan that's most optimal and "override" parameter sniffing. You just have to catch such a plan in the cache.

The drawback is, of course, that you are stuck with the same plan even if the statistics change. I'm sometimes accused of liking plan guides and plan forcing too much, but perhaps it's because I'd oftentimes get a call when parameter sniffing (or a service pack: yes, it does happen every now and again) changed to a query plan that is suboptimal. For those trouble calls (ie, "my query just starting running 4 times slower today and the plan is different from yesterday") plan_guides/forcing let you change the plan without changing the underlying code. When the fix arrives, you just disable the plan guide to let the optimizer go on its merry way.

posted @ 2009-12-10 11:55 天蝎 阅读(106) 评论(0) 编辑

2009年12月1日 #

摘要: ProblemI have several objects, all in the same schema. Because of this, ownership chaining is working, as described in this previous tip. However, I don't want ownership chaining to be on, but I need ...阅读全文
posted @ 2009-12-01 14:59 天蝎 阅读(122) 评论(0) 编辑

2009年11月18日 #

摘要: 谈到自动化测试,一般就会提到测试工具。许多人觉得使用了一、两个测试工具就是实现了测试自动化,这种理解是不对的,至少是片面的。的确,测试工具的使用是自动化测试的一部分工作,但“用测试工具进行测试”不等于“自动化测试”。那什么是“自动化测试”? 半自动化测试过程,算不算自动化测试?是否可以为“自动化测试”给...阅读全文
posted @ 2009-11-18 17:41 天蝎 阅读(226) 评论(0) 编辑

摘要: 存储过程一方面减少网络数据流量,提高数据处理效率;另一方面在数据库层面进行事务处理和控制比在应用程序中使用分布式事务更高效、占用的资源更少;还有一个就是灵活性问题,设计良好的存储过程,当业务逻辑或规则发生变化时,可以不用修改软件代码,只需修改存储过程即可,这对于用户和软件开发商来讲,可以降低维护成本和升级花费;另外,扩展开来还有一个系统健壮性和安全性问题,在非电信级商业网络环境中(比如:大多数企业...阅读全文
posted @ 2009-11-18 16:32 天蝎 阅读(23) 评论(0) 编辑

摘要: SOA是Service Oriented Architecture的简称,中文译为“面向服务的体系架构”。与传统技术相比,SOA从业务着眼,基于标准化的技术手段,以“服务”为基本元素来构建或整合适合于各行业应用需求的信息系统,提高信息系统的开发效率、充分整合和复用IT资源、并使信息系统能灵活快速的响应业务变化需求。  从技术的角度来看,SOA对软件开...阅读全文
posted @ 2009-11-18 16:19 天蝎 阅读(79) 评论(0) 编辑

2009年9月3日 #

摘要: Orignal link: http://www.simple-talk.com/sql/database-administration/the-dba-as-detective-troubleshooting-locking-and-blocking/In this article, taken from Chapter 5 of his great new book, SQL Server T...阅读全文
posted @ 2009-09-03 14:23 天蝎 阅读(320) 评论(0) 编辑