[Study Note] NHibernate In Action 20100526

5. Transactions, concurrency, and caching

5.1 Understanding database transactions

ACID - Atomicity, consistency, isolation and durability

atomic: several operations are grouped together as a single indivisible unit.
isolation: transactions allow multiple users to work concurrently with the same data without compromising the integrity and correctness of the data; a particular transaction shouldn't be visible to and shouldn't influence other concurrently running transactions.
consistency: any transaction works with a consistent set of data and leaves the data in a consistent state when the transaction complete.
durability: once a transaction completes, all changes made during that transaction become persistent and aren't lost even if the system subsequently fails.

user transaction - conversation: database operations occur in several batches, alternating with user interaction.

caching strategies need to be balanced to also allow for consistent and durable transactions.

Database implement the notion of a unit of work as a database transaction (sometime called a system transaction).

committed | rolled back

transaction demarcation

5.1.1 ADO.NET and Enterprise Services/COM+ transactions

Without Enterprise Service, ADO.NET API

BeginTransaction()

Commit()'

Rollback()

COM+ automatic transaction-processing service

With Enterprise Services, the automatic transaction-processing service

5.1.2 The NHibernate ITransaction API

session.BeginTransaction()

Transaction.Commit()

make sure the session is closed at the end in the order to ensure that the ADO.NET connection is released and return to the connection pool.

transaction.WasCommitted.session

Transaction.WasCommitted

the ISession has to be immediately closed and discarded (not reused) when an exception occurs.

flushing, a process you automatically trigger when you use the NHibernate ITransaction API.

5.1.3 Flushing the session

The NHibernate ISession implements transparent write-behind.

NHibernate flushes occur:

When an ITransaction is committed
Sometimes before a query is executed
When the application callls
ISession.Flush() explicitly

flush modes:

FlushMode.Auto - the default. Enable the behavior just described
FlushMode.Commit - Specifies that the session won't be flushed before query execution (it will be flush only at the end of the database transaction)
FlushMode.Never - specify that only explicit calls to Flush() result int synchronization of session state state with the database

5.1.4 Understanding connection-release modes

NHibernate.ConnectionReleaseMode:

OnClose - the ression release the connection when it's close. This was the only mode available in NHibernate 1.0.
AfterTransaction - The connection is released as soon as the transaction completes.

Note that you can use the Disconnect() method of the ISession interface to force the release of the connection (without closing the session) and the Reconnect() method to tell the session to obtain a new connection when needed.

hibernate.connection.release_mode = auto | on_close | after_transaction

5.1.5 Understanding isolation levels

transaction isolation - from the point of view of each concurrent transaction, it appears no other transactions are in progress.

locking

multiversion concurrency control

Isolation issues

Lost update - Two transactions both update a row, and then the second transaction aborts, causing both changes to be lost. This occurs in systems that don't implement any locking. The concurrent transactions aren't isolated.
Dirty read - One transaction reads changes made by another transaction that hasn't yet been committed. This is dangerous, because those changes may later rolled back.
Unrepeatable read - A transaction reads a row twice and reads different state each time. For example, another transaction may have written to the row, and committed, between the two reads.
Second lost updates problem - This is a special case of an unrepeatable read. Imagine that two concurrent transactions both read a row, one writes to it and commits, and then the second writes to it and commits. The changes mady by the first writer are lost. This problem is also known as last write wins.
Phantom read - A transaction executes a query twice, and the second result set includes rows that weren't visible in the first result set. (It need not be exactly the same query.) This situation is caused by another transaction inserting new rows between the execution of the two queries.

Isolation Levels

Read uncommitted - Permits dirty reads but not loat updates. One transaction may not write to a row if another uncommitted transaction has already written to it. But any transaction may read any row. This isolation level may be implemented using exclusive write locks.
Read committed - Permits unrepeatable reads but not dirty reads. This may be achieved using momentary shared read locks and exclusive write lock. Reading transactions don't block other transactions from accessing a row. But an uncommitted writing transactions blocks all other transactions from accessing the row.
Repeatable read - Permits neither unrepeatable reads nor dirty reads. Phantom reads may occur. This may be achieved using shared read locks and exclusive write locks. Reading transactions block writing transactions (but not other reading transactions), and writing transactions block all other transactions.
Serializable - Provides the strictest transaction isolation. It emulates serial transaction execution, as if transactions had been executed one after another, serially, rather than concurrently. Serializability may not be implemented using only row-level locks; another mechanism must prevent a newly inserted row from becomming visible to a transaction that has already executed a query that would return the row.

5.1.6 Choosing an isolation level

nothing is carved in stone

choose an isolation level in an NHibernate application

First, eliminate the read uncommmitted isolation level.

Second, most applications don't need serializable isolation.

preferred transaction isolation level is read-committed mode.

5.1.7 Setting an isolation level

ReadUncommitted - Read-uncommitted isolation
ReadCommitted - Read-committed isolation
RepeatableRead - Repeatable-read isolation
Serializable - Serializable isolation

5.1.8 Using pessimistic locking

locking is a mechanism that prevents concurrent access to a praticular item of data.

A pessimistic lock is a lock that is acquired when an item of data is read and that is held until transaction completion.

ITransaction tx = session.BeginTransaction();
Category cat = session.Get<Category>(catId, LockMode.Upgrade);
cat.Name = "New Name";
tx.Commit();

LockMode.None - Don't go to the database unless the object isn't in either cache.
LockMode.Read - Bypass both levels of the cache, and perform a version check to verify that the object in memory is the same version that currently exists in the database.
LockMode.Upgrade - Bypass both levels of the cache, do a version check (if applicable), and obtain a database-level pessimistic upgrade lock, if that is supported.
LockMode.UpgradeNoWait - The same as UPGRADE, but use a SELECT ... FOR UPDATE NOWAIT, if that is supported. This disables wating for concurrent lock releases, thus throwing a locking exception immediately if the lock can't be obtained.
LockMode.Write - The lock is obtained automatically when NHibernate writes to a row in the current transaction (this is an internal mode; you can't specify it explicitly)

Load(), Get() use LockMode.None

ISession.Lock() and a detached object use LockMode.Read

... let the professional DBA decide which transactions require pessimistic locking once the application is up and running. This decision should depend on subtle details of the interactions between different transactions and can't be guessed up front.

5.2 Working with conversations

conversation: a broader notion of the unit of work

conversation - long transaction, user transaction, application transaction, or business transaction

5.2.1 An example scenario

Last commit wins
First commit wins - optimistic locking
Merge conflicting updates - optimistic locking

5.2.2 Using managed versioning

Managed versioning relies on either a version number that is incremented or a timestamp that is updated to the current time, every time an object is modified.

The version number is just a counter value - it doesn't have any useful semantic value.

Recommend: new projects use a numeric version and not a timestamp

NHibernate increments the version number whenever an object is dirty.

optimistic-lock="false"

5.2.3 Optimistic and pessimistic locking compared

An optimistic approach always assumes that everything will be OK and that conflicting data modifications are rare.

An pessimistic block concurrent data access immediately and force execution to be serialized.

the duration of a pessimistic lock in NHibernate is a single database transaction.

5.2.4 Granularity of a session

The scope of object identity
The granularity of database and conversations

The NHibernate ISession instance defines the scope of object identity.

The NHibernate ITransaction instance matches tha scope of a database transaction.

session-per-request
session-per-request-with-detached-objects
session-per-conversation / long session

5.2.5 Other ways to implement optimistic locking

If you need optimistic locking for detached objects, you must use a version number or timestamp.

alternative implementation of optimistic locking checks the current database state against the unmodified values of persistent properties at the time the object was retrieved (or the last time the session was flushed).

optimistic-lock="all"
optimistic-lock="dirty" (dynamic-update="true")

slower, more complex, and less reliable

5.3 Caching theory and practice

most applications should be designed so that it's possible to achieve acceptable performance without the use of a cache.

caching can have an enormous impact on performance.

A cache keeps a representation of current database state close to the application either in memory or on the disk of the server machine. The cache is essentially merely a local copy of the data; it sits between your application and the database.

> The application performs a lookup by identifier (primary key).
> The persistence layer resolves an association lazily.

5.3.1 Caching strategies and scopes

Three main types of cache:

Transaction scope - Attached to the current unit of work, which may be an actual database transaction or a convertion. It's valid and used as long as the unit of work runs. Every unit of work has its own cache.
Process scope - Shared among many (possibly concurrent) units of work or tranactions. Data in the process-scope cache is accessed by concurrently running transactions, obviously with implications on transaction isolation. A process-scope cache may store the persistent instances themselves in the cache, or it may store just their persistent stat in some disassembled format.
Cluster scope - Shared among multiple processes on the same machine or among multiple machines in a cluster. It requires some kind of remote process communication to maintain consistency. Caching information has to be replicated to all nodes in the cluster.

Persistence layers may provide multiple levels of caching.

cache miss - a cache lookup for an item that isn't contained in the cache

The type of cache used by persistence layer affects the scope of object identity (the relationship between .NET object identity and database identity).

notice: 两个月之后再回来看这本书，有点恍如隔世的感觉：）

Caching and Object Identity

A transaction-scope cache is a good fit for persistence mechanisms that provide transaction-scope object identity.

In the case of the process-scope cache, objects retrieved may be returned by value. Instead of storing and returning instances, the cache contains tuples of data. Each unit of work first retrieves a copy of the state from the cache (a tuple) and then uses that to construct its own persistent instance in memory.

In the case of POCO-oriented persistence solutions like NHibernate, objects are always passed remotely by value.

The cluster-scope cache handles identity the same way as the process-scope cache; they each store copies of data and pass that data to the application so thay can create their own instances from it. In NHiberante terms, they’re both second-level caches, the main difference being that a cluster-scope cache can be distributed across several computers if needed.

the first-level transaction scope cache is always on and is mandatory.

我现在的项目里面，对于 Cache 基本上是放任自流的状态，其实 cluster-scope 我目前估计还用不到，希望能够看到前两种类型的 Cache 在 ASP.NET 应用程序中实际运用的情况。

Caching and Transaction Isolation

A process- or cluster-scope cache makes data retrieved from the database in one unit of work visible to another unit of work.

the cache is allowing cached data to be shared among different units of work, multiple threads, or even multiple computers.

First, if more than one application is updating the database, then you shouldn’t use process-scope caching, or you should use it only for data that changes rarely and may be safely refreshed by a cache expiry.

Second, application to scale over several machines support clustered operation, use cluster-scope (distributed) cache.

Third, share access to their databases with other legacy application, shouldn’t use any kind of cache beyond the mandatory transaction-scope cache.

Not every cache implementation respects all transaction isolation levels, and it’s critical to find out what is required.

A full ORM solution lets you configure second-level caching separately for each class. Good candidate classes for caching are classes that represent

Data that rarely changes
Noncritical data
Data that is local to the application and not shared

Bad candidates for second-level caching are

Data that is updated often
Financial data
Data that is shared with a legacy application

reference data is an excellent candidate for caching with a process or cluster scope, and any application that uses reference data heavily will benefit greatly if that data is cached.

A small numbers of instances
Each instance referenced by many instances of another class or classes
Instance rarely (or never) updated

我觉得似乎字典表就属于这种 reference data

5.3.2 The NHibernate cache architecture

NHibernate has a two-level cache architecture.

The first-level cache is the ISession.

The second-level cache in NHibernate is pluggable and may be scoped to the process or cluster. Use of the second-level cache is optional and can be configured on a per-class and per-association basis.

Using the First-Level Cache

The session cache ensures that when the application requests the same persistent object twice in a particular session, it gets back the same (identical) .NET instance.

avoid unnecessary database trafficc. More important, it ensures the following:

The persistence layer isn’t vulnerable to stack overflows in the case of circular references in a graph of objects.
There can never be conflicting representations of the same database row at the end of a database transaction.
Changes made in a particular unit of work are always immediately visible to all other code executed inside that unit of work.

我现在的程序采用了每 Application_End 的时候 flush() 的方式，似乎有点问题，导致我对数据库的访问过多。

Managing the First-Level Cache

ORM isn’t suitable for mass-update (or mass-delete) operations. If you have a use case like this, a different strategy is almost always better: call a stored procedure in the database, or use direct SQL UPDATE and DELETE statements for that particular use case.

The Nhibernate Second-Level Cache

The NHibernate second-level cache has process or cluster scope; all sessions share the same second-level cache. The second-level cache has the scope of an ISessionFactory.

Persistent instances are stored in the second-level cache in a disassembled form.

cache policies – caching strategies and physical cache providers

The cache policy involves setting the following:

Whether the second-level cache is enabled
The NHibernate concurrency strategy
The cache expiration policies (such as expiration or priority)

The cache is usually useful only for read-mostly classes.

Built-in Concurrency Strategies

A concurrency strategy is a mediator; it’s responsible for storing items of data in the cache and retrieving them from the cache. … it also defines the transaction isolation semantics for that particular item.

Three built-inn concurrency strategies are available, representing decreasing levels of strictness in terms of transaction isolation:

Read-write – Maintains read-committed isolation, using a timestamping mechanism.
Nonstrict-read-write – Make no guarantee of consistency between the cache and the database.
Read-only – Suitable for data that never changes. Use it for reference data only.

Choosing a Cache Provider

Hashtable – Not intended for production use. It only caches in memory and can be set using its provider: NHibernate.Cache.HashtableCacheProvider (available in NHibernate.dll)
SysCache – Relies on System.Web.Caching.Cache for the underlying implementation. NHibernate.Caches.SysCache.SysCacheProvider in NHibernate.Caches.SysCache.dll. This provider should only be used with ASP.NET Web Application.
Prevalence – Makes it possible to use the underlying Bamboo.Prevalence implementation as a cache provider. NHibernate.Caches.Prevalence.PrevalenceCacheProvider in NHibernate.Caches.Prevalence.dll.

Setting up caching therefore involves two steps

Look at the mapping files for your persistent classes, and decide which cache-concurrency strategy you’d like to use for each class and eache association.
Enable your preferred cache provider in the NHibernate configuration, and customize the provider-specific settings.

5.3.3 Caching in practice

A collection cache holds only the identifiers of the associated item instances. If you require the instances themselves to be cached, you must enable caching of the “Item” class.

Understanding Cache Regions

NHibernate keeps different classes/collections in different cache regions. A region is a named cache: a handle by which you can reference classes and collections in the cache-provider configuration and set the expiration policies applicable to that region.

use the NHibernate configuration property hibernate.cache.region_prefix to specify a root region name for a particular ISessionFactory.

Setting Up a Local Cache Provider

System.Web.Caching.CacheItemPriority

<?xml version=”1.0” ?>

<configuration>

<configSections>

    <section name=”syscache” type=”NHibernate.Caches.SysCache.SysCacheSectionHandler, NHibernate.Caches.SysCache” />

</configSections>

<syscache>

    <cache region=”Category” expiration=”36000” priority=”5” />

    <cache region=”Bid” expiration=”300” priority=”1” />

</syscache>

</configuration>

Using a Distributed Cache

It isn’t necessarily wrong to use a purely local (non-cluster-aware) cache provider in a cluster.

MemCache - NHibernate.Caches.MemCache.MemCacheProvider in NHibernate.Caches.MemCache.dll
NCache – commercial
Microsoft Velocity – commercial

you can centralize cache configuration in hibernate.cfg.xml

For cluster cache providers, it may be better to set the NHibernate configuration option hibernate.cache.use_minimal_puts to true.

Controlling the Second-Level Cache

NHibernate loads the cache provider and starts using the second-level cache only if you have any cache declarations in your mapping files or XML configuration files. If you comment them out, the cache is disabled.

SessionFactory.Evict();

这一篇学习笔记居然拖拉了两个多月才完成，好在又“捡”起来了，希望这次能够把这本书看完（NHibernate in Action，估计看一遍是不够的）。

posted on 2010-07-27 00:42 zhaorui 阅读(523) 评论(0) 收藏举报

刷新页面返回顶部

Road to Freelancer