存储过程是罪恶

关于存储过程我一直认为应该仅仅在必须的时候采用,但是很多人不这么看.我发现本文表达了我的心声,不如用它来传播我的想法更好.

Stored Procedures are EVIL

存储过程是罪恶

By Tony Marston

By Tony Marston

3rd September 2006

3rd September 2006

A lot of developers are taught to use database stored procedures, triggers and database constraints at every possible opportunity, and they cannot understand why an old dinosaur like me should choose to take an opposite view. The reason can be summed up quite simply:

很多开发者被教导使用存储过程触发器和数据库约束,只要可能都应该用,但是他们不能理解一个老家伙(比如我)的看法是相反的.原因简单如下:

You only know what you have been taught, whereas I know what I have learned.
你只知道你被教的,而我知道我学过的
I was weaned on file systems and databases which did not have any facilities for stored procedures and triggers, so I learned how to build applications without them. When such facilities became available my colleagues and I still never used them for practical reasons:
我工作于没有任何存储过程和触发器的数据库和文件系统,因此我掌握了如何在没有这些服务的情况下构建应用的方法,当有了这样的服务后,我和同事依然不会用它,因为:
It meant learning a new language, and we didn't have the time.

这意味着需要学习新的语言,我没有这个时间

It meant taking longer to implement and maintain, therefore cost more to develop. This is an important consideration for a software house which can only win business by providing cost-effective solutions.
这意味着需要很久时间实现和维护,因此需要更多时间开发.对于一个软件生产者,需要提供成本有效的解决方案来说,这是非常重要的
There was no advantage in doing so, so why bother?
这样做没有什么好处,何必庸人自扰?
Our golden rule was:

我的黄金法则是:

Use stored procedures and triggers only when it is an absolutely necessity.
只有在绝对必要的情况下才使用存储过程和触发器
This is in total conflict with the attitude of today's wet-behind-the-ears tenderfoot greenhorns who seems to think:
和这个法则完全相左的是:
Use stored procedures and triggers at every possible opportunity simply because you can.

只要有可能就用存储过程和触发器,原因就在于你会做

Amongst the arguments in favour of stored procedures are:
挑几个和存储过程有关的争论:
Stored procedures are not as brittle as dynamic SQL
存储过程不像动态sql那么脆弱
Some people argue that putting ad-hoc SQL in your business layer (BL) code is not that good. Agreed, but who said that the only alternative is stored procedures? Why not have a DAL that generates the SQL query at runtime based on information passed to it by the BL? It is correct to say that small changes to the database can have severe impacts on the application. However, changes to a relational model will always have an impact on the application that targets that model: add a non-nullable column to a table and you will see what I mean. You can use stored procedures or ad-hoc queries, you have to change the calling code to make sure that column gets a value when a new row is inserted. For Ad-hoc queries, you change the query, and you're set. For stored procedures, you have to change the signature of the stored procedure, since the INSERT/UPDATE procs have to receive a value for the new column. This can break other code targeting the stored procedure as well, which is a severe maintenance issue. A component which generates the SQL on the fly at runtime doesn't suffer from this: it will for example receive an entity which has to be saved to the database, that entity contains the new field, the SQL is generated and the entity is saved. No maintenance problems. With a stored procedure this wouldn't be possible.
有些人辩解说把ad-hoc sql(嵌入代码中的sql)放到那些业务层内是不好的.同意,可是,谁说ad-hoc sql是存储过程的唯一替代方案?为什么不采用DAL方法,根据运行时传递到业务层的信息来动态构建sql?这样做的化,数据库的小改动会导致对应用产生几个影响,这样说是正确的.然而,改变关系模型总是会影响到应用程序:添加一个非空列到表内,你就能明白我的意思.你可以使用存储过程或者ad-hoc 查询,你必须修改调用代码以便保证在插入新行的时候列有一个值.对应ad-hoc 查询,你修改查询就可以搞定;对于存储过程会导致几个维护问题.一个用来动态生成sql的组件就不会因此受困:它会接受一个要保存到数据库的实体,这个实体包含了新的字段,sql自动生成,然后实体会被保存.没有维护问题.采用存储过程,那就不可能没有问题了.
Stored procedures are more secure
存储过程更加安全
This is a common argument that many people echo without realising that it became defunct when role-based security was made available. A good DBA defines user-roles in the database, and users are added to those roles and rights are defined per role, not per user. This way, it is easy to control which users can insert / update and which users can for example select or delete or have access to views in an easy way.
持有这样看法的人没有认识到,当基于角色的安全可以使用后,这样的想法就应该灭绝了.好的 ,dba定义数据库的角色,让用户关联角色,而不是直接对用户进行授权,这样,控制用户插入,更新,选择,删除,或者视图访问就是很容易的事情了.
With a view it is possible to control which data is accessed on a column basis or row basis. This means that if you want user U to select only 2 or so columns from a table, you can give that user access to a view, not the underlying table. The same goes for rows in one or more tables. Create a view which shows those rows, filtering out others. Give access rights to the view, not the table, obviously using user-roles. This way you can limit access to sensitive data without having to compromise your programming model because you have to move to stored procedures.

采用视图可以控制数据到列和行的权限.这意味着如果你想用户U只能查询某个表的2个列的话,你可以给用户访问一个视图的权限,而不是表本身.对于访问行也是一样的.创建一个视图仅仅显示这些行,过滤掉其他,这样就行了.使用角色给用户赋予访问视图的权限,而不是访问表的权限.这样你就不必为了限制访问敏感数据而必须切换编程模型到存储过程了.

It is also said that stored procedures are more secure because they prevent SQL injection attacks. This argument is false for the simple reason that it is possible to have a stored procedure which concatenates strings together and therefore open itself up to sql injection attacks (generally seen in systems which use procedures and have to offer some sort of general search routine), while the use of parameterized queries removes this vulnerability as no value can end up as being part of the actually query string.
也有人说因为存储过程阻断了sql装入而比较安全.其实这是错误的,因为存储过程内部会采用字符串连接,因此这个问题依然存在(通常是使用存储过程提供通用的查询例程),唯有使用参数化的查询,并且不把它作为最后构造的字符串的一部分才能够去掉这个弱点
Stored procedures are more efficient
存储过程更加有效率
The execution of SQL statements in stored procedures may have been faster than with dynamic SQL in the early days of database systems, but that advantage has all but disappeared in the current versions. In some cases a stored procedure may even be slower than dynamic SQL, so this argument is as dead as a Dodo.
在早年存储过程的运行效率可能比动态sql更快,到现如今这个效果已经消失了,某些情况下存储过程比起动态sql会更慢,因此这个争论已经死翘翘了.
Performance should not be the first question. My belief is that most of the time you should focus on writing maintainable code. Then use a profiler to identify hot spots and then replace only those hot spots with faster but less clear code. The main reason to do this is because in most systems only a very small proportion of the code is actually performance critical, and it's much easier to improve the performance of well factored maintainable code.
性能不应该是首要的问题.我的信仰是,大部分情况下你应该集中精力于可维护的代码.采用性能评估工具验证性能热点,然后用更快(不那么清晰)的代码来替代.关键在于大部分系统都只有少数的小模块是需要更高性能的.在很好维护的代码中要解决这个问题是小菜一碟.
While stored procedures may run faster, they take longer to build, test, debug and maintain, therefore this extra speed comes at a price. If the same function can be performed inside application code at an acceptable speed, what is the advantage of spending more money to make it run faster at a more-than-acceptable speed? It is OK to use stored procedures when you absolutely need a performance gain, but until then they're nothing but premature optimization.

存储过程可能运行的更快,但是它的代价是需要花费更多的时间去构建,测试,调试和维护,因此这样的额外的速度是需要付出的.如果同样的函数可以可接受的速度运行,花费很多时间去让它更加可以接受是值得的吗?为了你绝对需要的性能提升而采用存储过程是可以的,但是请等到绝对必要的时候再做吧.

The company has paid for them, so why not use them?
公司花了钱,干嘛不用?
A similar argument is that by not using what the company has paid for, you are effectively wasting the company's money. I'm sorry, but using something because it's there is just not good enough. If I can achieve something inside my application with application code, then I must be given a very good reason to move it out of my application and into the database. Believe it or not there are costs involved in moving logic from one place to another, and those costs must be offset by measurable benefits.

类似的说法是如果公司掏钱了,你不用,那么就是一种浪费.但是我不这么看,就是因为它在那里就要用是很扯淡的.如果我可以用代码来获得同样的特性,那么必须要有很好的原因,我才会把代码从应用程序转移到数据库.不管你信不信,反正我信了:从代码中移动业务逻辑到数据库是需要成本的,这些成本必须有可以度量的价值才行.

Application code or database code - it's still code, isn't it?
应用代码或者数据库代码-还不都是代码吗?
No it's not. Application code is built using a programming language whereas SQL is nothing more than a data manipulation language, and is therefore very limited in its scope. There is absolutely nothing that can be done in a stored procedure that cannot also be done in application code, but the converse is not true.
不是啊.应用代码是用编程语言来做的,sql却不过是数据管理语言,非常受限于它的作用范围,没有什么可以在存储过程做的却不能在编程语言做,反之则不成立
Amongst the arguments against stored procedures are:

反对存储过程的观点:

It mangles the 3 Tier structure

它破坏了3层架构

Instead of having a structure which separates concerns in a tried and trusted way - GUI, business logic and storage - you now have logic intermingling with storage, and logic on multiple tiers within the architecture. This causes potential headaches down the road if that logic has to change.
和清晰划分的UI,业务逻辑,存储相比,存储过程的使用会导致存储和逻辑混一,业务逻辑存在于多层次内,这会导致潜在的头痛,让业务逻辑的修改变得困难
Stored procedures are a maintenance problem
存储过程会导致维护问题
The reason for this is that stored procedures form an API by themselves. Changing an API is not that good, it will break a lot of code in some situations. Adding new functionality or new procedures is the "best" way to extend an existing API. A set of stored procedures is no different. This means that when a table changes, or behaviour of a stored procedure changes and it requires a new parameter, a new stored procedure has to be added. This might sound like a minor problem but it isn't, especially when your system is already large and has run for some time. Every system developed runs the risk of becoming a legacy system that has to be maintained for several years. This takes a lot of time, because the communication between the developer(s) who maintain/write the stored procedures and the developer(s) who write the DAL/BL code has to be intense: a new stored procedure will be saved fine, however it will not be called correctly until the DAL code is altered. When you have Dynamic SQL in your BL at your hands, it's not a problem. You change the code there, create a different filter, whatever you like and whatever fits the functionality to implement.
原因是存储过程通过它们自己建立了api.修改api是不好的,这会导致很多代码跟着修改.添加新的存储过程往往是扩展现存api的更好的方法.存储过程集合和api并不不同.这意味着当表修改,或者存储过程行为改变时,它需要新的参数,新的存储过程不得不添加进来.看起来是小问题,其实不然,尤其是你的系统很大,或者运行了些时日的情况下尤甚.每个开发出来的系统都可能会变成遗留系统,不得不维护几年的时间.这会花费很多时间,因为你的应用开发者和数据库dba之间的沟通必定是很麻烦的.新存储过程会更好的工作,因为它直到dal代码修改后才会被调用.如果你采用的是动态sql就不是问题.你就修改那里的代码,创建一个过滤器即可
Microsoft also believes stored procedures are over: it's next generation business framework MBF is based on Objectspaces, which generates SQL on the fly.
微软也相信存储过程是过时的了:它的下一代业务框架MBF是基于Objectspaces的,它运行时生成sql
Stored procedures take longer to test
存储过程的测试花费更多的时间
Business logic in stored procedures is more work to test than the corresponding logic in the application. Referential integrity will often force you to setup a lot of other data just to be able to insert the data you need for a test (unless you're working in a legacy database without any foreign key constraints). Stored procedures are inherently procedural in nature, and hence harder to create isolated tests and prone to code duplication. Another consideration, and this matters a great deal in a sizable application, is that any automated test that hits the database is slower than a test that runs inside of the application. Slow tests lead to longer feedback cycles.

业务逻辑放到存储过程中和放到代码中相比,前者需要花费更多的工作去完成测试。参照完整特征经常逼着你配置很多其他的数据,只是为了能够插入你需要的数据(除非你工作在没有外键约束的遗留数据库系统内)。存储过程天然就是过程化的,因此很难创建隔离的测试,并且非常倾向于导致代码重复。另外,这会大大的影响可伸缩的应用程序,和测试应用程序代码相比,测试数据库代码总是非常缓慢的。

BL in stored procedures does not scale

业务层放到存储过程会无法达到可伸缩性

If all the business logic is held in the database instead of the application then the database becomes the bottleneck. Once the load starts increasing the performance starts dropping. With business logic in the application it is easy to scale up simply by adding another processor or two, but that option is not readily available if all that logic is held in the database.

如果业务逻辑放到数据库内,而不是应用代码内,那么数据库将会成为瓶颈。一旦负载增大就可能会掉线。在应用代码内的业务逻辑更加容易通过添加一两个cpu来提升性能,而在存储过程内加cpu常常不管用。

If you have a system with 100's of distributed databases it is far more difficult to keep all those stored procedures and triggers synchronized than it is to keep the application code synchronized.

如果你的系统基于分布式数据库,保持全部存储过程同步就会比应用代码同步要难得动。

Stored procedures are not customisable

存储过程不是可定制的

This is a big issue if you want an application where the customer can insert their own business logic, or where different logic is required by different customers. Achieving this with application code is a piece of cake, but with database logic it is a can of worms.

这是一个大问题,如果你想要一个应用可以被客户插入自己的业务逻辑,或者对不同的客户需要不同的业务逻辑。在应用代码中做到这一点是小菜一碟,而数据库逻辑做同样的事情就像一罐蠕虫。

Database triggers are hidden from the application

数据库触发器对应用程序不可见

A big problem with database triggers is that the application does not know that they exist, therefore does not know whether they have run or not. This became a serious issue in one application (not written by me) which I was maintaining. A new DBA who was not aware of the existence of all these triggers did something which deactivated every trigger on the main database. The triggers were still there, they had not been deleted, but they had been turned off so did not fire and do what they were supposed to do. This mistake took several hours to spot and several days to fix.

应用程序是感知不到触发器的存在的,因此也不知道他们运行了还是没有,这是一个大问题。在一个我维护的应用(不是我写的)这成为一个严重的问题。一个新的dba不知道这些触发器是存在的 ,然后做了某些事停止了它们,不是被删除,而是被关闭,因此不能被触发,有些要做的事情没有做。这个错误持续了几个小时,然后花费了几天去修复。

Version Control

版本控制

It is easy to control all changes to application code by running it through a proper version control system, but those facilities do not exist for stored procedures and triggers. How much damage could be caused if a stored procedure were to get out of sync with the application code? How easy is it to check that the application is running with the correct versions? How much more difficult would it be if the application you were supporting was running on a remote site with nothing more than a dial-up connection?

想要管理应用程序代码简单不过,这样的系统很多。但是对于存储过程和触发器来说,这样的系统并不存在。如果存储过程代码和应用程序代码失去了同步会如何?要知道应用版本是否正确是多么的容易?如果你维护的应用所在的服务器只有一个拨号连接是多么的麻烦?

This is a reason why some teams avoid stored procedures like the plague - it eliminates an area of potentially disastrous screw-ups.
这就是为什么一些团队如同躲避瘟疫一样躲避存储过程 - 这样就避开了潜在的,危险的区域
Vendor lock-in

供应商锁定

You may think that this is not a problem if you build and maintain the applications for a single company where a change in database vendor is highly unlikely, but what happens should the company decide that their DBMS is no longer flavour of the month and they want to change to a different DBMS? This may be due to several factors, such as spiraling costs or poor performance, but when it happens you will find that a lot of code will have to be rewritten. Porting the data will be one exercise, but porting the stored procedures and triggers will be something else entirely. Now, if all that logic were held inside the application, how much simpler would it be?

你或许认为,如果你构建和维护的应用程序就是给一个单独的客户开发的,更换数据库是非常的不可能的,因此这不是什么问题。但是如果这个客户决定废弃他们的数据库,然后换一个不同的,那会发生什么事情?也也许就是因为几个要素,比如成本或者可怜的性能,如果这个事情发生了,就意味着大量的代码需要被改写。或许迁移数据还比较简单,但是存储过程和触发器可就不是这样了。现在如果所有的业务逻辑都在应用中,那得是多么简单?

Believe it or not there are people out there who write applications which are database-independent for the simple reason that the applications may be used by many different companies, and those many companies may not all use the same DBMS. Those that do use the same DBMS may not be using the same version, and stored procedures written for one version may not be compatible with another.

不管信不信,有不少人编写数据库独立的应用的唯一原因就是应用会被很多客户使用,这些客户可能使用不同的dbms。使用相同的dbms的客户可能使用不同的版本,为一个版本编写的存储过程可能不会和另一个版本并不兼容。

As far as I am concerned the use of stored procedures, database triggers and foreign key restraints is OPTIONAL, not MANDATORY, therefore I am free to exercise my option not to use them. That is my choice, and the software that I produce does not suffer in any way, therefore it cannot be defined as the wrong choice.

这样的话,我认为存储过程,触发器,外键是可选的 ,而不是强迫的,因此我可以自由的选择不去用它们。这是我的选择,我构建的软件不会在这些方面搞砸,因为不能以错误的选择来定义。

The web application framework that I have built using PHP does not use stored procedures, database triggers or foreign key constraints, yet it does not suffer from any lack of functionality. This is possible simply because I can do everything I want inside my application where it is instantly accessible and customisable. To those of you who instantly jump to the (wrong) conclusion that this must mean that I have to write a huge amount of duplicated SQL statements my answer is simple - I don't write any SQL statements at all, they are all generated dynamically at runtime. This is all due to the framework being built using the 3 Tier Architecture which has a clear separation of concerns:

我使用php构建的web开发框架不使用存储过程,触发器和外键,然而不会感到缺乏功能特性的痛苦。这样做就非常简单,因为我可以在应用中做任何事,可以随时访问和定制。对于那些立刻下(错误的)结论的人,这可能意味着我需要写大量的重复sql语句。对于这样的看法,我的回答非常简单 - 我不写如何sql语句,它们都是动态生成的。可以达到这样的效果,是因为我构建的框架使用了3层架构,清晰的隔离了几个层面:

There is a separate object in the Business Layer for each database table. This is where all business rules are applied as data passes from the Presentation Layer (UI), through the Business Layer to the Data Access Layer, and back again. The Business Layer does not have any direct communication with the database - this is all handled by the Data Access Layer.

每一个表对于业务层的一个独立对象。当数据从表示层(UI)传递过来的时候,全部的业务规则都在这里执行,通过业务层到数据访问层,然后再倒过来。业务层和数据库没有任何通讯-这些有数据访问层处理。

There is a single object in the Data Access Layer known as the Data Access Object (DAO). The DAO receives a request from the Business Layer and dynamically constructs and executes the SQL query string to satisfy that request. This implementation means that I can easily switch to another DBMS simply by switching to another DAO, and without having to change a single line of code in any Business Layer object. I currently have a separate DAO for MySQL, PostgreSQL and Oracle.

在数据访问层有一个单一的对象(DAO)。DAO 接受到从业务层来的请求,然后动态构建和执行sql查询字符串,满足这样的请求。这样的实现意味着我可以通过简单的更换到另一个DAO来支持另外一个dbms,而不必修改一行业务层的代码。我现在有mysql, PostgreSQL and Oracle的独立的DAO。

Referential integrity is also handled by standard code within the framework and requires no additional coding from any developer whatsoever. It uses information which is exported from the Data Dictionary which tells it what to do with every relationship, and the standard code in the framework simply performs the relevant processing. The advantage of this approach is that it is easy to amend or even turn off any of these rules at runtime, which makes the application infinitely more flexible.
参考完整性也有框架的标准代码来实现,不需要开发者一点点的额外代码。它使用一个数据词典,这个数据词典可以告诉它每一个关系,它简单的执行相关处理。这方法的好处是容易在运行时修改甚至关闭规则,从而让应用可以更加柔性。
All changes made to the database can be logged without using a single database trigger. How? By adding extra code into the DAO to write all relevant details out to the AUDIT database. This functionality is totally transparent to all the objects in the Business Layer, and they do not need any extra code to make it work.

对数据库的所有修改可以被记录,而不需要哪怕一个数据库触发器。怎么做到的?通过添加一些额外的代码到DAO就可以写下相关的细节到审计数据库内。这个特征对所有的业务层代码都是透明的。它们不需要一点点额外的代码就可以工作了。

References

参考文献

Stored procedures are bad, m'kay?

One, two, three, or n tiers? (page 3, 7th paragraph)

Domain Logic and SQL

Business Logic In Stored Procedures

Why I do not use Stored Procedures

Problems with using stored procedures

Stored Procedures v Parameterized Queries

The Myth Of Data Integrity

I inherited a database which contains sprocs (and a lot of other BL)

Good and Evil in the Garden of Stored Procedures

Attitudes towards Stored Procedures

Who needs Stored Procedures, anyways?

Stored Procedures vs ad-hoc SQL

The Worst Possible Way to use a Stored Procedure
posted @ 2014-03-13 09:57  天涯海客  阅读(503)  评论(4编辑  收藏  举报