[转]Benchmarking magic - Janpoem

公告

原文：http://www.garfieldtech.com/blog/magic-benchmarks

The day is nearly upon us! Drupal 7 will open up developers to PHP 5 functionality when it is released next year. Already, there is talk of how, and if, to leverage PHP 5's object handling now that we don't need to deal with the weirdness of PHP 4's object model. Of course, because it's Drupal, our army of performance czars want to know just what the cost is for object handling, and especially advanced object magic like __get(), __call(), the ArrayAccess interface, and so forth.

So let's find out. :-)

Benchmarking methodology

The exact numbers in the following tests aren't particularly interesting. What's interesting is their relative value. All tests are run in a single script (available at the bottom of this post) on the following system:

Lenovo Thinkpad T61 on AC power
Intel Core2 Duo 2.2 GHz
2 GB RAM
Kubuntu 7.10 "Gutsy"
PHP 5.2.3

Because it's a fairly beefy system, all tests are run 2,000,000 times so that we have worthwile numbers to compare. All times listed below are in seconds. Of course, any such tests will vary a bit between runs, and even between two tests in the same script. We're looking for overall trends here, not exact numbers, but it's important to keep in mind that micro-benchmarks are an inexact science. Also keep in mind that I don't know the internals of the PHP engine well at all, so my analysis is based on logical extrapolation, not actual knowledge of the PHP engine itself.

All tests use the following benchmarking mechanism:

<?php
error_reporting(E_ALL | E_STRICT);
define('ITERATIONS', 2000000);
...
$start = microtime(true);
for ($i=0; $i < ITERATIONS; ++$i) {
  // something here we're testing.
}
$stop = microtime(true);
echo "Test name: " . ($stop - $start) . " seconds". PHP_EOL;
?>

Functions

For completeness, we'll test ordinary functions, too. First define a dummy function that we can call:

<?php
function foo($a) {
  return;
}
?>

And we'll call it 4 different ways:

<?php
foo(1); $foo = 'foo';
$foo(1); call_user_func('foo', 1); call_user_func_array('foo', array(1));
?>

Results:

Literal function    1.218
Variable function    1.305
call_user_func()    2.734
call_user_func_array()    3.386

As others have noted before, call_user_func*() is extremely slow. Unfortunately, it's also the main way to do function-level polymorphism in PHP.

Methods

Moving on the object oriented-code, our main interest here, let's look at three different ways of calling methods: Directly, via __call(), and via __call() with a generic pass-through using call_user_func_array():

<?php
class TestCall {
  function normal() { return; }
  function __call($method, $args) { return; }
}
$t = new TestCall(); // ... class TestCallSub {
  function normal() { return; }
  function bar() { return; }
  function __call($method, $args) {
    if ($method == 'foo') {
      return call_user_func_array(array($this, 'bar'), $args);
    }
    return;
  }
}
$s = new TestCallSub(); // Normal method
$t->normal(); // __call() overhead
$t->doesntExist(); // __call() overhead with generic dispatch
$s->foo();
?>

Results:

Native Method                1.095
Magic Method (__call())            3.018
Magic Method (with sub-function)    7.226

It looks like __call() is indeed not a speed deamon, but not as slow as I previously thought. Rather, it is call_user_func_array() that was the real killer. Between the two of them, call_user_func_array() has more overhead than __call() does. Mixing them is a performance nightmare.

Properties

PHP 5 also includes some magic overrides for properties: __get() and __set(). Let's see how expensive those are.

<?php
class TestGetSet {
  public $foo = 1;
  public function __get($var) {
    return $this->foo;
  }
  public function __set($var, $val) {
    $this->foo = $val;
  }
}
$t = new TestGetSet(); $t->foo;
$t->bar;
$t->foo = 1;
$t->bar = 1;
?>

Results:

Get Native Property        0.619
Get Magic Property (__get())    2.066
Set Native Property        0.752
Set Magic Property (__set())    2.623

Again, magic is expensive. Curiously, both __get() and __set() are just over 3x the cost of the native equivalent (3.3x and 3.4x, respectively) while __call() is only 2.7x the cost.

Arrays

Another nifty feature of PHP 5 is the ArrayAccess interface, part of the Standard PHP Library. What does that cost us?

<?php
class TestArrayAccess implements ArrayAccess {
  private $properties = array();
  public $foo = 1;
  function __construct($array) {
    $this->properties = $array;
  }
  function offsetExists($offset) {
    return isset($this->properties[$offset]);
  }
  function offsetGet($offset) {
    return $this->properties[$offset];
  }
  function offsetSet($offset, $value) {
    $this->properties[$offset] = $value;
  }
  function offsetUnset($offset) {
    unset($this->properties[$offset]);
  }
}
$a = array('a' => 'A', 'b' => 'B', 'c' => 'C', 'd' => 'D');
$t = new TestArrayAccess($a); // Get array property
$a['b'];
// Set array property
$a['b'] = 'B';
// Get object property
$t->foo;
// Set object property
$t->foo = 1;
// Get ArrayAccess property
$t['b'];
// Set ArrayAccess property
$t['b'] = 'B';
?>

Results:

Get Array Property        0.473
Set Array Property        0.655
Get Object Property        0.598
Set Object Property        0.733
Get ArrayAccess Property    2.379
Set ArrayAccess Property    3.030

So we can determine 3 things here. One, setting a variable is a bit more expensive than reading it, but not enormously so. That's not surprising. Two, arrays are very slightly faster than objects for just reading a public property directly, but again not by much and probably not enough to worry about (especially when there are plenty of more expensive operations, as we are finding). Three, the ArrayAccess interface eats your CPU.

At first that seems surprising, but consider that each array access must first detect that it's using the extra language magic, then call a method, and in our case that method is not just trivially returning as it did in the earlier tests. It's doing an array lookup and returning an actual value. Still, a 4.6x-5x increase in time feels high. It's even a bit more expensive than __get() and __set().

Inheritance

What about simple inheritance? There are many ways to do polymorphism. So far we've determined that call_user_func_array() is a really lousy one from a performance perspective, and a wrapping function is going to cost an extra function call each time. What if we use inheritance for more traditional, "classic" polymorphism? Let's have a go.

<?php
class Base {
  public $test = 1;
  public function baseMethod() { return; }
  public function overrideMe() { return; }
}
class Child extends Base {
  public $child = 1;
  public function childMethod() { return; }
  public function overrideMe() { return; }
}
$b = new Base();
$c = new Child(); $b->test;
$c->test;
$c->child;
$b->baseMethod();
$c->baseMethod();
$c->childMethod();
$c->overrideMe();
?>

Results:

Get Base Property        0.617
Get Base Property from Child    0.611
Get Child Property        0.625
Get Base Method            1.185
Get Base Method from Child    1.142
Get Child Method        1.141
Override Child Method        1.124

Finally, some good news! Not really surprising news, either. When all is said and done, inheritance is basically free as far as CPU cycles go. That should not come as a surprise. Properties and methods of an object are inherited at creation time, not call time, so once the object is created it doesn't really matter how it was created. At least from a performance perspective, then, inheritance is not a concern.

Composition

Of course, the blogosphere has been hopping recently about how inheritance is evil and inflexible and composition is so much better and more flexible. The catch, though, is that composition does incur a run-time cost in terms of extra method calls. Let's see what that cost is.

<?php
class Used {
  public function myMethod() { return; }
}
class User {
  protected $used;
  public function __construct() {
    $this->used = new Used();
  }
  public function myMethod() { return $this->used->myMethod(); }
}
$u = new User(); $u->myMethod();
?>

Results:

Get Composed Method    2.232

Wrapping a method via composition roughly doubles the performance cost, which is exactly what we'd expect from adding one more method call to the stack. No surprises here, either. Consider that the cost of composition. At least it's cheaper than call_user_func_array(). :-)

Iterators

The last test we'll make involves iterators. SPL includes a huge collection of iterators, but we're only going to look at two of them. We'll compare iterating over a native array with an object that uses an internal iterator, using the Iterator interface, and one using an external Iterator via IteratorAggregate and ArrayIterator.

<?php
class Internal implements Iterator {
  protected $a;
  public function __construct(array $a) {
    $this->a = $a;
  }
  public function current() {
    return current($this->a);
  }
  public function key() {
    return key($this->a);
  }
  public function next() {
    return next($this->a);
  }
  public function rewind() {
    return reset($this->a);
  }
  public function valid() {
    return (current($this->a) !== FALSE);
  }
}
class External implements IteratorAggregate {
  protected $a;
  public function __construct(array $a) {
    $this->a = $a;
  }
  public function getIterator() {
    return new ArrayIterator($this->a);
  }
}
$a = array('A', 'B', 'C', 'D');
$internal = new Internal($a);
$external = new External($a);
foreach (
$a as $item);
foreach ($internal as $item);
foreach ($external as $item);
?>

Results:

Iterate array            1.67
Iterate internal iterator    22.87
Iterate external iterator    6.06

Oh dear god make it stop! A trivially-simple internal iterator has a performance hit of more than an order of magnitude over a native array. An external iterator is cheaper, but still not cheap.

Let's consider why that is, though. Using the Iterator interface, we're forcing PHP to call into user-space 2-3 times per iteration. (I'm not sure of the exact internals, but at minimum it would need to call next() and valid() each iteration, plus key() if we're requesting it.) That's three method calls per iteration, not counting the behind-the-scenes engine code to make the magic work. Maybe it's not so surprising then. The external iterator is faster here because we're using the ArrayIterator object, provided by SPL and implemented entirely in C. If we used a user-space external iterator, I would expect results similar to those for the internal iterator.

The moral of the story here is, as always, C is faster than PHP. The more you can do in C, the faster your code will be. (Hey, that rhymes!) If possible, use IteratorAggregate and ArrayIterator over an internal iterator. If that's not possible for some reason, say you're iterating over some external resource like a file handle or database result set, be aware that it's going to cost you.

Summary

So what have we learned? We've learned that there is no such thing as a free lunch, unless you're getting it from your parents. (Amazing how programming parallels real life, isn't it?) All of PHP 5's advanced object-oriented features have a cost, and sometimes that cost is non-trivial.

Does that mean we should avoid using them? Of course not! Magic methods, iterators, ArrayAccess, and the like make solving certain types of problems far easier and faster for the programmer. In many cases, throwing more CPU at the code is cheaper than writing more, clunkier, harder-to-maintain code. And if the advanced features are not used in critical sections of the program, you may not even notice the difference. These benchmarks should be used as guidelines only; moving your database server from the same computer to a dedicated database box will likely yield a bigger performance boost than expunging all traces of __get() from your code, and will almost certainly cost far less to do.

There's one other important observation that we haven't really mentioned. One of the big complaints about PHP 4's object model was that it was dog slow compared to procedural code. Well, whatever the truth to that it is no longer the case. Calling a function and calling a method is virtually identical in cost, at least under modern versions of PHP. Polymorphic code can eve be faster if using inheritance over composition or function-level composition (or god-forbid call_user_func_array()), although as always beware the inheritance trap. As with anything else, use wisely.

The raw data is available below, as is a graph of the results courtesy of OpenOffice.org Calc. I've left the internal iterator out of the graph as it would visually throw everything else off. The complete benchmark script used is available below as well.

Results (2 million iterations)
Operation	Seconds
Literal function	1.218
Variable function	1.305
call_user_func()	2.734
call_user_func_array()	3.386
Native Method	1.095
Magic Method (__call())	3.018
Magic Method (with sub-function)	7.226
Get Native Property	0.619
Get Magic Property (__get())	2.066
Set Native Property	0.752
Set Magic Property (__set())	2.623
Get Array Property	0.473
Set Array Property	0.655
Get Object Property	0.598
Set Object Property	0.733
Get ArrayAccess Property	2.379
Set ArrayAccess Property	3.030
Get Base Property	0.617
Get Base Property from Child	0.611
Get Child Property	0.625
Get Base Method	1.185
Get Base Method from Child	1.142
Get Child Method	1.141
Override Child Method	1.124
Get Composed Method	2.232
Iterate array	1.67
Iterate internal iterator	22.87
Iterate external iterator	6.06

http://www.garfieldtech.com/blog/magic-benchmarkshttp://www.garfieldtech.com/blog/magic-benchmarks

posted on 2009-12-12 16:35 Janpoem 阅读(216) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部