NYCPHP Meetup

NYPHP.org

[nycphp-talk] Copy-on-write

Hans Zaunere hans at nyphp.org
Fri Oct 31 15:13:47 EST 2003



George Webb wrote:

> Hi!  This is a followup to Hans's comment about passing variables to
> functions (from 'Passing Arrays' thread, Tuesday afternoon).
> 
> 	So is it true that, even though PHP documentation says
> variables passed to functions are, by default, passed-by-value,
> really they are passed by *reference* -- BUT -- only when an actual
> change (i.e. write) is attempted on the variable does a "copy" be
> made?

Yes.  And no.  Perhaps this helps:

http://www.zend.com/zend/art/ref-count.php

Basically, (and I know there are some Zend Engine coders here, so feel free to correct me) the Zend Engine uses references internally; all variables in php land (those starting with '$') are references to underlying C data structures, called zvals.  Each zval can have any number of other names in php land, or aliases (think symbolic links), which are created explicitly by using the ampersand.  However, the Zend Engine also will create these references automatically at times, for instance to implement the copy-on-write for passing arguments into functions.  More or less, references are at the Zend Engine level; aliases are in php land.

> 	This seems like a really cool trick to try to maximize the
> convenience of duplicating a variable's data only when it is actually
> necessary to do so.  If no change is actually made to the data,
> then there is no need to duplicate it; the function can simply
> use the original variable -- because it is only *reading* it.

Exactly.

> 	Is this really how PHP 4 works?  If so, I'm somewhat impressed!

And you can illustrate this easily with a simple script that reads 10mb of data and then passes it into a function.  If the function changes even a single byte, you'll see the memory usage of the process double (or more).  Otherwise, it remains around 10mb, plus overhead of php itself.

> 	Does anyone know if PHP 5 will do this silent optimization
> as well?

Yes, and I believe there will be improvements to it.  Especially important will be the implementing of copy-on-write, or reference counting for objects.  PHP 4 *does not* do this (ie, objects are truely passed by value - they are copied).

> 	As an offshoot topic, is it really good that the documentation
> hides this important performance issue, by simply stating that
> function variables are by default pass-by-value?  I would think that,
> as the prior thread shows, users *do* need to know this.

I think this topic is essential to understand.  So essential, maybe it should be queued up as a PHundamentals?

H




More information about the talk mailing list