php hints

This document was shared by chx in #drupal.hu irc channel. Original was found on github, but unfortunally it's deleted, so I don't know who write it.

I've just saved it for me unpublished, but now, I'm publishing and add some style.
--------------------------------------------------------------------------

This document discusses difficult traps and pitfalls in PHP, and how to avoid, work around, or at least understand them.

If you're brand new to PHP, read [[Articles/PHP Intro]] first.

All of it! PHP is a Terrible Language! LOL!

PHP has more (and more severe) design flaws than most modern languages. Despite this, Facebook has built a 50-billion-dollar enterprise on it.

If this is going to be a problem, you should take some time and really reflect on your career choices.

usort(), uksort(), and uasort() are Slow

This family of functions is extremely slow. You should avoid them if at all possible. Instead, build an array which contains surrogate keys that are naturally sortable with a function that uses native comparison (e.g., sort(), asort(), ksort(), or natcasesort()). Sort this array instead, and use it to reorder the original array.

In our environment, you can often do this easily with array_psort(). You also need to account for i18n and locale-aware collation, which array_psort() will handle for you.

array_intersect() and array_diff() are Also Slow

These functions are much slower for even moderately large inputs than array_intersect_key() and array_diff_key(), because they can not make the assumption that their inputs are unique scalars as the key varieties can. Strongly prefer the key varieties.

array_uintersect() and array_udiff() are Definitely Slow Too

These functions have the problems of both the usort() family and the array_diff() family. Don't use them.

array_merge() in a Loop is Incredibly Slow

If you merge arrays like this:

$result = array();
foreach ($list_of_lists as $one_list) {
  $result = array_merge($result, $one_list);
}

...your program takes O(N^2) runtime because it generates N intermediate arrays. You should use array_merge_fast(). But be CAREFUL, don't use array_merge_fast() in a loop either! This is equally wrong:

$result = array();
foreach ($list_of_lists as $one_list) {
  $result = array_merge_fast(array($result, $one_list));
}

Instead you should do:
$result = array_merge_fast($list_of_lists); // Note that this is not within a loop!

isset(), empty() and Truthiness

A value is "truthy" if it evaluates to true in an if clause:

  $value = something();
  if ($value) {
    // Value is truthy.
  }

If a value is not truthy, it is "falsey". These values are falsey in PHP:

  null      // null
  0         // integer
  0.0       // float
  "0"       // string
  ""        // empty string
  false     // boolean
  array()   // empty array

Disregarding some bizarre edge cases, all other values are truthy. Note that because "0" is falsey, this sort of thing (intended to prevent users from making empty comments) is wrong in PHP:

  if ($comment_text) {
    make_comment($comment_text);
  }

This is wrong because it prevents users from making the comment "0". THIS COMMENT IS TOTALLY AWESOME AND I MAKE IT ALL THE TIME SO YOU HAD BETTER NOT BREAK IT!!! A better test is probably strlen().

In addition to truth tests with if, PHP has two special truthiness operators which look like functions but aren't: empty() and isset(). These operators help deal with undeclared variables.

In PHP, there are two major cases where you get undeclared variables -- either you directly use a variable without declaring it:

  function f() {
    if ($not_declared) {
      // ...
    }
  }

...or you index into an array with an index which may not exist:

  function f(array $mystery) {
    if ($mystery['stuff']) {
      // ...
    }
  }

When you do this, PHP issues a warning. Avoid these warnings by using empty() and isset() to do tests that are safe to apply to undeclared variables.

empty() evaluates truthiness exactly opposite of if(). isset() returns true for everything except null. This is the truth table:

VALUE if() empty() isset()
null false true false
0 false true true
0.0 false true true
"0" false true true
"" false true true
false false true true
array() false true true
EVERYTHING ELSE true false true

The value of these operators is that they accept undeclared variables and do not issue a warning. Specifically, if you try to do this you get a warning:

  if ($not_previously_declared)         // PHP Notice:  Undefined variable!

But these are fine:

  if (empty($not_previously_declared))  // No notice, returns true.
  if (isset($not_previously_declared))  // No notice, returns false.

So, isset() really means is_declared_and_is_set_to_something_other_than_null(). empty() really means is_falsey_or_is_not_declared(). Thus:

  • If a variable is known to exist, test falsiness with if (!$v), not empty(). In particular, test for empty arrays with if (!$array). There is no reason to ever use empty() on a declared variable.
  • When you use isset() on an array key, like isset($array['key']), it will evaluate to "false" if the key exists but has the value null! Test for index existence with array_key_exists().

Put another way, use isset() if you want to type "if ($value !== null)" but are testing something that may not be declared. Use empty() if you want to type "if (!$value)" but you are testing something that may not be declared.

foreach() Does Not Create Scope

Variables survive outside of the scope of foreach(). More problematically, references survive outside of the scope of foreach(). This code mutates $array because the reference leaks from the first loop to the second:

  $array = range(1, 3);
  echo implode(',', $array); // Outputs '1,2,3'
  foreach ($array as &$value) {}
  echo implode(',', $array); // Outputs '1,2,3'
  foreach ($array as $value) {}
  echo implode(',', $array); // Outputs '1,2,2'

Avoid using foreach-by-reference. If you do opt to use it, unset the reference after the loop:

  foreach ($array as &$value) {
    // ...
  }
  unset($value);

unserialize() is Incredibly Slow on Large Datasets

The performance of unserialize() is nonlinear in the number of zvals you unserialize, roughly O(N^2).

zvals approximate time
10000 5ms
100000 85ms
1000000 8,000ms
10000000 72 billion years

var_export() Loves to Kill Babies

If you try to var_export() an object that contains recursive references, your program will terminate. You have no chance to stop this from happening. Avoid var_export() unless you are certain you have only simple data. You can use print_r() or var_dump() to display complex variables.

json_encode() Gives Up Very Easily

PHP's json_encode() completely discards data if it has an invalid UTF-8 subsequence. Use fb_json_encode() (and fb_json_decode()) to avoid these problems.

call_user_func() Breaks References

If you use call_use_func() to invoke a function which takes parameters by reference, the variables you pass in will have their references broken and will emerge unmodified. That is, if you have a function that takes references:

 
function add_one(&$v) {
  $v++;
}

...and you call it with call_user_func():

  $x = 41;
  call_user_func('add_one', $x);

...$x will not be modified. The solution is to use call_user_func_array() and wrap the reference in an array:

  $x = 41;
  call_user_func_array(
    'add_one',
    array(&$x)); // Note '&$x'!

This will work as expected.

You Can't Throw From __toString()

If you throw from __toString(), your program will terminate uselessly and you won't get the exception.

An Object Can Have Any Scalar as a Property

Object properties are not limited to legal variable names:

  $property = '!@#$%^&*()';
  $obj->$property = 'zebra';
  echo $obj->$property;       // Outputs 'zebra'.

So, don't make assumptions about property names.

There is an (object) Cast

You can cast a dictionary into an object.

 
  $obj = (object)array('flavor' => 'coconut');
  echo $obj->flavor;      // Outputs 'coconut'.
  echo get_class($obj);   // Outputs 'stdClass'.

This is occasionally useful, mostly to force an object to become a Javascript dictionary (vs a list) when passed to fb_json_encode().

$arr['42'] refers to the same value as $arr[42]

So:

$arr = array();
$arr[42] = 'spoo';
echo $arr['42'];  // outputs "spoo"
echo $arr[' 42'];  // note leading space; "undefined index"

Do you really, really want to put an integer-like string key in an array? All is not lost! You can do it with an array cast, but good luck getting that value back.

$obj = new stdClass();
$zero = 0;
$obj->$zero = 'blub';
$arr = (array)$obj;
var_export($arr);  // outputs "array( '0' => 'blub' )". BUT:
echo $arr['0'];    // undefined index, fool!

There is a Builtin __PHP_Incomplete_Class Which Answers to No Master

See _warn_incomplete_object().

PHP Leaks Like a Sieve

Everything in PHP leaks a lot of memory. "Everything" does NOT mean "some rare library functions which you can avoid if you are careful", it means "basic array operations" and probably "comments". Do not build long-running PHP services.

Invoking "new" With an Argument Vector is Really Hard

If you have some $className and some $argv of constructor arguments and you want to do this:

  new $className($argv[0], $argv[1], ...);

...you'll probably invent a very interesting, very novel solution that is very wrong. Use newv().

Equality is not Transitive

This isn't terribly surprising since equality isn't transitive in a lot of languages, but the == operator is not transitive:

  $a = ''; $b = 0; $c = '0a';
  $a == $b; // true
  $b == $c; // true
  $c == $a; // false!

When either operand is an integer, the other operand is cast to an integer before comparison. Avoid this and similar pitfalls by using the === operator, which is transitive.

All 676 Letters in the Alphabet

This doesn't do what you'd expect it to do in C:

  for ($c = 'a'; $c <= 'z'; $c++) {
    // ...
  }

This is because the successor to 'z' is 'aa', which is "less than" 'z'. The loop will run for ~700 iterations until it reaches 'zz' and terminates. That is, $c will take on these values:

  a
  b
  ...
  y
  z
  aa // loop continues because 'aa' <= 'z'
  ab
  ...
  mf
  mg
  ...
  zw
  zx
  zy
  zz // loop now terminates because 'zz' > 'z'

Instead, use this loop:

  foreach (range('a', 'z') as $c) {
    // ...
  }

Ternary operator is left-associative

Unlike in C or Java, the ternary operator associates from left to right, not from right to left. So for example, the following code will not do what you want:

  $num = 0;
  $text = ($num === 0) ? 'zero' : ($num === 1) ? 'one' : 'many';
  // $text is now 'one' because the above call is equivalent to this:
  // $text = (($num === 0) ? 'zero' : ($num === 1)) ? 'one' : 'many';

Hozzászólás

A mező tartalma nem nyilvános.
  • Internal paths in double quotes, written as "internal:node/99", for example, are replaced with the appropriate absolute URL or relative path.
  • Engedélyezett HTML elemek: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <del> <img>
  • A webcímek és email címek automatikusan linkekké alakulnak.
  • A sorokat és bekezdéseket a rendszer automatikusan felismeri.
  • Engedélyezett HTML elemek: <a> <blockquote> <br> <cite> <code> <dd> <del> <div> <dl> <dt> <em> <li> <ol> <p> <span> <strong> <ul>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <bash>, <c>, <cpp>, <drupal5>, <drupal6>, <java>, <javascript>, <mysql>, <php>, <python>, <ruby>, <sql>. The supported tag styles are: <foo>, [foo].
  • Minden email cím át lesz alakítva ember által olvasható módon, vagy (ha a JavaScript engedélyezett) ki lesz cserélve kattintható, de biztonságos hivatkozásra.
By submitting this form, you accept the Mollom privacy policy.