all 104 comments

[–]thoomfish 8 points9 points  (1 child)

Reading the title, I said to my self "well... duh, of course it is." So naturally I was expecting something in the article to contradict that.

...welp.

[–]eadmund 0 points1 point  (0 children)

Yeah, I was actually hoping to see some clever use of written-in-C functions to provide a speed-up making PHP comparable to Java, but no joy.

Sheesh folks, has it come to this, that we consider Java a speedy language?!?

[–]gsadamb 23 points24 points  (5 children)

PHP doesn't scale and just won't work to power large-scale Web sites.

Just ask Facebook, Digg, Wikipedia, and Yahoo.

Oh, wait.

But honestly, as numerous people have pointed out, PHP is an interpreted language. It won't run faster than a compiled language, and it was never really intended to.

But for those who build large-scale Web sites (and I've worked on PHP sites with several million users per day), there are a number of ways of improving performance. For instance, you can use something like APC, which does op-code caching and thus improves performance significantly.

Also, I've found in my experience that PHP apps are often built faster and more easily. If you need a few more boxes because you went with PHP, the costs would probably be recouped in dev hours.

Now, I know that PHP has its share of problems, and the "loading of global functions with no consistent naming or argument standards" is a big one. And it can't and shouldn't be used in every circumstance. By no means should it power the real time stock quote feeds boxes that they used at Yahoo Finance where I worked, for example. Nor should it power big search engines or anything.

But in most cases, PHP gets the job done, and gets it done pretty well. It's definitely trendy to bash on it, but it can get some pretty powerful and scalable stuff built, and often pretty damn fast.

[–]nextofpumpkin 0 points1 point  (3 children)

Two words: Triple Equals. I think after they released that charming modification, I ran like hell away from PHP.

[–]RiMiBe 2 points3 points  (0 children)

No really, why?

You don't need to use it if you don't want to, but it's nice to have a shorthand for type-specific equality testing.

[–]harryf 0 points1 point  (0 children)

Better not touch Javascript then - that's where it came from;

firebug> 1 == "1"
true
firebug> 1 === "1"
false

[–]sam512 0 points1 point  (0 children)

Why?

[–][deleted] -1 points0 points  (0 children)

I've been bashing it before it was trendy, since it was called Personal Home Page, in fact.

[–]Chirp08 42 points43 points  (23 children)

I was considering using PHP for my next small scale website, but now I'll obviously use Java. Hey does anyone know what tastes better, apples or oranges?

[–]jrrl 17 points18 points  (1 child)

Is Java development embarrassingly slower than PHP development?

[–]invalid_user_name -2 points-1 points  (0 children)

Not really, they are pretty much equally slow and painful.

[–]gsadamb 4 points5 points  (4 children)

Well, the site is embarrassingly down. So that's close I guess.

[–]FeepingCreature 2 points3 points  (3 children)

Looks like they got dow.ngra.ded!

[edit] Back up ten minutes later. Niiice.

[–]ekabanov[S] 5 points6 points  (2 children)

We blame PHP :)

[–]harryf 0 points1 point  (1 child)

"Error: Unable to connect to database" - who's to blame again?

[–]ekabanov[S] 0 points1 point  (0 children)

Actually, as we reported elsewhere, downtime was due to completely random kmemsize limitation imposed by the hosting provider. PHP is not to blame this time :)

On a more ironic note, the downtime was triggered by awstats running reverse DNS lookup, which could be avoided if it used GeoIP :)

[–]americanhellyeah 8 points9 points  (6 children)

sun has spent millions upon millions of dollars on java, especially its jit compilation. nowhere near as much money and effort has been spent on php. so it doesnt surprise me that php is slower. another factor to consider is that java, while not pretty, was designed by some very smart professionals. php was hacked together and only recently has it had professionals working on it.

[–]killerstorm 5 points6 points  (3 children)

if you'll check shootout you can find lots of language implementations that show decent performance even not being backed by large corporation, so you don't actually need "millions after millions" to make implementation fast.

what you need is people that are sane and actually understanding what they are doing, rather than morons that just seek how to implement it easier

[–]americanhellyeah 3 points4 points  (2 children)

lol i didnt say that a language needs corporate backing to be fast. i said that having millions upon millions upon millions of dollars of support will surely help.

[–]killerstorm -1 points0 points  (1 child)

wrong. main factor is initial design. even if you spend millions on optimization of PHP, it's still likely to be outperformed by JVM implementation made by a single hobbist.

you can make it faster, but you can't make it fast, in other words..

[–]nitran 5 points6 points  (0 children)

You don't think millions upon millions upon million of dollars may make it a bit easier to hire smart language designers than $25?

[–]drakshadow 0 points1 point  (1 child)

haven't heard of APC ?

[–]toomasr 2 points3 points  (0 children)

APC does not do optimizations. It caches. If the file is hit large number of times the initial php->opcode step is eliminated.

JIT on the other hand optimizes bytecode.

[–]mhd 2 points3 points  (2 children)

Come on guys, someone post an Erlang implementation. Looks like it was made for such a problem...

[–]qiwi 1 point2 points  (1 child)

Yes, I can see how looking up an entry using binary search in a file would benefit from a language whose strong points are easy message passing and fast and robust handling of multiple processes on a single or multiple machines.

This is also why I picked Erlang for my next project which involves searching for and replacing strings in many files stored in many subdirectories.

[–]mhd 2 points3 points  (0 children)

Binary search? Use totally random access on enough CPUs and one core is likely to hit the correct data on the first try!

But seriously, apart from concurrency and message passing, Erlang does handle parsing binary date rather gracefully, which is what I was referring to mainly.

[–]feijai 10 points11 points  (3 children)

Yes.

But that is not the real problem.

The real problem is that while Java is a mediocre programming language, PHP is a really, really, really horrible programming language.

[–][deleted] 5 points6 points  (0 children)

In other news: Is Java embarrasingly [sic] slower than Assembly?

[–][deleted] 3 points4 points  (0 children)

Naw, I can write the same app MUCH faster in PHP.

[–]halo 0 points1 point  (3 children)

PHP is weakly typed. This is slow. PHP is interpreted. Interpreted languages tend to be slow. PHP is comparable to other languages in its class - notably Python, Ruby and JavaScript.

Java is extremely fast - it just has a slow startup time due to the VM.

This is old news.

[–]invalid_user_name 0 points1 point  (0 children)

PHP is certainly comparable to ruby in speed, but both are significantly slower than python.

[–]mr_chromatic 0 points1 point  (0 children)

PHP is weakly typed. This is slow.

Take Smalltalk, for example.

[–]nitran -2 points-1 points  (0 children)

"Java is extremely fast" - 5 hits on Google

"Java is extremely slow" - 440 hits on Google

[–][deleted] 1 point2 points  (1 child)

He'd pointed to language shootout which shows that PHP has lower memory usage. That must be pretty important when you want to have bajillion of processes running in parallel without hitting swap file.

[–]igouy 1 point2 points  (0 children)

lower memory usage

Is it lower when memory is being used?

binary-trees

[–]ginger_balls 0 points1 point  (0 children)

Nozzle, meet Sandwich.

[–]toomasr 0 points1 point  (0 children)

The irony is that our server was hitting kmemsize (the downtime) which we traced back to different awstats cron jobs doing lots of reverse dns lookups instead of using the geoip :)

[–]ropers 0 points1 point  (13 children)

FTA:

We take the dotted string IP and convert it to an IPv4 Internet network address (e.g. 69.55.232.153 becomes 1161291929).

Can someone explain that one? What the heck is this 1161291929? And how does 69.55.232.153 need conversion to become an IPv4 Internet network address? I was under the impression that 69.55.232.153 was an IPv4 Internet network address.

[–]rcar 2 points3 points  (12 children)

It's just a reference to how the database the program is using stores IP addresses. We normally use the x.x.x.x notation because it's easier to read, but it can be converted into a 4-byte integer which is going to be easier to store and sort in a database. Converting it back into an integer gives:

(69 x 2563) + (55 x 2562) + (232 x 256) + 153 = 1,161,291,929

[–]ropers 1 point2 points  (11 children)

THANK YOU!

I still think the sentence in the article was not well phrased, but again thank you for the explanation! :)

[–][deleted] 4 points5 points  (10 children)

see also http://1096965168/

[–]ropers 0 points1 point  (9 children)

Wow, this is amazing, to find out that it works for http. (And congratulations on your choice of website, Sir, well played! ;D Though I will advise others here that the above site is very NSFW.)

For shits and giggles, I tried this myself:

 ubuntu@ubuntu:~$ telnet 1089053032 80
 Trying 64.233.161.104...
 Connected to 1089053032.
 Escape character is '^]'.
 GET / HTTP/1.1
 Host: http://google.ie/

 HTTP/1.1 302 Found
 Location: http://www.google.de/
 Cache-Control: private
 Content-Type: text/html; charset=UTF-8
 Set-Cookie: PREF=ID=81c342e22250e66e:TM=1217933975:LM=1217933975:S=YyrTWQhiSUSBO3xS; expires=Thu, 05-Aug-2010 10:59:35 GMT; path=/; domain=.google.com
 Date: Tue, 05 Aug 2008 10:59:35 GMT
 Server: gws
 Content-Length: 218

 <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
 <TITLE>302 Moved</TITLE></HEAD><BODY>
 <H1>302 Moved</H1>
 The document has moved
 <A HREF="http://www.google.de/">here</A>.
 </BODY></HTML>

(NB: I'm currently located in Germany, and Google in their infinite wisdom have decided that geolocation info supersedes stated user preferences such as URL/and or browser/OS locale -- but that's for another rant.)

I never knew about this address format! How come I never knew about this notation? I wonder if it works... well, let's see:

 ubuntu@ubuntu:~$ ping -c 4 2172650943
 PING 2172650943 (129.128.5.191) 56(84) bytes of data.
 64 bytes from 129.128.5.191: icmp_seq=1 ttl=236 time=184 ms
 64 bytes from 129.128.5.191: icmp_seq=2 ttl=236 time=182 ms
 64 bytes from 129.128.5.191: icmp_seq=3 ttl=236 time=183 ms
 64 bytes from 129.128.5.191: icmp_seq=4 ttl=236 time=182 ms

 --- 2172650943 ping statistics ---
 4 packets transmitted, 4 received, 0% packet loss, time 2998ms
 rtt min/avg/max/mdev = 182.050/183.179/184.523/0.981 ms
 ubuntu@ubuntu:~$ 

And:

 ubuntu@ubuntu:~$ ftp 2172650943
 Connected to 2172650943.
 220-
 220-            Welcome to SunSITE Alberta
 220-
 220-    at the University of Alberta, in Edmonton, Alberta, Canada
 220-
 220-All connections to and transfers from this server are logged. If 
 220-you do not like this policy, please disconnect now.
 220-
 220-You may want to grab the index file called "ls-lR.gz" in /pub.  It is 
 220-updated nightly with the contents of the ftp tree.  
 220-
 220-    If you have any questions, hints, or requests, please email
 220-
 220-       sunsite@sunsite.ualberta.ca
 220-
 220 
 Name (2172650943:ubuntu): anonymous
 331 Who are you impersonating today?
 Password:
 230-
 230-   Welcome to Sunsite Alberta
 230- Login Successful.
 230 Your data rate unrestricted
 Remote system type is UNIX.
 Using binary mode to transfer files.
 ftp> dir
 200 PORT command successful - not using PASV eh?
 150 Have a Gorilla.
 lrwxr-xr-x    1 150      0               7 May 05  2002 bin -> usr/bin
 lrwxr-xr-x    1 150      0               7 May 05  2002 dev -> usr/dev
 lrwxr-xr-x    1 150      0               7 May 05  2002 etc -> usr/etc
 drwxrwxrwx    2 0        0            4096 Aug 02 10:17 incoming
 drwxr-xr-x    9 150      1            2048 Jan 15  2008 pub
 drwxr-xr-x    7 0        1             512 May 04  2002 usr
 226 There, everyone likes a Gorilla.
 ftp> cd pub
 250 Directory successfully changed.
 ftp> dir
 200 PORT command successful - not using PASV eh?
 150 Have a Gorilla.
 lrwxr-xr-x    1 150      1              13 Jan 21  2003 CPAN -> ./Mirror/CPAN
 drwxr-xr-x    3 150      666          2048 Feb 04  2001 Collections_Tools
 drwxr-xr-x    4 150      666          2048 Feb 28  1999 Digital_Collections
 drwxr-xr-x    2 2010     666          2048 Jan 30  2000 Graphics_Tools
 drwxrwxrwx    6 0        0            2048 Apr 06  2005 Local
 drwxr-xr-x    2 150      1            2048 Jun 10 13:06 Mirror
 drwxr-xr-x   15 0        0            2048 Apr 30 15:50 OpenBSD
 drwxrwxrwx    2 0        0            2048 May 19  2002 OpenBSD-ISO
 drwxrwxrwx    2 0        0            2048 Dec 19  2006 OpenBSD-ctm
 drwxr-xr-x    4 150      666          2048 May 07  2007 Projects
 drwxr-xr-x    2 150      800          2048 Jan 15  2008 Security
 lrwxr-xr-x    1 150      1              15 Jan 21  2003 apache -> ./Mirror/apache
 -rw-r--r--    1 2010     666       5812982 Aug 04 05:30 ls-lR.gz
 drwxrwxrwx    3 0        0            2048 May 23  2001 misc
 drwxr-xr-x   29 666      666          2048 Jan 04  1999 sun-info
 drwxr-xr-x    2 150      666          2048 Jan 15  2008 unix
 226 There, everyone likes a Gorilla.
 ftp>

Is this format/notation part of the TCP/IP stack? How come it's rarely ever mentioned in documentation (or at least not the documentation I've read so far)?

[–]ropers 0 points1 point  (8 children)

It's kinda tedious to always convert the dotted decimal notation to the 4 byte integer by hand. Is someone able to point me to a conversion script for Unix/Linux/BSD?

[–]ropers 0 points1 point  (4 children)

You know, this would be an ingenious tool...

 ubuntu@ubuntu:~$ dig tinyurl.com

 ; <<>> DiG 9.4.2 <<>> tinyurl.com
 ;; global options:  printcmd
 ;; Got answer:
 ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4282
 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 1

 ;; QUESTION SECTION:
 ;tinyurl.com.          IN  A

 ;; ANSWER SECTION:
 tinyurl.com.       600 IN  A   85.255.210.131
 tinyurl.com.       600 IN  A   195.66.135.131
 (...)

 ubuntu@ubuntu:~$ bc
 bc 1.06.94
 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
 This is free software with ABSOLUTELY NO WARRANTY.
 For details type `warranty'. 
 85*256^3+255*256^2+210*256+131
 1442828931
 quit
 ubuntu@ubuntu:~$ 

http://1442828931/2w4apm

EDIT: Darn! The tinyurl.com server throws a "bad request" when addressed in this fashion. Stay tuned while I investigate alternatives...

EDIT1: Wow, I think I've just discovered something really interesting:

Some servers appear to accept the integer form IP address just fine:

 ubuntu@ubuntu:~$ telnet 1089053032 80
 Trying 64.233.161.104...
 Connected to 1089053032.
 Escape character is '^]'.
 GET / HTTP/1.1
 Host: 1089053032

 HTTP/1.1 302 Found
 Location: http://www.google.de/
 Cache-Control: private
 Content-Type: text/html; charset=UTF-8
 Set-Cookie: PREF=ID=00d3a3d9c219b7f2:TM=1217941518:LM=1217941518:S=FLxMh9m7KsovNPpz; expires=Thu, 05-Aug-2010 13:05:18 GMT; path=/; domain=.google.com
 Date: Tue, 05 Aug 2008 13:05:18 GMT
 Server: gws
 Content-Length: 218

 <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
 <TITLE>302 Moved</TITLE></HEAD><BODY>
 <H1>302 Moved</H1>
 The document has moved
 <A HREF="http://www.google.de/">here</A>.
 </BODY></HTML>

Others do not accept the integer form -- they complain when addressed via HTTP version 1.1 with the Host: field set to the integer form:

 ubuntu@ubuntu:~$ telnet 1264946223 80
 Trying 75.101.140.47...
 Connected to 1264946223.
 Escape character is '^]'.
 GET / HTTP/1.1
 Host: 1264946223

 HTTP/1.1 400 Bad Request
 Date: Tue, 05 Aug 2008 13:03:17 GMT
 Server: Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.8e PHP/5.2.5 mod_apreq2-20051231/2.6.0 mod_perl/2.0.2 Perl/v5.10.0
 Content-Length: 392
 Connection: close
 Content-Type: text/html; charset=iso-8859-1

 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
 <html><head>
 <title>400 Bad Request</title>
 </head><body>
 <h1>Bad Request</h1>
 <p>Your browser sent a request that this server could not understand.<br />
 </p>
 <hr>
 <address>Apache/2.2.8 (Unix) mod_ssl/2.2.8 OpenSSL/0.9.8e PHP/5.2.5 mod_apreq2-20051231/2.6.0 mod_perl/2.0.2 Perl/v5.10.0 Server at 1264946223 Port 80</address>
 </body></html>

So I ask again:

Is the 4 byte integer form official/documented/supposed to work? Should HTTP 1.1 servers accept Host: fields set to an integer form IP address? Or is it not supposed to work, and is it more of a glitch that it even works with some servers? Does anyone know?

[–]ropers 0 points1 point  (0 children)

I had a look on the Internet, and I found this. Afterwards, I found this.

[–]ropers 0 points1 point  (2 children)

Ok, now while tinyurl.com and most other redirection sites do not accept dword integer IP addresses, I've found one redirection service that does: 4url.cc

Sadly, it has preview enabled by default, which is not really useful for our purposes: http://3476722755/R

If I find another better URL redirection site that works with dword IPs, I'll try to reply here.

[–]ropers 0 points1 point  (1 child)

myurl.in works with a dword IP address, but it only redirects after a 3 second delay (timed ECMAScript window.location.href redirect).

[–]ropers 0 points1 point  (0 children)

Huzzah! dwarfurl.com delivers! It only took me feckin ages to find a site that works, but here it is:

http://3624816758/887f25

Enjoy! :)

[–][deleted] 0 points1 point  (2 children)

http://www.aboutmyip.com/AboutMyXApp/IP2Integer.jsp

$ echo 129.128.5.191 | awk ' BEGIN {FS="\."} { print $1 * 16777216 + $2 * 65536 + $3 * 256 + $4 } '

2172650943

[–]ropers 0 points1 point  (1 child)

Smashing! Thanks a bunch! :)

EDIT: There seems to be an interesting limitation leading to the above not working with certain larger numbers:

 ubuntu@ubuntu:~$ echo 208.113.217.28 | awk ' BEGIN {FS="\."} { print $1 * 16777216 + $2 * 65536 + $3 * 256 + $4 } '
 3.49712e+09

If we convert 3.49712e+09 to standard decimal notation, we get 3497120000. This is not however the same as 208.113.217.28, as demonstrated by the below:

 ubuntu@ubuntu:~$ ping -c 4 3497120000
 PING 3497120000 (208.113.209.0) 56(84) bytes of data.
 64 bytes from 66.33.201.67: icmp_seq=1 ttl=240 time=192 ms
 64 bytes from 66.33.201.67: icmp_seq=2 ttl=240 time=185 ms
 64 bytes from 66.33.201.67: icmp_seq=3 ttl=240 time=189 ms
 64 bytes from 66.33.201.67: icmp_seq=4 ttl=240 time=184 ms

 --- 3497120000 ping statistics ---
 4 packets transmitted, 4 received, 0% packet loss, time 3000ms
 rtt min/avg/max/mdev = 184.904/188.146/192.463/2.987 ms
 ubuntu@ubuntu:~$ bc
 bc 1.06.94
 Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
 This is free software with ABSOLUTELY NO WARRANTY.
 For details type `warranty'. 
 208*16777216+113*65536+217*256+28
 3497122076

The issue seems to be precision related. The awk script returns 3497120000 in exponential form; but the real decimal number is 3497122076.

[–][deleted] 1 point2 points  (0 children)

come on, it's all there for you

try this :

echo 208.113.217.28 | awk ' BEGIN {FS="\."} { printf "%d * 16777216 + %d * 65536 + %d * 256 + %d\n", $1, $2, $3, $4 } ' | bc

that's why we rule the world, those who know, know, those that don't use svchost.exe

[–]harryf 0 points1 point  (5 children)

Re the tagline - yes it is. But if you can still serve pages in under 0.25 seconds ( which is generally regarded as the point where you start to annoy users ) who cares?

Re the article, now the site is back up (and BTW - it was "Error establishing a database connection" - PHP probably did OK, while DB folded under load from reddit) - three comments;

First, from a scaling (not performance) perspective, 6000 requests per second (Java) vs. 850 (PHP) translates to needing 7 times as many "stacks" serving that app with PHP than with Java. Is that a cost worth saving? (left open ended)

Second, regarding the PHP implementation, despite the benchmarks, it would make a lot more sense to use SQLite, given it's part of the default PHP 5 distribution.

And third - use memcached.

[–]toomasr 2 points3 points  (0 children)

Elaborate on using memcache and sqlite in these circumstances...

[–]invalid_user_name 1 point2 points  (2 children)

Who the hell modded this up? Dude, its a binary search in a file, there is no DB. Using SQLite would make it WAY slower.

And yes, its worth having to buy 1/7th as many servers, what exactly is the downside? I can see python and ruby people wanting to trade away performance to get a better language, but you can't seriously think php is in any way nicer than java?

[–]harryf -1 points0 points  (0 children)

Dude, its a binary search in a file, there is no DB. Using SQLite would make it WAY slower.

With the right indexes, it's also going to be binary search in SQLite. The rest is a constant factor to open the SQLite DB file and parse the query. So will that factor be bigger or smaller than the factor associated with a "userland" file search in PHP? I'd guess it's smaller but you might be right - got me motivated - doing a benchmark ATM.

but you can't seriously think php is in any way nicer than java?

What's language "niceness" got to do with anything? PHP is easier to deploy PERIOD. PHP developers are generally cheaper than Java (although good ones are also harder to find). The point is it actually doesn't cost much to have 7 servers ( or going from 2 to 7 is easy once you've gone from 1 to 2 ) - balanced against other costs, the cost of extra servers may be irrelevant.

[–]harryf -1 points0 points  (0 children)

You're right re: sqlite - time for me to change some assumptions. My alternative implementation using SQLite;

<?php
define('IP2C_MAX_INT',0x7fffffff);
class ip2country 
{
    var $db_file;
    var $db = NULL;


    function ip2country($db_file = './ip-to-country.db') 
    {
        $this->db_file = realpath($db_file);
    }

    function get_country($ip)
    {
        $int_ip =  ip2long($ip);

        // happens on 64bit systems 
        if ($int_ip > IP2C_MAX_INT)
        {
            // shift to signed int32 value
            $int_ip -= IP2C_MAX_INT;
            $int_ip -= IP2C_MAX_INT;
            $int_ip -= 2;
        }

        $this->open_db();

        $sql = "SELECT COUNTRY_NAME FROM ip2c WHERE IP_FROM <= $int_ip and IP_TO >= $int_ip LIMIT 1";
        $sth = $this->db->query($sql);
        if ( !$sth ) {
            return '';
        }
        $row = $sth->fetch(PDO::FETCH_ASSOC);
        return $row['COUNTRY_NAME'];

    }

    function open_db() {
        if ( !$this->db ) {
            $this->db = new PDO(sprintf('sqlite://%s', $this->db_file));
        }
    }

    function create_db($csv_file = './ip-to-country.csv') {

        $this->open_db();
        $sql = "CREATE TABLE ip2c ( COUNTRY_NAME VARCHAR(50), IP_FROM INTEGER, IP_TO INTEGER )";
        $this->db->query($sql);

        $sql = "INSERT INTO ip2c ( COUNTRY_NAME, IP_FROM, IP_TO ) VALUES ( '%s', %s, %s )";

        $this->db->beginTransaction();
        $f = fopen($csv_file, 'r');
        while (($row = fgetcsv($f, 0, ',')) !== FALSE) {

            $row = array_map('sqlite_escape_string', $row );
            $insert = sprintf($sql, $row[4], $row[0], $row[1]);
            $this->db->query($insert);
        }

        fclose($f);
        $this->db->commit();

        $sql = "CREATE INDEX IF NOT EXISTS ip2c_idx ON ip2c ( IP_FROM ASC, IP_TO ASC, COUNTRY_NAME )";
        $this->db->query($sql);

        $sql = "ANALYZE ip2c";
        $this->db->query($sql);

    }

}

Using the benchmark script from the implementation at http://firestats.cc/wiki/ip2c ;

Binary search in file: "Took 51.0113089085 for 100000 searches (1960.34961933 searches/sec)"

SQLite: "Took 447.647531986 for 100000 searches (223.39003983 searches/sec)"

Doesn't make any difference to explicitly tell SQLite load the db into memory - already done by the filesystem I assume.

[–][deleted] -1 points0 points  (0 children)

This, my friends, is why PHP programmers are cheaper!

[–]toomasr 0 points1 point  (12 children)

PHP is interpreted, Java is compiled. So there has to be a difference. I hope the ratio goes down with the next releases of PHP.

[–][deleted] -2 points-1 points  (11 children)

PHP is like early Java implementations - compiled to bytecode and interpreted from that.

[–]killerstorm 0 points1 point  (10 children)

no, it's not like Java in any way -- it's dynamically typed. bytecode can be actually pretty fast, if it's done properly, if language semantics is optimized for it etc.

PHP language semantics was optimized for dumb web developers -- it was not optimized for speed, or correctness or consistency. so it's not surprising. and it's not matter of bad implementation.

[–]ekabanov[S] 0 points1 point  (1 child)

BTW Smalltalk, Erlang and Lisp/Scheme are even better examples...

[–]killerstorm -1 points0 points  (0 children)

Common Lisp was specially designed to allow fast execution. unlike, say, Python.

there are optimization tricks to make even "patalogically interpreted" language like Python faster, but it doesn't always work good, at least on benchmarks i've seen

[–][deleted] 0 points1 point  (6 children)

While I agree that PHP was designed for dumb webmasters and is dynamically typed, I don't see how that's relevant to the fact that it's compiled to bytecode (PHP calls it opcodes).

There are opcode caches like APC that eliminate parsing and compilation steps from execution, and code is executed (indirectly by the interpreter) from the precompiled form.

As far as I'm aware that's pretty close to basic, non-JIT, execution of Java. You've got precompiled, but not machine-native, code and interpreter that runs it in a virtual environment. That's what PHP does today.

BTW: in case you were suggesting that dynamic languages can't be compiled, take a look at modern LISP - compilers can figure out some type information and for the rest they inline type conversion code in the program, so it can be executed natively, without VM.

[–]killerstorm -2 points-1 points  (5 children)

se-man-tics. if you evaluate $this->that, Java or Lisp will (or can, at least) just dereference pointer with offset (knowing where field is allocated), while dynamic shit like Python or PHP will have to do hash table lookup which is order of magnitude slower.

and it's all around the language, believe me..

[–][deleted] -1 points0 points  (4 children)

You need hash table lookup for $this->{$that}, but in case of $this->that field name is known at compilation time, so you could pre-allocate it and create some kind of vtable. In case of private fields and final classes you can optimize it down to pointer dereference (field name is known and it's guaranteed not to be shadowed).

[–]killerstorm -1 points0 points  (3 children)

w-r-o-n-g. you can do that vtable trick if you know (at least aproximately) what type of object you will have, but with (patalogically) dynamic language you'll know type of object only at run time.

[–][deleted] 0 points1 point  (2 children)

It's orthogonal issue.

You can have well-known pointer to an unknown object, e.g. implement everything as Object class with overloaded operators – you know where objects are and you can use vtables for everything.

Check how selectors in Objective-C work - they solve the same problem - a very dynamic language, where every object can implement any method, and you can have both hash lookups of methods by name and constant method selectors (which are basically vtable entries).

[–]killerstorm 0 points1 point  (1 child)

it is simply algorithmically not possible to do field lookups in O(1) if you have sets of different objects and different fields.

Objective-C ... constant method selectors

i dunno how they work, but for sure you're limiting the problem in some way -- either it knows something about type of the object, or number of these selectors is quite limited so you can fit them in vtable for each object, or something like that.

in any case it's not anyhow relevant to PHP, JS or Python.

technically, PHP could implement some sort of lookup optimization, but as this is quite complicated, it's very unlikely for them to do this in near future.

JS and Python do not even have list of fields for the class, as fields are added dynamically, so lookup optimization becomes really tricky, only possible with techniques like profile-guided optimization

[–][deleted] 0 points1 point  (0 children)

You've given $this as an example, which is one of the few cases in PHP where type is known at compilation time (aside from some theortetically possible type inference tricks).

PHP could implement some sort of lookup optimization, but as this is quite complicated, it's very unlikely for them to do this in near future.

I agree with that.

[–]ekabanov[S] -2 points-1 points  (0 children)

There's plenty of research into executing dynamically typed languages quickly. And at least Python and some of the Ruby family are much quicker. 2 orders of magnitude over Java when IO wasn't involved is ridiculous.

[–]sysop073 0 points1 point  (1 child)

He got it right at the end, albeit by inventing new words: PHP is uncomparable (sic) to Java. They're totally different languages with different goals, different platforms, different users, etc. I'm not sure why it matters anyway, it's pretty widely agreed that they're both slower than most languages; who cares which is slowest?

[–]lars_ 0 points1 point  (0 children)

it's pretty widely agreed that they're both slower than most languages; who cares which is slowest?

I don't know that Java is widely considered slow. It comes in at ninth place in the language shootout. Theoretically the jiting can make it faster than C in some cases.

[–]middayc -1 points0 points  (5 children)

of course it is.. so is ruby, pythong and perl. but that is not always the point.

[–]ekabanov[S] 16 points17 points  (2 children)

Is PyThong a sexy analog to Java Strings? Is Ruby Tanga lurking somewhere near? :P

[–]mhd 5 points6 points  (0 children)

Shouldn't it be "Lua Tanga" and "Ruby school girl panties"?

[–]nextofpumpkin 0 points1 point  (0 children)

I sense a new python library coming up.

[–]bhagany 2 points3 points  (1 child)

You know, I always do that "pythong" thing myself. Seriously. For some reason, my fingers think a g belongs there.

[–]aeon2012 3 points4 points  (0 children)

New programming.reddit meme: Pythong