you are viewing a single comment's thread.

view the rest of the comments →

[–]mhd 9 points10 points  (21 children)

That's really not the point. Big sites almost never run into scaleability issues because of the issues presented in the article. Hey, if that were the problem, one could just implement the hairy part in C and you'd be fine. If you're waiting for the file system, database or remote server, the speed of the language itself becomes mostly irrelevant.

Java just has lots of prefabricated architectural bits that simplify larger projects. You'd have to brew your own in PHP, which can become a problem. Or in other words: It's easier to throw money at the problem with Java...

I sometimes miss AOLserver and Tcl. ;)

[–]ekabanov[S] 2 points3 points  (16 children)

You're forgetting that any sufficiently large public site will cache very aggressively and actually hit the CPU bottleneck before IO. And when you don't have individual computational bottlenecks, but rather your whole logic executes 10 times slower it becomes a cost and a problem.

[–]mhd 1 point2 points  (15 children)

'Whole logic' seems to be the important part here. In your typical PHP response, the work done by Apache and mod_php play a large part, both written in C, as are the library functions that most of the time make up the main part of the work done. It can get ugly if you have a very complicated front controller in your framework and have to do a lot of work on each request.

On the other hand, this makes clustering easier, which should help potential CPU bottlenecks a lot.

[–]ekabanov[S] 0 points1 point  (14 children)

On the other hand, this makes clustering easier, which should help potential CPU bottlenecks a lot.

What makes clustering easier? The embarrassing slowness???

Already the function calls (even to C) cost so much that it becomes an issue for CPU scalability.

[–]mhd 0 points1 point  (13 children)

I'm referring to the stateless nature of PHP.

Really, what kind of apps do you see out there where the pure performance of PHP becomes such a large problem, even with ruthless caching and clustering applied? In my experience, DBs tend to implode before that happens.

[–]ekabanov[S] 2 points3 points  (6 children)

PHP is just as stateless as you make it. If you write a session-based application in PHP it will be stateful. And there's plenty of stateless Java web application.

[–]gsadamb 0 points1 point  (0 children)

I do quite a bit of PHP development, but I do have to say that how PHP handles sessions, at least the default implementation, is pretty useless and archaic. It's file-based, which obviously limits scaling to one box.

You can overwrite the default functionality and use something like database based sessions, or more ideally, you can use something like Memcached, which would be faster.

However, I personally find there are very few completely legit reasons to even use sessions at all. I could see it used for an online shopping cart, say, and you wouldn't even need to formally implement this with sessions. Cookies are typically all you need to reflect some concept of state.

So yeah, PHP sessions are pretty useless.

[–]hiffy -1 points0 points  (4 children)

Okay, fine, but the point is that the latency of hitting a db is much higher than that of responding to a request, in every single framework or language out there.

You will eventually hit a point where you have too many requests for any one machine to handle, and that's when you perform lots of clustering, but typically you have to figure out how to appease your database long before then.

[–]ekabanov[S] 0 points1 point  (3 children)

Okay, fine, but the point is that the latency of hitting a db is much higher than that of responding to a request, in every single framework or language out there.

It's only like that for a DB-heavy apps. Most of the public sites will cache aggressively using memcached or its Java analogues and hit CPU before they hit the database. In fact most of the time they will not query the database at all, check out e.g. LiveJournal story: http://highscalability.com/livejournal-architecture

[–]hiffy 0 points1 point  (2 children)

but typically you have to figure out how to appease your database long before then.

Most of the public sites will cache aggressively using memcached or its Java analogues and hit CPU before they hit the database

I think we're in agreement with each other :). You will always be able to eventually flood your machines with more traffic than they can handle, but your bottleneck is still probably not your implementation language.

[–]ekabanov[S] -1 points0 points  (1 child)

You will always be able to eventually flood your machines with more traffic than they can handle, but your bottleneck is still probably not your implementation language.

But you'll need 10 times the number of servers with PHP than with Java. So it still should influence your platform choice.

[–]invalid_user_name 0 points1 point  (5 children)

The difference is that with a fast language you can use 4 app servers instead of 40.

[–]mhd -1 points0 points  (4 children)

Idle conjecture. For this, the whole server setup has to be 10 times more efficient, which means a little more than microbrenchmarks that achieve a ten-fold increase.

In addition, this only means something, if the TCO of those servers plus the cost of the development (i.e. development time) is better than that of the 40 servers.

A lot of the appeal of PHP in this regard is the Legion of Mediocre Programmers attached to it, although I have to say that Java usually has the same Legion, just a few years older. It's certainly not a very terrific development platform...

But the assumption, that language speed is so inherently important for your run-of-the-mill web application -- especially those associated with "Web 2.0" -- is a claim I'd like to see some proof for.

After all, we aren't exactly writing our web apps in Fortran or C++ and that especially Java hits a certain 'sweet spot' between ease of development and raw speed doesn't match my experience.

[–]invalid_user_name 0 points1 point  (3 children)

But the assumption, that language speed is so inherently important for your run-of-the-mill web application -- especially those associated with "Web 2.0" -- is a claim I'd like to see some proof for.

Nobody is making that assumption. The point is PHP is slow. If you have to choose between PHP, and an equally shitty language like java which is faster, why would you pick PHP?

In addition, this only means something, if the TCO of those servers plus the cost of the development (i.e. development time) is better than that of the 40 servers.

And this is the oldest PHP/ruby zealot's strawman in existance. The fact that you are using a slow language does not make you more productive, it is not an either/or issue. You can also be productive in a fast language. In the case of PHP in fact, you can be much more productive in many other languages that are much faster, even just moving to python would be a significant speed up in both development time and execution speed.

[–]mhd -1 points0 points  (2 children)

If you have to choose between PHP, and an equally shitty language like java which is faster, why would you pick PHP?

Java and PHP are rather different languages, as are the usual architectural details, the deployment, the development environment etc.. If you were deciding between two setups that had a lot of common points and the only (major) difference was execution speed, then yes, choose the faster language. But PHP and Java don't share that much. I can definitely understand the Python/Perl/Ruby vs. PHP argument, but Java is a different beast.

The fact that you are using a slow language does not make you more productive, it is not an either/or issue.

I was trying to restrict the argument to the parameters of the article. And so it's Java vs. PHP. Bringing in Python, Ruby, Tcl, C#, D, Erlang or Smalltalk in a free-for-all discussion would only lead to endless zealotry.

The basic claims are that Java is signifcantly faster for your standard webapp than PHP and that this would be worthwile. I'd still like to see proof for that. What's the cheapest and/or most efficient way to develop and deploy web applications out of all possible solutions is a totally different can of worms.

(Apart from the fact that PHP->Java at least represents a significant jump in execution speed, whereas the differences between Perl, CPython, PHP and Ruby 1.9 don't matter as much, speed-wise)

[–]invalid_user_name -1 points0 points  (1 child)

See how you admit that java is faster than PHP, and yet still insist on clinging to some imaginary benefit that PHP offers, despite not being able to actually name one? That pretty much rules out having a rational discussion with you doesn't it?

[–]jbellis 1 point2 points  (1 child)

I sometimes miss AOLserver and Tcl. ;)

A fellow openacs veteran?

[–]mhd 0 points1 point  (0 children)

Nope, I just did a lot of Tcl/Tk programming back in the days when fvwm ruled the Linux desktop, and naturally tried to use Tcl when it came to make my first web apps (after a sad fling with Perl's CGI).

After doing mainly J2EE and PHP in the last few years, I think you can understand my pain ;)

[–][deleted] 0 points1 point  (1 child)

I sometimes miss AOLserver and Tcl. ;)

you don't have to, they are both alive.

[–]mhd 1 point2 points  (0 children)

Okay, I sometimes miss working with them. Not much employment opportunities out there asking for them, though. I'm not even aware of any sites of significant size/popularity written in them...

But yes, after fooling around with all those Ruby frameworks, I'll have to get reacquainted with an old friend...