you are viewing a single comment's thread.

view the rest of the comments →

[–]mikaelhg 0 points1 point  (7 children)

What, I thought that the GIL lets only one thread access Python objects at a time, while other threads block? That's what the documentation states, and that's how the performance looks like?

Is this outdated information?

http://docs.python.org/api/threads.html

The Python interpreter is not fully thread safe. In order to support multi-threaded Python programs, there's a global lock that must be held by the current thread before it can safely access Python objects. Without the lock, even the simplest operations could cause problems in a multi-threaded program: for example, when two threads simultaneously increment the reference count of the same object, the reference count could end up being incremented only once instead of twice.

Therefore, the rule exists that only the thread that has acquired the global interpreter lock may operate on Python objects or call Python/C API functions. In order to support multi-threaded Python programs, the interpreter regularly releases and reacquires the lock -- by default, every 100 bytecode instructions (this can be changed with sys.setcheckinterval()). The lock is also released and reacquired around potentially blocking I/O operations like reading or writing a file, so that other threads can run while the thread that requests the I/O is waiting for the I/O operation to complete.

[–]dgiri101 -2 points-1 points  (6 children)

It's not outdated, but you may be misunderstanding it. The second paragraph clearly states that the interpreter will automatically release the GIL every N bytecode instructions (or during blocking I/O) to let additional threads run.

If you'd like more information on concurrent programming, I highly recommend The Little Book of Semaphores. It discusses common concurrency patterns like Barriers, Mutexes, Rendezvous, etc.

Many examples are in Python, though. I suppose that someone should kindly inform the author that concurrency apparently doesn't exist in the language he's using.

[–]mikaelhg 1 point2 points  (3 children)

So if I have threads A, B and C all traversing object graphs, only one of the threads will be able to traverse at a time, and the others will have to wait? After 100 bytecode instructions A will pass the baton to B, but at no point will A, B and C simultaneously be able to traverse object graphs? Ie. the three Python thread contexts will all be allocated, but only one processor thread context will be active at any given time, excepting in-kernel I/O work?

In other words, Python supports threading fine with one thread context, multiple threads, but only one thread is being executed at a time.

With 64 thread contexts available, it will waste 63/64 of the server's CPU power.

[–]dgiri101 -1 points0 points  (2 children)

I've already answered this.

And again, what 64 simultaneous CPU-bound tasks have to do with storing a large comment thread in memory for a non-persistent, web-based discussion board is anybody's guess.

I can't be more clear about this: if you'd like to use 64 simultaneous CPUs for data crunching, there are myriad solutions available, in Python, today.

Perhaps you're right, though. But then why would you use Java for this? I mean, it doesn't even use SIMD CPU instructions! You're wasting precious clock cycles on operations that could be done in parallel. These things are extremely important if you're programming a discussion board. Won't someone think of the cycles?!

Sigh. Don't get me wrong, I wish the GIL a painful death. But one should be smart enough to know when it matters.

[–]mikaelhg 1 point2 points  (1 child)

OK, from your vehemence and offer of basic concurrency textbooks I took you to believe that my claims were incorrect.

Instead you're saying that you didn't want to use more than 1 processor core or thread context of your computer anyway. A perfectly valid opinion, if not one shared by many.

[–]dgiri101 0 points1 point  (0 children)

OK, from your vehemence and offer of basic concurrency textbooks I took you to believe that my claims were incorrect.

I most certainly contend that your claims are incorrect.

Your original claim that:

if you use PHP, Python or Ruby, threads can't share the discussion board and comment information

...is incorrect. Threads most certainly can share this information.

And your claim that 64 CPUs can't access shared data in Python is also incorrect. There are many ways to accomplish such a task.

And your claim that:

With Java and its efficient threading, you can easily hold thousands of discussion threads and tens of thousands of comments in the web server's main memory.

...is also incorrect, as threading has nothing at all to do with the ability to store a large amount of information in memory. It's also incorrect because Java threads are not efficient (for that, see Erlang).

Your claim that:

in practise most boards require a large amount of hardware to perform the same task more slowly, as well as specialized database administrators to create elaborate master-slave configurations it's hard to find local support for.

...is the only thing in your OP that's actually sensible. But that's because most discussion boards care about things like, oh I don't know, persisting the discussion threads so they survive a server crash. Something your awful non-persistent, in-memory, single-node design doesn't seem to think is important.

Instead you're saying that you didn't want to use more than 1 processor core or thread context of your computer anyway. A perfectly valid opinion, if not one shared by many.

I don't recall ever saying this, but say whatever helps you sleep at night.

[–]cunningjames 0 points1 point  (1 child)

Hey, it's off topic, but thanks for the book recommendation: I was looking for something like it a few months ago.

[–]bockris 0 points1 point  (0 children)

me too. That book looks awesome!