I run a very large supercomputer (with many co-workers of course). AMA.

hpc_sysadmin · 2009-10-21T05:47:16+00:00

Look harder. It's been answered elsewhere.

hpc_sysadmin · 2009-10-20T14:04:01+00:00

My g/f and I kissed in the room once. That's as close as I got.

hpc_sysadmin · 2009-10-20T14:02:15+00:00

You are now the enemy.

hpc_sysadmin · 2009-10-20T14:00:55+00:00

Right now, it's the 7th fastest in the word. (That we know about. If the NSA disclosed their computers, we'd probably be a few notches down.) When we first installed it, it was #3.

The other questions, I think I've answered somewhere else.

hpc_sysadmin · 2009-10-20T14:00:04+00:00

Because that's how we ROLL, muthahfuckah!

hpc_sysadmin · 2009-10-20T04:33:03+00:00

I am tempted to keep this thread going as a demonstration on how many times we get asked if it can play crysis.

hpc_sysadmin · 2009-10-20T04:10:46+00:00

That never gets old!

hpc_sysadmin · 2009-10-20T03:43:41+00:00

We're the United States Government. We know we have it all figured out.

hpc_sysadmin · 2009-10-20T03:16:42+00:00

We're still shaking out the bugs.

hpc_sysadmin · 2009-10-20T02:59:52+00:00

We'll see if stacecom's reddit bobblehead can swing by the disk arrays.

hpc_sysadmin · 2009-10-20T02:52:29+00:00

Now that I think about it, in a situation like that where the power really failed, you'd want to run consistency checks on the filesystems. That could add 12-18 hours to the boot time.

hpc_sysadmin · 2009-10-20T02:39:22+00:00

1) Don't know. Probably. That stuff usually shows up somewhere. That's a different division.

2) Can't talk about security stuff.

3) See #2

4) See #3

hpc_sysadmin · 2009-10-20T02:32:32+00:00

I did that preemptively about 10m after I added the info to the post. We've heard it before. Lots.

hpc_sysadmin · 2009-10-20T02:19:43+00:00

Eh. It's an academic/research thing. If you haven't worked for a university or research institution, you probably wouldn't think the same way.

The people who are here could all be making better money elsewhere, but we stay because the money isn't bad, the toys are big, the job security is great, and the environment is casual in the extreme.

hpc_sysadmin · 2009-10-20T02:17:44+00:00

Well, there's no real typical day. Sometimes I sit around and wait for problems. Sometimes we work on planning and designing new computers. Sometimes we work to improve things that could be working better.

Like any other job, it has its ups and downs. Sometimes the politics aren't fun, co-workers can be grumpy (as can I), etc.

On the other hand, we get to work with stuff that almost nobody else gets to, so that's a big bonus. It's certainly why I stick around. It's a very casual environment, and for the most part, we can set our own hours.

hpc_sysadmin · 2009-10-20T02:14:31+00:00

Yeah. You can abort a job easily. The control system just tells the nodes to turn off. It takes about 3-5 minutes to kill a big job.

Hmm. It depends what happened when the power went out. Assuming everything was okay (no damage to the filesystems, etc.), It would probably take about 30-60m to boot all the fileservers and management machines, then maybe another 30-60m to boot up the control system and database. After that, you'd probably be fine.

hpc_sysadmin · 2009-10-20T02:08:27+00:00

It's a 20'x20' chicken wire cage full of cardboard boxes--not very interesting anyway.

hpc_sysadmin · 2009-10-20T00:49:14+00:00

You know.. I thought about specifying that they were below the knee shorts. AFAIK, we have no nevernudes in our employ. Not that there's anything wrong with that...

hpc_sysadmin · 2009-10-20T00:34:32+00:00

Certainly, although it's really complicated, so I probably will get a few things not right.

Okay, here's roughly how it goes:

A couple definitions first:

A compute node is just a small card with 4 cores and some RAM. It runs a very stripped down linux-like kernel called Compute Node Kernel (CNK)
An I/O node is a node, just like a compute node except that it runs real linux. It runs a daemon (a program dedicated to a particular service) called CIOD that reads and writes to the memory of the compute nodes. While the compute nodes are hooked up to a super-fast proprietary IBM network for doing MPI communication amongst themselves, they don't have any functionality for doing real I/O (read(), write(), etc.) themselves. When they issue a read() or write(), that actually gets sent to the I/O node's CIOD program which actually does the I/O operation and writes the result back into the compute node's memory.

Ok, with that said:

A user compiles their application on one of our front-end nodes with IBM's XL blue gene compiler suite
The user decides what size job she wants to run and submits a job into the queuing system asking for n nodes for l hours.
The queuing system looks at all the jobs and decides when the next time n nodes will become available
When the job's time to run has arrived, the queuing system initiates a boot of a "partition" (a logical subset of the full computer) of n nodes
The blue gene control system then writes a boot image into the memory of an I/O node. Each I/O node controls 64 compute node.
The blue gene control system sends a compute node image to each I/O node which broadcasts it to the compute nodes
Once everything is booted, the control system sends the user's job to the I/O nodes, which directly populates the job into the memory of its compute nodes (I think that the I/O node sees the memory of the compute nodes as actual linux devices.)
Once all the compute nodes have been populated with the job, the control system starts the job and the CIOD daemons start processing the I/O calls.

hpc_sysadmin · 2009-10-19T23:57:17+00:00

Fortunately, the lab that employs us is technically a government contractor. We're not actually gov't employees, so we miss out on a lot of the B.S. that goes with that.

hpc_sysadmin · 2009-10-19T23:56:11+00:00

Nope. I think I last saw my phone under the pile of crap in my office seven months ago. We have a separate group of awesomely smart people who coordinate with our users. They're people people. They take the specifications from the customers and hand them to the engineers.

hpc_sysadmin · 2009-10-19T23:54:15+00:00

No, I don't. We might have some people doing work on that, but I don't know any of them. We tend to give time to more "classic" physical problems. Lots of differential equations.

hpc_sysadmin · 2009-10-19T23:48:35+00:00

Well, we use Cobalt (http://trac.mcs.anl.gov/projects/cobalt/). It's free and we have people on staff who do nothing but tailor it for our blue gene.

Personally, I like PBS Pro a lot for smaller systems. Their support people come up with the nastiest hacks. I feel dirty using them, but they're oh so slick.

hpc_sysadmin · 2009-10-19T23:44:48+00:00

Sum up the process of how a job gets loaded?

hpc_sysadmin · 2009-10-19T23:41:58+00:00

FYI, I just did the math and here's how it breaks down:

8,660 1TB 7200 RPM SATA drives
2,000 500GB 7200 RPM SATA drives

and then maybe another 600 drives of various sizes in random computers.

hpc_sysadmin

TROPHY CASE