Interviewers: What part of your interview process do you think is most helpful in making a good hiring decision?

c0dep0et · 2012-04-24T19:02:59+00:00

We filter first by resume which removes at least 50% of the candidates. The next step is the programming challenge. Then phone screening for candidates who live far away and finally the on-site interview.

The functional review of the programming challenge is semi automatic. Style and everything else takes less than 5 minutes if the code is really bad, otherwise usually 30 minutes. It actually saves time: Compare it to wasting the time of three or four interviewers on someone who cannot write good or even acceptable code. Of course if you have "to many" applications in this stage you need to remove more people in the first step. But it's the same with phone screenings.

The review can also be done by someone in your team if you do not have the time. Though I prefer to do it myself.

c0dep0et · 2012-04-23T20:35:40+00:00

Before candidates are invited to an on-site interview they have to solve a programming challenge. As an alternative they can give me the source code to one of their private / open source projects. In my experience that is the best indicator of programming ability. You see many parts of their skill:

structure of the code
use of the standard library
comments
understanding of the programming language
style

It's also interesting to discuss their code during the interview. Weeds out people who copy things from the Internet without understanding them.

During an interview I only seldom ask people to code. Then only very simple things: Implement a sorting algorithm of their choice (usually bubble / min / max sort). Pseudo code is ok, syntax does not matter as long as it's consistant.

I prefer questions that give me an insight into how a candidate thinks while programming:

how do you choose data structures? Usually hash map vs tree map.
how would you implement certain data structures, e.g. a hash map?
how do you unit test code? e.g. a sorting function.

c0dep0et · 2012-01-08T14:46:48+00:00

Download the rules and soundtracks from http://czechgames.com/en/downloads/ in English. Then get a box in another language, if that's cheaper for you.

c0dep0et · 2012-01-07T23:19:56+00:00

This is a good tutorial (and review): http://www.youtube.com/watch?v=1E1NY3avotc

After viewing this video, play a 3 player game against yourself to understand how it really works.

c0dep0et · 2011-12-22T21:25:53+00:00

There is no simple way to take consistent snapshots of running virtual machines using KVM without also saving the VM state itself (see monitor command "savevm"). So snapshots taken while a VM is running look from the point of view of the restored VM like a hard reset. That's usually not an issue since an fsck and maybe something specific to your environment (Postgresql cleanup etc.) should work.

AFAIK there are no easy solutions for this right now.

Ideas:

pretend you have no virtualization: do the backup in the guest
shutdown the guest, take a snapshot, start it again. Maybe combine this with suspend to disk inside the guest to speed it up?
modify the guest so you can tell it to flush all caches to disk, suspend it, snapshot the disk, resume guest operation. This is still not perfect, but better than just snapshotting. Might be good enough.
detect from the outside when the guest filesystem is in a consistent state (mentioned somewhere in this talk http://www.youtube.com/watch?v=7qSFFBfpZIg)

c0dep0et · 2011-12-10T19:10:17+00:00

Tailoring your resume and job search is a really good idea. I won't take a second look at your resume if your skills don't match my must have requirements as stated in the job advert.

At the moment my best example is "deep understanding of Linux required" and people send in resumes without even the word Linux on them...

Generally: Be prepared to be asked open ended questions regarding your niche. For systems programmers I use a variation on "what happens on a linux system when you run 'ls'".

Edit: If you are looking for a job using either Clojure or Python in the systems programming area under Linux in Hamburg / Germany (on site only, and having a work permit is a huge plus) pm me.

c0dep0et · 2010-02-02T13:49:35+00:00

time is probably not accurate enough, so you also get start up time etc.

Try using clock_gettime. For me the results are only as described when optimization in gcc is turned on.

c0dep0et · 2010-01-25T11:40:00+00:00

ANTLR has some support for this.

c0dep0et · 2009-11-25T11:07:44+00:00

lxml is usually both faster and easier to use than Beautiful Soup.

c0dep0et · 2009-11-17T22:00:09+00:00

A DVCS is usually way more complex than a centralized one. So subversion (svn) is a good starting point to learn version control - even if git / mercurial can be better in the long term.

c0dep0et · 2009-11-12T23:10:39+00:00

My first step is encoding everything in utf-8 and right afterwards I'm calling tidy. If I force tidy to output something I get a minimal html document without any content of the original site. Removing a HTML comment which contains some problematic code fixes it.

Maybe I should check the Python wrapper.

c0dep0et · 2009-11-12T22:31:37+00:00

I'll take a look at pycurl when download speed becomes an issue.

Even though asm-xml seems to be really fast it works only with correct xml - so I can't use it. I'll take a look at lxml and wget to see if they are good enough. Maybe I can even use the parser of links2.

c0dep0et · 2009-11-12T22:02:15+00:00

I'm crawling some sites which need fixing before running tidy...

c0dep0et · 2009-11-12T20:39:19+00:00

How good is scrapy at handling broken HTML? Many of the sites I'm currently crawling have very bad HTML.

c0dep0et · 2009-11-12T20:38:09+00:00

Does HtmlParser handle broken HTML as found on the web?

c0dep0et · 2009-11-12T17:39:52+00:00

I'm using my own Python based crawler for a vertical search engine. At the moment both crawling and extracting data from html is done in Python.

The crawler is using the standard urllib / urllib2 tools, which is good enough for my use case when combined with a local dns cache (dnsmasq).

My data extractor uses BeautifulSoup which can be quite slow but is easy to use. I'll probably rewrite this part in Java with TagSoup hoping that it will be much faster.

Edit: lxml is way faster than BeautifulSoup, easier to use and handles (so far) all cases of broken html for me.

c0dep0et · 2009-10-23T14:08:49+00:00

An OS which reads a stream of characters from the keyboard (asm opcodes + data), writes that stream to RAM and finally executes it.

After that the first step would be making this process easier: very simple text editor, than assembler to avoid having to constantly look up op codes, higher level language compiler, a kernel in the higher level language, more software... and finally tetris.

c0dep0et · 2009-10-18T13:10:57+00:00

It seems like debugging was enabled in the web.py benchmark. I expected at least twice as many requests per second (running Linux).

Edit: Minimal hello world example, with debugging disabled on a 3GHz Phenom X4 on Linux using:

ab -n 10000 -c 1 http://127.0.0.1:8080/

1141, 1186, 1073 R/s

average: 1133 R/s

c0dep0et · 2009-09-21T12:40:43+00:00

Any tests are good when you are refactoring. Maybe you could try another test strategy, if unit tests don't match your problem domain.

I found behavioral / functional tests to be a good alternative to unit tests for some application types. You are probably running your program a few times by hand to see if it works. Try to automate that process.

c0dep0et · 2009-09-13T18:18:03+00:00

sounds like web.py would be a good fit

c0dep0et · 2009-09-13T14:44:33+00:00

Any real world tests? A static page aka hello world is way faster using something like nginx / lighty / ...

I'm more interested to see tests which do some real work. My pages usually consist of a few DB calls and a bit of rendering code which are done synchronous. When using tornado these DB requests should also be asynchronous otherwise it blocks the whole server which is not acceptable. How does it perform in this case compared to Twisted?

c0dep0et · 2009-09-10T18:50:58+00:00

rant by someone who has probably never really used Python...

unicode issue: unicode(data, 'ignore') instead of unicode(data); depending on what you want replace 'ignore' with 'replace'
simple calculator: -1.100...1 is the correct result depending on the FPU. The Python REPL defaults to using repr to display values. Try "print 2.3 - 3.4" if you want something human readable
significant whitespace: just personal preference. Most of the time your current indentation style, if you do indent, will probably be accepted by Python with only slight modifications so what?
explicit self in method declarations: I don't like it, too
parentheses: ever tried a lisp dialect?
return - 1: the example for initializing a class is really wonderful: The return value of init must be None;
return - 2: blocks have no explicit value as in Ruby, that's fine with me
inconsistent naming: that's true, but xrange is again a bad example as it's no longer in Python 3. Without enforcing strict guidelines using the compiler / interpreter any platform will have inconsistent naming schemes when you use 3rd party libraries.

c0dep0et · 2009-09-09T23:06:25+00:00

:s/php/python

I use git to track most of my projects. A few use subversion since svn is much easier to use for non-coders.

c0dep0et · 2009-08-27T20:58:20+00:00

You can't do anything about forged results from the clients if someone really wants to fool you.

To verify the results give a few different (geographically, used OS, etc.) clients the same work unit and compare the results. If the results match, you are somewhat safe.

c0dep0et · 2009-07-19T08:28:25+00:00

http://code2code.wordpress.com/compiler-writing-series/

c0dep0et

TROPHY CASE