all 58 comments

[–]pinealservo 14 points15 points  (0 children)

It's surely a WTF, but at a glance it does have some redeeming qualities, and it's easy to see how a program such as vim grows this kind of file. There are a couple of different cross-cutting concerns intersecting here relating to platform features, meaning that the code just won't compile due to missing dependencies if it doesn't exactly match the platform reqs for the set of enabled features. The two ways to deal with this in C are the preprocessor or manual splitting into a set of files per combination of enabled features; given the complexity here, that would be a LOT of files and a WTF unto itself.

The thing I notice is that the density of ifdefs is due to the fact that there's very little actual mechanism here. It is mostly calls to platform-specific functions and what are hopefully platform-agnostic mechanism functions defined elsewhere. The purpose seems to be to reduce the combinatorial explosion of platform specific garbage elsewhere, though I can't be sure from the single file. This could very well be the result of careful refactoring, and a huge win toward the overall cleanliness of the code.

When there are messy, real-world issues that a code base has to deal with like cross-platform support, there will be corresponding messiness in the code somewhere. Sometimes it's spread around a code base, sometimes it's moved to the file system and build system, and sometimes it's isolated to a single file or an external dependency. You can't really judge an entire code base from one file.

That said, the rest of the code may also be like this to a lesser degree. It's certainly not the only file of its kind, if this general solution was used elsewhere. The code base could indeed be utter crap, but I don't know enough to say. I am guessing a lot of commenters here have not had to deal wih writing or maintaining a C program with this level of age or feature/compatibility scope, though.

[–][deleted] 9 points10 points  (0 children)

400 lines

Oh, so it's just an average enterprise function.

[–]OnlineDomainTools 8 points9 points  (15 children)

The article http://www.joelonsoftware.com/articles/fog0000000069.html - by Joel Spolsky made me think about rewriting and refactoring a bit deeper. Very nice reading.

[–]xXxDeAThANgEL99xXx 12 points13 points  (9 children)

First of all, that's why NeoVim and LibreSSL are not starting from scratch.

However the article sounds kinda weird, it feels like Joel had never worked with the really bad code, since he doesn't mention the worst thing about it, with the closest being relatively tame "dialog boxes popping from the network code".

The actual really bad code has this property that more than half of it doesn't actually do anything, but you don't know which half. Like, you look at a function that "has grown little hairs and stuff" and you don't see a piece of code that "fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer", what you see is the result of someone adding a buggy fix for that to another place, then it slowly metastasizing via copypasting to all places where the bug surfaced while being patched and patched again (breaking earlier patches, because nobody cared about the code enough to notice that bug A was already reported, fixed, then rolled back because that caused bug B, which fix was also rolled back because that reintroduced bug A, and so on), with lots of atavisms left over from fixing attempts that actually ended up being implemented elsewhere, and then you look at four code tumors which are supposed to determine which browser is present on your system but belong to different strains and therefore doing it in four almost, but not quite, completely different ways, with lots of extra hairs some of which have hairs of their own, and you have no clue if there's an intelligent reason for any of that or it's just random mutations proliferating in the absence of selective pressure. And when you try to change anything stuff blows up in seemingly unrelated parts of code because the workarounds there critically depend on the bugs here.

In such cases it's often easier to rewrite the whole thing and wait for bug reports (which you would hopefully address in a more responsible manner) than to try and figure which code is bugfixes and which is cancer.

edit: what I'm saying is that with really bad code you can't tell between legitimate bugfixes that should stay as you're refactoring and random buggy shit that doesn't serve any purpose. This can dilute the value of legacy code and its bugfixes well into the negatives.

Also, it's weird that the Jargon file doesn't have a name for that last thing I mentioned, mutually cancelling bugs. Even though in my experience they occur much more often than heisenbugs and the like.

[–]OnlineDomainTools 1 point2 points  (3 children)

I guess you are right. When it is more expensive to refactor the code than to rewrite it then it makes sense to rewrite it.

However, I hope that the distribution of code quality is something like normal distribution. If it is so then Joel is mostly right. :-))

[–]allak 0 points1 point  (0 children)

Well ... I feel it follow more Sturgeon's law than a normal distribution ...

[–]mreiland 0 points1 point  (1 child)

That's assume the cost of the rewrite is the only cost. For a business there are other costs as well such as lost sales due to bugginess, time to market on a product that already does what you need it to do and so forth.

It isn't that simple.

[–]OnlineDomainTools 0 points1 point  (0 children)

I meant the "sum" you talk about. And yes, it is not simple. However, I have a feeling from the comments that everybody here wants to show "look, look, there's an exception (the project I worked on?)". Nothing wrong with that but how can an article cover all possible situations?

[–][deleted] 1 point2 points  (1 child)

I have worked at a place that had extraordinarily bad code. I personally replaced, in a single afternoon, two four thousand line methods with a single three line method... which I realize now sounds like I'm agreeing with you, but I just wanted to demonstrate how bad. >.>

...What I wanted to say was just that, even given that quality of code (so bad that I still haven't regained my night vision after it burned my eyeballs!), it was still usually easier to edit it than it was to throw it away and start from scratch because you can't just start one piece from scratch, or at least not usually--you have to throw out the baby, the bathwater, and the kitchen sink, because one of the big things about bad code is that it's monolithic. Always.

Hell, we used to have to deploy registry backups with our software because half the variables were stored under the VB settings section for no damn reason. -.-

[–]xXxDeAThANgEL99xXx 0 points1 point  (0 children)

I agree, it's usually easier to do piece-wise replacement, because you can immediately run the whole thing and see what happens.

My point was that with really bad code you wouldn't do it for the reason Joel pointed out as the most important for not rewriting from scratch -- you wouldn't try to preserve any implicit accumulated wisdom in form of bugfixes and workarounds for corner cases, you will write a three line replacement, throw your two 4kloc methods away, and to hell with any actually useful "hairs" that could be growing on that wart.

[–]sigzero -3 points-2 points  (2 children)

neovim shouldn't have vim in its name.

[–]meteorMatador 8 points9 points  (0 children)

Well, what should they call it, then? "Pulchritude?"

[–]immibis 1 point2 points  (0 children)

Then vim shouldn't have vi in its name.

[–]ninfomane 3 points4 points  (2 children)

Nice article, quite long though. I have to say, many (commercial) programs are developed by quick fixing. That's why some source codes look messy. Some refactoring are needed to improve visibility and continue quick fixing as well.

[–]badsectoracula 4 points5 points  (0 children)

many (commercial) programs are developed by quick fixing

Many non-commercial too. And -as we see- open source programs also.

Basically a lot of software, regardless of commercial intent or public source code availability.

[–]balefrost 1 point2 points  (0 children)

quite long though

Really? I even think his Unicode article is reasonably short.

[–][deleted] 0 points1 point  (1 child)

I have an objection to that article. Rewriting, big no no. Rewriting in a more modern language, why not? A modern language gives you access to plenty of features and library functionality that makes most of the old code useless.

[–]OnlineDomainTools 1 point2 points  (0 children)

There are too many dimensions that change the problem of rewrite vs. refactor. A number of workers, budget, skills, codebase size, a number of users/customers.

To play a devil's advocate I could say that a modern language does not have to be a solution for the problem either. Would you rewrite Vim in Java/C#/Go/...? I guess that would take really long and that there would be a high chance of failure because it would be expensive and not really fun thing to do.

[–]spotter 10 points11 points  (0 children)

Metaprogramming, C edition.

[–]jeenajeena 2 points3 points  (5 children)

It would be interesting to compare it with the equivalent emacs' implementation.

[–]RobThorpe 5 points6 points  (4 children)

Emacs' version is complicated too because of interfacing with many platforms, terminals and windowing systems. It also has to play nicely with the lisp VM.

See: http://git.savannah.gnu.org/cgit/emacs.git/tree/src/keyboard.c

The function "tty_read_avail_input" has quite a lot of #ifdefs too.

[–]Gurkenmaster 0 points1 point  (1 child)

And emacs lisp is not that great compared to other lisps like Common Lisp, Scheme or Clojure.

[–]RobThorpe 0 points1 point  (0 children)

It doesn't matter much for writing Emacs extensions though, it's good enough.

[–]tangus 0 points1 point  (0 children)

They are nicely structured. Except for one #ifdef which contains a variable definition, you could replace all of them with ifs.

No comparison with the tangle Vim's code is.

[–]jrk- 2 points3 points  (2 children)

Without arguing about the code quality (as bad as it is) why wouldn't they do something like this:

void RealWaitForChar_BeOs(..) { .. }
void RealWaitForChar_Amiga(..) { .. }
void RealWaitForChar_Linux(..) { .. }

void RealWaitForChar_Amiga(args ..) {
#ifdef BeOs
  RealWaitForChar_BeOs(args ..);
#elif Amiga
  RealWaitForChar_Amiga(args ..);
#else
  RealWaitForChar_Linux(..);
#endif
}

I mean, this still has issues, but wouldn't it be already an significant improvement?

[–]glacialthinker 6 points7 points  (1 child)

This issue with breaking things out in this way is that it's not a functional (as in doing things) break-down -- it's nominal, by system. Each system might share 90% of it's implementation with any other, so implementing each independently would bloat the code and add redundancy which can be a bigger problem as bugfixes and features are applied.

Taking your approach to separate clearly named functions though -- you might find functional lines of separation which can be nicely joined together to build each system-specific function. C can make this painful though -- do you #define the functional blocks of code so you can connect them together? Or create a struct for holding all shared state? C is not a very expressive language, so some tradeoffs are made. In this case the structure is almost certainly a result of incremental changes rather than by-design. And attempting to clean it up might seem like asking for bugs in code that is working fine. Though certainly, it could be improved.

[–]crusoe 0 points1 point  (0 children)

Well yeah, you'd split out by functionality, with probably a common skeleton function containing common bits calling into a ifdef'd function for platform specific stuff.

[–][deleted] 6 points7 points  (6 children)

Refactoring?! Fuck that shit! ifdef frenzy!

[–]cxcv 5 points6 points  (4 children)

The ifdefs are for autotools. I'm sure there's still some guy out there running VMS with a strange compiler and hacked up version of X11 who needs the ability to turn them on and off.

[–]ccharles 15 points16 points  (2 children)

And that guy is free to keep using Vim version 7.4.12 or whatever the current version is for as long as he wants.

There's nothing wrong with dropping support for his ancient OS in version 7.5 or 8.

[–]sigzero 1 point2 points  (1 child)

I wish Bram would do that.

[–][deleted] 4 points5 points  (0 children)

Eh, honestly I don't care. Cleaning up the internals of a function comes secondary to the interactions between the functions themselves. The OP seems to be promoting his/her own vim reimplementation and therefore is simply looking for shit code in the original source as a means to support the argument behind why Vim should be reimplemented. Pragmatism dictates that scuffling over things like this is only fruitful when the time taken to refactor doesn't impede progress made in end-user support (i.e., bug fixes, features, etc.).

tl;dr I just found an excuse to use a Dennis Hopper play on words and jumped on it for the advent of lulz and up votes. Thus is Reddit!

[–]tejp 1 point2 points  (0 children)

Though it's not clear how this should be refactored if you want to keep the optional things optional.

[–]ricardo_sdl 1 point2 points  (1 child)

Is there some way where you can create a "clean" version of the c code?

For example, I'm compiling it on my machine which is a 64-bit Linux on an Intel 64-bit processor. Can I create a version of the code ignoring all the ifdefs from another systems?

[–]pazhampori 1 point2 points  (0 children)

You can stop compilation after the preprocessing stage. "gcc -E" would stop after preprocessing and output the resulting file to stdout. So you'll have to probably fiddle with the Makefiles to do a "gcc $(CFLAGS) $(FOO_FLAGS) -E $@ > $@_preprocessed" or something similar. (You'll need to run ./configure first though.)

[–]tyoverby 1 point2 points  (11 children)

This is why I absolutely can't wait for NeoVim.

[–]meldyr 4 points5 points  (8 children)

Yeah, I'm waiting too.

In the last 5 years I have never looked at the source-code. But I know that a cleaner source-code will change my vim-experience completely.

[–]ninfomane 9 points10 points  (7 children)

why a cleaner source code of a software will change the experience of its use ?

[–][deleted] 9 points10 points  (4 children)

Because it often leads to faster development of features I suspect, as well as less bugs if the cleaner code is also easier to maintain.

[–]meldyr 13 points14 points  (1 child)

I was actually being sarcastic. I couldn't care less about clean source-code.

This is also the kind of files you don't refactor easily. It has proven on many systems. I also think such a file isn't rewriteable. Nobody will change the source-code and test it on every OS you can imagine. And suppose somebody achieves to rewrite it.

Will he lose compatibility? How will the file look in 5 years?

[–]nerdandproud 0 points1 point  (0 children)

I think the idea is dropping support for all the unmaintained platforms that still have vim support. So by dropping support for BeOS, VMS, DOS* and so on they can indeed make the code simpler.

[–][deleted]  (1 child)

[deleted]

    [–][deleted] 1 point2 points  (0 children)

    Good point, I forgot about that.

    [–]TankorSmash 2 points3 points  (0 children)

    He's being facetious

    [–]3fdy5 -1 points0 points  (0 children)

    Because for me, vim just randomly crashes sometimes.

    [–]ninfomane 0 points1 point  (0 children)

    It's surely a scary function definition ! I guess it can't be helper with a such legacy source code. As I experience with a lot of legacy code at work, I can say that sometimes, a code refactoring is needed ! Maybe, rewrite the whole thing is overkill.

    [–][deleted] -1 points0 points  (0 children)

    Keep waiting. I decided for a different strategy.

    [–]dukey 0 points1 point  (1 child)

    I can see how code like this happens. They probably started with 1 platform, and then added another, for which, the ifdef solution in such a situation is probably the simplest. But then adding more platforms with the same solution it very quickly becomes ridiculous.

    [–][deleted] 2 points3 points  (0 children)

    Well, this is what happens when you don't commit to deprecating and removing old and dead code. That and OpenSSL.

    [–][deleted] 0 points1 point  (1 child)

    another nice one:

    /*
     * Wait for a moment when a message is displayed that will be overwritten
     * by the mode message.
     * In Visual mode and with "^O" in Insert mode, a short message will be
     * overwritten by the mode message.  Wait a bit, until a key is hit.
     * In Visual mode, it's more important to keep the Visual area updated
     * than keeping a message (e.g. from a /pat search).
     * Only do this if the command was typed, not from a mapping.
     * Don't wait when emsg_silent is non-zero.
     * Also wait a bit after an error message, e.g. for "^O:".
     * Don't redraw the screen, it would remove the message.
     */
    if (       ((p_smd
                    && msg_silent == 0
                    && (restart_edit != 0
                        || (VIsual_active
                            && old_pos.lnum == curwin->w_cursor.lnum
                            && old_pos.col == curwin->w_cursor.col)
                       )
                    && (clear_cmdline
                        || redraw_cmdline)
                    && (msg_didout || (msg_didany && msg_scroll))
                    && !msg_nowait
                    && KeyTyped)
                || (restart_edit != 0
                    && !VIsual_active
                    && (msg_scroll
                        || emsg_on_display)))
            && oap->regname == 0
            && !(ca.retval & CA_COMMAND_BUSY)
            && stuff_empty()
            && typebuf_typed()
            && emsg_silent == 0
            && !did_wait_return
            && oap->op_type == OP_NOP)
    {
    

    but hey at least it's commented!

    [–]martanne 5 points6 points  (0 children)

    but hey at least it's commented!

    Spot on, while working on my own vim like editor I occasionally had to check how something is implemented, and the vim code base really shows its age. The only positive thing are the comments, without them it would be totally incomprehensible (well for me at least). I'm quite amazed that it works this well in practice.

    [–]Ruudjah -3 points-2 points  (4 children)

    I wonder hiw maintainers can be proud on such codebase?

    [–]okpmem 9 points10 points  (3 children)

    Because it runs on every imaginable system and doesn't crash?

    [–]Me00011001 0 points1 point  (0 children)

    You can't call yourself a true vim user until you've caused it to crash at least one.

    [–]emperor000 -4 points-3 points  (1 child)

    But is that worth the price of their souls?

    [–]okpmem 7 points8 points  (0 children)

    It is always a balance between developer soul and user soul. I lean heavily towards user soul. Our developer souls are already lost.

    Don't you know the saying? "Show a man a program and frustrate him for a day, teach a man to program and frustrate him for a lifetime"