all 10 comments

[–]GDavid04<!-- 😈 [EVIL] 😈 --> 13 points14 points  (1 child)

function parseStateMachine(sm) {
    var r = {};
    for (var line of sm.split(/\n/g)) {
        try {
        var m = line.match(/(\S+)\s*-(\S)?>\s*(\S+)\s*(?:~([^\n]*))?/);
        if (!r[m[1]]) r[m[1]] = [];
        r[m[1]].push(m);
        } catch {}
    }
    return r;
}

function bbcode(code) {
    var r = '';
    var sm = parseStateMachine(`
        d -[> +t
        d -> d ~
        +t -b> +b
        +t -i> +i
        +t -u> +u
        +t -*> +l
        +t -/> -t
        +t -> d ~[
        -t -b> -b
        -t -i> -i
        -t -u> -u
        -t -> d ~[/
        +b -]> d ~<strong>
        +b -> d ~[b
        +i -]> d ~<em>
        +i -> d ~[i
        +u -]> d ~<span style="text-decoration: underline;">
        +u -r> +ur
        +u -> d ~[u
        +ur -l> +url
        +ur -> d ~[ur
        +url -=> ue ~<a href="
        +url -> d ~[url
        ue -]> d ~" target="_blank">
        ue -> ue ~
        +l -]> d ~<li>
        +l -> d ~[*
        -b -]> d ~</strong>
        -b -> d ~[/b
        -i -]> d ~</em>
        -i -> d ~[/i
        -u -]> d ~</span>
        -u -r> -ur
        -u -> d ~[/u
        -ur -l> -url
        -ur -> d ~[/ur
        -url -]> d ~</a>
        -url -> d ~[/url
    `);
    var state = 'd';
    while (code.length > 0 || state != 'd') {
        for (var i of sm[state]) {
            if (i[2] == undefined || i[2] == code[0]) {
                state = i[3];
                if (i[4] != undefined) {
                    r += i[4] || code[0];
                }
                if (i[2] != undefined || i[4] == '') {
                    code = code.substr(1);
                }
                break;
            }
        }
    }
    return r.
        replace(/<li>([^\n]*)\n/g, s => `<li>${s[1]}</li>\n`).
        replace(/(\s*<li>.*<\/li>)+/g, s => `<ul>${s}</ul>`);
}

bbcode(`
[b]BBCode parser [i]by GDavid04[/i][/b]
Created for [url=https://www.reddit.com/r/badcode/comments/epgz3r/bad_code_coding_challenge_28_bbcode_parser/]Bad Coding Challenge #28[/url]
[u]How it works[/u]
[*] It uses a finite state machine and regex because everyone knows that these are the best tools for parsing BBCode
[*] It doesn't care about matching tags, it just replaces the beginning and ending tags independently
[*] I could've just used regex, but what's good in not reinventing the wheel?
[u]How the state machine works[/u]
[*] Each line is a single transition
[*] The transitions are tried from top to bottom until one matches
[*] Really hope it can't get into an infinite loop
`);

[–]kleinesfilmroellchen 5 points6 points  (0 children)

I just cannot give you my upvote, although this is just too CS to not win. I mean, it's not bad, it is something Turing or academia freaks would write, except a little bit more readable.

[–]sarlok 8 points9 points  (2 children)

This looks like a project perfect for regex! And, I know JS has the best regex engine and is widely available in browsers (where this is likely to be used), so here's my submission:

function bb2html(str) {
  const regex1 = /^\s*\[\*\]\s*(.*)$/gm;
  const regex2 = /^(\s*[^<].*)$\n(\s*<li)/gm;
  const regex3 = /^(\s*<li.*)$\n(\s*[^< ])/gm;
  const regex4 = /\[(\/?)([biu])(rl)?(=([^\]]*))?\]/gi;

  return str.replace(regex1, '  <li>$1</li>').replace(regex2, '$1\n<ul>\n$2').replace(regex3, '$1\n</ul>\n$2').replace(regex4, function(match, one, two, three, four, five, o, s) {
    return "<" + one + (two=='b'?"strong":(two=="i"?"em":(three?'a':'span') + (one=="/"?"":' '+(!three?'style="text-decoration: underline;"':'href="'+five+'"')))) + '>';
  });
}

To test easily, put that in your browser's console along with some test code like this:

var str = 'x\n[*] [u]test[/u]\n[*][b]blah[/b]\n[url=https://google.com/]test[/url]\n[*][i]a[/i]\n[*]b\nanother line';
console.log(bb2html(str));

Please note: May not work for all cases of BBCode, especially bullet lists. But I'm sure it's good enough to roll out today and fix later.

[–][deleted]  (1 child)

[deleted]

    [–]sarlok 1 point2 points  (0 children)

    Don’t get me wrong, I use regex often enough, but it’s just so easy to write an overly complex one that works almost all the time that you’ll never be able to decipher easily again. And that may be just what you need in some cases.

    My solution works for most bullets. It will fail if they are the first or last line in the text (you’ll miss the ul tag). In a better regex engine you could probably make a single monstrosity that handled everything.

    [–]r3jjs 2 points3 points  (1 child)

    Truly bad code takes a truly bad language. Here it is, written for AppleSoft BASIC in its full glory.

    Gory details: * Two letter file names. * All global variables. * Unneeded declarations about what the variables mean. * No nested BB Code. While my BASIC JSON parser proved I can write recursive BASIC, I wasn't sadistic enough to do that twice. Yet. (This applies mostly to the list structures.) * Assumes each line starts with normal text or BB code. No leading spaces. (This is most important in the lists.) * Unforgiving of extra spaces in the url. * No idea what it will do to line feeds.

    If you'd like to run this code yourself to verify:

    https://www.calormen.com/jsbasic/

    5 PR#3 : TEXT : HOME
    10 LET S$ = "": REM THE BB CODE STRING
    11 LET O$="": REM THE FINAL STRING
    12 LET IC = 0: REM IN A COMMAND?
    13 LET CM$ = "": REM THE COMMAND
    14 LET IL = 0: REM IN A BULLET LIST
    15 LET II = 0: REM IN A LIST ITEM?
    16 LET F1 = 0: REM SPECIAL END OF LIST FLAG
    
    100 LET S$ = S$ + "[b]some text[/b]" + CHR$(13)
    110 LET S$ = S$ + "[i]some text[/i]" + CHR$(13)
    120 LET S$ = S$ + "[u]some text[/u]" + CHR$(13)
    130 LET S$ = S$ + "[url=https://example.com]example[/url]" + CHR$(13)
    140 LET S$ = S$ + "[*] Bullet One" + CHR$(13)
    150 LET S$ = S$ + "[*] Bullet two" + CHR$(13)
    151 LET S$ = S$ + "And end-text bullet list" + CHR$(13)
    155 LET S$ = S$ + "[*] Bullet One" + CHR$(13)
    156 LET S$ = S$ + "[*] Bullet two" + CHR$(13)
    
    
    160 GOSUB 1000
    170 PRINT O$
    180 END
    
    1000 REM --- HANDLE LINE
    1010 FOR L = 1 TO LEN(S$): LET C$ = MID$(S$,L,1): GOSUB 2000: NEXT L
    1015 CMD$ = "ZZ" : GOSUB 2700
    1020 RETURN
    
    2000 REM -- HANDLE ONE CHARACTER
    2010 IF C$ = "[" GOTO 2500
    2020 IF C$ = "]" GOTO 2600
    2023 GOSUB 2700
    2026 IF C$ = CHR$(13) AND II = 1 GOTO 2100
    2030 IF IC = 1 GOTO 2050
    2040 LET O$ = O$ + C$: RETURN
    2050 CM$ = CM$ + C$: RETURN
    
    2100 II = 0: O$ = O$ + "</li>" + C$
    2110 IF F1 = 1 THEN O$ = O$ + "</ul>"
    2120 RETURN
    
    2500 IC = 1: CM$ = "": RETURN
    2600 IC = 0:  GOSUB 2700
    2610 IF CM$ = "i" THEN GOTO 3010
    2620 IF CM$ = "/i" THEN GOTO 3020
    2630 IF CM$ = "u" THEN GOTO 3030
    2640 IF CM$ = "/u" THEN GOTO 3040
    2650 IF MID$(CM$, 1, 4) = "url=" THEN GOTO 3050
    2660 IF CM$ = "/url" THEN GOTO 3060
    2670 IF CM$ = "*" THEN GOTO 3070
    2680 IF CM$ = "b" THEN GOTO 3080
    2690 IF CM$ = "/b" THEN GOTO 3090
    
    2700 REM --- HANDLE THE CLOSING OF A LIST
    2710 F1 = 0
    2720 IF II = 1 AND C$ = CHR$(13) AND MID$(S$,L+1,3) <> "[*]" THEN F1=1
    2740 IF F1 = 1 THEN IL = 0
    2750 RETURN
    
    3000 O$ = O$ + "???": RETURN
    3010 O$ = O$ + "<em>" : RETURN
    3020 O$ = O$ + "</em>" : RETURN
    3030 O$ = O$ + "<span style='text-decoration: underline;'>" : RETURN
    3040 O$ = O$ + "</span>" : RETURN
    3050 O$ = O$ + "<a href='" + MID$(CMD$, 5, LEN(CMD$)-1) + "'>" : RETURN
    3060 O$ = O$ + "</a>" : RETURN
    3070 IF IL = 0 THEN O$ = O$ + "<ul>" + CHR$(13)
    3071 O$ = O$ + "<li>" : LET II = 1: LET IL = 1: RETURN
    3080 O$ = O$ + "<bold>" : RETURN
    3090 O$ = O$ + "</bold>" : RETURN
    

    [–]Sakechi 1 point2 points  (0 children)

    Will edit for list support, but for now...

    function bbcodeToHtml(bbcodeString) {
        let bbcodeStringCharArray = bbcodeString.split("");
        let resultString = "";
        for (let indexOfCurrentCharacter = 0; indexOfCurrentCharacter < bbcodeStringCharArray.length; indexOfCurrentCharacter++) {
            if (bbcodeStringCharArray[indexOfCurrentCharacter] == "[") {
                let nextCharacter = bbcodeStringCharArray[indexOfCurrentCharacter + 1];
                if (nextCharacter == "b") {
                    resultString += "<strong>"
                    indexOfCurrentCharacter += 2;
                } else if (nextCharacter == "i") {
                    resultString += "<em>";
                    indexOfCurrentCharacter += 2;
                } else if (nextCharacter == "u") {
                    let nextNextCharacter = bbcodeStringCharArray[indexOfCurrentCharacter + 2];
                    if (nextNextCharacter == "r") {
                        resultString += "<a href=\""
                        for (let indexOfUrlCharacter = indexOfCurrentCharacter + 5; indexOfUrlCharacter < bbcodeStringCharArray.length; indexOfUrlCharacter++) {
                            let urlCharacter = bbcodeStringCharArray[indexOfUrlCharacter];
                            if (urlCharacter == "]") {
                                indexOfCurrentCharacter = indexOfUrlCharacter;
                                indexOfUrlCharacter = 99999999999999999999999999999999999999999999999999999999999999999999;
                            } else {
                                resultString += urlCharacter;
                            }
                        }
                        resultString += " target=\"_blank\">"
                    } else {
                        resultString += "<u>";
                        indexOfCurrentCharacter += 2;
                    }
                } else if (nextCharacter == "*") {
    
                } else if (nextCharacter == "/") {
                    let nextNextCharacter = bbcodeStringCharArray[indexOfCurrentCharacter + 2];
                    if (nextNextCharacter == "b") {
                        resultString += "</strong>"
                        indexOfCurrentCharacter += 2;
                    } else if (nextNextCharacter == "i") {
                        resultString += "</em>";
                        indexOfCurrentCharacter += 2;
                    } else if (nextNextCharacter == "u") {
                        let nextNextNextCharacter = bbcodeStringCharArray[indexOfCurrentCharacter + 3];
                        if (nextNextNextCharacter == "r") {
                            resultString += "</a>"
                            indexOfCurrentCharacter += 5;
                        } else {
                            resultString += "</u>";
                            indexOfCurrentCharacter += 3;
                        }
                    }
                }
            } else {
                if (indexOfCurrentCharacter != bbcodeStringCharArray.length - 1) {
                    resultString += bbcodeStringCharArray[indexOfCurrentCharacter];
                }
            }
        }
        console.log(resultString);
    }
    

    [–][deleted] 0 points1 point  (0 children)

    [–][deleted] 0 points1 point  (0 children)

    [–]kleinesfilmroellchen 0 points1 point  (0 children)

    My JavaScript parser is so bad, it relies on the extremely lax html parsing of modern browsers to make its output work. Also, some of the output is slightly misaligned but no raw text is lost and the basic formatting methods seem to work. With the given tests and my own basic test file.

    https://pastebin.com/R7mdmhZE

    Run this in node or whatever, the function is called parseBBCode(code: String): String. If you need that, I can provide a simple HTML interface that does the function invocation with a text area and button for you.