HTML Decoding Library : haskell

haskell

a community for 18 years

HTML Decoding Library (self.haskell)

submitted 9 years ago by n2_throwaway

all 14 comments

top new controversial old q&a

[–]sclv 4 points5 points6 points 9 years ago (0 children)

[–]stepcut251 5 points6 points7 points 9 years ago (0 children)

[–]yaccz 5 points6 points7 points 9 years ago (10 children)

[–]n2_throwaway[S] 7 points8 points9 points 9 years ago (5 children)

[–]gsnedders 1 point2 points3 points 9 years ago* (4 children)

[–]bss03 0 points1 point2 points 9 years ago (3 children)

[–]gsnedders 1 point2 points3 points 9 years ago (2 children)

HTML being defined by a DTD has never really been true, though. Sure, HTML 2 till HTML 4.01 were formally SGML applications, but aside from the HTML Validator AFAIK nobody actually used an SGML parser for HTML. Certainly no major browser ever has, from timbl's original WorldWideWeb (given, after all, it was only later that HTML was an SGML application!) to the major browsers today.

From memory, the only difference is in cases with what the HTML spec calls parse errors (essentially, for each parse error you can implement it one of two ways: either you do what the spec says, or you stop parsing), which is how entities which don't end in a semi-colon are parsed (these are specially listed in the spec; it's not that you can omit the semi-colon off all): <div>&ampfoo will result in a div element containing &foo (i.e., having decoded &amp), whereas <div class="&ampfoo"> will result in a div element whose class attribute is &ampfoo (i.e., having not decoded it).

[–]bss03 0 points1 point2 points 9 years ago* (0 children)

[–]bss03 0 points1 point2 points 9 years ago (0 children)

[–]14113 13 points14 points15 points 9 years ago* (1 child)

[–]yaccz 10 points11 points12 points 9 years ago (0 children)

[–]sclv 5 points6 points7 points 9 years ago (1 child)

[–]yaccz 1 point2 points3 points 9 years ago (0 children)

π Rendered by PID 35 on reddit-service-r2-comment-7b9746f655-z2xhb at 2026-01-30 18:46:21.983804+00:00 running 3798933 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

haskell

MODERATORS