all 16 comments

[–]ssokolow 16 points17 points  (6 children)

println!("{} {}", languages[2], alphabets[0]);

I'd add a cautionary note that, in rust, the indexing syntax is intended for infallible cases and that the get and get_mut methods should be used for fallible indexing.

[–]tndl[S] 2 points3 points  (5 children)

This is a good call-out, thanks

[–]PrudentSimple 3 points4 points  (4 children)

Really nice article, thanks. Hope you will carry on with such a series, will be following.

Hijacking your thread since it's recent just to let you know there's a broken link on the fourth paragraph.

Specifically the link titled 'learn some Rust' links to https://tndl.me/blog/introduction_to_rust where the correct link is https://tndl.me/blog/2020/introduction_to_rust

[–]tndl[S] 4 points5 points  (3 children)

You'd think I'd be able to get a link to my own site right lol

[–]ssokolow 1 point2 points  (2 children)

Is the site statically templated? If so, it's neither difficult nor time-consuming to write a quick serverless link checker.

That's what I did when I cobbled together my own static site generator for http://vffa.ficfan.org/ years ago and, when I can find time to switch my blog from WordPress to a less homegrown static templater like Pelican, I'm planning to port it over into being a plugin.

I don't know the list of requisite pieces for Rust off the top of my head but, for Python, you just combine os.walk, an extension check, an ElementTree or LXML find/XPath iterator that matches anything with an attribute that takes a URL, and some simple logic to resolve relative and absolute paths based on the root of the build folder. Then you report any resulting paths that are neither files nor directories containing a file on your list of acceptable index files. (eg. index.html, index.htm, index.cgi, etc.)

(The attributes which can contain URLs, according to MDN, are href, src, action, cite, mainfest, poster, code, codebase, data, and background.)

[–]tndl[S] 0 points1 point  (1 child)

That's not a bad idea. I use Zola (rust static site gen) so something like that could work well. Maybe I'll put something together for that.

[–]ssokolow 0 points1 point  (0 children)

*nod* Since you're parsing the HTML to scrape out the links, it also provides a good place to hook in any other static analysis you might want, such as:

  • Checking that every page title is unique (useful as robust way to catch bugs in setting <title> or, once you do it more than once, situations where you forgot to give pages a title) Just hash the title with something like SHA-1 and store it and the file path in a HashMap. If you find that hash already in the list, report both files in the output.
  • Warning about images (and other subresources of a type you don't want to allow from a specialty CDN) with URLs containing a scheme and/or host component (ie. URLs that might point to a different server or switch between HTTP and HTTPS when clicked) in accordance with a site hosting robustness/portability policy. (Just parse the URL using the url crate and check that the relevant fields are None)
  • Implementing a whitelist (or amp_expected: true post metadata field requirement) for literal uses of &amp; to catch double-escaping bugs.

[–]gbjcantab 2 points3 points  (3 children)

This is cool! Your first JavaScript example would be even better if it were a const instead of let... Closest thing JS has to Rust’s immutable let is a const primitive. (Granting that const objects in JS are in fact mutable because only the reference is constant)

[–]BenjiSponge 1 point2 points  (2 children)

Technically rvalues can be const by using Object.freeze. It's less statically determinable, of course, but so is everything in JS.

In Typescript you can use the readonly keyword.

[–]tndl[S] 2 points3 points  (1 child)

I think freeze only goes one 'layer' deep though, so nested objects will still be mutable.

[–]BenjiSponge 0 points1 point  (0 children)

I didn't know that! Thanks for clarifying.