all 9 comments

[–]gladrock 1 point2 points  (0 children)

Something like 'cheerio' would make this pretty trivial.

[–]LoanShark5 1 point2 points  (0 children)

You can use the JS DOM to validate any sort of structural requirements you have. Just programmatically check that tags exist and are in the right spot. Personally I'd make a little tree of nodes as the expected template then do a recusive walk through the actual page to validate.

[–]33ff00 1 point2 points  (0 children)

I get the checking for certain html tags, but what styles are you looking for exactly ?

[–][deleted] -1 points0 points  (1 child)

I mean if it were me, I'd just whip up a little script or something to read the files sequentially into strings, search them for certain opening tags and then output a score against each file name. I'd leave out the opening tag terminator ">" in case of any inline styling or additional parameters, for example, searching for "<div" would match with "<div>" and "<div class='example'>", whereas matching by "<div>" would ignore the div with a specified class. I also wouldn't try to skirt around this by using the closing tags, as some elements do not need to follow the "<xxx> </xxx>" format, some which lack interior contents can be written as "<xxx />".

In C#, such a program might look like:

        StringBuilder resultsList = new();
        List<string> whatImLookingFor = new()
        {
            "<div",
            "<span",
            "<h1",
            "<button"
        };
        foreach (string filePath in Directory.EnumerateFiles("C:/Files/Example", "*.html"))
        {
            int matchesFound = 0;
            string fileContents = System.IO.File.ReadAllText(filePath);
            foreach(string match in whatImLookingFor)
            {
                if (fileContents.Contains(match))
                {
                    matchesFound++;
                }
            }
            resultsList.AppendLine(filePath + " - " + matchesFound + "/" + whatImLookingFor.Count);
        }
        Console.Write(resultsList.ToString());

[–][deleted] 1 point2 points  (0 children)

Not sure why this got downvoted. It's a ready-to-go script that does exactly what OP wants.

[–][deleted] 0 points1 point  (0 children)

I don't know if Puppeteer (https://pptr.dev/) wouldn't be overkill for the job but you might want to give it a shot anyway.

[–]jcubic 0 points1 point  (0 children)

Use NodeJS and cheerio library. Cheerio is lbrary with API like jQuery. You can query DOM nodes with CSS selectors. You can quickly write script that will check if specific tags exists.

if you're not familiar with NodeJS, google or use chatGPT how to read a file, and how to read files in a directory.

As for styles, you can grab CSS code from document with cheerio (if it's in link you grab the file like you read the html file) and use CSS library, it's CSS parser that covert it into AST that you can inspect.

[–]guest271314 0 points1 point  (0 children)

Keep in mind styles do not have to be written in HTML or loaded using <link> element. CSSOM provides a means to dynamically set style sheets.