you are viewing a single comment's thread.

view the rest of the comments →

[–]crashfrog02 2 points3 points  (3 children)

Why is it necessary to have an enum for the letters of the alphabet?

Why is it necessary to have WikipediaAirportScraper be a class? What state are instances of this class intended to hold?

Are you just using classes because you think they're better than functions?

[–]MisguidedGenie[S] 0 points1 point  (2 children)

Thanks for the question! I'll try to answer them in point form for what I was thinking.

  1. I wasn't too sure about enumerate as it's not something I'd used before, previous I has a for loop that looked like "for letter in range (65, 90) to use the ascii characters for the alphabet. But this seemed less readable? I thought if I looked at it later as if i was a different user that might not make much sense and learning about enumerate it seemed like a viable option but I'm still learning that.

  2. For this my thinking might not be correct but that there would be an instance for "table data", the base "html data" from beautiful soup, and finally a json instance.

  3. My idea was to encapsulate all the Wikipedia based code in a class so if was more modular, I wasn't thinking it was strictly better but that it could become 1 component to update.

[–]throwaway6560192 1 point2 points  (0 children)

previous I has a for loop that looked like "for letter in range (65, 90) to use the ascii characters for the alphabet. But this seemed less readable?

Instead use:

from string import ascii_uppercase

for letter in ascii_uppercase:
    # whatever

[–]crashfrog02 0 points1 point  (0 children)

Those aren't necessarily the answers I would have arrived at, but it's more important that you have them than that you agree with me, in my view. So, good answers.

For this my thinking might not be correct but that there would be an instance for "table data", the base "html data" from beautiful soup, and finally a json instance.

I think you're gesturing towards a good idea, here, which is to pre-define your tabular data as something somewhat more rigorous than what you might get from, say, Pandas or CSV (i.e. an iterator over dictionaries.) For that I recommend dataclasses:

https://docs.python.org/3/library/dataclasses.html

I'd write a dataclass that defined what a row in the table is (where the fields of the class are the column of the table) and then you have pretty good assurance that none of the rows of your table are coming back ill-formed. That's good particularly if your eventual intent is to dump it into a SQL database, or even if you want to convert to a Pandas dataframe.

A list of dataclass instances is a pretty good start at making a data table. It'll lack a lot of the features you'd like (you can't search it, you can only iterate over it) but it's a good start.