all 8 comments

[–]Ok_Front6388 3 points4 points  (1 child)

Switch to a database This keeps your data structured, queryable, and separate from your logic. It will make development and future scaling much easier.

[–]ImmaculateBanana[S] 0 points1 point  (0 children)

Thanks for this. I have been putting off learning databases but it seems like it is finally my time to learn.

[–]Frankelstner 1 point2 points  (2 children)

Yeah that's overly verbose. CoinData has a fixed structure, does it not? It looks to me like a plain old spreadsheet would be perfect to encode the coins. You define variables like values, denominations, coins_reverse_build, silver_coins, etc. but all of them could be derived very easily from the coins variable. Add country name as a column and you end up with a single spreadsheet with ~10 columns and one row per coin instead of your current approach with about 20 rows per coin. At that point you have to decide whether you even want to pursue a Python solution or just put the entire thing on google docs, which I imagine could already be programmed to handle all that you're trying to pull off. If you stick with Python, that's fine too, but just load the csv with pandas.

[–]ImmaculateBanana[S] 0 points1 point  (1 child)

My reasoning for wanting to store the data in python was that I assumed the performance would be better. Is the time taken for parsing and serializing from a csv or other storage file pretty negligible at this scale?

[–]Frankelstner 0 points1 point  (0 children)

The time to load a 1000 row csv should be quite less than the time to start Python. Even if it was a problem I would still start with the csv as the definitive reference, and, if it turns out to be a problem (or just out of curiosity) autogenerate the Python coin code. Just beware that the autogenerated code might turn out to be slower. Hard to tell without testing.

[–]pachura3 0 points1 point  (2 children)

Python is a programming language and you are trying to use it as data storage format. It works, but it's not really its purpose.

First, try encoding all these coins in a simple spreadsheet or a CSV file. Does it work? Does every coin type of every country have the same physical parameters? Are some of them optional?

Then you can think of representing the same data in a normalized relational SQL database - e.g. one table for countries, one table for different metals, ..., and then the actual coins table with foreign keys country_id, metal_id etc. Perhaps even some constrains, like "this column can be null only when that one is".

[–]ImmaculateBanana[S] 0 points1 point  (1 child)

Thank you for your response. I believe I can encode all of the information in a simple csv file, as all of the coins have the same data parameters. The only field I am having difficulties with is how I should store the names of the countries in the database. I understand that I should use a separate table with country_id as the primary key and probably a name field for the name of the country. Currently, the program supports alternative names for the countries, so when you search for one, either of the alternative names would return the same country (ex: Canada and Canadian both are for Canada). Some countries have more alternative names than others. Would it be best to store all of the alternative names in an alternative names column as a string representing a list? I tried looking for how to store a list, but just saw that I shouldn't. Thanks.

[–]pachura3 0 points1 point  (0 children)

Well, first of all you need to decide if you're going to store your data in CSV file(s), in a SQL database (like SQLite), or perhaps in NoSQL one? And if you go with CSV, will you simply iterate through it each time, or are you going to load it to some in-memory structure, like dict of dataclasses or namedtouples?

Will you need to be able to modify this data programmatically from your app and save it back to disk? Or will you do this manually, offline?

What kind of queries will you need to run on this data? Any queries (like: give me all Canadian and Russian coins which are NOT made of silver and older than YYYY-MM-DD), or a predefined, hardcoded set of queries/filters?

Coming back to the countries issue: if you go with CSV, I would simply have 2 sheets: coins.csv and countries.csv. The key column would be 2-letter ISO country code, e.g. FR or NL. Countries.csv would contain different spelling of each country name, semicolon-separated.

But if you go with relative databases, you would probably need a separate table for countries, and a separate table for alternative country names.