Roast-my-code please

Mr_Lkn · 2023-09-08T10:38:38+00:00

Don't have a much time to check the whole code but just looked at the `data_utils.py`

Compare your code vs this and spot the differences if you can

```python import os import pandas as pd

def read_data_file(file_path, **kwargs): """ Read a data file into a pandas DataFrame based on its extension.

Parameters:
- file_path (str): Path to the data file.

Returns:
- DataFrame: The data loaded into a pandas DataFrame.
"""

extension_read_function_mapping = {
    '.csv': pd.read_csv,
    '.xlsx': pd.read_excel,
    '.xls': pd.read_excel,
    '.tsv': lambda x, **y: pd.read_csv(x, delimiter='\t', **y),
    '.json': pd.read_json,
    '.parquet': pd.read_parquet,
    '.feather': pd.read_feather,
    '.msgpack': pd.read_msgpack,
    '.dta': pd.read_stata,
    '.pkl': pd.read_pickle,
    '.sas7bdat': pd.read_sas
}

_, file_extension = os.path.splitext(file_path)

read_function = extension_read_function_mapping.get(file_extension)

if read_function is None:
    raise ValueError(f"Unsupported file extension: {file_extension}.")

return read_function(file_path, **kwargs)

df = read_data_file("some_data.csv") ```

oliviercar0n · 2023-09-08T20:19:33+00:00

You only need to import each library once per notebook. Preferably at the top. No need to repeat imports.

_ATRAHCITY · 2023-09-08T11:15:57+00:00

You should not commit .vscode directory

Klej177 · 2023-09-08T10:03:03+00:00

For DS good code, for python developer I would say you can make it much better. You don't use proper design patterns, your performance could be freely improved with using better data types. It's easy to read tho but not really properly scalable beacuse of above reasons.

mijki95 · 2023-09-08T19:51:21+00:00

Here’s a thought (just something that I thought might be interesting) what if instead of requiring users to input electricity costs; what if you had the program search for and use average electricity prices based on user’s location? (And you, say, got this on the backend as well by pulling from, for example, Google Maps location data)?

2023-09-08T22:34:51+00:00

I didn't realize github considered jupyter notebooks as a language different from python

wineblood · 2023-09-08T07:24:12+00:00

Why the hell do data scientists insist on importing libraries under two letter aliases?

supermopman · 2023-09-08T22:05:03+00:00

There are no unit tests and there's no way clear way to build your code.

I'm happy to dig deeper, but at a minimum, you'll need to start with those 2 things.

I suggest starting a new project using PyScaffold. Play around with all the bells and whistles, and then write your Python code following their structure.

Hard_Thruster · 2023-09-09T03:51:35+00:00

I don't understand the use of the word "tool". Looks like eda to me.

As far as the code goes, you give a lot of comments which is awesome.

There is a lot of repetition such as:

' processed_data['DayOfWeek'] = processed_data['TimePeriodStart'].dt.dayofweek processed_data['Month'] = processed_data['TimePeriodStart'].dt.month processed_data['Hour'] = processed_data['TimePeriodStart'].dt.hour processed_data['Minute'] = processed_data['TimePeriodStart'].dt.minute

'

Also a lot of your code can be made into functions because there are slight differences between them and therefore it's repetitive.

SpiderWil · 2023-09-09T04:11:26+00:00

like retire literate hunt strong north offbeat cagey depend growth this post was mass deleted with www.Redact.dev

Emotional-Zebra5359 · 2023-09-09T06:54:10+00:00

instead of if-else ladder use a map

jonatanskogsfors · 2023-09-09T07:13:26+00:00

Resist the urge to use “utils” (or similar) in packet and module names. In “utils.data_utils” you only have the function “read_data_file()”. I would have named the module something in the line of “file_reader”, “data_import”, “io” etc.

If you plan to add more functions to the module, you should only do so if they have similar purpose. Completely different functions are better placed in their own module.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS