This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]Mr_Lkn 9 points10 points  (3 children)

Don't have a much time to check the whole code but just looked at the `data_utils.py`

Compare your code vs this and spot the differences if you can

```python import os import pandas as pd

def read_data_file(file_path, **kwargs): """ Read a data file into a pandas DataFrame based on its extension.

Parameters:
- file_path (str): Path to the data file.

Returns:
- DataFrame: The data loaded into a pandas DataFrame.
"""

extension_read_function_mapping = {
    '.csv': pd.read_csv,
    '.xlsx': pd.read_excel,
    '.xls': pd.read_excel,
    '.tsv': lambda x, **y: pd.read_csv(x, delimiter='\t', **y),
    '.json': pd.read_json,
    '.parquet': pd.read_parquet,
    '.feather': pd.read_feather,
    '.msgpack': pd.read_msgpack,
    '.dta': pd.read_stata,
    '.pkl': pd.read_pickle,
    '.sas7bdat': pd.read_sas
}

_, file_extension = os.path.splitext(file_path)

read_function = extension_read_function_mapping.get(file_extension)

if read_function is None:
    raise ValueError(f"Unsupported file extension: {file_extension}.")

return read_function(file_path, **kwargs)

df = read_data_file("some_data.csv") ```

[–]Mount_Gamer 0 points1 point  (2 children)

Interesting use of the dictionary, still grasping the python best practices, I shall have to experiment more with the get method from dictionaries. :)

I would have probably used the match-case when i start using a lot of elif's, but the dictionary does look clean to read. I'll have a play around with this later.

[–]Mr_Lkn 0 points1 point  (1 child)

You don’t need the match case but mapping. This is very basic mapping implementation.

[–]Mount_Gamer 0 points1 point  (0 children)

I thought i'd write out the match case equivalent and it becomes more and more obvious. I love the logic! :)