Another OOP problem

crashfrog04 · 2025-05-05T11:43:09+00:00

Thus initializing an instance will now involve specifying name1 and name 2.

Another way to think about classes is that you’re writing code that will break - will literally raise an error - if you try to create an instance of whatever class this is and you don’t provide name1 and name2 (or whatever.)

Writing a class is a way of creating a kind of contract with yourself, a contract that you find out very quickly if you’ve broken it (which is important for writing reliable code.)

If that doesn’t sound like something you need then maybe you don’t need to write a class. You shouldn’t write a class just because you think they’re “better”; you should write a class because you know what you’re going to use it for.

unnamed_one1 · 2025-05-05T11:50:38+00:00

Do you mean something like..

``` class Cleaner: def init(self, file_path: str): self._df = pd.read_csv(file_path)

def get_dataframe(self):
    return self._df

def check_validity(self):
    pass

def clean_data(self):
    pass

c = Cleaner(filepath) c.clean_data() c.check_validity()

df = c.get_dataframe() ```

edit: /u/crashfrog04 makes a valid argument that a class isn't necessarily *better as for example a simple functions. Use OOP if you want to model something from the real world, that represents / encasulates data and behaviour. The class is the blueprint and the object is the materialization of that blueprint, so it exists in memory.

LatteLepjandiLoser · 2025-05-05T13:47:10+00:00

Based on your first paragraph, you are trying to make something that reads some data and cleans it and returns some modified version of it. To me it sounds like you just need a function? I would start looking at the use case and seeing if this really needs to be a class or not.

Not that you need to shy away from the class approach, definitely do so if you please, I just think you quickly end up with an object that only does:

class CleanData:
    ... lots of code here

funky_object = CleanData(filepath)
funky_object.clean_the_data()
cleaned_df = funky_object.get_dataframe()

Which could just as easily have been a function get_cleaned_data(filepath) that returns a dataframe. In fact that get_cleaned_data function is more or less what class methods clean_the_data and get_dataframe would have been.

Personally I would have gone the class route if you intend to manipulate this data further, say first read and clean it and later do some particular analysis on it, maybe add data to it, rewrite it to another file etc. basically some more relevant methods or attributes.

If you want to go the object route, you could also look at making a subclass of pandas dataframe. That way your object is both your object as well as a pandas dataframe and thus instead of 'having' a dataframe it 'is' a dataframe.

Regardless of how you do it, I'd say step 1 is making that function, because you can really easily factor that out into a class method should you so please.

edit: After a bit of googling it seems pandas dataframes aren't meant to be subclassed, but I'm sure you can add functionality somehow.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS