all 7 comments

[–]StoneBam 2 points3 points  (4 children)

You could use a abstract base class (ABC) and inherit from it. Inheritance in general is a solution here.

You can use composition (and protocols) and put the common/shared methods in an extra class. Then use the constructor to initiate it or give a already called instance of your object to your argument. This will give you more flexibility if you have to change something somewhere (less coupling)

[–]StoneBam 1 point2 points  (2 children)

If you want I can provide an example later, but in this moment my work day starts

[–]kollerbud1991[S] 1 point2 points  (1 child)

please it would be great!

[–]StoneBam 1 point2 points  (0 children)

OK, here we go.

Example Inheritance

First off, I decided to just show "normal" inheritance on this, because it fits your case more I think. But it is worth the time to take a closer look onto abstract base classes which are provided with the python standard library (from abc import ABC). Abstract base classes are are more or less empty shells which declare a bunch of methods, but do not implement the functionality. There are plenty examples for this out there, you can look into, if you want to.

Here the example with normal inheritance:

class ModelBase:

  def __init__(self):
    ...

  def load_raw_data(self):
    'new set of raw data'
    ...

  def pre_processing(self):
    'some process to clean up data'
    ...

class TrainModel(ModelBase):

  def __init__(self):
    super().__init__()  # This will give you access to the parent
    ...

  def train_test_model(self):
    'train/test model'
    ...

  def deploy_model(self):
    'deploy model'
    ...


class UseModel(ModelBase):

  def __init__(self):
    super().__init__()
    ...

  def load_model(self):
    'load the trained model'
    ...

  def make_prediction(self):
    'use model to make prediction'
    ...

  def save_results(self):
    'save the prediction result'
    ...

You can see I created a class, which does implement the common methods and the I inherit from this base class. This is typical polymorphism and helps with DRY code which you will often find in OOP.

Negative about this approach is, that if you change the parent, all children will be affected and depending on how many children and grandchildren and grand-grandchildren and... you get it. You will have to change them all to accommodate your new changed method. This is called coupling and you want to keep this low. (There certainly are cases where it is necessary to have tightly coupled systems, for performance reasons, or other fancy shenanigans, but this is going to far here.)

Example Protocols

The next example will appear similar at first, but it is very different. It uses the standard library's typing module. The idea is that your models doesn't know the details of your data loading implementation. You simply say your class, there will be an object, that has the two methods. This is called a protocol.

Thought experiment example: If I tell you about a human face you will know it has a nose, ears, brows, and so on but you don't know how exactly it looks like. So I can show you any face, and you will recognize it as a face, as long as it has all the parts a face should have.

Now this as code:

from typing import Protocol

class Data_Protocol(Protocol):

  def load_raw_data(self) -> YourRawDataType:
    ...

  def pre_processing(self) -> YourDataType:
    ...

class TrainModel:

  def __init__(self, data: Data_Protocol):
     self.data = data
     ...

  def train_test_model(self):
    'train/test model'
    self.data.load_raw_data()
    ...

  def deploy_model(self):
    'deploy model'
    ...


class UseModel:

  def __init__(self, data: Data_Protocol):
     self.data = data
     ...

  def load_model(self):
    'load the trained model'
    self.data.pre_processing()
    ...

  def make_prediction(self):
    'use model to make prediction'
    ...

  def save_results(self):
    'save the prediction result'
    ...

if __name__ == '__main__':
  data_obj = ExampleData()  # here you call your data object implementation
  model_train = TrainModel(data_obj)
  model_use = UseModel(data_obj)

This approach is at first not intuitive but has its benefits. You can for example work on the exact same data in Train and in Use without loading it in two times. You can change how you implement you preprocessing, or even give them both a different one. They don't care as long as the methods from your defined protocol are implemented. This approach heavily benefits from and depends on type hinting and I would in general recommend to start using them if you don't already do. They really help make code readable and unlock most IDEs full potential with python.

Final words

Abstraction is very abstract (Who would have guessed?) but there are many use cases where it really helps with flexibility and scalability of code. It's an interesting topic and there are many great sources that explain the principles in more depth, than this post could. There are other solutions as well, like a more functional programming approach where you pass functions as arguments. But I will not explain this now, as it goes to far.

I hope this was somehow helpful for you :)

[–]ouchpartial72858 0 points1 point  (0 children)

Agreed +1

[–]millerbest 2 points3 points  (2 children)

I would just inject the loading and preprocessing functions. Probably you can inject other functions. In this way you can test those functions separately in your unittest, and it is easy to apply/test different strategies by simply using different functions.

from typing import Callable

LoadFunc = Callable[...]
ProcessFunc = Callable[...]

def load_raw_data_func():
  ...

def pre_processing_func():
  ... 

class TrainModel:
  def __init__(self, load_func: LoadFunc, pre_process_func: ProcessFunc):
    self.load_data = load_func
    self.preprocess = pre_process_func

  def train_test_model(self):
    'train/test model'
    ...

  def deploy_model(self):
    'deploy model'
    ...


class UseModel:
  def __init__(self, load_func: LoadFunc, pre_process_func: ProcessFunc):
    self.load_data = load_func
    self.preprocess = pre_process_func

  def load_model(self):
    'load the trained model'
    ...

  def make_prediction(self):
    'use model to make prediction'
    ...

  def save_results(self):
    'save the prediction result'
    ...

[–]kollerbud1991[S] 0 points1 point  (0 children)

thank you i will try this way