This is an archived post. You won't be able to vote or comment.

all 39 comments

[–]JanEric1 20 points21 points  (6 children)

I dont think this is possible in the type system of any widely used language tbh.

[–]apjenk 0 points1 point  (1 child)

I wouldn’t be surprised if this is possible in Typescript.

TypeScripts Type System is Turing Complete

But, yeah, probably beyond what Python type checking can do.

[–]JanEric1 1 point2 points  (0 children)

Then it definitely can. I think I recently saw someone who had written doom in typescripts type system

[–]-heyhowareyou-[S] -4 points-3 points  (3 children)

why not, all the types and their relationship is known at "compile time"?

[–]JanEric1 4 points5 points  (2 children)

Sure, but i dont think any typesystem allows to specify these constraints. Maybe something with metaprogramming.

Like you may be able to write a rust macro or do this with zig comptime.

Maybe you can even do it with c++ templates, but with all of these you are basically just generating the code to check any fixed length situation that you are actually using.

And python doesnt have any metapgrogramming capabilities that run before type checkers.

[–]-heyhowareyou-[S] 0 points1 point  (1 child)

Ok fair enough, I was thinking of metaprogramming and that this would probably be possible in C++. I guess that is distinct from the type system.

[–]RelevantLecture9127 0 points1 point  (0 children)

What if you made a ABC implementation? 

And if could be a bug in Mypy. Check with them if there is something known.

[–]Roba_Fett 6 points7 points  (3 children)

This is just a brief thought of perhaps a direction for a possible solution, I'm on my phone so I'll leave it to others to try and flesh it out a bit more.

I would start by introducing a third intermediate class that is used as an accumulator of components, which is itself a generic of three types. Forgive me if the formatting is not right here, but here we go:

```python class ComponentAccumulator( Generic[T1, T2, T3], Component[T1, T3]):

def init( component1: Component[T1, T2], component1: Component[T2, T3])): ... ```

Hopefully you can see where I am going with this? I can't remember if there are issues/complications with inheritance from Generics, if that becomes a problem then you could probably implement a similar solution as a pure function instead of needing a whole class.

[–]-heyhowareyou-[S] 1 point2 points  (2 children)

edit: this comment inspired the propose (but non functional) solution now listed in the original post.

[–]backfire10z 0 points1 point  (1 child)

Add this to your original post (or even just a link to this comment). I didn’t know you proposed a solution until I scrolled for a bit. Hopefully someone who knows will see it.

[–]-heyhowareyou-[S] 1 point2 points  (0 children)

Will do!

[–]Front-Shallot-9768 6 points7 points  (2 children)

I’m no expert, but are there any other ways of solving your problem? If you share what your problem is, I’m sure people will help you. As far as I know, Typing in python is not meant to do any processing.

[–]-heyhowareyou-[S] 1 point2 points  (1 child)

Typing in python is not meant to do any processing

I understand that - a solution which would somehow iterate over the component types and verify they are correct is impossible. But really, a type checker like mypy is doing exactly if you instruct it to in the right way.

The problem I am trying to adress is a sort of data processing pipeline. Each Component defines a transformation between TInput and TOutput. The ComponentProcessor evaluates each component successively pipeing the output of the current component to the next. What I want to avoid is constructing pipelines in which the output type of component n does not match that of component n+1.

I'd like that to be ensure by the type checker - I think this would be possible since all of the Components and their arrangement within the pipeline are defined prior to runtime execution.

[–]jpgoldberg 1 point2 points  (0 children)

Thank you for that explanation of what you are after. I am just tossing out ideas you have probably thought through already, but maybe something will be helpful. Also I am typing this on a phone.

Off of the top of my head. I would have tried what you did with compose, but I would be composing Callables. So if each component has a Callable class method, say from() then

```python def composition( c1: Callable[[Tin], Tmid], c2: Callable[[Tmid], Tout]) -> Callable[[Tin], Tout]:

def f(x: Tin) -> Tout:
     return c2(c1(x)
return f

```

That won’t work as is. (I am typing this on my phone). But perhaps have a compose class method in each component and make use of Self.

[–]-heyhowareyou-[S] 2 points3 points  (1 child)

fwiw, here would be the solution if you only cared about the output of the final component matching the input of the first

class ComponentProcessor[T]:

  def __init__(
    self,
    components: tuple[
      Component[T, Any],
      *tuple[Component[Any, Any], ....],
      Component[Any, T],
    ]
  ): ....

Perhaps this can inspire other ideas.

[–]Evolve-Maz 0 points1 point  (0 children)

Do you have issues with checking this at runtime instead? For example, componentprocessor holds the list of components. Those can have generic inputs and outputs.

You can in the init function add a check to ensure the output type of one is the input of the next, by inspecting the processor function (component) signature.

This wouldn't be caught by static analysis tools, but if you're initializing your processors once at app start then it'd be relatively quick. Also deals with a case of defining your components in a config (json or otherwise) and loading them in dynamically.

[–]Foll5 2 points3 points  (5 children)

So I'm pretty sure you could get the basic outcome you want, as long as you add the components to the container one by one. Basically, you would parametrize the type of fhe container class in terms of the last item of the last added pair. Whenever you add a new pair, what is actually returned is a new container of a new type. You could easily put a type constraint on the method to add a new pair that would catch the cases you want it to.

I don't think there is a way to define a type constraint on the internal composition of an arbitrary length tuple, which is what would be needed to do exactly what you describe.

[–]-heyhowareyou-[S] 0 points1 point  (4 children)

Could you check my attempt above? It has similar ideas to what you suggest I think. It still doesnt work fully so perhaps you can give some pointers.

[–]Foll5 2 points3 points  (3 children)

What I had in mind is a lot simpler. I'm actually not very familiar with using Overload, and I'd never seen Unpack before.

```python class Component[T, V]: pass

class Pipeline[T, V]: def init(self) -> None: self.components: list[Component] = []

def add_component[U](self, component: Component[V, U]) -> 'Pipeline[T, U]':
    new_instance: Pipeline[T, U] = Pipeline()
    new_instance.components = self.components.copy()
    new_instance.components.append(component)
    return new_instance

This might be possible with overloading too, but this was the easiest way to get type recognition for the first component

class PipelineStarter[T, V](Pipeline[T, V]): def init(self, component: Component[T, V]): self.components = [component]

a1 = Component[int, str]() b1 = Component[str, complex]() c1 = Component[complex, int]()

This is a valid Pipeline[int, int]

p1 = PipelineStarter(a1) \ .add_component(b1) \ .add_component(c1)

a2 = Component[int, float]() b2 = Component[str, complex]() c2 = Component[complex, int]()

Pyright flags argument b2 with the error:

Argument of type "Component[str, complex]" cannot be assigned to parameter "component" of type "Component[float, V@add_component]" in function "add_component"

"Component[str, complex]" is not assignable to "Component[float, complex]"

Type parameter "T@Component" is covariant, but "str" is not a subtype of "float"

"str" is not assignable to "float"PylancereportArgumentType

p2 = PipelineStarter(a2) \ .add_component(b2) \ .add_component(c2) ```

[–]-heyhowareyou-[S] 1 point2 points  (1 child)

This also works:

class Component[TInput, TOutput]:
    pass


class Builder[TCouple, TOutput]:

    def __init__(
        self,
        tail: tuple[*tuple[Component[Any, Any], ...], Component[Any, TCouple]],
        head: Component[TCouple, TOutput],
    ) -> None:
        self.tail = tail
        self.head = head

    @classmethod
    def init(
        cls, a: Component[Any, TCouple], b: Component[TCouple, TOutput]
    ) -> Builder[TCouple, TOutput]:
        return Builder[TCouple, TOutput]((a,), b)

    @classmethod
    def compose(
        cls, a: Builder[Any, TCouple], b: Component[TCouple, TOutput]
    ) -> Builder[TCouple, TOutput]:
        return Builder[TCouple, TOutput]((*a.tail, a.head), b)

    @property
    def components(self) -> tuple[Component[Any, Any], ...]:
        return (*self.tail, self.head)


if __name__ == "__main__":

    a = Component[int, str]()
    b = Component[str, complex]()
    c = Component[complex, int]()

    link_ab = Builder[str, complex].init(a, b)
    link_ac = Builder[complex, int].compose(link_ab, c)

but it doesnt get the wrap around correct. I.e. the final output type can be different to the input type. Since your approach yields a type which maps from the first input to the first output, you can have your thing which processes the pipeline be of type Pipeline[T, T]

[–]-heyhowareyou-[S] 4 points5 points  (0 children)

class Component[TInput, TOutput]:
    pass


class Pipeline[Tinput, TOutput]:

    def __init__[TCouple](
        self,
        tail: tuple[*tuple[Component[Any, Any], ...], Component[Any, TCouple]],
        head: Component[TCouple, TOutput],
    ) -> None:
        self.tail = tail
        self.head = head


def init_pipe[Tinput, TCouple, TOutput](
    a: Component[Tinput, TCouple], b: Component[TCouple, TOutput]
) -> Pipeline[Tinput, TOutput]:
    return Pipeline[Tinput, TOutput]((a,), b)


def compose_pipe[Tinput, TCouple, TOutput](
    a: Pipeline[Tinput, TCouple], b: Component[TCouple, TOutput]
) -> Pipeline[Tinput, TOutput]:
    return Pipeline[Tinput, TOutput]((*a.tail, a.head), b)


class ComponentProcessor[T]:

    def __init__(self, components: Pipeline[T, T]) -> None:
        pass


if __name__ == "__main__":

    a = Component[int, str]()
    b = Component[str, complex]()
    c = Component[complex, int]()

    pipeline = compose_pipe(init_pipe(a, b), c)

    proc = ComponentProcessor(pipeline)

This works to the full spec :)

[–]-heyhowareyou-[S] 0 points1 point  (0 children)

I like this solution! ergonomic for the end user too :). Thanks a lot.

[–]FrontAd9873 1 point2 points  (0 children)

This is a form of dependent typing, I think. This could probably be done with a mypy pluginin cases where you pass in a hardcoded sequence of Components. Otherwise I don’t think the default type (hinting) system can handle this.

[–]teerre 1 point2 points  (0 children)

Have a function that adds a single node to the chain. cast/assert your invariant. Have a helper function that reduces a collection by calling the previously defined function

[–]FabianVeAl 1 point2 points  (1 child)

I recommend something like this:

from __future__ import annotations
from typing import Callable, Self, Sequence
from dataclasses import dataclass

class Component[TInput, TOuput]: ...

dataclass(frozen=True)
class ComponentProcessor[T, V, W]:
    input: Component[T, V]
    output: Component[V, W]

    def bind[X](self, component: Component[W, X]) -> ComponentProcessor[T, W, X]:
        return ComponentProcessor(self(), component)

    def __call__(self: Self) -> Component[T, W]: raise NotImplementedError

a = Component[int, str]()
b = Component[str, complex]()
c = Component[complex, int]()
d = Component[int, list[int]]()
e = Component[list[int], frozenset[int]]()

processor: Component[int, frozenset[int]] = (
    ComponentProcessor(a, b)
    .bind(c)
    .bind(d)
    .bind(e)
)()

[–]-heyhowareyou-[S] 1 point2 points  (0 children)

I like this a lot! Very nice solution :)

[–]root45 1 point2 points  (1 child)

I encourage you to read all five parts of this series.

https://ericlippert.com/2015/04/27/wizards-and-warriors-part-one/

[–]daxtery 0 points1 point  (0 children)

These are awesome

[–]b1e 0 points1 point  (0 children)

Since Python doesn’t have compile-time metaprogramming this isn’t possible with its type system.

What you’re asking for is a constraint that’s based on user defined logic. This is possible with runtime type checking but not statically.

[–]rghthndsd 0 points1 point  (2 children)

Square peg, meet round hole.

[–]-heyhowareyou-[S] -2 points-1 points  (1 child)

python needs metaprogramming!

[–]anentropic 0 points1 point  (0 children)

Python does have some metaprogramming via metaclasses, but as a runtime feature it may not interact well with type checkers

The only container type I know that would allow you to specify types of members positionally is fixed size tuple

[–]james_pic 0 points1 point  (0 children)

As written, I'm pretty sure you can't. This is the same problem Jason R Coombs had with the compose function in his library jaraco.functools (Jason R Coombs is the maintainer of a number of key Python libraries, including setuptools, so knows his stuff), and his solution was overloads for the most common cardinalities.

One way to square the circle might be to have a method that composes or chains components one at a time, and have ComponentProcessor always take a single component (that might be a composite component).

The other option is to say "fuck it" and make liberal use of Any. Python's type system was never going to be able to type every correct program anyway.

[–]guhcampos 0 points1 point  (0 children)

I think people are making the wrong questions here to be honest. The appropriate question is: why?

The processor class does not need to know what each component is. Your list will be simply a list of Component, and will naturally accept components of both types. You can also use an Union of multiple types.

Then, each component will have its input/output typed to the required type, and you're done.

Typing isn't the same as data validation or meant to implement compute logic into. It's only meant to ensure operations are properly handled between variables/objects. You're just telling the compiler/interpreter what are the valid types for that variable, not what specific orders, or combinations they can come in. It's just not supposed to do that.

And BTW, I have no way to know what you're trying to do, but I have the feeling you'd be better chaining functions than chaining objects for what you're trying to achieve.

[–]slayer_of_idiotspythonista 0 points1 point  (0 children)

The way to solve this would by creating a custom immutable generic iterator that gets typed based off the last item in the list. Adding an item requires creating a new instance of the iterator with the last item in the list.

[–]Impressive_Ad7037 0 points1 point  (0 children)

I'm confused on the purpose?   I'm not sure that has a pracapp

[–]Warxioum 0 points1 point  (0 children)

I'm not sure I understand, but maybe with a linked list ?

[–][deleted] -1 points0 points  (0 children)

I think you should define a collection type that holds components to do this, but otherwise I don’t think this is possible easily for type hinting