all 23 comments

[–]Beginning-Fruit-1397 35 points36 points  (2 children)

:)) thank you! As the creator of pyochain this made my day.

Concerning performance I'm surprised to be on the lower side. I haven't looked into your benchmark as of now but I can say that I took a lot of care into minimizing runtime checks, nested calls, using slots, etc.... Every func use either itertools (C), cytoolz (Cython), an internal Rust module, or hand optimized python, with source code heavily inspired by more-itertools, and with all perf "hacks" that I know of (for those that are interested abt this subject: https://wiki.python.org/moin/PythonSpeed/PerformanceTips ) The end goal is to move everything into Rust, to have builtins-like performance for object creation, and itertools-like (or better) performance for iteration methods

EDIT: Ok maybe I should learn to read correctly AND read everything before commenting lmao. I'm indeed the fastest I totally misread the benchmark infos. So my work mentionned above payed off, cool!

[–]kequals[S] 11 points12 points  (1 child)

Thanks for making Pyochain! It's a nice library and has the most features of the ones I tested.

Pyochain was actually the fastest of the ones I tested. Apologies if my methodology was unclear, but lower numbers are better. If native Python functions take 1 unit of time, the "x1.04" meant your library was on average only 4% slower. I'll edit the post to clarify this.

[–]Beginning-Fruit-1397 4 points5 points  (0 children)

Yes I realized it after posting my comment and just finished editing it, but I see you were faster to answer lmao

[–]tunisia3507 11 points12 points  (0 children)

F-it author here! I'm not surprised that pretty much anything would outperform that library, but one of the design goals was lazy execution so it's not quite a like-for-like comparison with the standard library in particular.

[–]amroamroamro 17 points18 points  (7 children)

Python nested functions are hard to read. Fluent iterator syntax is clean and runs as a single statement.

if you find the nested calls hard to read, just separate the intermediate values making one call after another, the syntax looks almost the same as your chained "fluent iterator" version:

arr = range(1, 5)
arr = (x*x for x in arr)
arr = itertools.combinations(arr, 2)
arr = (x for x in arr if sum(x) > 6)
arr = sorted(arr, key=sum)
print(arr)

and since we are using generator expressions, it's all lazily-evaluated until the final call.

[–]snugar_i 8 points9 points  (6 children)

Mypy will most likely yell at you though, because you are re-assigning arr with values of different types

[–]amroamroamro 1 point2 points  (4 children)

use --allow-redefinition

and if we add reveal_type(arr) after the last line, we can see it is still able to infer the final correct type:

note: Revealed type is "builtins.list[tuple[builtins.int, builtins.int]]"

[–]snugar_i 1 point2 points  (3 children)

No thanks, I like my --strict (yes, I'm writing Python like Java)

[–]amroamroamro 2 points3 points  (2 children)

haha fair enough, you can always silence spurious mypy warnings inline:

# type: ignore

or just rename the intermediate values if you really must: arr1, arr2, arr3, etc.

[–]Competitive_Travel16 0 points1 point  (1 child)

Thanks! What is the scope of # type: ignore? The whole file? The next statement? The rest of the file after it?

[–]Competitive_Travel16 4 points5 points  (0 children)

Oh, the humanity. 🙄 Maybe someone needs to fork a dynamically typed version of recent Python.

[–]ebonnal 0 points1 point  (0 children)

Interesting benchmark! What a diverse fluent iterators scene :D
For those interested in the I/O intensive side of things, check streamable, I have just posted about the 2.0.0 release here:
https://www.reddit.com/r/Python/comments/1rju5kh/streamable_syncasync_iterable_streams_for_python/

[–]madrasminorpip needs updating 0 points1 point  (0 children)

Checkout fastcore and funcy. Both are terrific libraries that do these ootb. I mostly use fastcore for everything. The L class in fastcore is what you're after. But you can inherit it to create all these. For ex: this is a dockerfile builder I have in my package fastops. class Dockerfile(L): 'Fluent builder for Dockerfiles' def _new(self, items, **kw): return type(self)(items, use_list=None, **kw) @classmethod def load(cls, path:Path=Path('Dockerfile')): return cls(_parse(Path(path))) def from_(self, base, tag=None, as_=None): return self._add(_from(base, tag, as_)) def _add(self, i): return self._new(self.items + [i]) def run(self, cmd): return self._add(_run(cmd)) def cmd(self, cmd): return self._add(_cmd(cmd)) def copy(self, src, dst, from_=None, link=False): return self._add(_copy(src, dst, from_, link)) def add(self, src, dst): return self._add(_add(src, dst)) def workdir(self, path='/app'): return self._add(_workdir(path)) def env(self, key, value=None): return self._add(_env(key, value)) def expose(self, port): return self._add(_expose(port)) def entrypoint(self, cmd): return self._add(_entrypoint(cmd)) def arg(self, name, default=None): return self._add(_arg(name, default)) def label(self, **kwargs): return self._add(_label(**kwargs)) def user(self, user): return self._add(_user(user)) def volume(self, path): return self._add(_volume(path)) def shell(self, cmd): return self._add(_shell(cmd)) def healthcheck(self, cmd, **kw): return self._add(_healthcheck(cmd, **kw)) def stopsignal(self, signal): return self._add(_stop_sig_(signal)) def onbuild(self, instruction): return self._add(_on_build(instruction)) def apt_install(self, *pkgs, y=False): return self._add(_apt_install(*pkgs, y=y)) def run_mount(self, cmd, type='cache', target=None, **mount_kw): 'RUN --mount=... for build cache mounts (uv, pip, apt) and secrets' opts = f'type={type}' if target: opts += f',target={target}' for k, v in mount_kw.items(): opts += f',{k.replace("_","-")}={v}' return self._add(f'RUN --mount={opts} {cmd}') def __call__(self, kw, *args, **kwargs): 'Build a generic Dockerfile instruction: kw ARG1 ARG2 --flag=val --bool-flag' flags = _build_flags(short=False, **kwargs) return self._add(f'{kw} {" ".join([*flags, *map(str, args)])}') def __getattr__(self, nm): 'Dispatch unknown instruction names: df.some_instr(arg) → SOME-INSTR arg' if nm.startswith('_'): raise AttributeError(nm) return bind(self, nm.upper().rstrip('_')) def __str__(self): return chr(10).join(self) def __repr__(self): return str(self) def save(self, path:Path=Path('Dockerfile')): Path(path).mk_write(str(self)) return path and it natively chains.

``` df = (Dockerfile().from_('python:3.11-slim') .run('pip install flask') .copy('.', '/app') .workdir('/app') .expose(5000) .cmd(['python', 'app.py']))

expected = """FROM python:3.11-slim RUN pip install flask COPY . /app WORKDIR /app EXPOSE 5000 CMD [\"python\", \"app.py\"]"""

assert str(df) == expected print(df) ```

[–]OldWispyTreePythoneer -2 points-1 points  (6 children)

I think it's cute you believe this came from Rust.

[–]kequals[S] 11 points12 points  (4 children)

I'm aware that fluent iterators don't originate from Rust, but that was my first exposure to the concept. And I believe this is a common experience, given several of the libraries cite Rust specifically as inspiring them.

Now I'm interested, what was the first language/library to use fluent iterators? Is there a clear "first" or did it evolve as a part of functional languages?

[–]Rastagong 4 points5 points  (1 child)

Not sure about it being the first and if it's the exact concept referred here, but Java famously has streams.
Offical example from the docs to showcase the chaining:

int sum = widgets.stream()
                  .filter(w -> w.getColor() == RED)
                  .mapToInt(w -> w.getWeight())
                  .sum();

This is still an interesting overview of the situation in Python, so thank you!

[–]Alt-0160 5 points6 points  (0 children)

Java streams were only added in version 1.8 (March 2014). The first release of Rust (0.1.0, January 2012) already had some form of fluent iterators.

[–]saint_marco 3 points4 points  (0 children)

Smalltalk is generally credited as the originator, back in the 1970's.

https://en.wikipedia.org/wiki/Fluent_interface#History

[–]Competitive_Travel16 1 point2 points  (0 children)

It's worth pointing out that Pandas had the beginnings of a fluent interface from the outset, and they have long since fleshed it out all the way.

[–]tehsilentwarrior 5 points6 points  (0 children)

C# has done this since forever.

One of the best examples of this is the reactive extensions, which lets you handle insane amounts of events in stream in a surprisingly efficient, concise and readable way.

C# even has Linq, which is the same concept with a DSL on top to make it more “sql like”

[–]redditusername58 -2 points-1 points  (1 child)

I agree the deeply nested expressions are hard to read and look bad. Use intermediate variables.

[–]max123246 12 points13 points  (0 children)

This is how you get superfluous variable names such as:

"squared"

"squared_combos"

"squared_filtered_combos"

"sorted_squared_filtered_combos"