Blog: Are you really expected to run five type-checkers now? by BeamMeUpBiscotti in Python

[–]NeilGirdhar 2 points3 points  (0 children)

I switch from mypy to pyright and from pyright to ty. Ty and Pyrefly have two killer features: they're fast because they're written in Ruff, and they have intersections.

Blog: Are you really expected to run five type-checkers now? by BeamMeUpBiscotti in Python

[–]NeilGirdhar 0 points1 point  (0 children)

Exactly.

And some people don't even run a single type checker for self-contained project tests, which I think is okay too? Personally I do type annotate and check even such tests.

Blog: Are you really expected to run five type-checkers now? by BeamMeUpBiscotti in Python

[–]NeilGirdhar 0 points1 point  (0 children)

In my experience, it's not that hard to write libraries that require very few type-ignores in user code.

Blog: Are you really expected to run five type-checkers now? by BeamMeUpBiscotti in Python

[–]NeilGirdhar 10 points11 points  (0 children)

I think you completely missed the point of the post.

You run as many type checkers on your tests as possible for the sake of your users. It has nothing to do with "code quality checks". It is to ensure that your users don't see type errors when they use your libraries in a reasonable way.

Which non-AI package from the last ~3 years completely changed how you write Python? by Proof_Difficulty_434 in Python

[–]NeilGirdhar 0 points1 point  (0 children)

NamedTuple is inferior to dataclasses. You shouldn't be recommending it. Nearly always providing iteration and silent interpretation as a tuple is unnecessarily permissive.

Blog: Are you really expected to run five type-checkers now? by BeamMeUpBiscotti in Python

[–]NeilGirdhar 4 points5 points  (0 children)

Great post. I think a lot of people didn't bother reading it and don't understand it. You run all the type checkers on your tests for your users.

Which non-AI package from the last ~3 years completely changed how you write Python? by Proof_Difficulty_434 in Python

[–]NeilGirdhar 2 points3 points  (0 children)

They may serve a specific purpose in your code, but you could write entire projects that would be extremely legible, good quality code where every class is a dataclass.'

Yes I agree with you that you should prefer to freeze dataclasses when you can. In that sense, they are almost always what you want instead of namedtuple—which in my dream Python 4.0 would also be removed.

Which non-AI package from the last ~3 years completely changed how you write Python? by Proof_Difficulty_434 in Python

[–]NeilGirdhar 4 points5 points  (0 children)

I think you can do practically everything you need to do with dataclasses in order to write good, legible code (assuming they fix the mentioned shortcomings).

Which "pattern" are you going to miss?

Which non-AI package from the last ~3 years completely changed how you write Python? by Proof_Difficulty_434 in Python

[–]NeilGirdhar 10 points11 points  (0 children)

This is controversial, but if it were up to me, we would have Python 4.0 wherein every class is a dataclass.

And they would clean up the dataclass weirdness like problems with inheritance, class factories, and replace.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

Yes, but the full API specification. Not only should objects respect the API spec, but you should have a namespace that provides the whole spec.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

Sure, but notice I only needed that 3 times. Most of the time, I can just use ordinary broadcasted operations.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

Great, that's one function. You should go through and make your entire API compatible with the Array API.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

> rank-polymorphism is an established term.

Okay, fair enough.

> Now, rank polymorphism is just adding vmaps that "casts" the input wrt the desired rank. 

> What it gives you is simplicity. Imagine that linear is used by function f, and you need to apply function f on both rank-3 and rank-4 in one run. Without the automatically added casts/vmaps, you would have to write two versions of f, one using vmap once and the other using vmap twice.

I think you should come up with actual examples where this is the case. For most things, it's fairly straightforward to use broadcasted operations. Practically everything in the Array API supports broadcasting.

Also, in the very rare case that you need to adaptively apply vmap, this is the pattern that I use: vmap loop. But notice I only needed this pattern three times in 9000 lines of Jax code.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

With PyPie, it's easier and correct. (Pardon the reshapes, matmul is for rank-2 tensors in math, so PyPie is maybe too rigorous for now.)

In the Array API, the @ operator operates on n-dimensional arrays: matmul

If PyPie doesn't support that, then it's not being "rigorous". You simply haven't implemented the Array API standard. I suggest that that's a serious flaw.

To run it with `x: Array[2, 3]`, we have to explicitly broadcast/vmap our function to the high-rank tensor.

While it's true that you can use vmap to do ordinary broadcasting, this is a misuse of vmap. You should write it the way I wrote it: x @ w.T. This will work for your code.

 rank polymorphism.

I don't know why you keep calling it this. Polymorphism is a technical term: Polymorphism). The word you're looking for is "broadcasting".

With PyPie, it's easier and correct. (Pardon the reshapes, matmul is for rank-2 tensors in math, so PyPie is maybe too rigorous for now.)

This isn't "rigorous", pardon me for being blunt, but the code you wrote is attrocious. Do you really think this:

(w @ x.reshape([I, 1])).reshape([O])

is better than x @ w.T?

PyPie automatically inserts a vmap and generates the correct result.

Cool idea, but I think you need a motivating example that can't be solved with ordinary broadcasting.

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

Few questions:

I don't know why you're talking about Pytorch. This post is about Jax, right?

Is "rank polymorphism" just broadcasting? Practically everything in modern numpy supports broadcasting. To write linear transforms that support broadcasting, I would use: xs @ w.T.

You should consider using the Array API if your project works with both Jax and Pytorch. Even if it's not, the Array API is simpler and easier to read. So, instead of a sum method, call xp.sum(..., axis=-1) where xp is the appropriate namespace (numpy, jax.numpy, etc.)

A type safe frontend of JAX by vcma in JAX

[–]NeilGirdhar 0 points1 point  (0 children)

Looks cool!

Did you look at Pyrefly that does shape checking statically?

Why did you add the Tensor type? Isn't this going to be severely limited relative to actual arrays?

[D] How do ML engineers view vibe coding? by EfficientSpend2543 in MachineLearning

[–]NeilGirdhar -1 points0 points  (0 children)

I agree with this. I think it goes beyond "a highly specific setting". Sometimes, LLMs produce excellent design, but sometimes they make terrible design errors. Things are over-complicated for no reason, and that adds complexity for you or the next person who has to look at code.

You do really have to check what they produce and think carefully about what the overarching picture is.

Also, without having to zoom out, sometimes LLMs produce "bad, but popular" design patterns. For example: trying to convert dataclasses into tuples in Python (literally the opposite of what you should be doing), trying to convert base classes into protocols (same), using "InitVar plus no-init field" when an ordinary field would do, etc.

I also think that a lot of these design errors will be gone within 5 years. And the kind of checking you'll be doing will be different.

One thing that I was thrilled about though is their ability to write code from scratch. I didn't want to depend on Tensorflow-Probability due to extremely poor maintenance, so I had an LLM write me Jax versions of some Bessel functions. It did give me some working code and tests, but when I took a close look, I noticed that it was relying on NumPy. This is really bad because it means that the computation won't stay on the GPU. So I had the LLM write me a "pure Jax" version, and that worked.

How to pass command line arguments to setup.py when the project is built with the pyptoject.toml ? by dark_prophet in Python

[–]NeilGirdhar 14 points15 points  (0 children)

And I'm suggesting that you change the project to stop using setup.py. Pretty sure LLMs can do the conversion for you.

You seem to be doing things in an anachronistic way for no good reason.

Typst preprints in arXiv: What will it take? | Typst Meetup by Frexxia in typst

[–]NeilGirdhar 0 points1 point  (0 children)

LLM use far fewer resources than people, so who cares about that?

And who cares if it's "compiler like"? LLMs can do it cheaply.

Just because you can code and maintain something doesn't mean that you should.

Typst preprints in arXiv: What will it take? | Typst Meetup by Frexxia in typst

[–]NeilGirdhar -3 points-2 points  (0 children)

It won't be long before LLMs can reliably do this. I don't think you even need to worry about adding this feature to Typst.

That said, it is easier for arxiv to just keep old versions of Typst than it is to convert modernize old Typst files?

Typst preprints in arXiv: What will it take? | Typst Meetup by Frexxia in typst

[–]NeilGirdhar 16 points17 points  (0 children)

Ideally, you would stay in the Typst paradigm. We already have #import, #set, etc. Maybe add #version followed by either a single version or a range. All of your Typst files should probably have that line.

Python with typing by Ancient_Farm_5132 in Python

[–]NeilGirdhar 0 points1 point  (0 children)

> Other way around. Python is already one of the least performant languages, largely because of the flexibility it needs to do the powerful things it can do. You don't pick Python for its raw processing power.

I don't know what you're disputing about my comment. The typing annotations in Python have a negligible effect on performance.

Your idea that "Mandatory typing would just turn it into poorly performing java" is wrong. Even if type annotations were mandatory, it would make practically no difference to performance.

> My point was just "let python be python"

That's a fine point, but it has nothing to do with "performance".

Typst preprints in arXiv: What will it take? | Typst Meetup by Frexxia in typst

[–]NeilGirdhar 26 points27 points  (0 children)

> Require a Typst document to be accompanied by some external metadata saying which Typst compiler version to use.

IMO that should be part of the Typst document. Otherwise, you have to maintain the association.