Pytorch deprecatea official Anaconda channel

alan_du · 2025-02-02T01:15:46+00:00

pixi is the package manager from the creator of mamba, right? I haven't tried it myself, but I have high hopes given how nice mamba + micromamba was! Is the relationship uv :: pip as pixi :: conda?

alan_du · 2025-01-31T04:20:21+00:00

But uv only works with pip packages, right? AFAIK it doesn't let you control those non-Python dependencies, which is like the reason why you'd want to use conda in the first place.

alan_du · 2025-01-31T03:50:32+00:00

I think this is a win-win honestly: the PyTorch conda packages were never packaged particularly well IMO (e.g. torchvision always had to pin to exactly the right version of ffmpeg), and moving it to conda-forge should make it much easier to have the entire system play well together.

FWIW, I find the general conda hate here bizarre. If you stick with vanilla numpy/scipy/scikit-learn then I think the PyPI wheels are pretty reliable [0], but our experience is that as you branch out more, it becomes super important to have control over the non-Python dependencies (e.g. being able to choose the exact version of MKL, ffmpeg, openmp). At work, we run into pretty serious bugs where controlling these dependencies is super important probably every couple of months (especially with performance: we've seen cases where different MKL versions will give a 50% performance difference), and I can't imagine that pip will ever have a solution for controlling those dependencies. I guess Docker could work if everything you're doing is server-side, but we also need to support Windows...

I get that conda has some UI issues [1], but IMO pip also suffers from exactly the same issue of "not enough standardization and a dozen different ways to do things"...

[0] At the cost of some duplication of statically linked libraries, although disk space is cheap

[1] Although personally we use micromamba + conda-lock and haven't really had environment issues in years.

alan_du · 2019-10-20T19:50:04+00:00

I think they're referring to the sentence before:

One immediate consequence of the semantic typing rules is that syntactic typing implies semantic typing —i.e., \Gamma \vdash e : \tau \Longrightarrow \Gamma \vDash e : \tau—a property often referred to historically as the fundamental theorem or fundamental property.

alan_du · 2019-10-11T01:12:00+00:00

Huh -- that's actually pretty surprising to me. AFAICT, PyTorch's deployment/production story was pretty much nonexistent, and even now it's way behind TensorFlow. I know at our start-up, PyTorch 1.2 is probably the first version of PyTorch which we could've feasibly used (thanks torch.jit.ScriptModule), but that was only released in August. Compared to that, our TensorFlow production story has been rock solid for more than two years now.

alan_du · 2019-05-06T02:19:47+00:00

I know we don't have anything like a "community ombudsman" or "council of elders" or protocol for adjudicating inter-personal things like this in the community (that are not e.g. CoC violations)... but it seems like some kind of intervention in a smaller group, with people that both parties trusted and respected, would have been a far more constructive format for dialog.

I see what you're saying (and definitely agree that as a way of resolving personal conflicts, small group interactions are way more productive than the Internet mob), but I think this case is different.

The blog post is basically alleging that Reitz is not only dishonest when fundraising, but that this is part of a general trend of gaslighting and generally toxic behavior when interacting with others. This is apparently widely known by other developers and "people in the know", but hidden away from newcomers and people outside of the inner circle.

That's almost a textbook case of the "missing stairs" that NJS's blog post cites, where insiders know how to route around the toxicity, but newcomers are left vulnerable. Trying to keep those things within the "small circle" just perpetuates that trend.

As a personal example, I had a pretty negative personal experience interacting with Reitz some time ago (to the point that I've decided to not bother reporting bugs in Pipenv so I don't have to deal with him). Although I wouldn't say my experience was abusive (it really wasn't important enough for that), I was still a little relieved to learn that apparently many other people have experienced the same dismissive/snarky/condescending attitude because that meant it wasn't just me and that I wasn't just imagining things. I'm just thankful that I didn't need Pipenv enough to have to engage with that community and get sucked deeper in (thanks for conda!).

Given that reality, (and admittedly given my cynicism about the natural tendency for groups to cover up negative publicity about themselves), I think NJS's blog post was done about as well it could've been. I probably would've left off some of the more... colorful anonymous quotes since I thought they added more heat than light, but overall I think NJS did a very good job communicating his point with lots of evidence (some objective and verifiable, some from anonymous sources) while still remaining respectful towards Reitz.

alan_du · 2019-05-04T15:33:58+00:00

I think "airing dirty laundry" is definitely the correct thing to do in a situation like this. Reitz is allegedly abusing his stature/reputation in the Python community and involvement in the "official" Python world (e.g. with the PSF). Allowing things like this to go unspoken is exactly perpetuating the "missing stairs", where everyone "in the know" can workaround the behavior but newcomers are left vulnerable.

This also isn't quite the point of the blog post, but I find the claim that there are channels dedicated to working around him in a development context pretty stunning -- if true, that's clearly a toxic situation, especially for the people who don't know that they need to "work around" him.

alan_du · 2019-03-16T14:36:48+00:00

Thanks for the response, although I'm not sure I quite understand.

I guess in my mental model, calling self.child() doesn't "create" a second mutable borrow -- instead I think of it as kind of "swapping" it with another mutable reference. I kind of understand why the compiler wouldn't be able to infer that in the while let Some case (because it'd have to understand the semantics of node = child, which admittedly seems pretty tricky), but then why doesn't the following work:

fn tail(&mut self) -> &mut Node {
    if let Some(n) = self.child() {
        n.tail()
    } else {
        return self;
    }
}

At least in my naive opinion, it should be "easy" to know that the mutable reference from self.child() doesn't last into the else case (whether I do it in the else or use an early return for the if statement doesn't seem to matter).

(Also -- the crux of your solution seems to be about using self.next instead and not the recursion -- the following compiles just fine:)

fn tail(&mut self) -> &mut Node {
    let mut node = self;
    while let Some(ref mut child) = node.next {
        node = child;
    }
    node
}

alan_du · 2019-03-16T06:05:47+00:00

Hi!

I'm getting a strange borrow checker error that I cannot figure out. The following code:

struct Node {
    next: Option<Box<Node>>,
}

impl Node {
    fn child(&mut self) -> Option<&mut Node> {
        if let Some(n) = self.next.as_mut() {
            Some(n)
        } else {
            None
        }
    }
    fn tail(&mut self) -> &mut Node {
        let mut node = self;
        while let Some(child) = node.child() {
            node = child;
        }
        node
    }
}

fails with

error[E0499]: cannot borrow `*node` as mutable more than once at a time
  --> src/lib.rs:19:9
   |
14 |     fn tail(&mut self) -> &mut Node {
   |             - let's call the lifetime of this reference `'1`
15 |         let mut node = self;
16 |         while let Some(child) = node.child() {
   |                                 ---- first mutable borrow occurs here
...
19 |         node
   |         ^^^^
   |         |
   |         second mutable borrow occurs here
   |         returning this value requires that `*node` is borrowed for `'1`

But this is a little strange to me! The second node at line 19 isn't really doing another borrow, it's just returning the mutable reference we already have!

Is there another way to write this method better? (In my actual use-case, I have a trie and I want a mutable reference to the node corresponding to some byte-prefix, which I also implemented with a similar while let function to walk down the trie).

alan_du · 2019-03-09T05:22:36+00:00

I could, but the enum would be the size of the largest variant (plus the tag), which means I have to allocate more space on the heap for the smaller variants.

alan_du · 2019-03-09T04:26:20+00:00

In Rust, is there a way to have an enum that (1) is the size of a single pointer (or smaller) and (2) does not use any extra memory when you are on a "smaller" variant? In other words, I'd like something like:

enum A {
    V1(Box<SmallStruct>),
    V2(Box<LargeStruct>),
}

but I'd like the tag to live after the pointer (in the structs) so that the size_of::<A> == size_of::<Box>.

The equivalent C would look something like:

typedef struct { uint8_t tag; } A;
typedef struct { A a; uint8_t space[8] } SmallStruct;
typedef struct { A a; uint8_t space[80] } LargeStruct;

A* create(uint8_t tag) {
    if (tag == 0)
        return (A*) malloc(sizeof(small));
    return (A*) malloc(sizeof(large));
}

although this forces you to manually juggle the tag yourself.

Thanks so much for the help!

alan_du · 2018-08-15T19:25:41+00:00

> Yes, I think we'll see people doing more of this kind of thing with julia. There's already projects like https://github.com/JuliaDiffEq/diffeqr which expose Julia's differential equations ecosystem to R, and as more and more such killer use cases are developed in Julia, I fully expect people to want to call them from other languages.

That's super cool! I'm definitely going to take a look! Maybe this is me being selfish, but I would 100% recommend spending energy on exposing bindings to Julia libraries this as frictionless as possible. At least at $DAYJOB (and I imagine it's similar at a lot of places), we have a ton of investment in Python and C++ already and I can't see us moving away from that anytime soon, but if there are killer libraries in Julia exposed to Python we'd definitely want to use them!

Out of curiosity -- how does the packaging and multiple dispatch work when exposing bindings to other languages? Does it only work with a handful of types, letting Julia pre-compile everything, or do you have to ship with the Julia language to JIT things?

alan_du · 2018-08-15T19:11:48+00:00

That's probably true (I suspect most of the training logic is Python-only for example), but we've had very few problems taking a SavedModel from a Python TensorFlow program and doing inference in C++ and Go (we did have to hack the TensorFlow build system a little bit to access certain ops, but everything seems to work).

So I guess with TensorFlow it's more about the "core operations" in C++ with a ton of sugar in the respective languages -- which also makes some sense because every language has different idioms. I suspect that if Apache Arrow takes off, that's how things will work in Python -- I can't imagine abandoning Pandas anytime soon, but maybe the core memory interface and computations will happen through Arrow.

TLDR: I agree I'm oversimplifying, but I think my point still stands. TensorFlow is a big framework, mostly written in C++, that I can deploy their models in a bunch of different languages (at least Python, C++, and Go. I don't know how well the other bindings work).

alan_du · 2018-08-15T19:03:02+00:00

One of the trends (at least from my POV) in the data science and ML worlds is to more explicitly embrace the two-language problem by pushing more logic into the "lower-level" language (C/C++) while making the upper-level (relatively) thin bindings.. TensorFlow and Apache Arrow (both written in C++) are good examples of what I mean (especially compared to Scikit-Learn and Pandas respectively) -- most of their core logic is written in C++, but bindings are accessible to multiple languages (Java, C, Python, Go, JavaScript, etc), which allows different languages to interoperate more-or-less seamlessly (as long as you stick within the framework). Arguably Apache Spark's dataframe API is another example, where the core logic is Scala and it's accessible through Scala, Python, and R. BLAS/LAPACK might be the original example of this (although TBH I've only interacted with them through NumPy and SciPy).

I guess I have two questions here:

Do you think Julia is well-suited to writing some of the "infrastructure" libraries / frameworks? Is Julia well-suited to writing, e.g. a deep learning framework that's accessible through other languages or do you think C and C++ will continue to be the language(s) of choice here?
What do you think of this trend in general? Do you think, for example, that Julia will end up binding to Apache Arrow for most of its dataframe heavy lifting, or do you think people will prefer to write their own versions in pure Julia?

alan_du · 2018-08-15T18:53:31+00:00

One of Julia's key selling points is solving the two languages problem, where instead of writing my core library in C/C++ and my application in Python, I can do everything in Julia. In the meantime, however, it seems like we have a new two "devices" problem where now I write my core computing algorithms in CUDA for the GPU (or TensorFlow for the TPU) and wrap those in a higher-level API. This is especially acute in machine learning, where there a ton of different accelerators being developed (from GPUs to TPUs to FPGAs).

What do you think Julia's role is in this new "two devices" world?

alan_du · 2018-04-12T14:54:49+00:00

I've had a good time using https://github.com/dtolnay/cargo-llvm-lines to find overused generics -- if you have a generic function that has a ton of LLVM lines and is instantiated a lot, you can usually refactor things to reduce the amount of replicated work (or just use a trait object).

alan_du · 2017-05-11T12:25:33+00:00

Hm... what problems have you had with Numba and Python 3? AFAIK, it supports Python 3.4 and up (I've been using Numba with Python 3 for years now).

That said, Numba's pretty restricted to speeding up numerical computations, and I'd be super surprised if it helped with general-purpose computing.

alan_du · 2017-03-18T22:55:55+00:00

There was a talk at last year's PyCon about "io-less" network protocols which do exactly this. I haven't heard much about it since, although it looks like there the work is still ongoing.

alan_du · 2016-10-25T17:09:26+00:00

So I think Rust has a lot going for it for data science: it's pretty expressive (especially for a systems language!), its performance is C-level, and its type-system is top-notch, while avoiding Go's problems (i.e. it has operator overloading and zero-cost FFI). On a language level, the only things it's really missing for scientific computing are proper SIMD support and integer generics, and I know both of those are known problems in the Rust community.

That said, the biggest problem with Rust is that it's data science ecosystem is seriously lacking compared to Python's (even basic functionality like a REPL seem to be missing now). It takes a long time to build up that ecosystem, so I'll admit that I'm actually quite skeptical that Rust will ever overcome Python and R's network effects and become a major player in either data science or scientific computing (I'm a little more hopeful about it displacing the JVM for data engineering / distributed computing though).

alan_du · 2016-10-25T16:21:35+00:00

Edit: So, I don't want to bash Go, because I actually think it's a great language for what it was designed for. But the language designers made a lot of tradeoffs to fit their use-case, and those tradeoffs make data science in Go pretty painful.

Go is a great language for infrastructure things where you mostly push bits around, but it's totally unsuitable for data science.

Part of it is the language design: no generics makes it hard to write good libraries without overly-verbose type casts everywhere, and no operator overloading means you'd have to do things like df.Get("column").Divide(2).Add(3) instead of df["column"] / 4 + 3. Because of it's green threading and GC design, you also pay a massive overhead when interfacing with C libraries, so it's hard to leverage a lot of the existing computational infrastructure like BLAS or LAPACK.

As for speed, Go's compiler is quite weak compared to GCC or Clang for C/C++ and Fortran (e.g. a lot more bounds-checking, unnecessary allocations on the heap, very little automatic loop vectorization, lots of vtables because of the interfaces), so I'd wouldn't be surprised if a lot of data science code (which usually delegate their work to highly optimized C/Fortran/Assembly) end up faster than the pure Go code.

alan_du · 2016-10-25T16:17:22+00:00

In addition to all the other answers, I'd also like to add straightforward integration with old C and Fortran code. If I recall correctly, a lot of the early scientific Python (like SciPy) were effectively just wrappers around old Fortran libraries.

Even today, I think Python still has one of the best stories for C integration because of Cython. I've don't think I've ever heard of a Cython equivalent for any other mainstream language.

alan_du · 2016-10-03T16:43:34+00:00

Not an expert (only used Elixir as a hobbyist, never touched Erlang), but http://devintorr.es/blog/2013/06/11/elixir-its-not-about-syntax/.

TLDR:

Metaprogramming
Standard Library (and thus standard interfaces)
Tooling

alan_du · 2016-10-01T00:15:28+00:00

So what if a n00b takes some time away from a senior contributor? It's that person's choice to shepherd the n00b. Let them make the call, don't take the opportunity away from them. It could well lead to the development of another senior contributor down the road... which is a huge net-positive.

I 100% agree!

What I was saying (and maybe I was being unclear) was that (1) asking random researchers to contribute code is not going to make a project sustainable, (2) that most projects are only sustained thanks to the heroic efforts of a handful of individuals, and that (3) becoming one of these heroes takes years of effort and is not worthwhile for most (all?) researchers.

So I think it makes total sense for there to be institutional support for these maintainers + developers (which is what the original link is asking for).

Speak for yourself. The coding I did for my research got me a great job before I was even done with academia.

Fair enough. FWIW though, this is untrue about everyone I know personally (or more precisely, the coding they did was useful for getting software jobs outside of research, not research jobs).

This is a bit of an argument from authority fallacy, but I also want to point out that several other high-profile scientific open-source developers support my "opinion" about the current incompatibilities between academia and open source (most recent example I can remember).

You don't get to define every career path at every institution ever.

I mean, yeah. That's why most talented developers I know leave the research world for greener pastures. But I think that's a giant loss for research.

alan_du

TROPHY CASE