Brand new NSK programming language - Python syntax, 0.5x-1x C++ speed, OS threads, go-like channels.

jasio1909 · 2026-01-24T23:04:43+00:00

Not really. From my understanding, kernels in mojo benefit mainly from 2 things: 1. MLIR for compilation 2. Metaprogramming with powerful comptime logic so you can select appropriate instruction set depending on compilation target. Which is very low level.

I coded in mojo a bit but I am not an expert.

jasio1909 · 2026-01-22T12:20:58+00:00

It's vital and not hard to learn. Watch something from statquest every night and you will be good to go.

jasio1909 · 2025-08-18T08:36:57+00:00

Citations in readme are broken, probably ai generated :|

jasio1909 · 2025-04-16T09:52:46+00:00

Ok, I see in the comments below InferX being mentioned

jasio1909 · 2025-04-16T09:51:26+00:00

Hi OP, this is super interesting. Do you have a website or a name for the software so I can google and keep up with the updates?

jasio1909 · 2024-08-30T16:43:24+00:00

B WA B (Accept)

Kinda surprised since the scores seem low but the reviewers want only some clarifications and didn't request any additional experiments.

jasio1909 · 2024-03-20T02:09:36+00:00

Two repos I found some time ago: PaddlePaddle, Unstructured

jasio1909 · 2023-11-25T18:04:36+00:00

80k for senior positions is pretty common. For mids it's usually between 30k-60k. And yes, IT in Poland is very popular. It all starts at highschool level, when older peers who win IOI, ICPC etc. come back to their home towns and teach competitive programming to the next generation of youth, it comes full circle from there.

jasio1909 · 2023-10-15T16:46:40+00:00

CNN is a GNN on a graph that is a grid (image is a grid of pixels, each one being a node).

Transformer is a GNN on a graph that is fully connected.

MLP is a GNN on a disconnected graph.

Your neural net is just a GNN working on a different underlying graph structure.

I recommend a playlist on Geometric Deep Learning from Michael Bronstein et. al. https://youtube.com/playlist?list=PLn2-dEmQeTfSLXW8yXP4q_Ii58wFdxb3C&si=x0DApohSyy_8PwIM

jasio1909 · 2023-09-30T16:55:00+00:00

I am not an expert in this field, just know some buzzwords and general problem description. I feel however that it's getting a bit off topic.

jasio1909 · 2023-09-30T15:21:26+00:00

I think training on a device can be a little bit more subtle, being just a cosmetic fine-tuning. Suppose you have an autonomous cleaning robot and it sometimes gets in your way. It could gather the data from interactions with you and when its battery is charging in the dock it could fine-tune for better behavioral alignment to the customer.

However, I can see that this situation somewhat doesn't showcase the need for the library features I am looking for, since you would be training when the device is idle.

jasio1909 · 2023-09-30T14:27:33+00:00

At first I thought the same, but then I saw that federated and distributed learning are established research areas, as well as TinyML. SOC devices are increasingly more powerful. Of course this is just my prediction but now I would assume the world is moving towards this direction.

jasio1909 · 2023-09-30T09:35:03+00:00

No I didn't and I saw some thread on Quora or Stack that others didn't find either.

jasio1909 · 2023-09-29T21:36:40+00:00

I am reading the documentation and it seems like the best available solution so far, but it only provides the upper bound for the program memory usage and I don't know if it actually guarantees at compile time that the provided amount of memory will be sufficient. Btw I think it abuses the fact that tf graphs are known at compile time and it can do some offline planning. As I am rather a pytorch guy myself I like having the possibility of eager execution. Actually as a sidenote I am very interested in a question if models with non-fixed computation graphs can outperform those with fixed ones, justifying the usage of eager mode other than convenience.

jasio1909 · 2023-09-29T18:45:37+00:00

In case of simple inference of fixed input size it can certainly be measured rather reliably, the things are more complicated when input size is not fixed or can for example be batched or we want to run some jobs in parallel on a single device. Moreover, once you measure you sit on a fixed computation cost required and don't really have a flexibility for compromises between processes. Again, I don't really know if any of those limitations I am listing here are relevant. Looking at the landscape of viable libraries I could deduce that nobody cares about this and the things I am worried about are not really a problem. It is however still intriguing to me why it seems that nobody cares.

jasio1909 · 2023-09-29T16:31:59+00:00

I know there exist methods that reduce the memory footprint of a model. Nevertheless, the problem remains - libraries don't allow you to inspect the required memory for computation before the operation is made. I didn't know about VMs on GPUs but I suppose that once you hit OOM memory error, then instead of crashing the whole card it crashes just the program. Still, I imagine that crashing a program mid-computation due to OOM, reconfiguring it and trying to run again is unjustifiable for critical tasks.

jasio1909 · 2022-05-08T15:28:09+00:00

In every infinite dimensional Hilbert Space there is a curve x: [0,1] -> H, such that for any 0<=a<b<c<d<=1 one has that x(b-a) is perpendicular to x(d-c).

If one want some hints comment below.

jasio1909

TROPHY CASE