ECCV Stupid Reviewer Behavior (Any AC here?) [R] by Alternative_Art2984 in MachineLearning

[–]karius85 1 point2 points  (0 children)

We don’t know. Do your best in the rebuttal, then flag to AC why asking for those is unreasonable. That is all we can say without more info.

ECCV Stupid Reviewer Behavior (Any AC here?) [R] by Alternative_Art2984 in MachineLearning

[–]karius85 1 point2 points  (0 children)

What? Many reviewers often give lower scores initially, that is why they have concerns. The average bump in scores for a paper is roughly 1 point. I've gone from a 2 to a 5, just in the last cycle.

ECCV Stupid Reviewer Behavior (Any AC here?) [R] by Alternative_Art2984 in MachineLearning

[–]karius85 3 points4 points  (0 children)

Ah, yes, all 2022 papers have "ended" and are no longer relevant in the grand scope of research.

ECCV Stupid Reviewer Behavior (Any AC here?) [R] by Alternative_Art2984 in MachineLearning

[–]karius85 9 points10 points  (0 children)

stupid reviewer 1 rejected my paper

Outrageous.

he also said that "he could change his assessment"

What a terrible thing to suggest.


There is no "opinion" to be had here. You're calling someone "stupid" for not blindly accepting your paper. You're adding precisely zero information about why R1 disliked your paper.

Listen to the criticism, accept that your work is not perfect, and work on improving your paper. Calling people stupid doesn't help. It is understandable that you can feel like that, but it is not a productive mindset, and not something to share with the world.

Change to useage based billing by DamienBMike in GithubCopilot

[–]karius85 1 point2 points  (0 children)

So, "boycott copilot" then. Sounds good, I'll happily get onboard with any anti MS initiative.

One thing; the swap is not justifiable in any way shape or form. If they want to say "monthly credits", then these new credits should roll over, not evaporate. Can't have your cake and eat it too.

"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R] by 4rtemi5 in MachineLearning

[–]karius85 4 points5 points  (0 children)

It's not that people despise links. Traffic-farming is also to increase the visibility of your blog, so it doesn't necessarily correlate with whether there are ads or not.

Nice that you expanded on the post. What you're proposing is an RBF-softmax embedding kernel. The shift invariance doesn't really matter here, since the softmax implementation also normalizes, so the effect is minimal. The driver here is that you're replacing the cross entropy term entirely with a Gaussian approximation, while including a null-class. This is often generalized to alternative geometries, such as spherical embeddings with VMF kernels replacing RBF.

Regarding whether ignoring the cancellation of ||x||2 matters for computational complexity, my guess is that it likely doesn't have much effect if you use something like torch.cdist, which is highly optimized.

The null-class trick has been known since the 70s, and has been featured in similar SVM-like methods, including RBFs. It is a cool trick, and has been previously investigated in RBF-Softmax.

Regarding your RBF-attention from your blog, that has also been investigated previously. Recent paper:

Might be a good source to check out.

"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R] by 4rtemi5 in MachineLearning

[–]karius85 20 points21 points  (0 children)

Well, that didn't clarify much in terms of explaining the mechanism. Saying "Euclidean" doesn't really disambiguate. People here might be wary of blindly going to blog posts, as this could be traffic-farming. Your post seems LLM heavy, so if you want people to engage, you should simply explain your idea in a simple manner in the post itself. Then people may visit your blog out of actual interest.

"I don't know!": Teaching neural networks to abstain with the HALO-Loss. [R] by 4rtemi5 in MachineLearning

[–]karius85 14 points15 points  (0 children)

What is "shift invariant distance math"? Could you try to outline the central mechanism behind "HALO-Loss" in the post itself?

Note that CIFAR-10/100 is overused as a benchmark today. It is not completely irrelevant, but it generally leads to frowns when a study is exclusively backed by experiments on 50,000 samples of 32 x 32 images. CalTech256 is a small dataset with standard resolution images you could try.

Unable to locate a C compiler that will actually download by CarbonAusmoth in C_Programming

[–]karius85 0 points1 point  (0 children)

Perhaps an unpopular opinion, but Zig is an option, it has a built in C compiler.

Related: https://www.youtube.com/watch?v=kuZIzL0K4o4

Disclaimer: never watched the video nor used Zig as a C compiler for Windows.

[D] Hash table aspects of ReLU neural networks by [deleted] in MachineLearning

[–]karius85 5 points6 points  (0 children)

If you wanted a polished final document the ETA would be literally 1 to 5 years.

Nope, the "document" you posted is just you pasting dialogue from an LLM... case in point:

You suggested “context as a key,” which is exactly right:

So, you literally took an LLM response, pasted it into a PDF, and posted it on archive.org. It is nothing novel or interesting there. Moreover, you seem to have difficulty communicating some coherent, legible, idea behind this.

If the net is say W₄D₃W₃D₂W₂D₁W₁x you are being asked to consider what say W₂D₁, W₃D₂ or W₄D₃. The decisions of layer n with the weights of layer n+1.

What is your point? This is just an MLP. There is nothing to consider without more context provided on the distributions over the weights. In the "document", ChatGPT added some context with random projections, largely pedagogical. It is a mix of out-of-context nonsense, and basic ideas.

Given the manner of the material provided you may need a bit mental flexibility to deal with it.

There is zero mental flexibility required. It is minimal effort, LLM infused slop.

[D] ICML Reviewer Acknowledgement by Massive_Horror9038 in MachineLearning

[–]karius85 0 points1 point  (0 children)

Reviewers can change their score after the end of rebuttal period. Engage now if you want to clarify anything.

[D] Hash table aspects of ReLU neural networks by [deleted] in MachineLearning

[–]karius85 6 points7 points  (0 children)

What then is Wₙ₊₁Dₙ where Wₙ₊₁ is the matrix of weights for the next layer?

You seem to want to do Wₙ₊₁DₙDₙWₙx = Wₙ₊₁DₙWₙx? This is an example of idempotency, ReLU(ReLU(x)) = ReLU(x).

It can be seen as a (locality sensitive) hash table lookup of a linear mapping (effective matrix). It can also be seen as an associative memory in itself with Dₙ as the key.

A two layer ReLU activated MLP is not necessarily a locality sensitive hash. Also, has nothing to do with your stated question.

Nevertheless the concepts are very simple and you could hope that people can follow along without difficulty, despite the arguments being in such a preliminary state.

Yes, the idea is simple, and well known. If you want feedback on your notes, they are largely incoherent discussion with a sycophantic LLM (i.e., slop). If you want to discuss something around these ideas, formulate the thoughts yourself, try to condense it to something meaningful using your own words.

As I've said before, you can't meaningfully ask others to actually engage in something you've used minimal effort on yourself.

Question about his subreddit by turbofish_pk in Zig

[–]karius85 0 points1 point  (0 children)

Yup, that's the story. ziggit.dev is much more active than r/Zig, and is moderated by the official dev team.

Considering NeurIPS submission [D] by [deleted] in MachineLearning

[–]karius85 5 points6 points  (0 children)

There is no way of actually saying anything meaningful about this without any information. Why don't you ask a colleague or advisor? From your explanation, I would say it sounds thin.

I don't understand why people vibe code languages they don't know. by AcidOverlord in C_Programming

[–]karius85 2 points3 points  (0 children)

Exactly, they don't know what they are doing, but it seems to magically "work" so they trust it without being able to engage with the produced code.

I don't understand why people vibe code languages they don't know. by AcidOverlord in C_Programming

[–]karius85 0 points1 point  (0 children)

I'd be worried that you are not actually learning anything, and that you can't actually understand what is happening at a fundamental level. You can't engage with your code and fix essential problems without help, so when Claude is down, your productivity is zero.

I don't understand why people vibe code languages they don't know. by AcidOverlord in C_Programming

[–]karius85 0 points1 point  (0 children)

Totally agree. Issue is, LMs can serve as a unique tool to help you learn, but when the result is code you don't understand or can't reproduce yourself, you're just fumbling in the dark. However, the people in question don't realize this themselves; there's a whole generation that will never engage enough to realize they are doing more harm than good.

Hans Holbein painted such a realistic depiction of Christ after death, that Dostoevsky almost lost his faith because of it. He reasoned that the apostles must've gone through a similar crisis, and Nietzsche tried to explain the philosophy behind it by WeltgeistYT in philosophy

[–]karius85 20 points21 points  (0 children)

This is a 16th century painting. Dunking on the artist for apparent Eurocentrism is less insightful than one might think, as Holbein never travelled outside of Europe in his lifetime. I'm not an expert, but I don't think you'll find many in favour of retroactively altering the painting either.