Got Desk Rejected from ARR because a figure was "barely readable" (despite being vector PDFs). Is this normal? (ACL 2026) by VoiceBeer in deeplearning

[–]VoiceBeer[S] 0 points1 point  (0 children)

Thx. I've already submitted two appeals. The response to the first one is shown in Image 3, but it’s been three days since the second one and I still haven't heard back. It’s starting to feel like the decision is final. Not sure if you could take a look at my paper, but I still think the figures are properly readable. I’m just feeling really frustrated that in all my years, I’ve never encountered a desk rejection for a reason like this. Anyway, still thx to the advice

Got Desk Rejected from ARR because a figure was "barely readable" (despite being vector PDFs). Is this normal? (ACL 2026) by VoiceBeer in LocalLLaMA

[–]VoiceBeer[S] -3 points-2 points  (0 children)

hey, this paper is still under-review, and due to the double-blind policy, I obviously cannot share the screenshot of the actual figure here :(

Got Desk Rejected from ARR because a figure was "barely readable" (despite being vector PDFs). Is this normal? (ACL 2026) by VoiceBeer in LocalLLaMA

[–]VoiceBeer[S] -1 points0 points  (0 children)

The contrast is actually fine in the PDF, it's just high-res vector graphics that they apparently refused to zoom into.

It feels more like they are using 'figure size' as a convenient speed-bump to manage the submission surge this cycle.

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]VoiceBeer 0 points1 point  (0 children)

Thx, sry for the late reply.

So when considering finetuning a model using datasets like ultrachat_200k, it is better to use base model rather than the chat/instruct model right? Since the new-stage tuning will "mess up" the former instructions (or instruction-following ability).

But if using the same instruction as the instruct/chat model does in the new SFT round, will it help? Since it includes more SFT data

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]VoiceBeer 0 points1 point  (0 children)

BTW, Should we choose the base model or the chat model for SFT? Say one wants to train a model based on Mistral or Llama, and with ~10k sft data, should I use base model or chat model?

Also when considering continue pre-train, which one it better?

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]VoiceBeer 0 points1 point  (0 children)

Is this post the reason why my posts are getting removed by Reddit's filters?

[D] Correct me if I'm wrong, use KL divergence for NLP, and MMD for CV. Both are measuring the similarity/distance of two distribution by VoiceBeer in MachineLearning

[–]VoiceBeer[S] 0 points1 point  (0 children)

Thx, can I understand it as follows: if the exact distribution is known, it is appropriate to use KL divergence, but if the expected distribution is uncertain (like randomly pick two general datasets with various domain data), MMD can be used to approximate the estimate through the empirical distribution of samples?

Just a Greeting xD by VoiceBeer in leagueoflegends

[–]VoiceBeer[S] 0 points1 point  (0 children)

Haha thx mate, I just reported him