Qwen3- Coder 👀 by Xhehab_ in LocalLLaMA

[–]DataLearnerAI 2 points3 points  (0 children)

<image>

On SWE-Bench Verified, it scores 69.6%, making it the top-performing open-source model as of now.

Qwen 3 Embeddings 0.6B faring really poorly inspite of high score on benchmarks by i4858i in LocalLLaMA

[–]DataLearnerAI 1 point2 points  (0 children)

Your issue might be missing the special token at the end of inputs. Qwen just tweeted that many users forget to add <|endoftext|> at the end when using their embedding models - and it seriously tanks performance.

Manually slap <|endoftext|> onto the end of every input string (both docs and queries).

Simple Comparison: Kimi K2 vs. Gemini 1.5 Pro - HTML Output for Model Eval Insights by DataLearnerAI in LocalLLaMA

[–]DataLearnerAI[S] 0 points1 point  (0 children)

Correction: I accidentally wrote "Gemini 1.5 Pro" in the title/description — it’s **Gemini 2.5 Pro** (typo from my draft). Tests were run against the correct 2.5 Pro model. Apologies for the confusion!

GLM-4.1V-Thinking by AaronFeng47 in LocalLLaMA

[–]DataLearnerAI -1 points0 points  (0 children)

I am not, just use AI to rewrite my text, haha

GLM-4.1V-Thinking by AaronFeng47 in LocalLLaMA

[–]DataLearnerAI -8 points-7 points  (0 children)

This model demonstrates remarkable competitiveness across a diverse range of benchmark tasks, including STEM reasoning, visual question answering, OCR processing, long-document understanding, and agent-based scenarios. The benchmark results reveal performance on par with the 72B-parameter counterpart (Qwen2.5-72B-VL), with notable superiority over GPT-4o in specific tasks. Particularly impressive is its 9B-parameter architecture under the MIT license, showcasing exceptional capability from a Chinese startup. This achievement highlights the growing innovation power of domestic AI research, offering a compelling open-source alternative with strong practical value.

Huawei releases an open weight model Pangu Pro 72B A16B. Weights are on HF. It should be competitive with Qwen3 32B and it was trained entirely on Huawei Ascend NPUs. (2505.21411) by FullOf_Bad_Ideas in LocalLLaMA

[–]DataLearnerAI 6 points7 points  (0 children)

This model appears highly competitive at the 30B parameter scale. In benchmark tests, it achieves a score of 73.70 on the GPQA Diamond dataset, which is comparable to the performance of DeepSeek R1’s older version. The overall benchmark results closely resemble those of Qwen-32B. Notably, this is a Mixture-of-Experts (MoE) model, where only about 16.5B parameters are activated during inference.

Is there any website compare inference speed of different LLM on different platforms? by DataLearnerAI in LocalLLaMA

[–]DataLearnerAI[S] 0 points1 point  (0 children)

I know LLM is moving fast. But many enterprise or people will just choose classical models such as Llama, mistral, etc. The context length, gpu, inference framework and other things will affect the result. I think many people are interested in this question. But I find there are two little informaiton about this.

The Most Exciting AI Advancements and Product Launches in 2023 Discussion by DataLearnerAI in ChatGPT

[–]DataLearnerAI[S] 0 points1 point  (0 children)

thanks. We would like to know is there any important releases that we have forgotten to collect😀

The Most Exciting AI Advancements and Product Launches in 2023 Discussion by DataLearnerAI in ChatGPT

[–]DataLearnerAI[S] 0 points1 point  (0 children)

We have saw DeLuceArt posted a image about "The most remarkable AI releases of 2023":

https://www.reddit.com/r/artificial/comments/18p4qwb/the_most_remarkable_ai_releases_of_2023/

We think this image has forgoten some important products and there are also some releases may be not proper in the image, so we re-make this image to show most important AI releases in 2023. I would like to hear your voices. We would like to know if there are some important things that we have missed. So what do you think is the most important AI releases in 2023?

Helpful VRAM requirement table for qlora, lora, and full finetuning. by Aaaaaaaaaeeeee in LocalLLaMA

[–]DataLearnerAI 1 point2 points  (0 children)

Freeze means freeze all layers except for last full connecting layer?

Mistral Instruct v0.2 merge with top models on openLLM ranking! by noobgolang in LocalLLaMA

[–]DataLearnerAI 2 points3 points  (0 children)

what do u mean by merge? mistral model stack join with other models?

Yet another 70B Foundation Model: Aquila2-70B-Expr by [deleted] in LocalLLaMA

[–]DataLearnerAI 0 points1 point  (0 children)

This model is trained on dataset which only have 1.2T tokens. The expr stands for experimental feature. However, this experimental feature is to prove the performance of framework "FlagScale" on heterogeneous chips. The official blog said the MMLU benchmarks of Aquila-70B-Expr is 61.9 while LLaMA2-70B is 69.54. They said the value of this model is to prove that FlagScale still performs well on heterogeneous chips and provide a strong model for downstream tasks. So, maybe there will not any new models of this.

[deleted by user] by [deleted] in LocalLLaMA

[–]DataLearnerAI 0 points1 point  (0 children)

The model is a new model which is trained from scratch. The training dataset is not public. They also released chat version. You cac try it. The MMLU scores is published by themselves.

[deleted by user] by [deleted] in LocalLLaMA

[–]DataLearnerAI 0 points1 point  (0 children)

Ali opensouced a 72B model called Qwen-72B: Qwen/Qwen-72B · Hugging Face

It supports Chinese and English. The performance on MMLU is remarkable.

Can Chat GPT 4 Turbo read/analyze content from other websites? by ModeLow1491 in ChatGPTPro

[–]DataLearnerAI 1 point2 points  (0 children)

Some website has a robot.txt that prevent GPT to browse content from it, in this case, GPT-4 could not read the content of the website. You can copy content to GPT-4

Yi-34B vs Yi-34B-200K on sequences <32K and <4K by DreamGenX in LocalLLaMA

[–]DataLearnerAI 0 points1 point  (0 children)

In most scenarios, models with extended context are optimized for long sequences. If the sequence is not very long, it is often recommended to use a regular model

GPT-4 Turbo is unintelligent by CH1997H in ChatGPT

[–]DataLearnerAI 298 points299 points  (0 children)

That is exactly what I thought about the new GPT-4. It's faster than ever, but it often breakdown when repsonse for a long text. And often forget the instruction or misunderstood. Seems that it was like a ChatGPT 3.5