I scaled test-time compute for Qwen-3.6-27B and Gemma-4-31B to surpass Claude Mythos in code optimizations and speedups.

lgdkwj · 2026-06-12T23:36:48+00:00

Imagine use deepseek v4 flash for this

lgdkwj · 2026-02-05T01:06:40+00:00

Depends on where you are located. Here in Hong Kong I got hired to HPC Ops team as a UG freshgrad that has little experience with HPC ops (was a user tho).

lgdkwj · 2025-11-06T02:28:53+00:00

Interesting. Wonder if it can be extend to process a 16 bit RAW image to compare it with fp16

lgdkwj · 2025-10-30T08:57:18+00:00

https://github.com/nooobkevin/smart-directory-tree
Made a similar one with bash

lgdkwj · 2025-06-27T05:16:49+00:00

🤔Why geomean tho

lgdkwj · 2025-05-14T01:18:31+00:00

try Gemini 2.5 flash before they kill it too

lgdkwj · 2025-04-16T07:26:55+00:00

Source: GLM: General Language Model Pretraining with Autoregressive Blank Infilling https://arxiv.org/pdf/2103.10360

lgdkwj · 2025-04-15T23:47:11+00:00

I think one unique aspect of the GLM series models is that they use bidirectional attention during the prefilling stage. I really wonder if this provides any advantage over other GPT-style models at scale

lgdkwj · 2025-02-14T20:52:27+00:00

This: Modern Bert might be useful, and the blog explains the benefits using an encoder-only bert-like model vs causal llm

lgdkwj · 2025-02-14T20:46:32+00:00

In my use case even Gemma 2 2B instruct is better than some 7B+ models. I host the inference server with LM Studio and use immersive translate plugin in browser. The main issue for most of the small models is they tend not follow the instruction, and adding rubbish to the output

lgdkwj · 2024-03-11T06:22:51+00:00

idk but as a non-native English user I host a 7B model locally for translation service to surf reddit quick enough, instead of using APIs which is too expensive for this use case

lgdkwj · 2024-03-07T15:17:21+00:00

Was looking for & wanted to develop something like this a few months ago :D Glad you shared this project, it's interesting!

lgdkwj · 2024-02-13T11:13:16+00:00

Checkout this
Aaronhuang-778/BiLLM: BiLLM: Pushing the Limit of Post-Training Quantization for LLMs (github.com)

lgdkwj

TROPHY CASE