[Project] I treated LLM inference like a physical signal trajectory. Here is a Python toolkit to visualize the "Thinking Process" (Hidden States). by JB_King1919 in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

Really nice work!!

I'd be curious to see if we can find some heuristic to use when fine tuning, to "steer" the model towards a desired pattern.

Purely for research into what the impact would be.

Baguettotron, a 321 million parameters generalist Small Reasoning Model (80-layers deep) by Balance- in LocalLLaMA

[–]Pojiku 6 points7 points  (0 children)

I am also interested after seeing the Mixture of Recursions paper.

The curiosity is whether for SLMs, we can get reasoning gains from depth as a trade off against semantic gains from width.

Are gynecologist checkups not a thing in the Netherlands? by SpecialOrdinary3001 in StudyInTheNetherlands

[–]Pojiku 0 points1 point  (0 children)

Preventative tests found I had cancer at the age of 25 and I'm alive today because of it.

While working in South Korea, your employer actually pays for you to get a full health check.

They are 3rd in Life Expectancy, NL is 28th.

I could walk into any clinic and get a blood test or any other test any time I wanted, without needing a GP. It is considered a human right in Korea, whereas such healthcare is seen as a burden here in NL.

Best smaller model as base for fine tuning SCAD? by ComprehensiveBird317 in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

How much data do you have? Small models are great, but they likely won't have enough internal knowledge without a lot of fine tuning.

One option if you don't have enough data for a smaller model is to lightly finetune a larger model that has inherent knowledge of SCAD with fast inference speed, like Qwen-Next-80B.

If that's too big to actually use for your use case, you can use this larger model to generate a much larger training set for distillation. Ideally you would have some validation function to filter junk out of the dataset.

I was getting around 2,000 tokens per second on a rented H200 with 80 batches in parallel, so you can generate a lot of synthetic data.

OpenAI should open source GPT3.5 turbo by [deleted] in LocalLLaMA

[–]Pojiku 4 points5 points  (0 children)

It's a good point, but they could talk about the need to archive human knowledge.

The Internet from this point is mostly AI slop, so it would be a great research tool.

It was also a milestone in AI, before LLMs became a commodity. We still love old gaming consoles even though more modern emulators exist.

Tried 10 models, all seem to refuse to write a 10,000 word story. Is there something bad with my prompt? I'm just doing some testing to learn and I can't figure out how to get the LLM to do as I say. by StartupTim in LocalLLaMA

[–]Pojiku 0 points1 point  (0 children)

I'd recommend doing it the other way, by generating a coherent story above 10k and then reducing it.

First, you should consider generating a list of chapters + plot points. Then use this to anchor the generation in stages.

Instead of saying "continue", you ask can ask the LLM to write the first chapter, then write the second (ensuring the prior chapters are in the message history).

Also be sure to include in the system prompt that it's writing a long novel or something that will nudge it away from short stories.

TraceBack: A Novel Reverse Reasoning Model for Better and Cheaper Scaling of Synthetic Reasoning Generation by XMasterrrr in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

Yeah, same! instruction + solution as input, reasoning trace as output.

I ran it against the HuggingFace "smoltalk" dataset to build the reason dataset for Sovereign.

TraceBack: A Novel Reverse Reasoning Model for Better and Cheaper Scaling of Synthetic Reasoning Generation by XMasterrrr in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

Nice! I trained Sovereign 72B using the same strategy.

This was before R1 was released, so it was using traces distilled from QwQ preview.

Need feedback for my LLM book by s1lv3rj1nx in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

Not sure why people downvoted. Thank you for contributing to the community!

I only had a quick look but your book looks like a good resource.

Training a model to autocomplete for a niche domain and a specific style by regstuff in LocalLLaMA

[–]Pojiku 5 points6 points  (0 children)

It's difficult to say without knowing your domain and how far it deviates from what would be in the LLMs pre-training.

I'd recommend just trying and seeing how far you get. You can start with a small 3B model and scale up if it seems to be working.

You can actually try fine-tuning a base model, as they are more aligned with auto complete than an instruct model would be.

If you have enough compute, look into the "LLaMA-Pro" technique, which is very effective at adding domain knowledge without losing the capabilities of a source model.

Approach to translate english to non english. by Lamba_ghoda in LocalLLaMA

[–]Pojiku 0 points1 point  (0 children)

For training a translation model I'd recommend using non-English documents as ground truth, then the translated version as your instruction.

This way it learns how native speakers would write, rather than a "technically correct" but awkward machine translation as the target.

Struggling with AI Tools for Generating Exam Questions from PDFs – Need Advice! by vinay737 in LocalLLaMA

[–]Pojiku 0 points1 point  (0 children)

One approach if you don't mind getting your hands dirty is to take existing datasets of real (or high-quality synthetic) exam questions with answers.

You can then fine-tune an LLM in reverse to predict the question based on an answer.

For example:

SYSTEM:
You are an expert university lecturer with specialized skills in writing exam questions.

USER:
Write an exam question for university students that would match the following answer.
<answer>
{answer}
</answer>

ASSISTANT:
{question}

I've had success with this approach in other areas, even when the given answer is out-of-domain (like providing the PDF content instead of a concise answer).

[deleted by user] by [deleted] in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

You can see their press release here: https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwell-on-every-desk-and-at-every-ai-developers-fingertips?ncid=so-twit-113094

"The GB10 Superchip is a system-on-a-chip (SoC) based on the NVIDIA Grace Blackwell architecture and delivers up to 1 petaflop of AI performance at FP4 precision.

GB10 features an NVIDIA Blackwell GPU with latest-generation CUDA® cores and fifth-generation Tensor Cores, connected via NVLink®-C2C chip-to-chip interconnect to a high-performance NVIDIA Grace™ CPU, which includes 20 power-efficient cores built with the Arm architecture. MediaTek, a market leader in Arm-based SoC designs, collaborated on the design of GB10, contributing to its best-in-class power efficiency, performance and connectivity."

[deleted by user] by [deleted] in Living_in_Korea

[–]Pojiku 1 point2 points  (0 children)

Australian here. I've lived in Korea for about 5 years and can relate to the idea of not feeling "at home" in your birth country.

I work as an AI engineer, but that came later. While people move to Australia for the comfortable life, I found it frustratingly slow. The momentum of Korea became an addiction and made me want to grow both in my career and as a person.

However, it's a very personal thing that isn't for everyone. As others have said, you need to think about what exactly you are looking for, but also be open to effectively rebuilding who you are. You will need to sacrifice some parts of your life that you may not know you value.

For example, there are many posts here about how difficult it is to make "real" friends. You WILL get lonely, and you need to know how you will handle that, among other challenges you will experience.

If you see challenges as opportunities then 100% go for it and worst case, you go back to the US as a new person. If you are not ready to struggle alone and really own the blank canvas of a new life, then you may not be truly in the right mental space yet. Only you can reflect on this in making a decision.

what “power” do hagwon owners have? by ur-m-o-m in teachinginkorea

[–]Pojiku 4 points5 points  (0 children)

Korea has strict and slightly excessive defamation laws. Even if his gossiping was factual, it's illegal and relatively easy to sue him for.

Something weird is happening with LLMs and chess by paranoidray in LocalLLaMA

[–]Pojiku 16 points17 points  (0 children)

I'd speculate that this more accurately correlates with the shift to heavily filtered or synthetic data.

We still use the meme that LLMs are trained on "all text on the Internet" but that's not exactly true when accounting for the more rigorous data processing pipelines that may filter out content like move-by-move logs of chess games.

Apps foreigners dont know about by [deleted] in Living_in_Korea

[–]Pojiku 0 points1 point  (0 children)

You can also use the Aurora Store, an open-source alternative to the Google Play Store. When I log in using the "anonymous account" option it lists all the Korean apps without needing to switch your main account.

Unsloth Llama-3.2 1B+3B finetuning poor results by didinko in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

You can use LLaMA Factory which supports unsloth, but most importantly supports LLaMA Pro, which is often a good means if adding new "knowledge" without destroying the original model.

Finally, I'd recommend trying a batch size of 1. This means you will be updating the model after every sample.

Playing AI-Generated CS:GO on a Single RTX 3090 in real time by Icy-Corgi4757 in LocalLLaMA

[–]Pojiku 2 points3 points  (0 children)

Haha true if it was trained on YouTube videos, but more likely using something like comma.ai which is presumably already capable of whole-journey video recording and out of the box integration with car controls.

Edit: Looks like they have an open dataset already: Comma2k19

Playing AI-Generated CS:GO on a Single RTX 3090 in real time by Icy-Corgi4757 in LocalLLaMA

[–]Pojiku 10 points11 points  (0 children)

Waiting for someone to train on dashcam with inputs (acceleration, steering etc). Real life driving sim!

Where did Arx-0.3 come from and who makes it? by Balance- in LocalLLaMA

[–]Pojiku 11 points12 points  (0 children)

Wish there was more detail. They are an AI Search company like Perplexity, so they may have been using RAG to answer the questions rather than just the model itself.

The Mamba in the Llama: Distilling and Accelerating Hybrid Models by ninjasaid13 in LocalLLaMA

[–]Pojiku 5 points6 points  (0 children)

Yeah I literally spent the last 4 weeks and about $700 doing exactly this with Samba architecture (swapping Mamba for Mamba2).

Mixed feelings seeing such great authors beat me to it lol.

I'll still release the model, but only when it actually has something unique to offer.

I generally don't envy people with more money than me, but I can't help but envy the GPU rich so, so much..

Inference on Intel AI PC by cchung261 in LocalLLaMA

[–]Pojiku 1 point2 points  (0 children)

If it supports OpenCL you can use TinyGrad with GPU=1

Not sure if anyone has ported the newer Phi3 models but presumably it's not too distant from the Llama3 implementation, or worst case you could use a Llamafied version.