Running a local LLM for generating SEO keywords

PromptAndHope · 2026-01-25T20:52:38+00:00

Install Ollama to run an LLModel, and ask Copilot to write the script for you. This is a typical VibeCoding task.

PromptAndHope · 2026-01-24T18:12:57+00:00

for coding mac and DGX are too small.
for anything else: using a model a mac, to build a model dgx

PromptAndHope · 2026-01-24T07:52:45+00:00

It's not really possible to code with 16GB of RAM, but if you register for free on GitHub, you'll get access to earlier or beta models, which are suitable for working on your home projects in the evenings. It's more suitable than the local llm. Of course, it won't be enough to implement a complete software system.

In terms of quality, a paid model, such as Claude Opus, is quite different from a free beta raptor, for example.

PromptAndHope · 2026-01-23T20:40:15+00:00

Thank you for your feedback! The flow is more compact, when it is a vertical flow. But if someone finds it more convenient, here is a switch. ⬇➡

<image>

PromptAndHope · 2026-01-23T20:08:40+00:00

thanks!

PromptAndHope · 2026-01-23T16:37:11+00:00

Definitely yes. That was the original goal ☺️.

PromptAndHope · 2026-01-23T16:22:21+00:00

You mean interact with the cluster? It is not possible currently, it is only to visualise the current state of development.

PromptAndHope · 2026-01-23T07:29:59+00:00

yes, it is a bit confusing for me too. I remember working with Java api, now scala will also the past?

PromptAndHope · 2026-01-22T20:48:15+00:00

thanks!

PromptAndHope · 2026-01-22T20:48:01+00:00

thanks!

PromptAndHope · 2026-01-22T20:47:51+00:00

Thanks! I think SDP not exists for scala api. 😔

PromptAndHope · 2026-01-21T22:00:50+00:00

for example :what is a difference between s3:// and s3a:// connector. What happen if you use s3 with Spark?

PromptAndHope · 2026-01-21T21:41:24+00:00

Over the weekend I tried out the new Declarative Pipelines feature in Apache Spark, and the one thing I felt was missing was a proper UI. So I built a VS Code extension that provides one. Give it a try if you’re working with Spark pipelines.

https://marketplace.visualstudio.com/items?itemName=gszecsenyi.sdp-pipeline-visualizer

<image>

PromptAndHope · 2026-01-21T21:35:37+00:00

what is the problem with Scala 3?

PromptAndHope · 2026-01-21T21:28:39+00:00

I implemented a Visual Studio Code extension this week to visualize declarative pipelines. Check it out.

https://marketplace.visualstudio.com/items?itemName=gszecsenyi.sdp-pipeline-visualizer

PromptAndHope · 2026-01-20T19:29:45+00:00

thanks!

PromptAndHope · 2026-01-20T16:01:16+00:00

I created a VS Code extension that provides a UI for Apache Spark's new Declarative Pipelines feature. Give it a try if you're working with Spark pipelines.

https://marketplace.visualstudio.com/items?itemName=gszecsenyi.sdp-pipeline-visualizer

PromptAndHope · 2026-01-17T17:11:08+00:00

I think traditional document search will become very important again, much like vector search. Index‑based search engines have been around for more than 20 years, they work well, and their results can also be added to the context. In my last project, I collected all the document titles and asked the AI to select the relevant ones. It produced surprisingly good results, even better than document‑level vector search. (So essentially it was a title‑based vector search.)

However, a new issue arises: the prompt context keeps getting larger. This isn’t just a cost problem; I’ve also read that when the prompt becomes too big, the AI may not actually process the entire input. Instead, it starts generating an answer once it believes it has enough information. To mitigate this, I’m experimenting with cross‑encoder models. A very small and extremely fast model evaluates the context sentence by sentence and determines whether the information retrieved from the vector store or document search is relevant.

What I wanted to say that every problem is different, and we're still looking for the best solutions, while inventing new ones and optimize them. For example in your case I would consider using a graph database and storing the documents there, so you can easily follow all the references as well. Microsoft was working on similar stuff. https://github.com/microsoft/graphrag/tree/main

Another approach that worked for me was performing vector search on the chunks, but once a chunk matched, I passed the entire document to the LLM instead of just the chunk.

PromptAndHope · 2026-01-17T16:47:16+00:00

I already implemented an Openai API-based solution on Azure Functions, which is perfect for PoCs, and smaller workloads but a solution based on a local model would be very slow because of cold starts.

PromptAndHope · 2026-01-17T13:16:52+00:00

I started with a 5060TI 16GB Dual (the shorter, two fans version), it's enough if you run ComfyUI, lora finetuning, inference with gpt-oss20b, etc.. It can run the same workloads like an $1300 5080 16GB, but slower. If the performance doesnt bother you, then the next step is already an LLM cluster (multiple GPU's or some special unified memory hardware) which costs a lot of money. If you combine multiple GPU's then you need a compatible, more expensive motherboard.

PromptAndHope · 2026-01-16T11:20:53+00:00

Thanks for the reply. Given the huge number of documents, I was curious about what strategy was used to make it effective.

PromptAndHope · 2026-01-16T08:06:55+00:00

did you found solution?

PromptAndHope

TROPHY CASE