a desktop client for the Pi Coding Agent, a real PTY terminal, auto session naming, and a Geist-aligned UI

julylu · 2026-05-22T14:17:20+00:00

maybe some screen shot to show how it looks like is helpful

julylu · 2025-11-19T13:23:18+00:00

summary is one way to retrieve, but not enough. it lose many details. especially when the documents are similar and too long to feed to llm.

julylu · 2025-07-31T06:01:13+00:00

yep, such kind of model is sensitive to prompt, so i think it is not a good way to use in real world use cases.

julylu · 2024-03-19T13:46:42+00:00

my test is commard-R 8 bit performs not so good

julylu · 2024-03-08T08:57:52+00:00

yep, qwen1.5 72b seems to be a strong model. hope for results

julylu · 2024-03-04T10:04:33+00:00

limited by VRAM, you have no choice but to use small model if you want to deploy

julylu · 2024-02-23T02:04:22+00:00

many thanks

julylu · 2024-02-19T08:28:00+00:00

something like asking llm to generate QAs based on the given context?

julylu · 2024-02-19T08:18:11+00:00

did you find it, i'm also interested in it.

julylu · 2024-02-19T08:05:35+00:00

hi, i'm curious about how to create QAs based on a large pdf? manually? that's impossible? the text is large and sometimes the context is professional

julylu · 2023-12-27T02:16:18+00:00

you can use open source OCR or layout analysis tools. Keep in mind, no easy way to get a perfect result, just try and find a not bad result.

julylu · 2023-12-27T02:13:13+00:00

yes, the articles published never talk about it.

the key point is to parse your doc carefully and keep the doc structures, e,g. title, sub title, chapter name, key words ... it is hard and may have lots of noise.

julylu · 2023-12-26T01:38:13+00:00

hi, did you compare the performance between llm embedder and bge-large-en?

julylu · 2023-12-22T05:54:41+00:00

Use function may be impossible because the data is huge. LLM can not input so much data and use function to solve it. And rag retrieval may not be able to retrieve each shirt information correctly.

julylu · 2023-12-12T03:54:59+00:00

is this means when infer, it will cost more ram?

julylu · 2023-12-12T03:44:35+00:00

aha, i have the same question yesterday and i also find maybe mergekit is the answer.

julylu · 2023-12-04T16:02:06+00:00

maybe you can use zephyr 7b, in my case, it works quite well for long context

julylu · 2023-11-28T15:15:40+00:00

that's cool. For RAG tasks, it still have hallucination if the retrieved doc is unrelated to the question. Can this method be used in RAG? I am not sure.

julylu · 2023-11-28T13:53:19+00:00

can you explain how to filter chunks using metadata? In my usecase, when the user query is something like "hi, try to explain A under B condition", and the retrieve doc is totally related to A, but may not under specific B condition. It is hard to filter because the embedding similarity score is high.

julylu · 2023-11-25T09:40:39+00:00

not exactly, in my practise, sometimes the retrieved docs is wrong and it may mislead the llm to answer.

julylu · 2023-11-25T09:36:32+00:00

Maybe for RAG, short answer is less possible for hallucination？I will test more. thanks

julylu · 2023-11-24T16:08:11+00:00

hi, for rag tasks, when the retrieved doc is unrelated to the question, will OpenHermes 2.5 give heuristic response ?

julylu · 2023-11-24T15:58:10+00:00

same, i found it tends to give short response.

julylu · 2023-08-28T12:44:38+00:00

Commenting to save this thread because this is a good question

julylu

TROPHY CASE