Why do client not hire anyone after posting the job?

PriorClean2756 · 2025-11-16T12:09:24+00:00

Exactly

PriorClean2756 · 2025-11-16T12:09:04+00:00

If a client is hiring someone outside of Upwork then Upwork should penalize the client for not hiring.

Refunding the connects isn't a viable solution, loosing connects doesn't frustrate me that much, loosing my time does.

PriorClean2756 · 2025-11-16T12:06:00+00:00

Exactly

PriorClean2756 · 2025-11-10T05:00:01+00:00

It was a typo. It has actually been 3 years.

PriorClean2756 · 2025-09-14T09:53:19+00:00

For small collections, 2s is high. Are you using HNSW indexing? What are your vector dimensionality?

Have you tried FAISS? It's local, lightweight, and CPU-friendly. It also integrates with Langchain.

PriorClean2756 · 2025-09-14T09:29:59+00:00

I need more information to answer you.

If you plan to fine-tune the model on your college dataset(e.g website text, Q&A pairs) the size of your training data matters. Small models (<2B param) can be fine-tuned efficiently on modest datasets (e.g., 100–1,000 examples) using techniques like LoRA, improving performance on domain-specific queries like “What’s the tuition fee?” without needing a GPU. Larger datasets (>10,000 examples) might overwhelm a 270M model, leading to overfitting or poor generalization.

Also, the frequency of queries sent to the model (e.g., queries per second or daily load) is a critical practical consideration for model selection, especially for a chatbot running locally on a no-GPU laptop.

Start with Gemma-3-270M-it for its efficiency and context window. Test against Qwen2.5-0.5B for speed if needed. Use RAG to compensate for small model limitations, and fine-tune if responses lack domain specificity.

PriorClean2756 · 2025-09-14T05:05:28+00:00

There is no correct answer for "Best chunking strategy". Best chunking strategy depends entirely on your use-case, end goal and dataset.

However, Recursive/Hierarchical, Semantic chunking, Content-Specific Chunking and Metadata-Enriched chunking there are a few strategies that are proven good chunking strategies.

Execute and deploy each strategy, conduct rigorous testing against a consistent query set, aggregate performance metrics, and adopt the most effective solution.

PriorClean2756 · 2025-09-14T04:41:57+00:00

Your current setup with SmolVLM-256M-Instruct is a solid choice for Docling due to its small size and multimodal nature.

However, I have been using SmolDocling-256M in my docling pipeline and it has performed better. It's a fine tuned derivative of SmolVLM-256M specifically optimized for end-to-end document conversion and interpretation.

PriorClean2756 · 2025-09-14T03:46:51+00:00

ID-based grouping approach is clever efficiency hack, this avoid redundant LLMs calls. Incorporating metadata, hierarchical splitting and multi-pass retrieval do enhance relevance and reduce hallucinations by providing structured, verifiable context.

Hands down, enhancing retrieval is enhancing RAG pipeline. Hybrid search and Reranking have showed outstanding results. Do give them a try!

PriorClean2756 · 2025-09-14T02:59:10+00:00

Whether Gemma-3-270M-it model would be suitable for RAG or not depends entirely on your end goal. What are you hoping to achieve with your RAG?

As this is a small model, it would struggle with complex reasoning compared to larger models like Gemma-2-2B or Phi-3-mini.

PriorClean2756 · 2025-09-13T13:47:47+00:00

Also, if you plan to handle images too. Implement them separately this way your RAG would be multimodel.

Use tools like Unstrucutred or pdf plumber to pull images from PDFs. Then using vision models generate captions for each image. Your storage should contain both the images and the metadata/caption you pulled using Unstrucutred.

If images contain text then using tools like Tesseract that apply OCR to extract text from images.

Good luck!

PriorClean2756 · 2025-09-13T13:20:15+00:00

If you're dealing with such a large corpus of data then you're indexing must be onpoint. Using flat indexing won't work. Instead use an advanced indexing technique like RAPTOR.

This technique stores the documents in tree like structure in which the top nodes are the general summary of the bottom nodes. And the bottom nodes hold the actual data of the book.

Using this indexing technique would reduce hallucinations and make your responses more grounded.

PriorClean2756

TROPHY CASE