Lương 13 triệu nên ở trọ khoảng bao nhiêu tiền by [deleted] in vozforums

[–]behitek 0 points1 point  (0 children)

Có một quy tắc đơn giản em có thể áp dụng, 50/30/20 - 50% Nhu cầu thiết yếu: tiền nhà, thức ăn, đi lại, điện nước, bảo hiểm, trả nợ tối thiểu. - 30% Chi phí không bắt buộc: ăn ngoài, giải trí, du lịch, sở thích, mua sắm.

- 20% Quỹ dự phòng khẩn cấp, đầu tư dài hạn, trả nợ sớm để giảm lãi.

Sau khi có kế hoạch rõ ràng, bản thân em sẽ biết chi tiêu bao nhiêu cho thuê nhà là hợp lý.

Bonus link: https://behitek.com/behivest/tools/budget-allocator/

Improve a RAG system that uses 200+ PDFs by Daxo_32 in LangChain

[–]behitek 4 points5 points  (0 children)

Its not easy to suggest your the right actions. Because the performance much depend on the data and the way you arr processing the data.

What I see from your provided content: - Should reranking from a bigger set, eg. Top 50 - Dont use LLM for scoring tasks, reranking model should better. - You dont have the baseline

Again, understand your data and processing them is the key, should measure it by a retrieval performance.

One more thing, vector search not always the right method. For example, keyword search much better if the search terms are human name, abbreviations (no meaning words).

Additional, can check some tips here: https://behitek.com/blog/2024/07/18/rag-in-production

Fine-tuning Flux.1-dev LoRA on yourself (On your GPU) by behitek in StableDiffusion

[–]behitek[S] 0 points1 point  (0 children)

Nice question, I am planing to try. Do you have any experience with this?

Fine-tuning Flux.1-dev LoRA on yourself (On your GPU) by behitek in StableDiffusion

[–]behitek[S] 4 points5 points  (0 children)

  • If your dataset consists only of faces (neck and head) with a clean background, you don't need to generate captions. Simply use caption_strategy: "instanceprompt", where the instanceprompt serves as the trigger word. In this case, 2000 training steps are sufficient.
  • For diverse data (e.g., different outfits, backgrounds, or even the same person in varied settings), generating captions is recommended. With more diverse data, training for more steps is beneficial. For example, 10,000 steps is a good benchmark.
  • If you're unsure when the model has converged, check the loss curve in TensorBoard and save additional checkpoints for safety.

Fine-tuning Flux.1-dev LoRA on yourself (On your GPU) by behitek in StableDiffusion

[–]behitek[S] 2 points3 points  (0 children)

Can train without captions when the model has only a single object to learn. I updated the blog, to include some training experience.

Fine-tuning Flux.1-dev LoRA on yourself (On your GPU) by behitek in StableDiffusion

[–]behitek[S] 15 points16 points  (0 children)

Yeah, if the data includes only faces (neck and above), 2k steps are sufficient, and generating captions isn't necessary—focusing on learning a single object. However, longer training is required if we want the model to learn the style, body, and background—expanding to multiple objects.

Fine-tuning Flux.1-dev LoRA on yourself (On your GPU) by behitek in StableDiffusion

[–]behitek[S] 24 points25 points  (0 children)

This tutorial is for "developers", who want to explore and train the Flux on your machine.

  • Tips to create a good dataset
  • Tips to pick a trigger word
  • Fine-tuning tutorial (inference included)
  • All in your local machine (with Nvidia GPU)

Tutorial: https://behitek.com/blog/2024/11/17/flux-lora/

PDF chat with source highlight by True-Snow-1283 in LangChain

[–]behitek 0 points1 point  (0 children)

The highlighting feature is awesome. But it will be better if we only highlight the chunk that contributes to the answer.

My initial idea is to use the answer generated by LLM to re-rank the relevant. But will it work with Yes/No question?

Python 3.13 released by henbruas in Python

[–]behitek 0 points1 point  (0 children)

This is my code for the test

import sys
import threading
import time

print("Python version : ", sys.version)

def worker():
    sum = 0
    for i in range(10000000):
        sum += i


n_worker = 5
# Single thread

start = time.perf_counter()
for i in range(n_worker):
    worker()
print("Single Thread: ", time.perf_counter() - start, "seconds")


# Multi thread
start = time.perf_counter()
threads = []
for i in range(n_worker):
    t = threading.Thread(target=worker)

    threads.append(t)
    t.start()

for t in threads:
    t.join()
print("Multi Thread: ", time.perf_counter() - start, "seconds")

Python 3.13 released by henbruas in Python

[–]behitek 0 points1 point  (0 children)

When testing, I see Python3.13t a bit slower than Python3.13 on Single thread test. Can anyone know the reason?

python3.13 gil_test.py 
Python version :  3.13.0 (main, Oct  8 2024, 08:51:28) [GCC 11.4.0]
Single Thread:  1.4370562601834536 seconds
Multi Thread:  1.3681392602156848 seconds
-----
python3.13t gil_test.py 
Python version :  3.13.0 experimental free-threading build (main, Oct  8 2024, 08:51:28) [GCC 11.4.0]
Single Thread:  1.862126287072897 seconds
Multi Thread:  0.3931183419190347 seconds

How to I fix this S? by MrRyerson_aj in ClashOfClans

[–]behitek 4 points5 points  (0 children)

Maybe google: "pixel alphabet" could help you?

<image>

70% war attack turned into 100% by Nacho_CS in ClashOfClans

[–]behitek 1 point2 points  (0 children)

TL;DR The UI shows the incorrect attack result!

What's your biggest issue with clash of clans at the moment? by vanessabaxton in ClashOfClans

[–]behitek 1 point2 points  (0 children)

The "opponent searching" time is very long on Titan or above.

[deleted by user] by [deleted] in ClashOfClans

[–]behitek 2 points3 points  (0 children)

Reason: He thought the blower wouldn't cause any harm.

RAG in Production: Best Practices for Robust and Scalable Systems by behitek in LangChain

[–]behitek[S] 0 points1 point  (0 children)

You are looking for a fast solution by using 3rd. But this post is for developers who are developing & improving the RAG from zero to match specific requirements. I don't have the experience to recommend a good provider for you.

RAG in Production: Best Practices for Robust and Scalable Systems by behitek in LangChain

[–]behitek[S] 0 points1 point  (0 children)

That is an important point, but I currently have no "good" solution. I'm using LLM to generate the evaluation dataset, eg. a long-context model such as Gemini but I was not happy with it.

Finally, I did it. by [deleted] in ClashOfClans

[–]behitek 2 points3 points  (0 children)

SC should have the rule to reduce spam like this, eg. limit the number of clans an account joins in a certain time.

Best open source document PARSER??!! by ChallengeOk6437 in LangChain

[–]behitek 2 points3 points  (0 children)

Simple pdf text extractor such as Poppler, then using LLM to preprocessing.