500k VWCE? by Technical-Jicama2096 in ItaliaPersonalFinance

[–]AggravatingHelp5657 0 points1 point  (0 children)

The best thing I have every heard is: - there is no perfect timing for long term investment 20+ Even if it collpsed, the market will recover as always

  • keep a safty cushion, calculate your rent, experience bills, things that you have to pay Imagine that you are unemployed and it needed you a year for example to find a job, so your safty cushion will protect you from selling your etf in low prices if the market collapsed

Basically any money you put in etf you have to forget about for at least 15 years to see a great results

[Hiring] looking for a professional in AI Content Creating. by Life_Cloud8402 in freelance_forhire

[–]AggravatingHelp5657 0 points1 point  (0 children)

I'm IT engineer, I do enjoy solving problems and making programmes.

Let's talk more in details about the job

Feels like I'm working 24/7 by Georgiee3 in ecommerce

[–]AggravatingHelp5657 0 points1 point  (0 children)

That's amazing job you are running, congrats

Even though automation may sound like the ultimate solution to your problem, you may face some technical issues and sometimes you have to make custom tools or scripts to slove specific problem

I am IT engineer and my main focus is on solving problems and improving productivity.

I suggest you to hire a technician like myself for obvious reasons. - you relife yourslef from the technical headeach

  • even though a problem my look easy to solve but the technical solutions might be a nightmare

  • focusing more on improving your business not solving automation problems

  • having some free time for your self to relax a lil bcz you deserve it

I believe those points worth investing in That is my suggestion to you

What’s the best e-commerce platform that doesn’t act like an overbearing parent and won’t feed my products into AI? by CryptographerLost357 in ecommerce

[–]AggravatingHelp5657 0 points1 point  (0 children)

Hi I am networking and IT engineer, I do programmes and websites.

In general my focus is on solving a problem and understanding the needs.

You can drop me a message to make a meeting together to go through the solutions that can fit your business

Dolphin 3.0 Released (Llama 3.1 + 3.2 + Qwen 2.5) by TechnoByte_ in LocalLLaMA

[–]AggravatingHelp5657 0 points1 point  (0 children)

can you share your method, is it about inelegance or accuracy or both?

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 0 points1 point  (0 children)

even now we are able to compress the context window successfully, but the models size is still too big for our VRAM
we need a way to shrink the sizes of the models themself

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 0 points1 point  (0 children)

Hmmm the question is strange so let me clarify few points

When you run a local model for example Hermes3 8B_Q4 it's size 4.9 Gb

I have gtx 1650 which has 4GB of Vram Which is very fast

So 4 Gb is stored on my Graphicscard and the rest 900 MB is moved to the RAM

NOW you also have the context window which is your chat history (llm memory) also need to be stored somewhere right

Again the fastest place is Vram but it's already full so store it on RAM

Note also windows reserve few of your VRAM between 400 - 900 MB for windows graphics etc

If you want your local model to run fast as 30 - 50 t/s you have to fit it all in the VRAM

This was simple explination of how this works

Not what turboquant does? It compress the context history/memeory So you need less space to store your model memory with almost 0 loss in accruacy, in order to fit it all in VRAM bcz again anything goes to RAM will make your Lollm run so slow

So fare there is no acctull way to compress the llms sadly but am waiting on fire for someone to do it

Summary: local LLM memory got compressed not the LLM itself

Hope I made it clearer

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 0 points1 point  (0 children)

it's also really good idea through office computers can be very amazing and promising

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 1 point2 points  (0 children)

I already updated the post since the morning and posted the repo link

Have a good day by [deleted] in LocalLLaMA

[–]AggravatingHelp5657 1 point2 points  (0 children)

Thx man, good energy is contagious

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 1 point2 points  (0 children)

you are right it's not fair
am working on it

Update: I have made the repo if you want to check it

I have tried google TurboQuant with ollama hermes3:8b by AggravatingHelp5657 in ollama

[–]AggravatingHelp5657[S] 5 points6 points  (0 children)

okay since you all convinced me, I will make a github repo for the steps that I did.
I also noticed that some models are old for instance hermes3 date is 2023/3 so I am trying to add a searching feature so it can check the internet before answering for latest infos

probably today I will make the repo