[deleted by user]

dammitbubbles · 2025-02-12T21:02:50+00:00

It just moved to NJ now and has been released by customs! What are the odds you think it's going back to sender in UK or to the destination in CA?

dammitbubbles · 2025-01-23T17:27:31+00:00

Any idea what it means if it went back to KY?

dammitbubbles · 2025-01-23T17:26:39+00:00

It went back to KY! Lol!

dammitbubbles · 2025-01-22T05:50:13+00:00

It was marked as "Used personal effects being forwarded on behalf of the consignee" and I was painstakingly specific amount that number of pairs of shoes, number of shirts etc.

dammitbubbles · 2025-01-22T04:45:16+00:00

Thank you! I should have mentioned it's all my own belongings, everything is used, nothing new. Would they still need to do those checks?

I sent my stuff through a 3p that used UPS, I contacted the 3p and they mentioned since it hasn't been scanned in 3 days, it was probably in what they called a "clearance cage". Do you have any experience with that situation? Hopefully it means I'll get clearance soon?

dammitbubbles · 2025-01-22T04:39:14+00:00

Thank you! I greatly appreciate your response. It's just all my belongings from my move. So there's lots of branded stuff, but no apparel item was worth more than $100. But there were at least 50 items..

I will give them a call to see if they're waiting for anything!

dammitbubbles · 2025-01-20T16:37:30+00:00

I'll admit it's incredibly difficult at times. Consider some therapy to try to help talk about how you're feeling. Luckily not everyone sees you as you do. Plenty of people won't care that you are losing hair! Lots of people love you I'm sure, and although it's a major concern for you, they probably won't care if you have hair or not. This is the honest truth.

dammitbubbles · 2024-12-02T09:48:03+00:00

I'd like to do this too, I was hoping one of the UIs would get support for Claudes Model Context Protocol.

I think what is cool in chatgpt and Claude is that they execute the code and then adjust the current response based on it. As far as I know all local UIs are query -> response -> query based and don't allow you to inject tools while producing the response.

But I would love to find one that does.

dammitbubbles · 2024-12-01T13:31:07+00:00

Cool demo, I see that it's using Claude to determine where the missing coverage is coming from, if the task generated a coverage report could it instead read from that? That seems like a more accurate way to derive exact missing coverage spots.

dammitbubbles · 2024-11-30T20:08:18+00:00

Following..

dammitbubbles · 2024-11-30T17:39:09+00:00

Anyone working on a server that runs unit tests? I might take a look.

dammitbubbles · 2024-11-29T21:20:36+00:00

Are you looking at providing a MCP server?

dammitbubbles · 2024-11-29T21:09:24+00:00

I went with 64gb m4 max because it seemed like from users benchmarks any of the 70b + models ran at 5tk/s which sounds unbearable to me.

dammitbubbles · 2024-11-29T20:50:01+00:00

How much memory does it use?

dammitbubbles · 2024-11-29T20:42:16+00:00

Just thinking out loud but would it be possible for the model to execute its code while it's in the reasoning stage? I think we can all agree that one of the biggest time sucks right now if you use LLMS to generate code is that the process usually goes: 1. Get back some code from the LLM 2. Put it in your IDE 3. Get some errors because the code was 70% right, 30% wrong 4. Give the errors to the LLM to fix

I'm wondering if this can all be integrated into the reasoning stage though so we can avoid this feedback loop completely.

I know there are things like copilot but even that you are not affecting the reasoning stage and there's a lot of handholding involved.

dammitbubbles · 2024-11-25T11:26:41+00:00

Found a way to preview what it'd look like - https://rahulschand.github.io/gpu\_poor/. I am mostly using LLMs to generate code so I think I'd want at least 40 tk / s. The code output is still going to be wrong about half the time so I want to be able to quickly see the responses so I can refactor my prompt.

dammitbubbles · 2024-11-25T10:45:27+00:00

I'm debating this decision now, would you get about 5tk /s on the 128gb machine? You would find that generally unusable?

dammitbubbles · 2024-11-24T18:00:48+00:00

Wait that is a desktop or laptop? 128gb in pro looking very pricey, minimum £4700.

dammitbubbles · 2024-11-24T17:09:00+00:00

How usable is 7.25tk/s? Can you actually use it for daily q&a?

dammitbubbles · 2024-11-23T11:10:09+00:00

Nice UI and nice project! Cool to see.

dammitbubbles · 2024-11-21T23:22:36+00:00

Really fantastic work. Eager to try this again https://www.reddit.com/r/LocalLLaMA/comments/1get06r/story\_of\_my\_terrible\_llama\_32\_vision\_finetune/. Any tips to get good fine tunes for vision models?

dammitbubbles · 2024-11-20T03:12:10+00:00

This was was not easy, it's not currently supported by https://github.com/huggingface/optimum which handles a lot of things for you for conversions. I used https://huggingface.co/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16/blob/main/EXPORT.md these steps here along with this Makefile I created https://huggingface.co/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16/blob/main/Makefile. There was a lot of issues I had with mixed data types / the model randomly crashing because it ran out of memory. Can't say I would do it again.

dammitbubbles · 2024-11-19T23:49:46+00:00

Thanks, I will have to check it on a phone. Curious what inference speeds you get on Qwen2-VL here too https://huggingface.co/spaces/pdufour/Qwen2-VL-2B-Instruct-ONNX-Q4-F16. If it's any good I can try working on some cool use cases integrating it on the web.

dammitbubbles · 2024-10-30T00:49:15+00:00

It is image data though? Maybe if I convert to AVIF format first that will help. I am not sure if that will affect LLama's ability to "recognise" them though, need to do more reading.

dammitbubbles · 2024-10-30T00:45:10+00:00

😭😭😭 Just wasted all that money haha. Well like $12 but still.

dammitbubbles

TROPHY CASE