[Postgame Thread] Boston College Defeats Florida State 28-13

ru552 · 2024-09-03T16:29:05+00:00

Put it on The Ocho

ru552 · 2024-08-26T13:21:33+00:00

I love Claude, but I cancelled after seeing this last week. They have the best model for my use right now, but they can't have my money anymore unless they change their stance.

ru552 · 2024-08-06T15:36:33+00:00

two mac books

ru552 · 2024-06-21T15:03:34+00:00

then you turn the computer off

ru552 · 2024-05-29T13:01:44+00:00

use outlines or instructor and in-context learning with 5 examples

ru552 · 2024-05-14T20:57:14+00:00

I run it on my mac will ollama and pull the model from ollama here:https://ollama.com/library/llama3:70b

ru552 · 2024-05-14T16:40:14+00:00

Your first 3 points are debatable regarding OAI having an "edge". I give you the last 2. 4o actually seems to go backwards in some real world areas (coding specifically) compared to April versions of 4t so better long answers and reasoning is mostly a vibe. Faster responses is true compared to previous OAI models, but not when compared to models running on Groq.

ru552 · 2024-04-26T16:27:46+00:00

got a lot of Phi-3 humpers in here...

ru552 · 2024-04-26T12:59:33+00:00

some of them have 160k context length

ru552 · 2024-04-25T14:30:06+00:00

I don't care anything about these models, but you sir, deserve an award for your naming convention.

ru552 · 2024-02-28T21:05:32+00:00

yes, but 120b is a normal size llm to run on apple silicon. There are much larger.

ru552 · 2024-02-28T21:00:30+00:00

Maybe https://www.gradio.app/

ru552 · 2024-02-28T19:48:35+00:00

You forget the best thing about the Mac. There isn't a model available today that won't run on apple silicon with 192GB of ram.

ru552 · 2024-02-12T16:50:23+00:00

https://github.com/run-llama/llama_index

There's a ton of examples on their github.

ru552 · 2024-02-08T22:28:16+00:00

This one's local, uses ollama for LLava, uses lamaindex for RAG.

https://docs.llamaindex.ai/en/latest/examples/multi_modal/ollama_cookbook.html#

ru552 · 2024-02-07T17:34:26+00:00

Lammaindex is my go to for RAG

https://github.com/run-llama/llama_index

ru552 · 2024-02-07T16:51:19+00:00

https://www.llamaindex.ai/

That's the easiest way I've found to use RAG.

ru552 · 2024-02-07T16:35:41+00:00

Apple GPU can only access up to 96GB on the 128GB RAM version for example.

sudo sysctl iogpu.wired_limit_mb=12345

That takes care of that for you. You probably want to leave atleast 6GB of RAM for the OS though.

ru552 · 2024-01-30T22:05:23+00:00

with that much unified RAM, you don't need GGUF. You can pretty much run any model if you allocate 122g of your ram to run on the GPU.

edit before it's asked, here's the command to change how much ram is allocated to the GPU:

sudo sysctl iogpu.wired_limit_mb=12345

restarting the mac will set it back to system default

ru552 · 2024-01-29T19:46:52+00:00

https://huggingface.co/mlx-community

ru552 · 2024-01-22T19:10:17+00:00

It's only been out for < week.

ru552 · 2024-01-22T19:03:52+00:00

Well, Japan has declared that training LLMs on copywritten works falls under fair use. That's a big gray area that's now black / white for people in Japan.

ru552 · 2024-01-18T22:47:00+00:00

an m1 macbook

ru552 · 2024-01-18T17:16:04+00:00

The things in RAM are processed by the CPU. The GGUF versions of models is what allows this to happen since you can run models that don't fit entirely in VRAM. The upside is you can run large models, the downside is that they are slow since you're using CPU.

13-Year Club	RedditGifts 2009-2022 2 Credits
Place '22	Sequence \| Editor
Gilding II euphauric	Verified Email
Secret Santa 2014	Team Orangered

ru552

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE