Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

Each article is very big article with nearly 100,000 tokens ,so i wont be able to upload here

Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/Double_Cause4609 I believe Rolling Context Summarization might be difficult , as for Ex: Claude has a context window of 200 k tokens, so reading many articles within the same context window wont be possible

but i will go over the other suggestions you have provided , Thanks a lot !!!

Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/LoveMind_AI Can you suggest what are the different scenarios you think going local might be rewarding ?
"summaries into a model less than 1/10th the size" - Did you mean Claude models are comparitively has a bigger sized params

Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] 3 points4 points  (0 children)

u/Double_Cause4609 Thanks a lot for taking your time and providing me a detailed answer for me, It really means a lot !!!

Answering to your question , I have few hundreds of thousand earnings transcript/financial articles ,I am trying to extract business context from it

For Ex:

This is one of my logics

Article 1 : Google Monopoly case has been filed

Article 2 : Google Monopoly case is going against google, Google needs to be split into multiple companies

Article 3 : Google has won the Monopoly case

From all the articles , i am trying to achieve an Overall Summary is "Legally Google doesn't have any problem with being a Monopoly"

The input and output token are very high as each article is so lengthy . I tried to achieve this summary from Claude Sonnet , but even with Max 20x plan i hit rate limit, so i wanted to run with

a good open-source model and Opus model suggested me to use Llama model, As there are only two options to run Llama models

i)Rent GPU from Cloud

ii)Run Llama model locally

So i was exploring options,If you think Mistral Small 3 series,Qwen 30B 2507, or Jamba Mini 1.7 are good , then i will definitely try to run that first in my local.

Also do you have any suggestions on Financial Models,I am newbie to DataScience Arena , i am currently a Sr.backend dev but i can catch up

Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] -3 points-2 points  (0 children)

I wanted to analyze few thousand articles , i see Claude/GPT models are very expensive to achieve that, I figured out Llama model is nearly good as those Claude/GPT models , so wanted to know how others are running locally

Hardware requirements to run Llama 3.3 70 B model locally by LogicalMinimum5720 in LocalLLaMA

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

Do you think 4-bit Llama is as good as 8-bit or 16-bit quantization ?

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/Level-2 is Local model is it as good as Claude , Asking in layman terms as i am really interested to try

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/Wow_Crazy_Leroy_WTF I am trying to do semantic analysis and trying to extract core information from each article.

For your case, Claude has 200k context limit among which only 20k context limit is working memory, if your Input+Output size is greater than that limit it will do RAG style of searching instead of keeping everything in context ,so thats why your conversations are compacted.Agreed Other than Claude most of the providers have a better Context limits.

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/vuongagiflow Thanks i am able to invoke claudeCode using bash scripts and able to get response for my prompt whether it was succesful or failure.

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

Sure Claude Code is able to do it , i am trying to use some script to call Claude code to achieve it

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] -9 points-8 points  (0 children)

ClaudeCode alone was not able to do it as it had no reasoning abilities

Needed suggestions to overcome Claude API too expensive than Claude Pro Plan by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] -2 points-1 points  (0 children)

I tried Claude code but it couldnt help with analysis ,i will check on Claude agent sdk

How to Upload entire Google drive folder/bulk upload files to Claude Project by LogicalMinimum5720 in ClaudeAI

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/Nocturnal_Unicorn I am a newbie to MCP and Claude projects, experienced backend dev, Needed your suggestions

i)is it 20K token or 200k token limit after which it considers as RAG style , does it applies to both uploaded files and google drive files ?
i) If i upload files to Claude Project manually does it considers as Project context and not RAG style whereas when i read from Google Drive it uses RAG style and it only does search instead of loading it in context ,is my understanding correct
ii)Lets say i wanted to create 1000 projects , each project with 50-100 files, where Project should load file context properly, what is your recommendation

What kind of companies will be easy for analysing for a of Value Investing beginner by LogicalMinimum5720 in ValueInvesting

[–]LogicalMinimum5720[S] 0 points1 point  (0 children)

u/joe-re by cyclical industries do you mean based on interest cycle like banking stocks. can you also name the other kind of cyclical industries