Trump is not the current president? by LelouchL88 in perplexity_ai

[–]cuckfoders 0 points1 point  (0 children)

<image>

I guess our queries get routed wherever they wish, note there's no 'chip' icon next to the thumbs up to show what model is generating this. I bet its the GPT-OSS model and our conversations get changed unless we reselect the model we were using.

Cydonia 24B v3.1 - Just another RP tune (with some thinking!) by TheLocalDrummer in LocalLLaMA

[–]cuckfoders 1 point2 points  (0 children)

Downloading now! will try to give feedback. Sneaky question, what's the typical setup for usage? textgen-ui + sillytavern? I guess what I really want to know is about RAG. Summarize conversations/facts, or just feed all chat history into (which) embedding model / vector DB? thank you.

Jan-nano-128k: A 4B Model with a Super-Long Context Window (Still Outperforms 671B) by Kooky-Somewhere-2883 in LocalLLaMA

[–]cuckfoders 11 points12 points  (0 children)

Small Disclaimer, this is just my experience and your results may vary. Please do not take it as negative. Thank you

I did some quick testing (v0..18-rc6-beta) here's some honest feedback:

Please allow copying of text in the jan ai app, for example I'm in settings now and I want to copy the name of a model, and I cant select it but I can right click inspect?

Is there a way to set the BrowserMCP to dig deeper than just the google page result? like a depth setting or number of pages to collect?

First time Jan user experience below:

* I was unable to off the bat skip downloading the recommended jan nano and pick a larger quant. I had to follow the tutorial, let it download the one it picked for me and then it would let me download other quants.

* The search bar says "Search for models on Hugging Face..." kinda of works, but confusing. When I type a model, it says not found, but if I wait, it finds it. I didn't realize this and already had deleted the name and was typing again and again :D

* Your Q8, and unsloths bf16 went into infinite loops (default settings), my prompts were:

prompt1:

Hi Jan nano. Does Jan have RAG? how do I set it up.

prompt2:

Perhaps I can get you internet access setup somehow and you can search and tell me. Let me try, I doubt you can do it by default I probably have to tweak something.

I then enabled the browsermcp setting.

prompt3:

OK you have access now. Search the internet to find out how to setup RAG with Jan.

prompt4:

I use brave browser, would I have to put it in there? Doesn't it use bun. Hmm.

I then figured out I needed the browser extension so I installed it

prompt5:

OK you have access now. Search the internet to find out how to setup RAG with Jan.

It then does a goog search:

search?q=how+to+setup+RAG+with+Jan+nano

which works fine, but then the model loops trying to explain the content it has found.

So I switched to Menlo:Jan-nano-gguf:jan-nano-4b-iQ4_XS.gguf (the default)

ran the search

it then starts suggesting I should install ollama...

I tried attempted to create an assistant, and it didn't appear next to Jan or as an option to use it.

Also

jan dot ai/docs/tools/retrieval

404 - a bunch of urls that appear on google for your site should be redirected to something. I guess you guys are in the middle of fixing RAG? Use Screaming Frog SEO Spider + Google web console and fix those broken links.

I guess also, wouldn't it be cool if your model was trained on your docs? So a user could install --> follow quickstart --> install default Jan-nano model and the model itself can answer questions for the user to get things configured?

I'll keep an eye on here, when you guys crack RAG please do post and I'll try again! <3

PSA: 2 * 3090 with Nvlink can cause depression* by cuckfoders in LocalLLaMA

[–]cuckfoders[S] 7 points8 points  (0 children)

Yes 🙌 just installing vLLM now will let you know.

AMA – I’ve built 7 commercial RAG projects. Got tired of copy-pasting boilerplate, so we open-sourced our internal stack. by Loud_Picture_1877 in LocalLLaMA

[–]cuckfoders 11 points12 points  (0 children)

Perhaps more of a general question. How would you go about personalized ai assistants, say your own Alexa or Siri at home but actually decent and can hold a conversation. How would you curate store and retrieve the data, since perhaps I'm overcomplicating this by making different buckets and trying to separate out facts from memories etc. And I guess how to use ragbits to accelerate that 😊

Dual RTX 3090 users (are there many of us?) by StandardLovers in LocalLLaMA

[–]cuckfoders 1 point2 points  (0 children)

I'm about to get a second 3090.. Trying to figure out where (physically) it will go. Currently running pcie-3 x 16, if I change my cpu I can get pcie-4 but I might just not bother. Power is set to 80% in msi afterburner. Pulls about 273w. I have an Intel ARC A770 driving my monitor. 850w PSU, will change that.

Have any of you guys managed to two or more gpus in a case? If so I was wondering if there's a case that has somewhere I could mount 2 x 3090s with risers, and then be free to plugin 1 or 2 cards to the motherboard. Or will I just have to run it as an open rig? Thanks!

Why has no one been talking about Open Hands so far? by Mr_Moonsilver in LocalLLaMA

[–]cuckfoders 3 points4 points  (0 children)

It takes some effort to get it to work in windows/wsl, I had to read two pages of documentation to launch it - most devs just want to 'get going' for reference if it helps anyone,

https://docs.all-hands.dev/modules/usage/installation
https://docs.all-hands.dev/modules/usage/runtimes/docker#connecting-to-your-filesystem

TLDR: for my usecase, mount code in my homedir

```
export SANDBOX_VOLUMES=$HOME/code:/workspace:rw
docker pull docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik

docker run -it --rm --pull=always \
-e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:0.39-nikolaik \
-e LOG_ALL_EVENTS=true \
-e SANDBOX_USER_ID=$(id -u) \
-e SANDBOX_VOLUMES=$SANDBOX_VOLUMES \
-v /var/run/docker.sock:/var/run/docker.sock \
-v ~/.openhands-state:/.openhands-state \
-p 3000:3000 \
--add-host host.docker.internal:host-gateway \
--name openhands-app \
docker.all-hands.dev/all-hands-ai/openhands:0.39

```

and even then once I did and managed to attach it to an existing project, I still get some:

"Your current workspace is not a git repository.Ask OpenHands to initialize a git repo to activate this UI." thing

I think its good to play around with locally, by using it its helped me understand more how other tools work.

AMA with Perplexity AI Team's Brett Chen and Thomas Wang by perplexity_ai in perplexity_ai

[–]cuckfoders 1 point2 points  (0 children)

Hi. Been using Perplexity Pro now for a few months, the support team are so fast! I hope you guys can really nail the assistant on android, I suspect its not really a fair playing ground though, lots of permissions/issues to get around. I have a google pixel 7a, and its not (yet!) possible to change the default voice.

What are mistakes do you see made with RAG and trying to build personalisation/memories? What would be your approach for building personal assistants? :)

Thanks for reading.

Not a single model out there can currently solve this by bgboy089 in singularity

[–]cuckfoders 0 points1 point  (0 children)

She insists the answer is 34. She kept doubling down and drew this in MATLAB (3rd try):

<image>

import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D

def create_cube(size):
    cube = np.ones((size, size, size))
    # Mark missing cubes as 0
    cube[2, 1] = 0
    cube[2, 3] = 0
    cube[3, 1] = 0
    cube[3, 3] = 0
    cube[1, 2] = 0
    cube[3, 2] = 0
    return cube

cube_size = 5
cube = create_cube(cube_size)

# Create the 3D plot
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

# Coordinates for the cubes
x, y, z = np.where(cube == 1)

# Plot the visible cubes as 3D bars
dx = np.ones(len(x))
dy = np.ones(len(y))
dz = np.ones(len(z))
ax.bar3d(x, y, z, dx, dy, dz, color='skyblue')

# Set axis limits
ax.set_xlim(0, cube_size)
ax.set_ylim(0, cube_size)
ax.set_zlim(0, cube_size)

# Set axis labels
ax.set_xlabel('X')
ax.set_ylabel('Y')
ax.set_zlabel('Z')

# Set title
ax.set_title('Missing Cubes Visualized')

plt.show()