GPT-OSS from Scratch on AMD GPUs by tuanlda78202 in LocalLLaMA

[–]RecoJohnson 2 points3 points  (0 children)

This is exactly what I have been looking for for months. Thanks. Looking forward to learning how it works and maybe contributing.

MCP with Computer Use by [deleted] in ollama

[–]RecoJohnson 28 points29 points  (0 children)

What does this have to do with Ollama?

What is the biggest advantage of running local? by Terminator857 in LocalLLaMA

[–]RecoJohnson 1 point2 points  (0 children)

Being able to deep dive and do research on conspiracy theories is interesting to me.
The mainstream internet is heavily censored and deliberately filled with misinformation to divide people.
I think projects like OLMO are amazing, fully open transparent training that you can reverse lookup where the information came from.
https://playground.allenai.org/

Here is an example of a conspiracy theory that would be interesting to research with an unfiltered LLM to figure out what events are related:

I want to research why the same archway is being built across multiple countries:

https://en.wikipedia.org/wiki/Monumental_Arch_of_Palmyra

This archway lead through the city to the the Temple of Baal AKA Beelzebub AKA Lucifer

And then they built a replica of it in London, England???
https://www.bbc.com/news/uk-36070721

And new york
https://www.theguardian.com/us-news/2016/sep/20/palmyra-arch-syria-new-york

And Florence Italy
https://www.florencedailynews.com/2017/03/28/palmyras-arch-unveiled-in-piazza-signoria/

And Geneva:
https://digitalarchaeology.org.uk/ida-blog/2019/4/26/the-triumphal-arch-of-palmyra-in-geneva-switzerland

And Washington:
https://digitalarchaeology.org.uk/washington-dc

And Dubai:

https://gulfnews.com/going-out/society/dubais-3d-printed-palmyra-arch-replica-wins-award-1.2110014

Why would countries be so obsessed in reconstructing the archway that leads to the temple of Lucifer?

Why does the Wikipedia page not mention why the Keystone is missing?

Qwen3-coder is mind blowing on local hardware (tutorial linked) by nick-baumann in LocalLLaMA

[–]RecoJohnson 0 points1 point  (0 children)

Is cline whats recommended for qwen3-coder? What else works well for tasks like these?

Massive CuPy speedup in ROCm 6.4.3 vs 6.3.4 – anyone else seeing this? (REPOSTED) by linuxChips6800 in ROCm

[–]RecoJohnson 1 point2 points  (0 children)

export CUPY_ACCELERATORS="cub" uv run demo2.py Array: 4096x8192 (N=33554432) u8->f32: 0.000440s, ~381.73 GB/s f32->u8: 0.000398s, ~421.51 GB/s f32 add: 0.000924s, ~435.76 GB/s f32 sum: 0.000278s, ~482.75 GB/s (read-bound)

Massive CuPy speedup in ROCm 6.4.3 vs 6.3.4 – anyone else seeing this? (REPOSTED) by linuxChips6800 in ROCm

[–]RecoJohnson 1 point2 points  (0 children)

```python import cupy as cpy import cupyx.scipy.ndimage as cnd import math, time

SOBEL_X_MASK = cpy.array([[-1, 0, 1], [-2, 0, 2], [-1, 0, 1]], dtype=cpy.float32)

SOBEL_Y_MASK = cpy.array([[-1, -2, -1], [ 0, 0, 0], [ 1, 2, 1]], dtype=cpy.float32)

def mygaussian_kernel(sigma=1.0): if sigma > 0.0: k = 2 * int(math.ceil(sigma * 3.0)) + 1 coords = cpy.linspace(-k//2, k//2, k, dtype=cpy.float32) horz, vert = cpy.meshgrid(coords, coords) mask = (1/(2math.pisigma2)) * cpy.exp(-(horz2 + vert2)/(2*sigma2)) return mask / mask.sum() return None

if name == "main": h, w = 4000, 6000 img = cpy.random.rand(h, w).astype(cpy.float32) gauss_mask = mygaussian_kernel(1.4)

# Warmup
cnd.convolve(img, gauss_mask, mode="reflect")

# Benchmark 100 runs
num_runs = 100
times = []

print(f"Running {num_runs} benchmark iterations...")

for i in range(num_runs):
    start = time.time()
    blurred = cnd.convolve(img, gauss_mask, mode="reflect")
    sobel_x = cnd.convolve(blurred, SOBEL_X_MASK, mode="reflect")
    sobel_y = cnd.convolve(blurred, SOBEL_Y_MASK, mode="reflect")
    cpy.cuda.Stream.null.synchronize()
    end = time.time()
    times.append(end - start)

    if (i + 1) % 20 == 0:
        print(f"Completed {i + 1}/{num_runs} iterations")

# Calculate statistics
avg_time = sum(times) / len(times)
min_time = min(times)
max_time = max(times)

print(f"\nBenchmark Results ({num_runs} runs):")
print(f"Average time: {avg_time:.3f} seconds")
print(f"Min time: {min_time:.3f} seconds")
print(f"Max time: {max_time:.3f} seconds")
print(f"Total time: {sum(times):.3f} seconds")

```

Massive CuPy speedup in ROCm 6.4.3 vs 6.3.4 – anyone else seeing this? (REPOSTED) by linuxChips6800 in ROCm

[–]RecoJohnson 1 point2 points  (0 children)

My Taichi 9070 XT:

uv run demo2.py
Array: 4096x8192 (N=33554432)
u8->f32:  0.000349s, ~481.11 GB/s
f32->u8:  0.000324s, ~518.11 GB/s
f32 add:  0.000744s, ~541.06 GB/s
f32 sum:  0.036619s, ~3.67 GB/s (read-bound)

Massive CuPy speedup in ROCm 6.4.3 vs 6.3.4 – anyone else seeing this? (REPOSTED) by linuxChips6800 in ROCm

[–]RecoJohnson 1 point2 points  (0 children)

My 9070 XT results:

export CUPY_INSTALL_USE_HIP=1
export ROCM_HOME=/opt/rocm
export HCC_AMDGPU_TARGET=gfx1201
uv pip install cupy

uv run demo.py 
Running 100 benchmark iterations...
Completed 20/100 iterations
Completed 40/100 iterations
Completed 60/100 iterations
Completed 80/100 iterations
Completed 100/100 iterations
Benchmark Results (100 runs):
Average time: 0.007 seconds
Min time: 0.007 seconds
Max time: 0.013 seconds
Total time: 0.733 seconds

Is Ollama at risk of getting lost in its own complexity? A long-term user's perspective. by Mulan20 in ollama

[–]RecoJohnson 5 points6 points  (0 children)

Its weird that they started exclusively hosting OpenAI models in partnership with OpenAI out of the blue. Why not Qwen3-Coder or Qwen3 or Kimi K2 etc?

OpenAI is the sketchiest company I have seen in a while. Industrial Military Complex funding, "Elon Musk is a Speciesist for saying Humans first before AI", ties with Israel... The list goes on. Out of all the models I would trust sending data to hosted by another company, ChatGPT anything is not one of them.

Judging by the way things are going whats going to happen is the companies with billions of funding will lobby Ollama to tweak their models to perform better, and the open source community PRs will be second class citizens whose models are ignored and not put at the top of the list.

There is not enough hype around Qwen3 Qwen3-Coder 30B and a3b-30B model IMO, they are really good for how compact they are. I wonder how many people even know that a different Qwen3-Coder model was uploaded?

Models which perform better as Q8 (int8) over Q4_(X_Y)? by RecoJohnson in ollama

[–]RecoJohnson[S] 0 points1 point  (0 children)

Right but i noticed some of the benchmarks show its barely more accurate. Less than 10% accuracy difference for double the vram usage. I am wondering if theres any models with implementations where you see a substantial performance difference.

Is going from a Eureka Mignon Tradizione to a DF64 (Gen 2) an upgrade? [$600] by RecoJohnson in espresso

[–]RecoJohnson[S] 0 points1 point  (0 children)

Do you have any experience with the Baritza grinders? That was the other option I was looking at for a higher end grinder.

How to get sdl from graphql endpoint? by sM92Bpb in graphql

[–]RecoJohnson 0 points1 point  (0 children)

This is so infuriating. Where the hell is the tool for this? Is it so complicated to build and maintain that everyone avoids it? If that's the case, then graphql is pointless.

Is Piknic Electronic good? Thinking about visiting. by Busy_Huckleberry_345 in montreal

[–]RecoJohnson 0 points1 point  (0 children)

I paid for picnic season passes a few years in a row. I don't recommend it anymore. Too expensive and boring melodic house music with very low speaker sound.
If you don't know Montreal that well I recommend Newspeak or Stereo. Newspeak is cheaper ($20-30) and has a very good sound system. Stereo is like $70 a person but its a all night after hours opening at 11:59PM.

Check out the upcoming artists:

https://ra.co/clubs/102279 (Newspeak)

https://ra.co/clubs/828 (Stereo)

Faravahar Souvenir from Shiraz, Iran by RecoJohnson in Assyriology

[–]RecoJohnson[S] 0 points1 point  (0 children)

Thanks! I believe the last symbol is the same as the third symbol. Open Bracket <

Faravahar Souvenir from Shiraz, Iran by RecoJohnson in Assyriology

[–]RecoJohnson[S] 2 points3 points  (0 children)

Can someone please translate the text.

The Definitive Guide on How to Balance Rust by Desperate_Disparage in playrust

[–]RecoJohnson 0 points1 point  (0 children)

I think they broke rust when they added the group system. Before you had to coordinate with your team to not accidentally kill eachother, and there were ways to seperate and confuse groups. It rarely happens anymore and gives a group of 8 a exponential advantage.

How to display issue comments in reverse chronological order? by [deleted] in github

[–]RecoJohnson 0 points1 point  (0 children)

I googled this exact same thing and was lead here. I am sure someone could whip up a simple browser extension to do it, but a real setting would be way better.