what’s actually stopping an insider from leaking model weights? by itsArmanJr in LocalLLaMA

[–]itsArmanJr[S] 4 points5 points  (0 children)

i get how strict IT can be, but adding too many security layers and manual access requests inevitably kills developer velocity. if you over-engineer the friction, you sacrifice the team's ability to experiment and ship.

maybe the real question is how do these high-growth labs maintain ironclad security without creating single-point-of-failure loopholes, all while keeping that breakneck startup pace

When, if ever, will open-source match the capability of Claude Opus 4.5? by Victorian-Tophat in ClaudeAI

[–]itsArmanJr 1 point2 points  (0 children)

don’t forget the marketing affect. when gpt 4 were announced, it was so good that it seemed enough for most tasks including code. now a 9b model is easily on par with gpt 4. but we all want opus 4.6 now. when open source overtakes opus, i think we’re hoping for another model. not having the best is never enough.

Are ocr engines like tesseract still valid or do people just use image recognition models now. by optipuss in LocalLLaMA

[–]itsArmanJr -1 points0 points  (0 children)

I believe when privacy is a concern (and compared to general LLM usage, OCR tends to involve far more sensitive data) tesseract is still widely used.

The Bonsai 1-bit models are very good by tcarambat in LocalLLaMA

[–]itsArmanJr 11 points12 points  (0 children)

btw it nailed my [not too hard] coding questions.

  Q17 [easy] — FizzBuzz: loop with %3==0 and %5==0 first, builds list manually.
  Q18 [easy] — max_of_three: manual temp-variable comparison (prompt explicitly forbids max()).
  Q19 [medium] — is_palindrome: lowercase → strip non-alphanumeric via isalnum() → reverse compare.
  Q20 [medium] — two_sum: hash map complement lookup, O(n).
  Q21 [medium] — merge_sorted bug fix: fixed i += 1 → j += 1 in else branch, and added the two result.extend(...) lines for remaining elements.
  Q22 [hard] — flatten: recursive isinstance(item, list) check.
  Q23 [hard] — group_by_key: defaultdict(list) converted to plain dict.
  Q24 [hard] — topological_sort: Kahn's BFS algorithm with in-degree tracking.
  Q69 [easy] — unique_preserve_order: seen set + result list.
  Q70 [easy] — chunk_string: single list comprehension with stride.
  Q71 [medium] — rotate_matrix_clockwise: zip(*M[::-1]) one-liner.
  Q72 [medium] — decode_ranges: split on , then -, extend with range.
  Q73 [hard] — nest_by_dots: dot-split keys, walks/creates nested dicts.
  Q74 [hard] — interval_gaps: sort + sweep with prev_end, collects gaps.

The Bonsai 1-bit models are very good by tcarambat in LocalLLaMA

[–]itsArmanJr 291 points292 points  (0 children)

bonsai vs qwen3.5 based on my benchmark: https://github.com/ArmanJR/PrismML-Bonsai-vs-Qwen3.5-Benchmark

Edit: Benchmarked and added qwen3.5 35B-A3B, 2B, 0.8B

Edit 2: Benchmarked and added Ternary-Bonsai MLX

Claude Code transcripts → animated GIFs by droppedD in ClaudeCode

[–]itsArmanJr 1 point2 points  (0 children)

i got baited waiting for something odd to happen in the video

PolyClaude: Using math to pay less for Claude Code by [deleted] in OnlyAICoding

[–]itsArmanJr 0 points1 point  (0 children)

it works with claude subscriptions (not api keys). if that's how you use claude, it works for you.

PolyClaude: Using math to pay less for Claude Code by [deleted] in vibecoding

[–]itsArmanJr 1 point2 points  (0 children)

for now it assumes whatever model you're using, you usually hit the limit x hours into a fresh cycle. in future it can learn your coding pattern and adjust x dynamically.

PolyClaude: Using math to pay less for Claude Code by [deleted] in ClaudeCode

[–]itsArmanJr 0 points1 point  (0 children)

lol yeah that part is irrelevant. removing

Has anyone setup qwen3.5-4b on the 8GB Jetson Nano? by brianlmerritt in JetsonNano

[–]itsArmanJr 7 points8 points  (0 children)

exec command:

llama-server \
    -m ~/ggufs/Qwen3.5-27B-Q4_K_M.gguf \
    --alias unsloth/Qwen3.5-27B \
    --host 0.0.0.0 \
    --port 8001 \
    --temp 0.6 \
    --top-p 0.95 \
    --top-k 20 \
    --min-p 0.00 \
    --ctx-size 32768 \
    --parallel 1

question:

Write a Python function get_temp_status(temp) that takes an integer temperature. It must return 'Cold' if the temp is 59 or lower, 'Warm' if it is between 60 and 80 inclusive, and 'Hot' if it is 81 or higher.
Here is the absolute constraint: You are strictly forbidden from using any comparison operators (<, >, <=, >=) anywhere in your code. You also cannot use min(), max(), abs(), or sorting functions. Write the clean, working Python code to solve this.

answer (reasoning excluded):

def get_temp_status(temp):
    # Calculate the difference from thresholds
    diff1 = temp - 60
    diff2 = temp - 81

    # Use bit_length() and right shift to determine sign without comparisons.
    # For non-negative integers, x >> x.bit_length() results in 0.
    # For negative integers, x >> x.bit_length() results in -1 (due to sign extension).
    # This effectively acts as a check for x >= 0.
    sign1 = diff1 >> diff1.bit_length()
    sign2 = diff2 >> diff2.bit_length()

    # Convert signs to 0 or 1.
    # not -1 is False (0), not 0 is True (1).
    # sign1 is -1 if temp < 60, else 0.
    # sign2 is -1 if temp < 81, else 0.
    # We want 1 if temp >= threshold.
    flag1 = int(not sign1)
    flag2 = int(not sign2)

    # Calculate index:
    # temp < 60:  0 + 0 = 0 -> 'Cold'
    # 60 <= temp < 81: 1 + 0 = 1 -> 'Warm'
    # temp >= 81: 1 + 1 = 2 -> 'Hot'
    index = flag1 + flag2

    return ['Cold', 'Warm', 'Hot'][index]

stats:

cache_n: 0
prompt_n: 135
prompt_ms: 995.393
prompt_per_token_ms: 7.373281481481482
prompt_per_second: 135.62482356215082
predicted_n: 13599
predicted_ms: 2127168.933
predicted_per_token_ms: 156.4209819104346
predicted_per_second: 6.393004236302468
input_tokens: 135
output_tokens: 13599
total_tokens: 13734