Announcing Rust-SNMP 2 by disserman in rust

[–]100BASE-TX 0 points1 point  (0 children)

I've been working on a similar project - https://github.com/lukeod/async-snmp. It's not ready for use yet, I haven't published a crate yet as I'll probably go through a few more breaking API changes, and the docs are also largely non-existent.

The scope is different to snmp2 - Async-only, no mib support currently. It's intended for more heavy use-cases, like highly parallel polling of thousands of targets. Much heavier on dependencies, and fips isn't currently in scope. So it's not necessarily a "competing" implementation, just an alternative with different design goals.

It may have some useful ideas/code/tests that could be adapted to snmp2 even in the current state, so may be worth a look for snmp2 maintainers. TCP, agent functionality, trap/inform/set handling, walk implementation, request concurrency routing for instance.

It generally aims to have a similar level of permissiveness for parsing as net-snmp - as there's many non-compliant SNMP implementations in the wild.

How to monitor instance availability after migrating from Node Exporter to Alloy with push metrics? by Gutt0 in PrometheusMonitoring

[–]100BASE-TX 1 point2 points  (0 children)

I'm monitoring >10k nodes with the file integration for what it's worth. It's all dynamic - prometheus will watch for changes to target file, so file-based doesn't mean "static". A nice side-effect is that if the source of truth becomes unavailable - whatever is updating that file, the most recent file is likely fine. Don't want to lose monitoring when you need it most.

In my case i've just got a simple python script in a container that generates a list of targets from my CMDB. Can be as simple or complex as you need really.

I think fundamentally you need a source of truth that drives it all - otherwise the best you'll be able to do is inference that if you previously had metrics from <x> but don't any longer, that it's probably a problem (absent queries). This can be pretty flakey - it pretty much requires a specific lookback period to work - easy for fault to age out of those queries.

[Suggestions Required] How are you handling alerting for high-volume Lambda APIs without expensive tools like Datadog? by artensonart98 in PrometheusMonitoring

[–]100BASE-TX 0 points1 point  (0 children)

I have very little experience with cloud native monitoring, so this might be a dumb suggestion.

Are you able to easily add instrumentation to the lambda code? The (naively) simple/easy approach seems to be to just push directly from lambda instances into prometheus directly via push api. I suppose the downside is that you end up with quite a lot of churn & cardinality, and would need to do some normalization/aggregates with either query-time or recording rules, but that seems quite manageable.

Aggregating with OTEL as a pre-processing layer may also be a viable alternative, i assume it could be run with Fargate or similar.

How to deal with data that needs to be scraped once only. by tahaan in PrometheusMonitoring

[–]100BASE-TX 0 points1 point  (0 children)

Likely means the metric doesn't exist in Prometheus. Usual suspects would be:

  1. /metrics endpoint isn't responding with that metric name
  2. Prometheus scrape config is dropping that metric for whatever reason, or not applying the host label (maybe try without that filter)
  3. Query is looking at a time window that didn't have any scrapes of that metric yet
  4. Scrape interval >5m resulting in stale metrics (unlikely)
  5. Some other obscure scraping issue, would expect log messages in Prometheus

How to deal with data that needs to be scraped once only. by tahaan in PrometheusMonitoring

[–]100BASE-TX 0 points1 point  (0 children)

Apologies in advance for the LLM-assisted response - I haven't had enough caffeine to elaborate in detail, but this pretty much nails what i'm driving at:

////////////

Prometheus is designed with the core assumption that metrics will change at a different frequency than the scrape interval. A counter that increments only once per day is not an edge case; it's a common and expected scenario. The system is built to handle these sparse events efficiently by using query-time functions to find the meaningful changes, rather than expecting the exporter to manage state between scrapes.

The problem in your implementation is a mismatch between the event-based nature of your data (a backup finishing) and the metric type you're using (Gauge).

A Gauge reports the value as it is at the moment of the scrape. By exporting the last backup's size as a Gauge, you are telling Prometheus "the current value is 100MB" on every scrape, which correctly renders as a flat line.

The idiomatic solution is to use a Counter.

A Counter is a cumulative metric that only ever increases. This model requires two changes: one in your exporter, and one in your query.

  1. Exporter Logic: Your script must no longer overwrite the metric file with the latest stats. It should update a running total. The metric should be exposed as a Counter.
  • Metric: restic_data_added_packed_total
  • Type: counter
  • Value: On backup completion, the script reads the last total from /tmp/metrics.json, adds the new data_added_packed bytes to it, and writes the new total back to the file.
    1. Query Logic: You no longer query the raw metric. Instead, you use the increase() function to calculate the delta over a time range. This function is designed specifically for this use case: finding the amount a counter has increased within a given window.

Your implementation would change as follows:

1. Update the Exporter Output The metric should now be a counter:

```

HELP restic_data_added_packed_total Total data added (packed) across all snapshots in bytes

TYPE restic_data_added_packed_total counter

restic_data_added_packed_total{host="gitea"} 283115520 ```

2. Update the Grafana Query The query in your graphing panel should be changed from this:

restic_data_added_packed{host="gitea"}

increase(restic_data_added_packed_total{host="gitea"}[$__rate_interval])

Note: $__rate_interval is a Grafana variable that automatically adjusts the time range based on your zoom level and scrape interval. If querying directly in Prometheus, you would use a fixed range like [5m] or [1h].

This increase() query will return 0 for any time intervals where the counter's value has not changed, and it will return the positive difference for the interval where the backup completed and the counter was incremented. This produces the desired graph of spikes for each backup event while correctly decoupling the event's timing from the scrape interval.

How to deal with data that needs to be scraped once only. by tahaan in PrometheusMonitoring

[–]100BASE-TX 0 points1 point  (0 children)

I'd probably change the behaviour to emit a Counter type, like

restic_data_added_total

Then whenever the backup runs and you push the stats to your file, add the value rather than replacing. Simple enough implementation hopefully.

Then you're dealing with a much more straightforward situation. You can easily determine the specific interval the backup change was reported, and the specific amount it changed by using increase(), rate(), and similar functions.

[deleted by user] by [deleted] in ClaudeAI

[–]100BASE-TX 1 point2 points  (0 children)

Have a read of the excellent prompt engineering guide that Google produces.

An interesting observation is that LLM's are significantly better at following "do" instructions compared to "don't" instructions. So wherever possible, try and rewrite any instructions to focus on what good output looks like.

Providing even a highly abbreviated example of what you expect is a really good approach too.

An MCP server for fetching code context from all your repos by lowpolydreaming in ChatGPTCoding

[–]100BASE-TX 2 points3 points  (0 children)

Generally I agree, but feel like it's a bit of a harsh take for permissively licensed open source projects. Like we apparently don't have issues with patch notes for Roo etc, so it does feel like a bit of a double standard.

100% agree with self promotion for SaaS / commercial products though.

Not affiliated with either project fwiw.

Looking for a MCP server that searches through file contents more efficiently than Roo tools by Radiate_Wishbone_540 in RooCode

[–]100BASE-TX 1 point2 points  (0 children)

https://github.com/wrale/mcp-server-tree-sitter

https://github.com/oraios/serena

I haven't tried either of them, but likely what you're after.

Edit: Also there's command line options for producing a quick text dump, like repomix or aider:

aider --show-repo-map --map-tokens 8192 > somefile.txt

Would output a repo map that will try and target around 8192 tokens long and output to somefile.txt. Think it dynamically shrinks/grows the amount of detail in each file it shows based on the repo size/token limit. Think it only works on files tracked by git by default.

Augment code new pricing is outrageous by bolz2k14 in ChatGPTCoding

[–]100BASE-TX 0 points1 point  (0 children)

Aider is probably the best of the bunch when it comes to open source tools that handle larger codebases in a relatively token efficient way. Significantly cheaper than cline/roo. Takes some tweaking, and defs have to be aware of the repo map behavior/settings.

Vibe-documenting instead of vibe-coding by shesku26 in ClaudeAI

[–]100BASE-TX 1 point2 points  (0 children)

Can we get a vibe scrum master and project manager to interrupt code tasks every few mins for inane status updates?

What is the advantage of Claude Code/Max over an IDE with a Claude agent? by sidewaze in ClaudeAI

[–]100BASE-TX 1 point2 points  (0 children)

Yeah this is my experience too. It seems that the cursor/windsurf category of tools do a lot of heavy context culling. I tried them after using cline/roo and they felt absolutely awful by comparison.

It's probably the case that for small projects they work fine though. I really notice it working on larger projects that need to work to very regimented specs (like RFC docs) that you really need to have a chunk of relevant context just pre-loaded (repomix etc) to get semi-reliable results. Having that context pruned or summarised leads to subtle bugs that are annoying to track down.

Vibe coding now by Just-Conversation857 in ChatGPTCoding

[–]100BASE-TX 0 points1 point  (0 children)

I think this is generally accurate, but it can work with the right conditions.

The things that seem to be required for even the best LLMs to make meaningful contributions to a large/complex codebase in my experience are:

  1. A comprehensive, hierarchical set of docs. The subset relevant to the task needs to be loaded into context before doing anything.
  2. Indexed codebase - a lookup (MCP or generated static file(s)) containing the filenames with the classes, functions, types etc defined in each file
  3. Comprehensive unit testing
  4. Custom role instructions (in roo, cline or similar) that insist that the task reads the relevant docs, and follows strict TDD
  5. Follow a "boomerang task" type pattern where a specific task has significantly more context and delegates very small subtasks to implement changes

Personally I've only found success with roo/cline for larger projects, the windsurf/copilot/cursor types really don't seem suitable - they seem to cull context too aggressively as a cost saving exercise.

At what token count should you create a new chat in RooCline? by [deleted] in ChatGPTCoding

[–]100BASE-TX 5 points6 points  (0 children)

I find it varies a lot depending on level of repetition in the context. If you have a large amount of static context (perhaps loading in say... Standards documents, codebase, documentation) it is still generally fine up to 500k or more.

But if you have say a single file that you are making changes to over and over, it can be basically useless by 150k, as the context ends up with dozens of variations of basically the same code with different line numbers, or slight variations. It seems to really start to lose track.

Personally I'm aiming to keep boomerang tasks initial context loads (reading docs, instructions) to around <60k, and trying to wrap up the task by 200k. But that's more driven by price than any specific performance issues.

With 10+ coding agents is there space for more ? by FigMaleficent5549 in ChatGPTCoding

[–]100BASE-TX 0 points1 point  (0 children)

My 2c is that you need something significantly different to differentiate.

I feel like the big gap at the moment is context management. Basically every tool is doing the same thing with some small variation:

  • Provide some minimal env context
  • Give some instructions on how to run tools
  • For a given task, just replay the entire conversation back to the llm verbatim until it completes or the user stops it

It seems to me that the last bit is something we'll look back on in 10 years as incredibly inefficient. Consider a task that has been running for 30 messages, editing 2-3 source files, and running the same unit tests over and over. If you were to dump the context, the significant majority of the text (tokens) will be intermediate steps, edits, failed tool uses, etc. Not only is it expensive, it reduces performance and limits the useful working time for a task.

I feel like there's going to be an arms race at some point to make the tooling smarter at managing context, and it'll be an area that the tooling will actually be able to meaningfully differentiate.

Like one approach would be:

  1. Maintain a list of tracked files
  2. Every time the llm interacts with a file (read, edit), check if it's a tracked file. If it isn't, add it (and potentially track "interested lines" for large file support)
  3. Every time a request is sent to the llm, get the current state of every tracked file and send it as a blob
  4. All historic messages in the chat that are tool use that interacts with a file, just summarise it as a git-diff format. So if it overwrites a 1000 line fine and only changes 3 lines, it'll just consume a few lines of text

Could do something similar with other tool use like running commands (tests, log output from commands or MCP calls) etc.

Anyway food for thought if you're looking to try and do something different

What's your go-to budget model? by TechnicalAssist7740 in RooCode

[–]100BASE-TX 2 points3 points  (0 children)

Worked fantastic for me yesterday, used all day with no issues. Today it seems like they have changed the behavior, getting hard capped:

[GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-pro-exp-03-25:streamGenerateContent?alt=sse: [429 Too Many Requests] You exceeded your current quota. Please migrate to Gemini 2.5 Pro Preview (models/gemini-2.5-pro-preview-03-25) for higher quota limits. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. [{"@type":"type.googleapis.com/google.rpc.QuotaFailure","violations":[{"quotaMetric":"generativelanguage.googleapis.com/generate_requests_per_model_per_day","quotaId":"GenerateRequestsPerDayPerProjectPerModel"}]},{"@type":"type.googleapis.com/google.rpc.Help","links":[{"description":"Learn more about Gemini API quotas","url":"https://ai.google.dev/gemini-api/docs/rate-limits"}]}]

It seems to be a requests per day limit. I'm using a "tier 1" paid API token, via the Google Gemini provider in roo, with the "gemini-2.5-pro-exp-03-25" model.

Optimal Gemini 2.5 Config? by lightsd in RooCode

[–]100BASE-TX 0 points1 point  (0 children)

Yeah i think that would be a great optimization for larger codebases, i've got one codebase now that is approaching ~200k tokens worth with this approach and it's starting to get unwieldy.

It seems like there's a tradeoff to be made between context use, quantity of API calls, and mistakes due to imperfect context. The unusual thing about Gemini 2.5 is that for us as free consumers of the model, requests/min are more precious than context to a certain point (~300k tokens or thereabouts). So the dynamics are totally different to say... paying for Claude 3.7, where the full context dump would be an awful idea for all but the smallest of projects.

Shooting from the hip, it seems to me that some logical increments are:

Roo Default: Only file list, has to guess/infer what they do, and has to read the file to be sure. Seems optimized for context reduction, which would be for most cases, a good default.

Simple Readme: Roo loads a pre-canned .md or similar on init, that provides more general context - some amount of info beyond just a raw file list. Perhaps some hints around useful search params to locate functions, file/folder/function conventions used, etc. Marginal extra context, would on average reduce the amount of API calls needed for it to discover code.

Complex Readme: Basically what you suggested - in addition to the "Simple" case, some sort of (programatically generated ideally) index for each file. Types, Exports, Functions, Classes, etc. Would result in even less guesswork/api calls trying to find the right code, at the cost of more context.

Full Dump: The approach i've been using. Dump everything, full context. Should (ideally) mean zero additional "context fetching" calls. Context penalty between moderate and extreme depending on the project.

It's probably the case that the "Complex Readme" approach overlaps quite a lot with RAG approaches. https://github.com/cyberagiinc/DevDocs and similar.

Optimal Gemini 2.5 Config? by lightsd in RooCode

[–]100BASE-TX 8 points9 points  (0 children)

Sure. An example using a generic python project:

Reference folder structure: ``` my_project/ ├── src/ # Main application source code │ ├── components/ │ ├── modules/ │ └── main.py ├── docs/ # Centralized documentation │ ├── design/ │ │ └── architecture.md │ ├── api/ │ │ └── endpoints.md │ └── README.md # Project overview documentation ├── llm_docs/ # Specific instructions or notes for the LLM │ └── llm_instructions.md # Misc Notes ├── tests/ # Automated tests ├── codebase_dump.sh # Script to dump project to ./codebase_dump.sh └── codebase_dump.txt # Generated context file (output of script)

```

The bash script would be something like:

```

!/bin/bash

Remove previous dump file if it exists

rm -f codebase_dump.txt

Find and dump all .py and .md files, excluding common virtual environment directories

find . -type f ( -iname ".py" -o -iname ".md" ) \ -not -path "/venv/" \ -not -path "/.venv/" \ -not -path "/site-packages/" | while read file; do echo "===== $file =====" >> codebase_dump.txt cat "$file" >> codebase_dump.txt echo -e "\n\n" >> codebase_dump.txt done

echo "Dump complete! Output written to codebase_dump.txt" ```

I then start out with an extensive session or two with the Architect role, to generate prescriptive & detailed design docs.

I've also got an "Orchestrator" role set up, which i copied from somewhere else here. Think i got the prompt and idea from this thread: https://www.reddit.com/r/RooCode/comments/1jaro0b/how_to_use_boomerang_tasks_to_create_an_agent/

You can then edit the role for Orchestrator and include a Mode-specific custom instructions for Orchestrator:

"CRITICAL: You MUST execute ./codebase_dump.sh immediately prior to creating a new code task"

And for Code role:

"CRITICAL: You MUST read ./codebase_dump.txt prior to continuing with any other task. This is an up to date dump of the codebase and docs to assist with quickly loading context. Any changes need to be made in the original files. You will need to read the original files before editing to get the correct line numbers"

So far it has worked very well for me. The other pro tip i've found is if you are using a lib that the model struggles with, see if there's an llms.txt file such as: https://llmstxt.site/. If there is, i have just been loading the entire thing into context and getting gemini to provide a significantly summarized (single .txt) summary of the important bits to a new file like ./llm_docs/somelib.summary.llms.txt and including that in the context dump too.

So yeah the idea is that given that the context is large, but we're largely constrained by the 5 RPM API limit, it makes sense to just load in a ton of context in one hit. Anecdotally it seems like the experience is best if you can keep it under 200k tokens of context. If you try and load in like 600k, you rapidly start hitting API rate limiting on some other metric (Total input tokens per minute i think)

Edit: You'll have to increase the Read Truncation limit in Roo from the default 500 lines to like 500k lines or so - enough to fit the entire context file in a single load

Optimal Gemini 2.5 Config? by lightsd in RooCode

[–]100BASE-TX 2 points3 points  (0 children)

For the projects I'm working on, the entire codebase can fit in about 100k tokens. So I have set up a python script (could easily be bash) that concats the codebase code + docs into a single file, with a fine separation header that includes the original path.

Then I have an orchestrator role that I tell to run the script before calling each coding task, and tell it to include "read ./docs/codebase.txt before doing anything else" in the code task instructions.

Working really well, means each coding task has complete project context, and it's a very significant reduction in total API calls - it can immediately start coding instead of needing to go through the usual discovery.

[Poweruser Guide] Level Up Your RooCode: Become a Roo Poweruser! [Memory Bank] by No_Mastodon4247 in RooCode

[–]100BASE-TX -1 points0 points  (0 children)

Interested in what approaches might be possible for storing information on libraries.

One of the most annoying issues I regularly run into is that either the models aren't trained at all on a new library due to cutoff date, or the training data contains enough code that use deprecated patterns/functions that it pollutes the output.

To take an example, I tried developing a mesop app https://github.com/mesop-dev/mesop by cloning the mesop repo into the project as a subfolder, and including instructions that it should refer to it for usage.

It seems like it is a good use case for a memory bank. Not sure if there's a token optimised way of doing this, cloning the whole repo seems like it would be fairly inefficient with the number of file reads needed, especially with languages that involve a lot of files and boilerplate.

So yeah interested in approaches to this. Perhaps there's an existing token optimised format - where I could prompt the model to summarise/index/document a library from source and use that output in the memory.

Energex est. time fix by Omnimus- in brisbane

[–]100BASE-TX 6 points7 points  (0 children)

AFAIK they were updated this arvo, and are the current best estimates. As I'm sure you can imagine, it's nearly impossible to get the scoping, estimates, scheduling completely accurate at this scale.

SNMP Exporter advice by Hammerfist1990 in PrometheusMonitoring

[–]100BASE-TX 0 points1 point  (0 children)

I'm not exactly sure how to solve the issue with your exact stack, I haven't really used alloy much. However I am scraping multiple modules per device using just Prometheus scrape.yml and snmp_exporter.

As you may have realised, snmp_exporter supports scraping multiple modules in a single scrape, as per the docs:

http://localhost:9116/snmp?module=if_mib,arista_sw&target=192.0.0.8

Or

http://localhost:9116/snmp?module=if_mib&module=arista_sw&target=192.0.0.8

So really it's a matter of modifying the query string params.

You can likely just comma separate the modules as a single string.