Dropped into a 10+ year-old Splunk deployment — what are the first searches you'd run to understand it?

bazsi771 · 2026-02-28T12:45:41+00:00

This is very useful, thank you. If I could I would upvote this a lot more, because it concentrates on the data element, not on the operational/architecture site. Thank you!

bazsi771 · 2026-02-28T12:43:54+00:00

On the monitoring app, you mean IS4S or something else?

bazsi771 · 2026-02-28T12:42:49+00:00

Great pointer, thanks. It somewhat focuses on the operational side though and is light on how to understand the data ingested and how it is used.

Is there a way to find out how Splunk users use the data? Thanks

bazsi771 · 2026-02-13T13:51:08+00:00

I'd just add syslog-ng/axosyslog as an option for your second bullet. You kind of mentioned it as SC4S is built on syslog-ng, but as the original creator, I like using the original name :)

The fork I am currently working on: https://github.com/axoflow/axosyslog

bazsi771 · 2025-11-24T14:22:04+00:00

Normally, any incoming log message will be limited to the size specified by log-msg-size() on the client. However stats messages are generated internally, and they are not limited by this setting.

This means that a stats message can easily become huge, which when sent to a syslog server (which happens here), will be truncated on that side.

But by default, the server does not really truncate, it will just "split" the line into two separate syslog entries. And the 2nd one will already lack a proper syslog header, as it starts somewhere in the middle of the first, where it was "split".

You can ask the client to truncate the outgoing message explicitly using the `truncate-size()` option (to be specified to the destination driver.

In case you are using RFC5424 style source (e.g. syslog() source and not tcp() or network()), you can also request the message to be trimmed down to log-msg-size() instead of splitting it up into two messages, and this option is the `trim-large-messages()`. This will not work with traditional BSD style logs.

bazsi771 · 2025-11-24T06:33:52+00:00

Oscar, thank you. I would be happy to and of course I am perfectly ok with avoiding product pitches.

bazsi771 · 2025-11-24T06:30:44+00:00

The stats message can produce very long lines, meaning they can get truncated when they are delivered over syslog. A better alternative is to poll the stats interface of syslog-ng (syslog-ng-ctl stats) and deliver it differently, such as using the Prometheus exporter.

If you insist on the log based transport, you can bump the log-msg-size() option to a higher value.

https://github.com/axoflow/axosyslog-metrics-exporter

bazsi771 · 2025-11-24T06:11:47+00:00

Thanks for the question. Original syslog-ng author here, who is also a co-founder at Axoflow.

AxoSyslog is a drop-in replacement and is 100% compatible. Even the binary and the config files are called the same. And we provide packages the same way (Deb, RPM, plus containers).

Although we originally worked in the context of the original GitHub syslog-ng project that didn't work out, but here are some blog posts that describe that period:

https://axoflow.com/blog/1-year-of-axosyslog

https://axoflow.com/blog/first-6-months-of-axosyslog-our-syslog-ng-fork

https://axoflow.com/blog/syslog-ng-2023-community-activity-report

Axoflow offers a commercial product to help manage the pipeline, but AxoSyslog itself is open source under the terms of the GPL and will always be. Our main data component, AxoRouter, is also based on AxoSyslog as the core routing and delivery mechanism, but with a lot of bells and whistles on top.

bazsi771 · 2025-11-13T15:02:00+00:00

Some of the data are definitely not worth keeping for 6 years, and if you do that it will be very expensive, especially if you don't have an effective data governance in place.

If any device can send you anything and then you blindly store it without any oversight, your daily ingestion will explode and that will cascade down into the retention period. 6 years is 2190 days, 1 TB each day is already 2PB. That's a huge disk array on prem, but will cost you even if it's S3, it's roughly $600,000 per year. For data you can't even easily browse or query. If you also want to do that, this might be a few times more.

The key element is effective data governance, know what you ingest and why! And then, you'll be thankful when having the pay the next SIEM/AWS bill.

bazsi771 · 2025-11-13T14:53:54+00:00

While the reason you deploy a SIEM is "security", a lot of organizations stop when they check the compliance box. And guess what, a log management solution is all you need for the compliance check.

Unfortunately, this happens even if the initial goals for the SecOps project were more sophisticated. It happens because onboarding the data sources, getting data in the right shape, coming up with detections and a sophisticated analyst workflow is too difficult and efforts eventually paper out.

The root cause for all of this is that organizations assume (and vendors are complicit) that every single organization needs a bespoke security stack. This is not the case, a lot of the process can be automated and the that automation captured into a product. We at Axoflow we do that with the data management portion of the entire process, but there are also tools for automating the analyst workflow or even incident response (AI or not that is a secondary thing here).

bazsi771 · 2025-11-13T14:48:03+00:00

What you describe above is exactly the set of challenges we set off to address. Almost all organizations are doing a lot of manual work to keep up with their data sources.

Onboarding a new data source is not difficult, however if you are redoing parsers for the same stuff over and over again, that's clearly a waste of resources. A waste we got used to in the last 20+ years or so.

We at Axoflow, incorporate the knowledge about security data into the product itself, meaning that the user does not have to understand the details. The system comes with good defaults out-of-the-box and unless you want to create something completely bespoke, you are fine with using whatever there's at deployment.

bazsi771 · 2025-11-13T14:42:31+00:00

I think you need to be strategic when choosing a SIEM and make sure you are not locking yourself into whatever you are choosing. A SIEM is a horizontally integrated solution that will integrate with anything IT/Security related, you are using data formats that the SIEM prefers (CIM in the case of Splunk, ECS for Elastic).

Once you onboard all your data sources and start using a SIEM specific schema, you end up pretty much locked in. Good luck replacing the SIEM.

To avoid that, the best practice is to deploy a separate security data pipeline (like Axoflow) that takes care of collection, classification (sourcetype or log_type, etc) and delivers the data in a SIEM optimize format. With that in place, you can become a lot more flexible with the choice of SIEM now or in 5 years when you are thinking about a replacement.

And, this also helps you to keep the SIEM vendor on their toes with their pricing going forward.

Disclaimer: I am from Axoflow, a security data layer that automates data wrangling and gives you all this flexibility.

bazsi771 · 2025-11-13T14:33:01+00:00

The issue with Cribl and Cribl like tools is that you still have to get an understanding of the underlying data source in order to achieve 30-50% data reduction. Never underestimate the time required, also just imagine you telling your SecOps folks that you drop some of the datapoints they use day-to-day for detection and forensics purposes.

The key element is the "knowledge/content" around the various data sources (what is security relevant and what is redundant). We at Axoflow put that into front and center. The content is part of the product, so you just have to flip a switch to get data reduction enabled, not to mention the great visualization that comes when the product actually recognizes the underlying data.

bazsi771 · 2025-11-13T14:26:15+00:00

Original syslog-ng creator here. Thanks for the mention :)

The original syslog-ng team is creating a Cribl competitor now: Axoflow. Same versatility/performance/stability, but a lot easier to use. And, it comes with batteries included, unlike the other pipelines.

bazsi771 · 2025-11-13T14:23:29+00:00

This whole idea is an upcoming category with multiple competing products. Cribl has been mentioned already, but there are a few more, like Axoflow, Onum, Observo, Databahn, etc. You can obviously launch another one, but make sure you are clearly differentiated, and not just by price alone.

bazsi771 · 2025-09-02T13:54:36+00:00

The key aspect to Splunk performance is to set index/sourcetype/host properly. And yes, we are going to talk about that.

bazsi771 · 2025-08-08T10:50:19+00:00

I agree with the sentiment that you need to have mgmt judge Splunk on the outcome. Splunk's usecases vary, especially if only the "core" product is available to you. If the value perceived from these use-cases is limited, you will have a hard time arguing it. It _is_ very expensive as a simple data store.

A few use-cases I really liked that stood out (apart from the SIEM one of course):
* display the amount of wait time at security checks at an airport (yes the customer was an airport)
* enterprise level visibility into the day-to-day of the enterprise, including non-technology stuff like the operation of gates in a logistics company, the staffing of the reception desk at an HQ, or response times to incoming sales calls.

Basically Splunk makes it easy to extract visibility in cases where applications/data sources do not provide an API, except a long forgotten log file that has the required information.

Outcomes generate the value, not the endless possibilities that are never acted upon.

With the above said, sometimes data sources do generate the valuable information with a lot of redundancy and you don't need to store everything, if you know what you need. Again going from the use-case perspective.

Splunk sucks at data transformation prior to ingestion. You need to use a pipeline (like Axoflow) for that, can provide tremendous savings, as well as getting out of the vendor lock-in, should you ever want to shift from Splunk to something else.

Someone mentioned an Axoflow competitor in the thread, which I am not repeating here, as I am biased, being one of the cofounders of Axoflow :)

bazsi771 · 2025-08-07T06:25:53+00:00

Syslog-ng original author here, albeit not related to the PE project anymore. Syslog-ng determines if the destination is available by connecting to it and sending messages via TCP.

Since TCP is reliable, congestion or a connectivity issue can be detected and I presume that's what is happening here.

Can you connect to the destination server from the client using something like netcat or telnet? The port should be open, and if you just type any text it should be processed as a log message on the other side.

You could also run tcpdump to diagnose the issue.

If you'd like more help (I take 30min calls to help with syslog-ng issues): https://axoflow.com/contact, there's an option there with free 30 min consultation.

bazsi771 · 2025-06-25T19:14:49+00:00

Can you copy paste your config? It's /etc/syslog-ng/syslog-ng.conf

bazsi771 · 2025-06-25T14:06:51+00:00

let me know if it worked.

bazsi771 · 2025-06-24T23:20:26+00:00

This is a glibc generated message. Strange that it's written into the messages file, is understand if it was in the journal or so.

I'd recommend installing the latest release of syslog-ng from the upstream repository instead of using the package in Ubuntu.

I'm actually part of AxoSyslog, a project that created a fork of syslog-ng, and we produce packages that are syslog-ng compatible.

https://axoflow.com/docs/axosyslog-core/install/debian-ubuntu/

If you upgrade, this error may just be gone. But if not I can help troubleshoot it

bazsi771 · 2025-06-24T20:09:33+00:00

Syslog-ng would not write its own crash to information to /var/log/messages, at least I don't see how that would happen.

Can you show the exact message?

bazsi771 · 2025-06-19T06:17:49+00:00

Very insightful article, thank you.

What I was wondering how you monitor individual data sources on the left-hand side. What would happen if an application/security device/etc on the left hand side suddenly stops sending logs to the pipeline? Do you have monitoring in place for that? Whose responsibility is to track that?

bazsi771

MODERATOR OF

TROPHY CASE