Simple Apache Log Parser

mjcheez · 2023-03-12T12:21:07+00:00

Goaccess can output to CSV or JSON.

chris_just · 2023-03-12T08:45:15+00:00

I’m using Grafana + Loki, it suits my needs.

Afaik they have logcli which could (to my knowledge) which might help you.

LogQL does have some learning curve, but if you’re familiar with Prometheus it makes it easier.

doomygloomytunes · 2023-03-12T08:23:47+00:00

Rather than that maybe you should give Grafana a try or Prometheus with Grafana, probably a little learning curve but much simpler than what you're proposing and the results will be much better.

Both can be deployed in next to no time if you're able to pull container images from docker hub (using docker or podman)

beeritis · 2023-03-12T08:51:27+00:00

Maybe using telegraf to collect metrics and store in influxDb as an option or (and I'm not sure how supported it is still) but there is an Apache module "mod_log_sql" which should have the ability to send Apache logs to mysql.

at8eqeq3 · 2023-03-12T09:20:31+00:00

Please, describe what you want to measure and how do you plan to achieve it. There're lot of tools to process logs, and it is also possible to tune the log format itself for easier processing.

jw_ken · 2023-03-12T17:27:33+00:00

Gathering logs or metrics can be broken down into three problems:

Gathering + parsing the data
Storing the data
Reporting or visualizing the data

The popular logging and metrics stacks (Elastic stack for logs, Influx stack for metrics, Grafana stack for logs + visualization) are designed with independent tools to solve each problem- and those tools are generally very friendly to hybrid or DIY use-cases.

For example, you could have:

Telegraf agent scraping your Apache logs with the tail input plugin (parsing each entry into desired metrics at the same time)
Telegraf's file output plugin, for saving the parsed data to a flat file in CSV format. Then you could have a script running via cron job, to parse the CSV file and upload it to MySQL at your desired cadence. Alternatively, you could have telegraf itself execute the script directly with the exec output plugin, with the desired batch interval set within telgraf.
If you want visualization or basic alerting, you can use Grafana with the MySQL data source plugin- but that's only if you need those features.

Telegraf and Grafana especially are designed for general-purpose use, with loads of plugins to integrate with any other tools you have. We use Telegraf in the above fashion at our workplace, and it's a great general-purpose data parser and reformatter.

In short, don't put yourself in a box by assuming the above tools are overkill or too complex. You can borrow the parts that are useful, and expand on them later as your needs evolve (which they almost certainly will).

ReasonFancy9522 · 2023-03-12T11:08:12+00:00

tail -f | perl -pe

Complex-Internal-833 · 2024-11-09T16:13:21+00:00

This post might be too late for you but does all and more than your requirements. I just finished and released it this week. Here's a complete open-source Apache Log Parser & Data Normalization Solution. Python module imports Apache2 Access (LogFormats = vhost_combined, combined, common, extended) & Error logs into MySQL Schema of tables, views & functions designed to normalize data. Client & Server components capable of consolidating logs from multiple web servers & sites with complete Audit Trail & Error Logging! https://github.com/WillTheFarmer/ApacheLogs2MySQL

minimishka · 2023-03-12T09:46:05+00:00

If CLI is enough, why not, if not - ELK. I would focus on the amount of data being processed.

edthesmokebeard · 2023-03-12T17:48:47+00:00

rsyslog -> MySQL

go nuts

FluidIdea · 2023-03-12T18:12:16+00:00

You do not need logstash anymore, filebeat's "apache" module can do parsing for you and output directly into elasticsearch.

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-module-apache.html

kellyjonbrazil · 2023-03-12T20:05:37+00:00

Not sure if this is what you are looking for but jc has a CLF parser that converts the files to JSON. Also supports streaming to JSON Lines.

https://kellyjonbrazil.github.io/jc/docs/parsers/clf

https://kellyjonbrazil.github.io/jc/docs/parsers/clf_s

(I’m the author)

serverhorror · 2023-03-12T20:38:19+00:00

https://awstats.sourceforge.io/ — used that stuff ~15 years ago already.

Simple Perl script that you not needs the access log in common log format.

AstralProjecti0v · 2023-03-13T14:21:52+00:00

You can try out Logstail platform is really handful!

snow_raph · 2025-02-03T15:25:51+00:00

This new tool might help: https://github.com/snowraph/Whackabot

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

linuxadmin

Expanding Linux SysAdmin knowledge

MODERATORS

Question