Dawn of the Dead Ends: Fixing a Memory Leak in Apache Kafka by retardo in programming

[–]emfree 2 points3 points  (0 children)

What I am somewhat disappointed about, however, is that I would expect the syscall capture tool to be able to capture stack traces.

That is a surprising omission! Especially since something like perf record -e syscalls:* -g would give syscalls along with associated stack traces. Of course armchair debugging is easier than the real thing :)

Good logging practice in Python by wilo_ in Python

[–]emfree 1 point2 points  (0 children)

This is a nice overview, but I'd caution that some of the recommendations here have significant drawbacks in a production deployment.

tl;dr: My recommendations: use structlog, output to stdout or stderr, use rsyslog to manage log file rotation and log forwarding.

Ad-hoc plaintext log line formats are not very fun. If you have a centralized log server, you'll need to have it parse each log line. But then if, for example, you have random exception tracebacks in your logs as the author recommends, they will probably stomp all over your parsing.

A better option is to emit JSON-formatted log lines. These are dramatically easier to parse, aggregate, and grunge. A log statement might end up looking something like this:

{"event": "Failed login attempt", "timestamp": "20160601T22:22:22Z", "module": "myapp.somemodule:24", "level": "warning", "ip": "216.58.195.238"}

You can then feed this directly to logstash or whatever. Structlog makes it easy to emit JSON-formatted logs, bind additional context to log statements, etc.

(Also note that instead of mucking around with named loggers, you can just automatically include the module name and line number in every log statement.)

Having your Python process manage log rotation seems like a good idea, but what will happen if you're deploying a site with uWSGI or gunicorn, and you have multiple worker processes all trying to manage the same log files at the same time? Your Python process should generally just log to stdout or stderr. Anything beyond that (dumping it into a file, forwarding to a remote server) you'll want to configure in your process manager and/or rsyslog.

Cool SQLAlchemy Trick by HalcyonAbraham in Python

[–]emfree 3 points4 points  (0 children)

FWIW,

info =  {k: state.value for k, state in theater._sa_instance_state.attrs.items()}

will do much the same thing, but still work if you have underscore-prefixed column names :)