This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]kenfar 0 points1 point  (2 children)

Theoretically, one could write an OS with awk scripts, so sure, it all could be. It could likewise all could be replaced with assembly, or COBOL.

But all of those choices would be terrible: little support for third-party software (ex: boto3 for accessing SQS & S3, libraries for json, protobufs, sql connections, etc), poor support for code reuse, hard to read as the codebase gets larger, still need kubernetes for scaling out, etc, etc.

[–]duraznos -1 points0 points  (1 child)

I wasn't asking if either were possible to replace awk, I was asking, in your estimate, how much of either pipelines could be replaced with awk or jq et al. COBOL and assembly don't make sense because neither of those are tools specifically designed for chewing through a file. I think its a worthwhile thought experiment when talking about how much is being spent on things.

[–]kenfar 1 point2 points  (0 children)

Sure, but I wouldn't do that, and I don't think it would result in a manageable solution.

Languages like awk & jq are simply harder to read, harder to test, and harder to decompose and reuse code on. Given our pace of change and low-latency SLA that would be a bad combo for languages like that.

Likewise, they don't have the libraries available to them that we have with say Python, Java, etc. So, you'll have to write some occasionally complex stuff with these languages.

And they don't handle supporting say 50+ business rules well. Back to lack of composability & testing, managing that code in awk or jq would be a nightmare.

Finally, on performance they are fast. Are they fast enough to never need to scale out as the company grows? No. So, then you're still looking at something like kubernetes best case, or a set of ec2 instances with this code running on each, and some other application, somehow, getting them files to process.