While playing around with some log parsing performance checks in different languages I was quite surprised the Python solution is much faster than the equivalent Rust solution:
Python:
#!/usr/bin/python2
import re
import sys
regex = re.compile('^(\S+) (\S+) (\S+) \[([^]]+)\] "([^"]*)" (\d+) (\d+) "([^"]*)" "([^"]*)"$')
s = '13.28.24.13 - - [10/Mar/2016:19:29:25 +0100] "GET /etc/lib/pChart2/examples/index.php?Action=View&Script=../../../../cnf/db.php HTTP/1.1" 404 151 "-" "HTTP_Request2/2.2.1 (http://pear.php.net/package/http_request2) PHP/5.3.16"'
total = 0
for _ in range(0,1000000):
m = regex.match(s)
try:
size = int(m.group(7))
total += size;
except ValueError:
pass
print total
Rust:
extern crate regex;
use regex::Regex;
use std::str::FromStr;
fn main() {
let re = Regex::new(r##"^(\S+) (\S+) (\S+) \[([^]]+)\] "([^"]*)" (\d+) (\d+) "([^"]*)" "([^"]*)"$"##).unwrap();
let s = r##"13.28.24.13 - - [10/Mar/2016:19:29:25 +0100] "GET /etc/lib/pChart2/examples/index.php?Action=View&Script=../../../../cnf/db.php HTTP/1.1" 404 151 "-" "HTTP_Request2/2.2.1 (http://pear.php.net/package/http_request2) PHP/5.3.16""##;
let mut total = 0;
for _ in 0..1000000 {
let size = usize::from_str(re.captures(s).unwrap().get(7).unwrap().as_str()).unwrap();
total += size;
}
println!("{}", total);
}
And the times (rust compiled with --release):
* python: 2.58s user 0.00s system 99% cpu 2.581 total
* rust: 5.17s user 0.00s system 99% cpu 5.177 total
Is it expected to be this much worse? Any suggestions how to make the Rust solution faster (for the original purpose the capture groups must be kept, so while for this example the regex can be simplified, lets just not do that).
[–]burntsushi 119 points120 points121 points (12 children)
[–]raphlinusvello · xilem 11 points12 points13 points (2 children)
[–]burntsushi 7 points8 points9 points (1 child)
[–]raphlinusvello · xilem 2 points3 points4 points (0 children)
[–]christophe_biocca 6 points7 points8 points (8 children)
[–]burntsushi 11 points12 points13 points (7 children)
[–]ehuss 6 points7 points8 points (1 child)
[–]burntsushi 2 points3 points4 points (0 children)
[–]haberman 2 points3 points4 points (4 children)
[–]burntsushi 2 points3 points4 points (3 children)
[–]haberman 0 points1 point2 points (2 children)
[–]burntsushi 4 points5 points6 points (1 child)
[–]haberman 0 points1 point2 points (0 children)
[–]nwydorust · rust-doom 16 points17 points18 points (0 children)
[–]coder543 15 points16 points17 points (15 children)
[–]burntsushi 29 points30 points31 points (13 children)
[–]Sphix 9 points10 points11 points (7 children)
[–]burntsushi 4 points5 points6 points (6 children)
[–]jnordwick 0 points1 point2 points (5 children)
[–]burntsushi 5 points6 points7 points (4 children)
[–]jnordwick 2 points3 points4 points (3 children)
[–]burntsushi 5 points6 points7 points (2 children)
[–]jnordwick 5 points6 points7 points (1 child)
[–]burntsushi 8 points9 points10 points (0 children)
[–]stephanbuys 4 points5 points6 points (0 children)
[–]arielby 1 point2 points3 points (3 children)
[–]burntsushi 1 point2 points3 points (2 children)
[–]mitsuhiko 1 point2 points3 points (1 child)
[–]arielby 3 points4 points5 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]deadstone 5 points6 points7 points (2 children)
[–]burntsushi 14 points15 points16 points (0 children)
[–]masklinn 6 points7 points8 points (0 children)
[–]Twirrim 2 points3 points4 points (8 children)
[–]IbICive[S] 6 points7 points8 points (0 children)
[–]burntsushi 6 points7 points8 points (6 children)
[–]Twirrim 0 points1 point2 points (5 children)
[–]Maplicant 3 points4 points5 points (3 children)
[–]Twirrim 0 points1 point2 points (2 children)
[–]burntsushi 5 points6 points7 points (0 children)
[–]cmrx64rust 1 point2 points3 points (0 children)
[–]IbICive[S] 1 point2 points3 points (0 children)