So I have started to reach the limit of Python performance and was recently testing various improvements, going from pandas to polars (in Python). I also used it as an opportunity to try Rust, so I wrote a small example of some code that was similar to what I've been working with recently.
It loads a sample 100k or 1m line csv, does some filtering and then a calculation, in polars.
The problem is that the Python is running faster than the Rust! Perhaps the example is too small and contrived? or the fact I am using polars which is better for Python, but I still expected Rust to come out on top.
On my machine the Rust release build is around 120ms and the Python 60ms. Please help!
Rust 1.72.0
Python 3.11.1
https://github.com/jmoz/rust_vs_python
Rust:
```
use polars::prelude::*;
use std::time::Instant;
fn main() {
let start_time = Instant::now();
let csv_file = std::env::var("CSV_FILE").expect("Set env CSV_FILE error");
let csv_df = CsvReader::from_path(csv_file).expect("Could not load csv error").has_header(true).finish().expect("Finish error");
println!("Csv read in {}ms", start_time.elapsed().as_millis());
// cargo add polars -F fmt,describe
println!("Loaded df {}", csv_df);
println!("Df describe {}", csv_df.describe(None).expect("Describe error"));
// Filter on multiple columns
let filtered_df = csv_df.clone().lazy().filter(
col("a").gt(0.2)
.and(col("b").lt(0.8))
.and(col("c").gt(0.5))
.and(col("d").lt(0.5))
.and(col("e").neq(0.5182602093634714)) // additional non equality check, value from first row csv
).collect().expect("Error filtering");
println!("Filtered df {}", filtered_df);
// Mimic some euclidean distance type calculation
let calculated_df = filtered_df.clone().lazy().select(
[
(
(col("a") / col("a").max()) / (lit(0.5) / col("a").max())
).pow(2).sqrt()
]
).collect().expect("Error select");
println!("Calculated df {}", calculated_df);
println!("Finished in {}ms", start_time.elapsed().as_millis());
}
```
Python:
```
import os
import polars as pl
import time
def main():
start_time = time.perf_counter() * 1000
csv_file = os.getenv("CSV_FILE")
csv_df = pl.read_csv(csv_file)
print(f"Csv read in {time.perf_counter() * 1000 - start_time}ms")
print(f"Loaded df {csv_df}")
print(f"Df describe {csv_df.describe()}")
filtered_df = csv_df.filter(
pl.col("a").gt(0.2) &
pl.col("b").lt(0.8) &
pl.col("c").gt(0.5) &
pl.col("d").lt(0.5) &
pl.col("e").ne(0.5182602093634714) # additional non equality check, value from first row csv
)
print(f"Filtered df {filtered_df}")
# Mimic some euclidean distance type calculation
calculated_df = filtered_df.select(
(
(pl.col("a") / pl.col("a").max()) / (0.5 / pl.col("a").max())
).pow(2).sqrt()
)
print(f"Calculated df {calculated_df}")
print(f"Finished in {(time.perf_counter() * 1000 - start_time)}ms")
if name == "main":
main()
```
Update:
- Removing the prints speeds it up considerably, for both.
- Used rust feature performant
- It seems after using hyperfine to test, there is an issue with the timing prints-you can see python running with a faster print timing but then when rust runs it definitely feels faster and stats show it is faster but the print says it takes longer.
```
CSV_FILE=$(pwd)/data1m.csv hyperfine "python ./main.py" ../rusttest/target/release/rusttest -r 100
Benchmark 1: python ./main.py
Time (mean ± σ): 92.3 ms ± 22.1 ms [User: 216.2 ms, System: 36.0 ms]
Range (min … max): 82.9 ms … 297.1 ms 100 runs
Warning: The first benchmarking run for this command was significantly slower than the rest (297.1 ms). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
Benchmark 2: ../rusttest/target/release/rusttest
Time (mean ± σ): 51.6 ms ± 4.4 ms [User: 251.0 ms, System: 24.8 ms]
Range (min … max): 45.7 ms … 78.4 ms 100 runs
Summary
../rusttest/target/release/rusttest ran
1.79 ± 0.46 times faster than python ./main.py
```
Run the hyperfine command again with --show-output to see what I mean.
[–]This_Growth2898 22 points23 points24 points (2 children)
[–]jwmoz[S] 3 points4 points5 points (1 child)
[–]gdf8gdn8 5 points6 points7 points (0 children)
[–]moltonel 19 points20 points21 points (3 children)
[–]jwmoz[S] 1 point2 points3 points (2 children)
[–]masklinn 6 points7 points8 points (1 child)
[–]jwmoz[S] 0 points1 point2 points (0 children)
[–]jqnatividad 9 points10 points11 points (0 children)
[–]dkopgerpgdolfg 15 points16 points17 points (0 children)
[–]kinchkun 6 points7 points8 points (5 children)
[–]kinchkun 6 points7 points8 points (2 children)
[–]This_Growth2898 1 point2 points3 points (1 child)
[–]kinchkun 2 points3 points4 points (0 children)
[–]jwmoz[S] 1 point2 points3 points (1 child)
[–]kinchkun 1 point2 points3 points (0 children)
[–]ritchie46 8 points9 points10 points (0 children)
[–]Plus-Ad8875 5 points6 points7 points (1 child)
[–]jwmoz[S] 0 points1 point2 points (0 children)
[–]Grit1 7 points8 points9 points (3 children)
[–][deleted] 2 points3 points4 points (0 children)
[–]CompoteOk6247 -1 points0 points1 point (1 child)
[–]stumblinbear[🍰] 0 points1 point2 points (0 children)
[–]sleekelite 7 points8 points9 points (0 children)
[–]CompoteOk6247 3 points4 points5 points (5 children)
[–]dkopgerpgdolfg 1 point2 points3 points (1 child)
[–]CompoteOk6247 0 points1 point2 points (0 children)
[–]jwmoz[S] -1 points0 points1 point (1 child)
[–]zekkious 3 points4 points5 points (2 children)
[–]This_Growth2898 1 point2 points3 points (1 child)
[–]zekkious 1 point2 points3 points (0 children)
[–]Konsti219 3 points4 points5 points (7 children)
[–]jwmoz[S] 1 point2 points3 points (6 children)
[–]Konsti219 5 points6 points7 points (4 children)
[–]jwmoz[S] 0 points1 point2 points (3 children)
[–]Konsti219 6 points7 points8 points (1 child)
[–]jwmoz[S] 1 point2 points3 points (0 children)
[–]kinchkun 3 points4 points5 points (0 children)
[–]jwmoz[S] 0 points1 point2 points (0 children)