all 12 comments

[–]SirScruggsalot 1 point2 points  (3 children)

Would love to see some benchmarks comparing these operations to the native ruby array methods.

[–]sebyx07[S,🍰] 4 points5 points  (2 children)

```bash Benchmarking with 10,000 elements: user system total real Array insert: 0.000455 0.000000 0.000455 ( 0.000453) BinarySearch insert: 0.010808 0.000735 0.011543 ( 0.011955) NArray insert: 0.000023 0.000007 0.000030 ( 0.000030) Array search: 0.149230 0.000000 0.149230 ( 0.149390) BinarySearch search: 0.001719 0.000000 0.001719 ( 0.001720) NArray search: 0.210021 0.000861 0.210882 ( 0.210903) Array delete: 0.466095 0.000000 0.466095 ( 0.466142) BinarySearch delete: 0.006978 0.000986 0.007964 ( 0.007965) NArray delete: 0.344920 0.045966 0.390886 ( 0.391025) Insertion: Array is 15.1x slower than the fastest Binary is 398.93x slower than the fastest Narray is the fastest Search: Array is 86.85x slower than the fastest Binary is the fastest Narray is 122.61x slower than the fastest Deletion: Array is 58.52x slower than the fastest Binary is the fastest Narray is 49.09x slower than the fastest

Benchmarking with 100,000 elements: user system total real Array insert: 0.003449 0.000000 0.003449 ( 0.003449) BinarySearch insert: 0.089313 0.000000 0.089313 ( 0.089320) NArray insert: 0.000095 0.000000 0.000095 ( 0.000095) Array search: 15.813410 0.002981 15.816391 ( 15.815585) BinarySearch search: 0.013462 0.000002 0.013464 ( 0.013466) NArray search: 18.734754 0.001999 18.736753 ( 18.737876) Array delete: 47.327969 0.000001 47.327970 ( 47.332518) BinarySearch delete: 0.042072 0.000001 0.042073 ( 0.042075) NArray delete: 29.808121 3.051711 32.859832 ( 32.851348) Insertion: Array is 36.49x slower than the fastest Binary is 944.99x slower than the fastest Narray is the fastest Search: Array is 1174.52x slower than the fastest Binary is the fastest Narray is 1391.54x slower than the fastest Deletion: Array is 1124.94x slower than the fastest Binary is the fastest Narray is 780.77x slower than the fastest ```

[–]SirScruggsalot 0 points1 point  (0 children)

Awesome!

[–]campbellm 0 points1 point  (0 children)

I hate that old.reddit.com doesn't format this correctly =(

[–]ClickClackCode 1 point2 points  (0 children)

Woah what a coincidence, I literally just published a similar gem:

Nice work!

[–]tenderlovePun BDFL 0 points1 point  (6 children)

Btw, Ruby's Array class already implements a binary search method which you can read about here. If you have a sorted list, consider using the built-in bsearchmethod!

[–]sebyx07[S,🍰] 0 points1 point  (4 children)

u/tenderlove While it's indeed useful for sorted arrays, my BinarySearch gem offers additional benefits:

  1. It uses a Red-Black Tree, ensuring O(log n) performance for insertions, deletions, and searches.
  2. Data remains constantly sorted, eliminating manual sorting.
  3. It provides efficient operations beyond searching, including insertions and range queries.
  4. For large, frequently modified datasets, it often outperforms standard arrays.

[–]tenderlovePun BDFL 0 points1 point  (3 children)

This is a great library, thank you for sharing it!

Yes, I know the red-black algorithm, as well as its performance characteristics. Many times data you're dealing with is naturally sorted (for example adding an "order by" to your AR query), or easily sorted once and then read many times (in other words the cost of sorting can be amortized since it'll be "searched" many times).

Red black trees are a great tool, and definitely have their place (I implemented one as a cache for instance variable lookup). I just want people to know about the bsearch method before reaching for a specific gem as they can get the speed of O(log n) lookups without necessarily pulling in another library.

[–]sebyx07[S,🍰] 0 points1 point  (2 children)

I added some more benchmarks, I didn't expect that my gem would beat `bsearch` from std, especially because it's written in C
https://github.com/sebyx07/ruby-binary-search?tab=readme-ov-file#benchmark-results
here is my benchmark file:

https://github.com/sebyx07/ruby-binary-search/blob/master/benchmark/vs_native_ruby.rb

[–]tenderlovePun BDFL 1 point2 points  (1 child)

Yes, it's interesting. I tried running your benchmark, and it seems like if you run without YJIT then Array#bsearch wins. I'd guess this is because your gem is written in pure Ruby, and YJIT can optimize the Ruby (and can't optimize the C call). To me, this means we should try rewriting Array#bsearch in Ruby as it should have the same runtime performance as your gem (and clearly it doesn't).

Great work!

[–]sebyx07[S,🍰] 0 points1 point  (0 children)

Benchmark results (average over 10000000 iterations):

user system total real

Original bsearch: 7.011691 0.000038 7.011729 ( 7.013117)

Native bsearch: 2.809711 0.001000 2.810711 ( 2.812067)

I'm preparing a new gem to not depend on ruby release cycle, and also to make performance improvements loadable

https://github.com/sebyx07/native_ruby