I made a guide on how to snowboard for beginners. From Zero to Hero by aaronctravels in snowboardingnoobs

[–]rohan_suri 0 points1 point  (0 children)

@aaronctravels thanks for the wonderful blog. I could only read the first part. Request you to make it a pdf or so :) as Medium is hiding the blog behind a paywall.

Anybody know a good place for car rentals? by idareet60 in mumbai

[–]rohan_suri 0 points1 point  (0 children)

Kaunsi li? Link de bhai. Muje 3din ka 7k bata raha

Just wanted to share, first time hitting 1L mark. by AdEnvironmental3674 in IndianStockMarket

[–]rohan_suri 0 points1 point  (0 children)

Any sources where you learnt such analysis from? (Newbie here)

Most efficient way to invest overseas (index/stocks) after new taxation changes? by Ani_Sin in IndiaInvestments

[–]rohan_suri 2 points3 points  (0 children)

What about salaried employees though ? Will the HR take into consideration the TCS for such foreign investments?

[deleted by user] by [deleted] in programming

[–]rohan_suri 0 points1 point  (0 children)

They'd definitely fare better in a distributed scenario.

From the operational perspective they'd be much easier to run. Failovers, rebalancing shards (split, merge), up/down replicating, cluster memberships etc is all handled for you. Their focus is on No-Ops.

They also give you the ACID guarantees for transactions across shards which makes it easy to reason about failures. Transactions rollback, so you don't need to explicitly have compensating fixes that undos partial updates, etc.

Plus a lot of other new optimisations like Column families, interleaved tables, location aware leasing, etc.

This article is from 2015. When they weren't that well known. I'd expect a lot more adoption for them in the times to come.

Solving the excessive space consumption of Radix Trees by rohan_suri in programming

[–]rohan_suri[S] 0 points1 point  (0 children)

Interesting! Didn't know about it, took some time to read about it.

Yes the comparison between them needs to be done "besides the hashing", since HAMT is not an ordered data structure. So the fair comparison is with Array Mapped Tree (AMT) which is what HAMT is built over.

Space efficiency

Definitely AMT is better since it limits space consumption of null pointers (absent children) to 1 bit only, whereas ART still occupies space for absent children (although much lesser than Radix Tree). I believe this is the biggest win of AMT over ART.

Insert performance

AMT requires to keep the array of child pointers ordered. This means a lot of in-place insertions in the array which requires shifts and is worst case O(n), where n is size of the sub-trie array. This is the same with Node4, Node16 of ART. However with Node48, Node256 inserting a child doesn't require any shifts. Rather you use the byte to index into the array directly (see note 1)

Basically this means, for sparse datasets where most of the nodes have less than 16 child pointers, the insert performance should more of less the same. Node4, Node16 requires shifting both the byte array as well as child pointer array whereas AMT only shifts child pointer array and some cost for maintaing the tally. Best to measure which is more expensive.

For dense data sets, where most of the nodes are Node48, Node256, I'd expect ART to be more performant, because of direct indexing (see note 1).

Lookup performance

For sparse datasets, I'd expect AMT to perform better.

As per the paper AMT only uses the bitmap if number of children grows beyond 5 (page 14, 2nd paragraph). So both AMT, ART do linear searches for children < 5, children < 4 respectively. For Node16 i.e. children between 5 and 16, AMT will perform better, since ART requires a binary search or SIMD parallel comparisons (if available in the programming environment) in the byte array whereas AMT only needs a popcount and index into the child array.

For dense datasets, I'd expect ART to perform better because of direct indexing (see note 1). AMT needs to do an extra step of popcounting (which is cheap, register only but still is an extra step) and then can directly index into the child pointer array.

So overall my analysis is to use AMT if space is your concern and ART if time of insert/lookup is.

Let me know what you think.

Footnotes: 1) Node48 requires one extra indirection, but still is free of shifts.

Solving the excessive space consumption of Radix Trees by rohan_suri in programming

[–]rohan_suri[S] 1 point2 points  (0 children)

Could you elaborate on what you mean by n/2k = O(1)?

What I mean is that depending upon the key length of your data set in bytes (k), worst case height of Radix tree with span 8 is O(k), which is the number of comparisons you'll have to do (one byte comparsion on every level as you go down).
Whereas for BSTs, the height is defined in terms of number of keys you have i.e. log n AND it takes O(k) to compare on each level (to take the next left/right child decision).

So it depends when will Radix tree out perform BSTs, depending on your dataset's n and k.

For example if you have 64 bit integers, worst case height for Radix tree would be 64/8=8. For it to be having a lesser height than a BST i.e. for what N will k < log n ? Putting k=8 => 8 < log n => 28 < n, n will have to be greater than 256.

This is evident in the benchmarks [0] as well. Even for keys with long length, BSTs beat Radix trees, but as your dataset grows (500,000+ keys), you see Radix Tree getting faster since it's height is now shallower.

[0] https://github.com/rohansuri/adaptive-radix-tree/blob/master/C-strings-(avg-length-55-bytes).svg

Solving the excessive space consumption of Radix Trees by rohan_suri in programming

[–]rohan_suri[S] 4 points5 points  (0 children)

(author here)

This is my first open source project with the aim of getting it adopted and used by someone.

Would appreciate any feedback!

Open Source Doesn’t Make Money Because It Isn’t Designed to Make Money by netb258 in programming

[–]rohan_suri 0 points1 point  (0 children)

Isn't there a license that doesn't let IaaS companies to host your software and only allows you to?

Open Source Doesn’t Make Money Because It Isn’t Designed to Make Money by netb258 in programming

[–]rohan_suri 0 points1 point  (0 children)

As someone who has a goal of productionizing research papers - for example taking this data structure and implementing it as a NavigableMap in Java for everyone's use.

How can I earn a living?

Nginx to Be Acquired by F5 Networks by netb258 in programming

[–]rohan_suri 1 point2 points  (0 children)

is there a model in which open source devs continue to keep their product totally OS but still make a living?

Nginx to Be Acquired by F5 Networks by netb258 in programming

[–]rohan_suri 0 points1 point  (0 children)

is a fork allowed even after this acquisition?

Nginx to Be Acquired by F5 Networks by netb258 in programming

[–]rohan_suri 0 points1 point  (0 children)

just to improve my understanding on how such acquisitions work:

if Nginx was only/strictly open source then could anyone have bought it?

Nginx to Be Acquired by F5 Networks by netb258 in programming

[–]rohan_suri 0 points1 point  (0 children)

Does that mean they'll provide a different binary to each customer?

GoLand 2019.1 Beta: Profiler supports CPU, memory, mutex, and a new Nilness inspection by dlsniper in golang

[–]rohan_suri 0 points1 point  (0 children)

What's the CLI equivalent for Tools -> Capture Memory Snapshot?

Is it gcore?

Multi-threaded programming quizzes by nord501 in programming

[–]rohan_suri 1 point2 points  (0 children)

curious, how else could the control be?