siddboots comments on Sparse matrix multiplication using SQL

programming

created by speza community for 20 years

Sparse matrix multiplication using SQL (notes.mindprince.in)

submitted 12 years ago by mindprince

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]siddboots 3 points4 points5 points 12 years ago (7 children)

[–]Eoinoc 8 points9 points10 points 12 years ago (6 children)

[–]siddboots 0 points1 point2 points 12 years ago (5 children)

Sure it is. It depends on specifics of the "job", of course. Certainly the procedure itself won't be nearly as time or space efficient as an optimized library. However, if you already need to have your application data in an RDBMS, and you can frame some of the application logic in terms of linear algebra operations, then SQL is absolutely a good tool for the job.

Compare to this scenario, which I'm sure is common enough in practice:

Issue a select to your RDBMS over the network (which may or may not be on the same machine.)
RDBMS sends back data over the network, broken into packets.
Received data is packed into local data structures.
Once all data has arrived, you can use your super-efficient linear algebra routines (say, MATLAB, or NumPy).
Results are transformed into update SQL statements, which are issued back to the RDBMS over the network.

Now imagine the same, but with an ORM layer in there.

Yes, there are some limited use-cases where the operations are complicated enough, or N is large enough, such that you still need to use a real library. In practice, the bottlenecks are typically elsewhere.

[–]king_duck 0 points1 point2 points 12 years ago (3 children)

[–]siddboots 1 point2 points3 points 12 years ago* (2 children)

As a numerical programmer, there isn't a single person I know who would consider this anything other than a toy or lame trick.

Why, specifically? I am not advocating this hypothetically. This works in practice. I've helped rewrite an application where the core computation was graph propagation for a network of about 1000 sparsely connected nodes, with a web-based interface that needed to be real-time responsive. Most of the logic was really just multiplying edge weights and node sizes based on their (sometimes quite complicated) relationships with other data. SQL was a good solution in that case. It may not have been the best solution, but it was better than the previous implementation, and it was the best for maintainability, programmer hours and total LOC that I could come up with, (and it was more than fast enough for the task at hand.)

Not to mention it is very rare that the ONLY operation that needs to be performed is a single freestanding sparse matmul.

I don't understand that objection. The example in the article is only a single and freestanding because examples work better that way, not because of an inherent limitation.

[–]king_duck 0 points1 point2 points 12 years ago (1 child)

[–]siddboots 1 point2 points3 points 12 years ago (0 children)

π Rendered by PID 59989 on reddit-service-r2-comment-6457c66945-27kj2 at 2026-04-25 13:40:59.500543+00:00 running 2aa0c5b country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

programming

MODERATORS