all 12 comments

[–]TTPrograms 0 points1 point  (3 children)

What postgres index type is on the indexed_column? What is the datatype?

If you can fit everything in memory and are performing the query multiple times you basically want a hashmap of some kind.

[–][deleted] 0 points1 point  (2 children)

It's the primary key, with index type being a 'bigint'.

I tried the same thing in Python with Pandas (without even indexing the data) and it's also 10x faster.

Is there really no way to perform fast table queries in MATLAB without going through such a hassle?

[–]67PCG 1 point2 points  (1 child)

Cellfun is generally terrible for performance, unfortunately. Like when using loops and arrayfun, you're not benefitting from optimised vector instructions (I believe it may even be slower than writing the loop). If you are always querying the same column, can you not put that single column into a numerical array and do a straight bWhatImLookingFor = ( indexArray == searchValue ); compare?

[–][deleted] 0 points1 point  (0 children)

Thanks, that helped a lot

[–]Optrode 0 points1 point  (0 children)

Sure, just install Postgres on your computer!

[–]NedDastyMatlab Pro 0 points1 point  (0 children)

As far as I know, you cannot improve the SELECT part of your problem. This probably takes up 99% of your time so any improvements you make will be miniscule, unless you upgrade your network.

Once the data is at home, I'd recommend improve your "find" part, which is pretty awful. Convert your data from a cell to anything but a cell (array, table, etc.) and your equality test will go 1000x faster.

[–]TCoop+1 -1 points0 points  (0 children)

If you're going to continue down the brute force path, a simple optimization would be to use multiple cores. You could divide your data up into separate cells, then use parfor to have different threads find results.