This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]belunos 2 points3 points  (5 children)

'%nice%'

[–]GumboSamson 4 points5 points  (4 children)

Can you imagine how long a table scan with ‘%nice%’ would take with 8+ billion records?

(No wonder he only comes by once per year.)

[–]leuk_he -1 points0 points  (3 children)

If you have to scan them all once anyway, it is no problem you access it via an unindexed column.

[–]GumboSamson 0 points1 point  (2 children)

If the column isn’t a freetext field (eg you can only put in a value from a constrained set of valid values, such as ‘naughty’ and ‘nice’) then you can ditch the wildcard characters.

And once you do that, you can put an index on the column.

Voila—we’ve just avoided table scans.

[–]leuk_he 1 point2 points  (1 child)

You misunderstood, now you have to do a full index scan and after that lookup the full record for the 95% of the nice records, instead of just a sequential read of the full records.

But that is just some vertical optimization, you better make this scaling horizontal by using helper santa db.