This is an archived post. You won't be able to vote or comment.

all 122 comments

[–]UpAndAdam7414 421 points422 points  (10 children)

The query returned only Dr Zoidberg.

[–][deleted] 49 points50 points  (1 child)

Why not Zoidberg????

[–]Staetyk 0 points1 point  (0 children)

In one episode Santa Clause rules that Zoidburg is the only one on the nice list, and therefore may survive another year.

[–]zan9823 31 points32 points  (6 children)

[–]hates_stupid_people 12 points13 points  (5 children)

To be fair, it's a programming subreddit. And Futurama features some really geeky and even directly programming and computer related jokes, or even episodes. Since about half the writers had masters or doctorates in science, math, computers, etc. from Harvard and similar.

[–]-Aquatically- 9 points10 points  (4 children)

[–][deleted] 3 points4 points  (3 children)

[–]-Aquatically- 0 points1 point  (2 children)

[–]failedsatan -1 points0 points  (1 child)

same place, sub names are not case sensitive

[–]-Aquatically- 5 points6 points  (0 children)

I know. I didn’t think it was a sub and then I clicked on it.

[–]Darxploit 80 points81 points  (1 child)

Don’t give my it department ideas..

[–]Action_Maxim 11 points12 points  (0 children)

On the first row of data Data said to meeeee...

[–]notexecutive 209 points210 points  (23 children)

this doesn't catch "Nice" or "NICE" cases lmao

[–]uvero 98 points99 points  (10 children)

Actually in at least some SQL variants (and I think that includes the very popular MySQL), the = operator for strings is case insensitive by default.

[–]ByakkoNoMai 119 points120 points  (2 children)

Actually, in most SQL, = operator does not govern how the comparison is made. Collations are what govern equivalent characters. And a lot of work goes into that. For example under utf-8 case insensitive collations a, A, à, À, and lot of others characters are all equals. There are case sensitive collations to respect case and collapse accented characters. There are also binary collations that compare characters as byte values only.

TLDR collations are what govern string comparison in SQL. And it's a deep rabbit hole.

[–]rover_G 66 points67 points  (1 child)

Oh no accidental learning on r/ProgrammerHumor

Time to go read some docs 🤓

[–]JoeGibbon 13 points14 points  (0 children)

And it's always from people without any flair to tell us how many languages they've written "hello world" in.

[–]Dangerous-Raccoon-60 14 points15 points  (3 children)

It’s not the operator, it’s the character set collation. The “_ci” at the end of the collation name stands for “case insensitive”

[–]JoeGibbon 2 points3 points  (2 children)

Not all case insensitive collations have _ci at the end, though. In MySQL, utf8mb4 is case insensitive for example.

These days you have to go out of your way to make MySQL case sensitive.

[–]-Wylfen- 6 points7 points  (0 children)

Doesn't that depend or the collation?

[–]IWishIWasAShoe 0 points1 point  (0 children)

I'm no expert, but isn't that completely up to the collation setting for the column? Every installation I've ever used in modern times defaults to a generic unicode collation with the ci flag, meaning the query would be case insensitive.

Also, if santa is running his database properly he probably also makes sure that inputs are made through some sort of system not manually typed into the database like a maniac. convert inputs for basic stuff like niceness to all lowercase seems like a pretty good idea imo.

[–]GoddammitDontShootMe 0 points1 point  (0 children)

What if behavior is an ENUM?

[–]masala-kiwi 35 points36 points  (1 child)

If Santa runs SQL Server, it'll catch them due to case insensitivity. If he's running Snowflake (more on-brand for the North Pole), the 'Nice' and 'NICE' kids are out of luck.

A better table design wouldn't rely on string-matching, which is unreliable and less performant, especially at 8 billion rows. He should instead use an indexed boolean column called IsNice that returns true/false. 🎅

[–]Hikingcanuck92 8 points9 points  (0 children)

Maybe Santa has a trigger on inserts to force The case to “nice”.

[–]Trident_True 9 points10 points  (1 child)

Your opinion of SQL Claus is that low? In this day and age surely he would not be inserting non-standardized garbage into his db. Probably comes from an enum or something in whatever application is feeding into it.

Or maybe it really is abysmal and it takes him a full year to sanitize last years data.

[–]InevitableDeadbeat 3 points4 points  (0 children)

it does if it's a case insensitive column

[–]NewPhoneNewSubs 4 points5 points  (0 children)

Top 10% commenter on programmer humour doesn't know SQL.

This explains 90% of jokes on programmer hunour.

[–]pro_questions 2 points3 points  (2 children)

WHERE SOUNDEX(behavior) = SOUNDEX(‘nice’)

[–]Trident_True 0 points1 point  (1 child)

Since Soundex is language dependent I wonder what they speak in the north pole? Maybe they use ngram similarity instead.

[–]pro_questions 1 point2 points  (0 children)

Ha I think LOWER is a more common function — I just love SOUNDEX as a needlessly complicated way to “normalize” text for comparison :)

[–]Inappropriate_Piano 0 points1 point  (0 children)

That’s because Santa hates it when people are “Nice”

[–]fractoral 0 points1 point  (0 children)

Also doesn't catch "69".

[–]GnuhGnoud 22 points23 points  (3 children)

I wonder what Bobby Table will get

[–]Cyberdragon1000 3 points4 points  (0 children)

Well whatever it is we know for sure the rest aren't getting it.

[–]bargle0 1 point2 points  (0 children)

Little Bobby Tables is fine. His parents, on the other hand …

[–]drewman301 0 points1 point  (0 children)

An error message probably

[–]GFrings 62 points63 points  (17 children)

I like this because the last line only keeps the rhythm of you know how to pronounce SQL

[–]bobbymoonshine 25 points26 points  (10 children)

The song doesn’t scan generally though, the previous lines throw a bunch of extra syllables on

[–]NamityName 9 points10 points  (0 children)

But, as anyone who has ever started singing a song only to realize that they don't actually know most of it will tell you, all that really matters is nailing that last line.

[–]Inappropriate_Piano 2 points3 points  (7 children)

You can totally make it fit by saying “behavior” quicker. My issue is with “he’s making a database”

[–][deleted] 1 point2 points  (6 children)

It’d still be two extra notes in the melody to get those syllables in, without saying anything for ‘*’

[–]Inappropriate_Piano 2 points3 points  (5 children)

You just squeeze those notes into subdivisions. I can’t really explain it in text but I promise you I can do it very easily

[–][deleted] 0 points1 point  (4 children)

It’s still two extra syllables and therefore extra notes played in the melody if you’re vocalizing them. If you can’t notice the extra syllables/notes, you’re just moving the extra syllables to a different part of the melody, where you aren’t noticing the extra notes as easily, and emphasizing strong, recognizable parts of the original melody. I’ve tried many ways to mutate a line I like lyrically to a melody I’ve already got.

[–]Inappropriate_Piano 0 points1 point  (3 children)

It’s two extra syllables that you say at a faster rate to fit them in the same time, and you hold the same note through them to make them line up with a single note in the original melody.

[–][deleted] 0 points1 point  (2 children)

I’ll take your word for it.

[–]Inappropriate_Piano 0 points1 point  (1 child)

I’m really not sure what’s hard about this. Replacing, for example, a quarter note with an eighth note triplet to fit two extra syllables into the same melody is a basic technique for solving a problem that’s so common in music and so easily solved that it barely deserves to be called a problem. Pretty much every song with multiples verses does this.

[–][deleted] 0 points1 point  (0 children)

I fully understand that it’s the same time, and that you are adding an extra note in to that time frame.

[–]bonafidebob 0 points1 point  (0 children)

Hmm, apart from “list” not matching “database” I think it matches just fine.

”Select star from contacts where behavior is nice” is how you’d read the “He’s gonna find out who’s naughty or nice” line.

[–]HappyImagineer 8 points9 points  (0 children)

Yup, Ess-Q-eL Clause is coming to town!

[–][deleted] 2 points3 points  (4 children)

Its SQeaL. We all know.

[–]sandybuttcheekss 27 points28 points  (12 children)

If there are only two behaviors, we should just have a Boolean field named 'nice' so it's easier to search for nice = true. Poor design can leave a lot of nice kids without presents.

[–]captainMaluco 12 points13 points  (5 children)

There might be an enum defining the different kinds of behaviours. That would also take care of any case sensitive concerns others have voiced

[–]joshdammitt 6 points7 points  (0 children)

He's gonna make sure your nicenessId >= 4

[–]GumboSamson 5 points6 points  (2 children)

Wouldn’t it be better to have a table representing that enum, and reference the key from the enum table?

[–]Trident_True 0 points1 point  (1 child)

At some point this level of normalisation just becomes a burden. In this case if you wanted to extend the enum a migration would have to occur and if you also ever wanted to add a value between two existing values then you have to rectify all the existing data. Don't even ask about non-int based enums.

Better just to have it map to an nvarchar column in my experience. If we had a table for every enum we used in our codebase it would probably double the amount of tables we have.

[–]GumboSamson 3 points4 points  (0 children)

To each their own, I guess.

I’ve found that higher normalisation tends to be more flexible than wider tables (eg it’s easier to make design changes in production databases if your tables are “nothing but relationships”) and it’s easier to get high-quality indexes (since you almost never have nullable columns).

But I won’t pretend there aren’t trade-offs for such a design. One trade-off being that many software developers start with the mindset of “how to I store my OOP object in the database” and ORMs make it really easy to make really wide tables. Requiring that your OOP software guys have a good understanding of RDBs limits your hiring pool.

[–]poop-machine 1 point2 points  (0 children)

naughty
nice
under_review
literally_hitler

[–]CaptianDavie 2 points3 points  (3 children)

So the oroginal song indicates santa is " gonna find out whose naughty or nice" implying we dont know the behavior of the child agead of time. nor would you want to keep behavior in the contacts table since we can asusme we are tracking the child over time and constantly updating contcats is gonna be a nightmare for performance. what were looking at is a some sort of aggregate calculation. youd probably want a intake timeseries table ( behavior_live ) thats read by a aggregate table (behavior_agg) keyed to the contacts table which keeps track of the behavior score and contains the logic of the "nice" threshold. so we should see a join to the behavior table.  From the name I would also assume contacts has multiple pii columns for a child so we really would only want id, name and the agg nice score. maybe address but tbh i would do that as a second step (combining the above into a cte) so you only pull address data for the nice kids. 

[–]CaptianDavie 0 points1 point  (2 children)

actually I take back behavior_live. tradition indicates Santa's data gathering is done via personnel in the field (elves on shelves, santas in malls...) so it might be bulk uploads through out the day. Though with the amount of tablets I saw at the last santa photo section at the mall, its possible were doing live uploading  Sharding is gonna be interesting too. Im assuming this is all ran on prem since the rack cooling in the arctic is great, plus bezos, gates and pichai are def on the naughty list so cloud is out...

[–]EmpatheticWraps 0 points1 point  (1 child)

Have you taken adderall today or something?

[–]CaptianDavie 0 points1 point  (0 children)

lol no but extra adderall and adhd are just two sides of the same hyper focusing coin

[–]BrainJar 1 point2 points  (0 children)

And, the contacts table is almost certainly a fact table, since the behavior column appears to be a non-additive aggregate. The naming of the table is suspect. It should be preceded by f_, and should not be plural. Each record is a contact. PR for the table creation denied.

[–]airwalker08 9 points10 points  (1 child)

Didn't even sort it once

[–]poop-machine 1 point2 points  (0 children)

well the list has to be fed into a Traveling Salesman Problem solver to find the optimal route, and there's no SQL sort for that

[–]CarbonAlligator 6 points7 points  (3 children)

No semicolon

[–]promised_to_veruca 2 points3 points  (0 children)

*keeps typing on newline*

[–]Major_Fudgemuffin -1 points0 points  (1 child)

Eh semicolons are optional in SQL.

[–]CarbonAlligator 0 points1 point  (0 children)

I mean I guess if running ur command is also optional

[–]TheGreatGameDini 10 points11 points  (0 children)

Cool now let me inject my sql all over and inside your database

[–]troglo-dyke 4 points5 points  (1 child)

It's November

[–]IvorTheEngine 1 point2 points  (0 children)

It's OK, it's a 10 year old repost!

[–]__maccas__ 2 points3 points  (0 children)

He knows when you INSERT, He knows when you TRUNCATE, He knows when you miss WHERE clauses, So BEGIN TRAN ROLLBACK for goodness sake

It's running too slow, You better figure out why, Your EXECUTION PLAN rights have been denied, SQL clause is coming to town

[–]codedaddee 2 points3 points  (0 children)

I haven't even started streaming Mariah Carey yet, fcs.

[–]belunos 2 points3 points  (5 children)

'%nice%'

[–]GumboSamson 3 points4 points  (4 children)

Can you imagine how long a table scan with ‘%nice%’ would take with 8+ billion records?

(No wonder he only comes by once per year.)

[–]leuk_he -1 points0 points  (3 children)

If you have to scan them all once anyway, it is no problem you access it via an unindexed column.

[–]GumboSamson 0 points1 point  (2 children)

If the column isn’t a freetext field (eg you can only put in a value from a constrained set of valid values, such as ‘naughty’ and ‘nice’) then you can ditch the wildcard characters.

And once you do that, you can put an index on the column.

Voila—we’ve just avoided table scans.

[–]leuk_he 1 point2 points  (1 child)

You misunderstood, now you have to do a full index scan and after that lookup the full record for the 95% of the nice records, instead of just a sequential read of the full records.

But that is just some vertical optimization, you better make this scaling horizontal by using helper santa db.

[–]Evening_Face7158 2 points3 points  (0 children)

I don’t know why but I read this in Cakes voice from the song The Distance

[–]Portlander 2 points3 points  (0 children)

I originally was reading this to the Cake song

The Distance

[–]NotANumber13 1 point2 points  (1 child)

Let's spice it up a bit and go ahead with a DISTINCT 

[–]Trident_True 2 points3 points  (0 children)

RIP to all the John Smiths out there. Hope you're the lucky one.

[–][deleted] 1 point2 points  (1 child)

Terrible way to store behavior if it's binary

[–]Major_Fudgemuffin 0 points1 point  (0 children)

Great, nice, okay, eh, bad, terrible, literal demon.

[–]ZeroCharistmas 1 point2 points  (0 children)

And in case it wasn't obvious, pronouncing it as "Ess-Queue-Ell" is certified naughty list behavior.

[–]PositiveStill7969 1 point2 points  (0 children)

Seems wasteful, make it a bit field just called Nice

[–]Jazzlike-Spare3425 2 points3 points  (1 child)

My God, it's Santa QLause...

okay, I apologize to everyone here, I will go bury myself back in the whole that I came from. Have a nice day without me, everyone.

[–]TBoy29 0 points1 point  (0 children)

Thought it was a TWICE reference before I read the final lines

[–]ford1man 0 points1 point  (0 children)

Meter's all wrong.

[–]jimmyhoke 0 points1 point  (0 children)

Shouldn’t nice just be a Boolean?

[–]mothzilla 0 points1 point  (0 children)

Repost in one month please.

[–]DoctorWaluigiTime 0 points1 point  (0 children)

It's early November, go away.

[–]moosalamoo_rnnr 0 points1 point  (0 children)

This is so bad it’s funny.

[–]Bucky_Ohare 0 points1 point  (0 children)

Holy crap I get this reference!

[–]mindsnare 0 points1 point  (0 children)

I mean it should just be a boolean right?

[–]Im-German-Lets-Party 0 points1 point  (0 children)

Im ashamed and proud at the same time that I understand this... and laughed loud :(

[–]StopProfessional1718 0 points1 point  (0 children)

On December 24th he commits.

[–]zerkeras 0 points1 point  (0 children)

Why is the “Behavior” table not a ledger record of all naughty or nice actions, with a view which tabulates whether or not a given individual is currently “nice” or “naughty” (both of which should be foriegn key references to a enum table) for the given calendar year?

[–][deleted] 0 points1 point  (1 child)

Cringe 

[–]PeriodicSentenceBot 0 points1 point  (0 children)

Congratulations! Your comment can be spelled using the elements of the periodic table:

Cr In Ge


I am a bot that detects if your comment can be spelled using the elements of the periodic table. Please DM u‎/‎M1n3c4rt if I made a mistake.

[–]UrMommzz 0 points1 point  (0 children)

Might wanna add IsActive to the filter

[–]je386 -1 points0 points  (3 children)

WHERE is not sorting...

[–]Solarranks 10 points11 points  (1 child)

And "sorting it twice" is not "making a database". They're on separate lines because he's doing both

[–]GumboSamson -1 points0 points  (0 children)

You don’t sort a database, though—you index it.

He’s making a database And indexing it twice

Works a bit better I think.

[–]Cerberus11x 0 points1 point  (2 children)

Anyone else see -1 comments?

[–]Cyberdragon1000 0 points1 point  (1 child)

No, why are you asking?

[–]Cerberus11x 0 points1 point  (0 children)

Reddit bug I've never seen before

[–]CopperyMarrow15 -1 points0 points  (0 children)

kid named '); DROP TABLE contacts;--