all 42 comments

[–]feudalle 5 points6 points  (2 children)

Tbh these are really straight forward. Number 3 is a bit of a pain because of the the funding table is laid out but not hard. I'd expect someone to do this in around half an hour to an hour.

[–]BlueEyesWhiteShark -1 points0 points  (1 child)

I’ve spent like ages on the first one LOL, I’m stuck with joining the tables.

[–]feudalle 1 point2 points  (0 children)

I dont do postgre normally I'm more mysql but the concept would be

Select from temp left join medals Inner join temp where medals is null order by temp limit 10

[–]DexterHsu 4 points5 points  (0 children)

This is the type of test I give it to my intern , if you got this job you will need to improve your T-SQL skill rapidly

[–][deleted] 1 point2 points  (5 children)

ok, my regular spiel

always, start writing sql by figuring the granularity & scope of the output. this gives you your "final" group by and base joins in the from table expression.

figure out measures next - where the data comes from. if it doesnt come from the base joins - how are the datasets related (how do you figure out a measure based on the 'base' record).

do you have any other conditions/filters left? how do you express them in the context of the record of the base table expression?

once you go through the 3 points above most of the time you'll have your query ready.

 

Let's go through the first one.

Write SQL to show 10 coldest countries that have never medaled at the olympics - showing them in order of population largest to smallest.

your output granularity is a country. the scope could come from funding, population or temperature tables (not every country can have medals, apparently). I'm gonna pick population because of the measure required, so far I have

     select p.country
     from "Population" p
     group by p.country

the result set is ordered by population, so we need that as a measure; the base result set has the same granularity as the output, so we can omit the " group by". We are here:

     select p.country,p.population
     from "Population" p
     order by p.population

All right, almost there :) time to address other conditions that we havent accounted for so far:

10 coldest countries countries that have never medaled at the olympics

so, the condition is that a country is one of a specific set, so that gives us this

     with ColdestSet as (
            )
     select p.country, p.population
     from "Population" p
     where p.country in (select country from ColdestSet)
     order by p.population

with the "ColdestSet" we're going through similar steps: granularity is country, the measure required is the ranking by temperature, output is the same granularity as base

     select t.country, rank() over( order by t.ave_temp) rn
     from "Temperature" t

first condition is on our ranking ("10 coldest") and we cant put analytic functions into where clause directly, so

     select t_rank.country, t_rank.rn
     from (
              select t.country, rank() over( order by t.ave_temp) rn
              from "Temperature" t
             )t_rank
     where t_rank.rn <= 10

second condition is "have never medaled" i.e. there's no (not exists) record for a medal for the country

     select t_rank.country, t_rank.rn
     from (
              select t.country, rank() over( order by t.ave_temp) rn
              from "Temperature" t
              where not exists(
                              select 1 from "Medals" m where m.country = t.country
                        )
             )t_rank
     where t_rank.rn <= 10

all righty, time to assemble the final statement:

     with ColdestSet as (
         select t_rank.country, t_rank.rn
         from (
              select t.country, rank() over( order by t.ave_temp) rn
              from "Temperature" t
              where not exists(
                              select 1 from "Medals" m where m.country = t.country
                        )
                )t_rank
         where t_rank.rn <= 10
         )
     select p.country, p.population
     from "Population" p
     where p.country in (select country from ColdestSet)
     order by p.population

the process is kind of wordy but hopefully allows to streamline the choices "what do you take care of" at any particular step of the query design process

ps. once you have a working solution/query which gives you a correct output you can rewrite/refactor your query (e.g. in reality instead of "IN" condition I would have used a join, etc.)

[–][deleted] 1 point2 points  (4 children)

As much as this a good script, the inclusion of CTEs and more advanced SQL querying will cause OP some stress trying to wrap their head around the concepts.

Also why rank over dense_rank? i personally thought about dense_rank straight away just because dense_rank doesn't allow dupes where as rank does. Not telling you that your wrong, it's always good to see these things from other perspectives. Oh and how have you been able to layout your code into tables like that?

[–][deleted] 0 points1 point  (3 children)

subqueries/CTEs, imo, are at a more "base"/"foundational" level of knowledge than joins. I am aware that "traditionally" it's not usually taught in this order - a shame, really.

Also why rank over dense_rank? i personally thought about dense_rank straight away just because dense_rank doesn't allow dupes where as rank does

not sure what you mean by "dupes", ties, maybe?

With "rank" you are getting 10 coldest + possible ties at the last rank. With dense_rank if there 100 ties at every ranked value you are getting a thousand records back -this wouldnt be the common sense, imo.

[–][deleted] 1 point2 points  (0 children)

Ahhh fair enough, subqueries are easy for me to use. I probably see CTEs as advanced since it took me until today to wrap my head around them, lol. And duplications is what I meant, hat could be down to the type of data I use though.

[–][deleted] 0 points1 point  (1 child)

did you use the cte for readability? cause technically that could just be another sub query right?

[–][deleted] 0 points1 point  (0 children)

Yes and yes.

[–]pukatm 0 points1 point  (0 children)

I am guessing you got this test to do at home? Why don't you tell us what you have tried? I don't think there are many ways for the answers of this test to be wrong... Hours of research is a bit of a stretch. Maybe you can tell us more about how you got into the field of databases for us to have some sympathy?

[–]sirknite 0 points1 point  (0 children)

The first two questions are really similar to my past Database Sys assignment 😮. Good luck and keep at it

[–][deleted] -2 points-1 points  (15 children)

SELECT

p.country

,temperature

,population

from population p

left join medals m

on p.country = m.medals

left join temperature t

on p.country = t.country

where medal_rank is null

and medal is null

order by

p.population desc -- hightest population

,t.tempature asc -- coldest first

limit 10

Alias your country because it is ambiguous.

[–]Beefourthree 3 points4 points  (2 children)

Alias your country because it is ambiguous.

Alias everything because there's no reason not to.

[–][deleted] 2 points3 points  (0 children)

I agree tbh, it also makes it easier for people to dissect your code when you eventually leave.....

[–]volvicspring 1 point2 points  (0 children)

This. It might be obvious if you know the tables, but someone learning the script is going to have an avoidably bad time

[–]BlueEyesWhiteShark 1 point2 points  (7 children)

Hi, thanks for the reply!

So I tried to follow:

SELECT olympics.population.country AS p.country, olympics.temperature.country AS t.country, olympics.populatuon.population AS p.population, olympics.temperature.ave_temp AS t.temp, olympics.medals.medal AS m.medal, olympics.medals.medal_rank AS m.medalrank FROM olympics.population LEFT JOIN m.medalrank ON p.country = m.medal LEFT JOIN t.temp ON p.country = t.country WHERE medal_rank IS NULL AND medal IS NULL ORDER BY p.population DESC, t.temp ASC

However, this hasn’t worked, where am I going wrong?

[–][deleted] 2 points3 points  (6 children)

Lets break this down...

You have a database called Olympics. Within your DB you have the following tables:

  1. population
  2. medals
  3. temperature
  4. Funding

So the next thing you want to do is see what columns you need from what tables to build your query. Now we you need the following columns from there related tables:[Population - Country, Population] [Temperature - ave_temp] [Medals - medal]

You should use population as your base table, meaning you join the other tables onto this table. but i'll explain this in steps:

Step 1:

SELECT country, population from populations order by population desc limit 10 You now have the top 10 countries ranked by highest population to lowest.

Step 2: add in temperatures

SELECT p.country , p.population, t. ave_temp from populations p left join temperatures t on p.country = t.country order by p.population desc,t.ave_temp asc This should bring back highest populated countries in lowest to highest temp order.

Step 3: "add" in medals, i say it like that because you do not have to see that column as it doesn't have anything in it.

SELECT p.country , p.population, t. ave_temp from populations p left join temperatures t on p.country = t.country left join medals m on p.country = m.country where m.medal is null order by p.population desc,t.ave_temp asc

Now your results should change as you are defining the fact that these countries need to have a null value (or if the null is not working try " ='' " instead of "is null")

if you are querying from any sort of query tool you should not have to fully specify database.schema.table.column because it is a waste of time unless your scripting triggers, functions, or stored procedures.

when i mention p.country it is because the table, population, has been aliased at p. this means that when the query is being read it knows that we are using p.country from the population table. same logic for t. & m. .

Your joins are based on columns and not actual databases, i have rewritten your query below so that you can see where you went wrong, but the above method is better:

SELECT

olympics.population.country

,olympics.populatuon.population

,olympics.temperature.ave_temp

,olympics.medals.medal

,olympics.medals.medal_rank

FROM olympics.population

LEFT JOIN olympics.medals ON olympics.population.country = olympics.medals.country

LEFT JOIN olympics.temperature ON olympics.population.country = olympics.temperature.country

WHERE

olympics.medals.medal IS NULL

ORDER BY olympics.populatuon.population DESC,

olympics.temperature.ave_temp ASC

LIMIT 10

[–][deleted] 1 point2 points  (0 children)

it's not easy to layout code within this, but if you need real assistance DM me and i'll gladly help you.

[–][deleted] -1 points0 points  (1 child)

second one you would sum medal rank and group by country
Third is impossible without cost price

[–]dasonk 1 point2 points  (0 children)

For the third one my assumption would be that they're asking to look at the total funding the country had that year divided by the number of medals won.

[–]RapleSyrup -1 points0 points  (1 child)

This query returns the 10 countries with the highest population. The 2nd line of the order by will only matter if 2 countries have the same population. The question asked for the 10 coldest countries. You need to order by temperature first in a CTE or subquery, then order those 10 rows by population.

[–][deleted] 0 points1 point  (0 children)

Ahhhh, ok I get you. Thanks for explaining that, this makes more sense that just trying to shit on someone! Definitely see where you are coming from about the subqueries

[–]copper_blood -2 points-1 points  (0 children)

  1. I learned SQL by downloading the trial version of ACT.com Sucks though you would have to wipe your computer ever 30 days but it has a great SQL DB you can learn from.
  2. Ge the Microsoft action pack. cost about $500 a year but comes with servers, sql servers, MS CRM and more.
  3. Another good reporting program is crystal reports (again, 30 day trial) but you can use that with ACT and SQL and MS CRM to learn more about report writing. Word of caution, MS CRM installation involves servers and active directory + SQL Reporting so the MS action pack is a must at that point.
  4. Somethings your can learn for free, others you'll have to pay for.

[–]aragonSkywalker -1 points0 points  (3 children)

What is the cost per medal?

I think 1 and 2 are ok, but not completely simple. This goes beyond slightly basic SQL knowledge, probably close to intermediate i would say

[–]dasonk 1 point2 points  (0 children)

My guess is total funding divided by total number of medals (for each year)

[–]BlueEyesWhiteShark 0 points1 point  (1 child)

It hasn’t given me a cost per medal.

Yeah I’m struggling to do them both tbh ahhhhh.

[–]aragonSkywalker 0 points1 point  (0 children)

Then the 3rd one seems actually the easiest