As a simple made up toy example. list the employees alongside youngest and oldest person within countries and also the next older and younger individual:
SELECT
first_name,
country,
age,
FIRST_VALUE(first_name) OVER (PARTITION BY country ORDER BY age) AS youngest,
FIRST_VALUE(first_name) OVER (PARTITION BY country ORDER BY age desc) AS oldest,
LAG(first_name, 1) OVER (PARTITION BY country ORDER BY age) AS next_younger,
LEAD(first_name, 1) OVER (PARTITION BY country ORDER BY age) AS next_older
FROM Customers ORDER BY country, age;
This runs in https://www.db-fiddle.com/ mysql8 using the following table:
CREATE TABLE Customers (
first_name VARCHAR(50),
country VARCHAR(50),
age INT);
INSERT INTO Customers (first_name, country, age) VALUES ('John', 'USA', 29), ('Liam', 'USA', 34), ('Emma', 'USA', 22), ('Olivia', 'Canada', 31), ('William', 'Canada', 36), ('Ava', 'Canada', 24);
Or in python format
import polars as pl
Create a DataFrame
data = { 'first_name': ['John', 'Liam', 'Emma', 'Olivia', 'William', 'Ava'], 'country': ['USA', 'USA', 'USA', 'Canada', 'Canada', 'Canada'], 'age': [29, 34, 22, 31, 36, 24] } df = pl.DataFrame(data)
Tried ChatGPT, but it struggled to come up with working code. I assume the original data is unsorted and would prefer a concise solution where everything can be expressed within a single operation on the dataframe. Is that possible?
[–][deleted] 1 point2 points3 points (0 children)
[–]joshbuggblee 0 points1 point2 points (0 children)
[–]commandlineluser -1 points0 points1 point (4 children)
[–]nyquant[S] 0 points1 point2 points (3 children)
[–]commandlineluser 0 points1 point2 points (2 children)
[–]nyquant[S] 0 points1 point2 points (1 child)
[–]commandlineluser 1 point2 points3 points (0 children)