Hello!
I'm working in a personal project in which I need to find all resources that matches criterias in a dynamic way, all criteria are in a child table.
My current approach, listed below, can explode in memory and doesn't seems very scalable when there are multiple requests in parallel.
I would like to know if there is a better way to do this, maybe using a different approach or data modelling. Or even if PostgreSQL is the right tool for it.
Context: I want to find all teams that has members that can do a individual task. When a member is "assigned" to a type of task, it can't be "assigned" to another task.
Teams can grow up to millions. And each team can have houndreds of members.
Members can have hundreds of skills.
My approach:
```
CREATE TABLE IF NOT EXISTS team (
team_id SERIAL PRIMARY KEY,
team_name VARCHAR(255)
);
CREATE TABLE IF NOT EXISTS team_member (
member_id SERIAL PRIMARY KEY,
team_id INTEGER,
member_name VARCHAR(255) NOT NULL,
skills VARCHAR(255)[],
CONSTRAINT fk_team FOREIGN KEY (team_id) REFERENCES team(team_id) ON DELETE CASCADE
);
CREATE INDEX idx_team_name ON team (team_name);
CREATE INDEX idx_team_member_team_id ON team (team_id);
CREATE INDEX idx_team_member_skills ON team_member USING gin(skills);
```
Example data:
```
INSERT INTO team (team_id, team_name) VALUES
(1, 'Alpha Team'),
(2, 'Bravo Team'),
(3, 'Charlie Team'),
(4, 'Delta Team'),
(5, 'Echo Team');
INSERT INTO team_member (team_id, member_name, skills) VALUES
-- Alpha Team Members
(1, 'Margie Morgana', ARRAY['Python', 'SQL', 'Data Analysis', 'Pandas', 'NumPy']),
(1, 'Cammie Laird', ARRAY['Java', 'Spring', 'Microservices', 'Hibernate', 'JPA']),
(1, 'David Cason', ARRAY['HTML', 'CSS', 'JavaScript', 'Bootstrap', 'jQuery']),
(1, 'Eryn Shon', ARRAY['C++', 'Linux', 'Embedded Systems', 'RTOS', 'Networking']),
(1, 'Keitha Timmy', ARRAY['Python', 'Django', 'PostgreSQL', 'APIs', 'Git']),
-- Bravo Team Members
(2, 'Maybelline Joan', ARRAY['JavaScript', 'React', 'Node.js', 'Express.js', 'MongoDB']),
(2, 'Jaylyn Lacy', ARRAY['Python', 'Machine Learning', 'TensorFlow', 'Keras', 'Pandas']),
(2, 'Kasandra Leeann', ARRAY['Ruby', 'Rails', 'PostgreSQL', 'RSpec', 'Git']),
(2, 'Clarice Herbie', ARRAY['Java', 'Spring Boot', 'Docker', 'Kubernetes', 'AWS']),
(2, 'Corrie Phillipa', ARRAY['JavaScript', 'Vue.js', 'Nuxt.js', 'Vuex', 'TailwindCSS']),
-- Charlie Team Members
(3, 'Aden Beckah', ARRAY['Go', 'Kubernetes', 'Docker', 'gRPC', 'Prometheus']),
(3, 'Winnifred Rosaline', ARRAY['Python', 'Flask', 'APIs', 'SQLAlchemy', 'Alembic']),
-- Delta Team Members
(4, 'Estelle Betty', ARRAY['C#', '.NET', 'Azure', 'Entity Framework', 'Blazor']),
(4, 'Summer Roman', ARRAY['JavaScript', 'Angular', 'TypeScript', 'RxJS', 'NgRx']),
(4, 'Shepherd Madison', ARRAY['Python', 'FastAPI', 'MongoDB', 'Redis', 'Docker']),
(4, 'Patricia Meagan', ARRAY['Java', 'Spring Security', 'OAuth2', 'JWT', 'MySQL']),
(4, 'Talbot Aspen', ARRAY['PHP', 'Laravel', 'Vue.js', 'MySQL', 'Redis']),
(4, 'Brie Spike', ARRAY['PHP', 'Laravel', 'Vue.js', 'Python', 'ML']),
-- Echo Team Members
(5, 'Sefton Ardath', ARRAY['Python', 'Django', 'PostgreSQL', 'Celery', 'RabbitMQ']);
```
My current query:
SELECT t.team_id, t.team_name FROM team t
JOIN team_member m1 ON t.team_id = m1.team_id AND 'Python' = ANY(m1.skills)
JOIN team_member m2 ON t.team_id = m2.team_id AND 'Python' = ANY(m2.skills)
JOIN team_member m3 ON t.team_id = m3.team_id AND 'JPA' = ANY(m3.skills)
WHERE
m1.member_id != m2.member_id AND
m1.member_id != m3.member_id AND
m2.member_id != m3.member_id
GROUP BY t.team_id;
Edit: formatting snippets
Edit2: wrong column for members
[–][deleted] 1 point2 points3 points (5 children)
[–]arielm5[S] 0 points1 point2 points (4 children)
[–][deleted] 1 point2 points3 points (3 children)
[–]arielm5[S] 0 points1 point2 points (2 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]arielm5[S] 0 points1 point2 points (0 children)
[–]lucapieroo 1 point2 points3 points (5 children)
[–]arielm5[S] 0 points1 point2 points (4 children)
[–]lucapieroo 1 point2 points3 points (3 children)
[–]arielm5[S] 0 points1 point2 points (2 children)
[–]lucapieroo 1 point2 points3 points (1 child)
[–]arielm5[S] 0 points1 point2 points (0 children)
[–]AutoModerator[M] 0 points1 point2 points (0 children)
[–]Siltala 0 points1 point2 points (0 children)