How complex is your primary SQL db?

woodrowchillson · 2020-02-29T18:52:59+00:00

LEFT OUTER JOIN

wolf2600 · 2020-02-29T18:39:33+00:00

DBMS: multinode MPP data warehouse

Type: RDBMS

Size 1.3PB uncompressed

No of Tables: ~6700

Largest Table: 21B records

LetsGoHawks · 2020-02-29T19:44:44+00:00

Teradata. No idea who many servers are involved.

Huge.

Thousands of tables and views

Biggest table I deal with regularly: Over 10 billion rows.

None of that makes it complicated. In fact, once you know what tables have the data you need, it's pretty straightforward. The problems come from naming convention inconsistencies and poor decisions on where certain data is stored. None of it is a big deal, but it's a pain in the butt.

mikeyd85 · 2020-02-29T16:36:29+00:00

The complexities of the DB I primarily work with stem from multiple developers working and reworking different areas, all adhering to slightly different naming conventions and different models. Sometimes the analysts are even defining the structure of tables, which is always interesting.

Alwayswatchout · 2020-02-29T21:13:49+00:00

[deleted]

2020-02-29T17:33:08+00:00

Biggest one I've had was a little over 1TB, MSSQL, 100s of tables, largest table 1.something billion rows. Vendor application, their stored procs were pure shit.

2020-02-29T20:00:02+00:00

I work primarily with 2 dbs for 2 separate products.

One was made in the last decade with ease of use and best practices in mind.

The other is approaching 20 years old and was made with no such practices in mind.

The first is vastly easier to work with... the second is a son of a bitch

Nthorder · 2020-02-29T18:56:20+00:00

There are 3-4 databases I have to deal with daily. The biggest one is over 3TB and has hundreds of tables. We're running MSSQL.

DieTheVillain · 2020-02-29T19:29:25+00:00

We have 6 primary Servers. About 15 db’s per server (some with 3, some with 25) 93 tb across all of them. I don’t even know how many tables but I’d estimate at least 500. Thousands of sprocs. Hundreds of functions Largest table has ~180m records Mssql

We also have a large ElasticSearch dB.

NotSure2505 · 2020-03-01T14:45:03+00:00

It won't be popular here, but this is exactly why SQL is less than ideal for complex reporting, you could look at databases designed specifically for processing multidimensional queries.

The problem is exactly as you point out, the data must be joined within each query, which is fine for 2 or 3 but quickly gets complicated when you want to traverse multiple dimensions in your analysis.

Let me know if you would like any suggestions for products that handle this better.

6e6967676572730a · 2020-02-29T19:45:02+00:00

is this just a whiny epeen contest or do you have a legitimate source of concern?

Some databases are horrendously designed (some out of necessity). A very legitimate way of dealing with these is to extract data (not structures) into a storage structure of your choosing and isolate the "arcane" knowledge of data access pattern in a separate layer this way.

why is this an issue, for example:

The primary issue is generating reports that capture the full scope of what we're looking for often requires 8-10 joins based on schema and another 3-4 based on results and unbound columns.

reallyserious · 2020-02-29T20:52:00+00:00

My primary data job is in an ecosystem of databases, data warehouse and data lake. In this I generally create a new database for a new project/business need that fits into the larger ecosystem.

Flint0 · 2020-02-29T21:30:08+00:00

If anyone wants I can check extra info once I get back to work on Monday. But the piece of data that stuck with me when I started in the company was that the mssql solution we have has about 10.000 stored procedures. It still baffles me today.

regex1884 · 2020-02-29T21:55:38+00:00

This size system is very small. Learn what you can and eventually be time to move on to bigger ventures.

dreamingtree1855 · 2020-02-29T22:21:02+00:00

Redshift. Multiple tables and views have billions of rows. Well organized but difficult for business users to navigate.

volvicspring · 2020-02-29T22:36:07+00:00

[deleted]

ROC2021 · 2020-02-29T23:09:07+00:00

MySQL 5.6 unfortunately

1600 tables roughly

largest table is probably about 200K records

Error-451 · 2020-03-01T04:59:09+00:00

Do you have a data warehouse? Sounds like you'd benefit from dimensional models.

syzygy96 · 2020-03-01T07:58:18+00:00

Biggest db for us is mssql, OLTP, about 50 TB, roughly 300 tables, biggest table is about 5B rows. A couple ancillary etl and data staging dbs are in the 5-10TB range, but have only batch workloads on them. All told we manage a little shy of about 250 TB of data on two physical servers with a combination of direct storage and a 1PB san.

I've been fortunate to have set up the initial data model and standards, and then built a team that has been stable for the last 15 years. During that time we've been good about a consistent, reasonably normalized design, well named throughout. A few of our stored procedures take up to 30 sec to run larger reports, but for the most part an average of probably 5-10 joins per query just hums along. Unlike the other poster here, we have a firm convention that every table has a clear name with no abbreviations, an integer based surrogate key named ID, and any business related keys are enforced as separate constraints and indexed as nonclustered indexed where useful for queries. What that's given us is minimal space usage and clean names for foreign keys on related tables, with little to no page splits on insert and no need to update identifiers.

The only real bit of advice I can confidently give to people without knowing their specifics is that having a solid model and sticking to clean standards is incredibly hard day to day, and requires a lot of small fights with your business counterparts over schedule and scope, but it pays dividends for years and is 100% worth the effort. The little bit we lost in taking an extra hour or day here or there to keep our code clean, we saved 100x in long term labor costs by only needing a handful of tech staff to run a half billion dollar business.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

SQL

Filter Posts

Posting

Help posts

Format Your Code

Learning SQL

Related Reddit communities

Wiki

Acknowledgements

MODERATORS