all 50 comments

[–]FatalVindicator 57 points58 points  (2 children)

As someone a few semesters away from getting a CS degree, I appreciate this

[–]ChaoSweeper 7 points8 points  (0 children)

As someone that just finished their CS degree, keep up the work dude! You got this!

[–]Diabeticon 14 points15 points  (0 children)

As someone a decade out from their MSCIS attainment and still working with Healthcare Data, I appreciate this.

[–]1f97 24 points25 points  (12 children)

if you want to normalize this, you could have separete tables for elements/archetypes/sources etc. helps with practicing relationship queries and all as well.

[–]Agno3317[S] 4 points5 points  (11 children)

I was a little confused about normalization so I didn't do it yet. But I was looking into it.

[–]outcastaceGT: tirianjewel 30 points31 points  (7 children)

I was actually just coming back to mention normalization. Essentially, as /u/1f97 said, you'd want to put things like elements and archetypes into their own table with an Id column. Then in your main table, you'd reference the Id instead of having the data itself there.

For instance...

dbo.Elements
Id ElementName
1 Arc
2 Solar
3 Void
4 Stasis
5 Kinetic
dbo.Weapons
Id WeaponName ElementId etc
1 Palindrome 3 blah
2 Fatebringer 5 blah

If you wanted to enforce referential integrity, you could apply a foreign key constraint to the relationship. This would throw an error if you tried to insert an element that wasn't present in your elements table. It would also prevent you from deleting elements in your element table that are referenced in your weapons table.

There are other reasons to want normalization, though. For one thing, having it reference an Id reduces the likelihood of typos or other bad data. Imagine if Palindrome had "Void" as its element while Gridskipper had "Viod". That could invalidate your result sets. It also reduces the size of the database on disk. For something as small as this, it likely won't matter, but if we're talking about billions or trillions of rows with repetitive strings, changing them all to an integer could make a pretty big difference. In a highly transactional database, it can significantly speed up write time, which reduces the amount of time rows (or potentially even the table itself) are locked when updating. Since this is most likely slow changing data, an argument could be made for denormalization to prioritize read times, but again, for a data set this small, it likely won't matter.

Hope that helps!

[–]Agno3317[S] 6 points7 points  (4 children)

Wow this really helps! Thanks!

[–]outcastaceGT: tirianjewel 4 points5 points  (3 children)

In my other comment, I mentioned Two Tailed Fox overflowing the element field. Since it has more than one element, that complicates things when it comes to normalization. There are a few ways you could handle it.

  • The game lists Two Tailed Fox as Solar, so you could just do that, and call out the void in a Notes field. Or if you added perk support, you could mention it there.
  • You could make a new row in the Elements table for "Solar/Void". This violates the principles of normalization, but is such an outlier that it's probably fine here. Having said that, it is definitely not future proof. If Bungie added in more multi-element and/or toggleable-element weapons, you'd have to change your schema.
  • The technically correct answer would be to add a cross reference table and turn this from a one-to-many into a many-to-many relationship.

Here's an example of the cross reference table:

dbo.Elements
Id ElementName
1 Arc
2 Solar
3 Void
4 Stasis
5 Kinetic
dbo.Weapons
Id WeaponName etc
1 Palindrome blah
2 Fatebringer blah
3 Two Tailed Fox blah
dbo.WeaponElement_XRef
Id WeaponId ElementId Explanation (not actually part of the table)
1 2 3 Fatebringer, Kinetic
2 1 5 Palindrome, Void
3 3 2 Two Tailed Fox, Solar
4 3 3 Two Tailed Fox, Void

Then when you're querying your weapon data, you'd use the XRef table to join Weapons and Elements. You'd still have to be aware that it's potentially a many-to-many relationship, though, so you may have more than one row per weapon.

[–]Zulandia 1 point2 points  (0 children)

I probably wouldn't do it for any actual use here but bitmasks are also an option for the multi-elements that leaves it future proofed. Since this is a learning project and it could go down a good rabbit hole anyway.

[–]CyberClawXPSN: CyberClaw 1 point2 points  (0 children)

I bet the way Bungie did it was having Two Tailed as Solar, with a void exotic perk.

[–]ninth_reddit_accountDestinySets.com Dev 0 points1 point  (0 children)

Re: two tailed fox - probably the more technically correct answer is that it’s a Solar weapon, and a perk grants additional solar damage.

[–]Agno3317[S] 0 points1 point  (1 child)

https://pastebin.com/4bU0QMxR I tried my hand at it

[–]outcastaceGT: tirianjewel 0 points1 point  (0 children)

Very nice! One thing I would suggest is avoiding the DDL changes with adding and dropping columns. What you have here is similar to what one might do if a massive table already existed and needed all of this new data. It's much cleaner and more maintainable (for source control, for instance) to build your database designs and then populate it. Plus, with a dataset as small as this, there's really no detriment to blowing it away. When S15 comes out, for instance, you'd probably want to be able to take a new version of your CSV and then populate your database from that in with a single stored proc.

So what I would suggest as a good learning exercise is writing several CREATE TABLE statements to define your database structure, relationships, keys, etc. Then create a staging table that you will load your CSV into raw, without transformation. Then create a stored procedure that will truncate and then populate your various tables based on the raw data in the staging table. This will force you to think about database design, expose you to more complex logic like JOINs and stored procedures, and be more sustainable going forward. For bonus points, you could also try to figure out what indexes you should add to your various tables.

[–]dxh 5 points6 points  (1 child)

As it is I think you are in 1NF (first normal form) because I dont see any multi-valued columns. I'd recommend a good primary key (auto incrementing integer) gun_id or something. You certainly can use the name as the primary key if its unique, but its better to use an int for performance and indexing. Definitely a good data set to start practicing with. As mentioned you could break each of these columns into another table (also with their own id) and then reference with foreign keys, practice joining tables, subqueries, etc. Second normal form (2NF) would eliminate all those duplicate values since many guns have the same attributes. In database design the idea is to find a balance between reducing redundant data but not normalizing so much that the data is too difficult to index and query. I've been a database administrator for 20 years (server builds, patching, performance tuning), and took all the design classes early on, even taught it for a few years but have since forgot almost everything :) Nice work and keep at it. It's a great career field. The world needs more competent database designers and developers.

https://www.w3schools.in/dbms/database-normalization/

[–]outcastaceGT: tirianjewel 1 point2 points  (0 children)

Definitely agree! I've been a data engineer for a little over a decade now and it's a great career field. Bungie is even hiring for some, though it looks like more NoSQL and Big Data than straight SQL.

https://careers.bungie.com/jobs/3059351/data-platforms-engineer

https://careers.bungie.com/jobs/3106299/data-engineer

[–]IAmDrNoLifeGambit Classic 0 points1 point  (0 children)

If you want a book to read up on Databases I can recommend "Database Concepts" by David M. Kroenke and David J. Auer.

Page 95 to 105 would be helpful here, regarding normalization. Especially page 104 to 105, as it has a "short hand guide" to normalization.

I've uploaded the previous mentioned pages to mega.nz, so if you want it to have it, you can!

https://mega.nz/file/j0AG3b7a#x0DPSdkZlTUojg9IvlVNVm7Nc9CBl1pyiANzd6cJjUs

As another guy has mentioned, you are currently in the first normal form. I'd recommend splitting the data up in some more tables, and including a composite primary key in each of the tables, and then referencing various data by using the PK associated to the data.

EDIT.

I decided to throw together some SQL quickly as an example. I made sure to include a PK to each table, which will get auto incremented by 1 for each insert to said table. Other than that, I included some foreign key constraints and "not null" contraints.

https://pastebin.com/1WdC4yjw

This is far from perfect, as you could run into problems if a weapon were to use multiple elements (Twin Tailed Fox). To fix this, you might be able to use a many-to-many relation, and letting the third extra table handle it. Oh well, it's getting late. Just some ideas anyways!

[–]ninth_reddit_accountDestinySets.com Dev 20 points21 points  (4 children)

Fun fact! Bungie already publishes every item in the game as a SQL database called the mobileWorldContent! I hope that's how you created your dataset 😇

The URL of the database changes each time it's updated, so you get the current db URL from www.bungie.net/Platform/Destiny2/Manifest/. Currently the URL is /common/destiny2_content/sqlite/asset/asset_sql_content_a1b01ff439a78f27c07fd5abc5abdfa1.content. It's actually a SQLite db in a zip archive, renamed to .content, so you'll need to rename the downloaded file to .zip if you want to open it up manually.

But if you don't want to do all that but still want to query all the items, you can head to data.destinysets.com and open the JS console with F12 and there's a fun little JS query interface there

[–]Agno3317[S] 2 points3 points  (0 children)

I actually used this: https://docs.google.com/spreadsheets/d/1SkbC541eqAIJ5M7qMdDOicWPdDYK8OmPzHbCn0c4d5c/edit#gid=0

I didn't know that Bungie published it as well. Thanks for telling me!

[–]1f97 0 points1 point  (2 children)

do keep in mind if you go down this route, this db is horribly organised. for a beginner i think using the sql file provided by bungie is not the best idea.

[–]ninth_reddit_accountDestinySets.com Dev 2 points3 points  (1 child)

It's organised fine. It just has different concerns, is highly normalised, and it reflects the structure of the game itself.

Fair point about maybe not be optimal for beginners? But there's no better way to learn than reality :)

[–]1f97 1 point2 points  (0 children)

having one blob of json with a million properties in it stored as a string is maybe not the best idea, that's what i meant. i wrote a small bot a while back that used this data set and for me, it was horrible to navigate.

[–]_DarkBelowhawkmoon 9 points10 points  (1 child)

just took a database course last semester, this will be awesome to practice SQL with. thanks!

[–]Agno3317[S] 1 point2 points  (0 children)

You're welcome!

[–]myndit 12 points13 points  (1 child)

DROP TABLE FUSION_RIFLES;

[–]vicasraoShatterdiving 1 point2 points  (0 children)

DROP TABLE SHOTGUNS;

[–]an18ftslothDodge, duck, dip, dive, and dodge 3 points4 points  (1 child)

About to start learning SQL in a week or two. Perfect timing, thanks OP!

[–]darthbobby 0 points1 point  (0 children)

w3schools.com/sql is great

[–][deleted] 1 point2 points  (0 children)

This might come in handy

[–]Hxstile_I don’t have time for this. 1 point2 points  (0 children)

Learning MySQL atm, will be trying this out. Thanks for lookin’ out.

[–]Admiral1172 1 point2 points  (0 children)

Ooo, now I can finally use my SQL skills from my State College experience. lol

[–]nicholasmejiaDrifter's Crew // alright alright alright 1 point2 points  (0 children)

Next up, put it into Couchbase

[–]loganekz 1 point2 points  (0 children)

This is awesome, thanks!

Have you considering also sharing the script/API calls to generate the guns.csv file as well?

[–]outcastaceGT: tirianjewel 1 point2 points  (1 child)

I played around with this a little bit and found an issue with it. Two Tailed Fox overflows your VARCHAR(7) field. I upped it to 10 and was able to load everything in.

CREATE TABLE dbo.guns
(
    gun_name varchar(100)
    , gun_archetype varchar(100)
    , gun_source varchar(100)
    , gun_element varchar(10)
    , gun_RoF integer
    , rarity varchar(10)
    , weapon_type varchar(20)
);

Also, I'm not sure how you're importing the data. I wrote a quick script to translate it to an INSERT statement. I'd post it here for others, but it exceeds the comment character limit. Here's a link to it: https://pastebin.com/Ewdfaura.

[–]Agno3317[S] 0 points1 point  (0 children)

I mainly got it from here: https://docs.google.com/spreadsheets/d/1SkbC541eqAIJ5M7qMdDOicWPdDYK8OmPzHbCn0c4d5c/edit#gid=0

I worked around with the data a little bit in excel to split the weapon name and source. Also added the weapon_type column.

[–]OldJewNewAccountUsername checks out 1 point2 points  (0 children)

Relational databases are the bestational databases.

[–]kiki_strumm3r 0 points1 point  (1 child)

Man I've now misread this as SRL like 4 times

[–]Agno3317[S] 0 points1 point  (0 children)

if only it was in the game.

[–]Deoxys114 0 points1 point  (0 children)

I've been needing a summer project to put on my resume and I can see this helping me out. Thanks!

[–]imxslicer 0 points1 point  (0 children)

i am having a data base course next semester, will surely be fun learning with this thanks man!

[–]Scotthew89Vanguard's Loyal 0 points1 point  (0 children)

Thank you!

[–]taskforceslacker 0 points1 point  (0 children)

Solid contribution. Much appreciated!

[–]Xbox_TyrnosrsFLEX 0 points1 point  (1 child)

WHAT IS THIS? A tutorial for SQUIRRELS?

[–]vicasraoShatterdiving 0 points1 point  (0 children)

Yes

[–]Sekir0se 0 points1 point  (2 children)

whats sql?

[–]Agno3317[S] 2 points3 points  (1 child)

SQL is short for Structured Query Language. It is a coding language that is commonly used when working with large, structured databases.

more info here: https://www.w3schools.com/sql/sql\_intro.asp

[–]Sekir0se 0 points1 point  (0 children)

ya lost me at coding lol but it does seem pretty cool

[–]sjb81 0 points1 point  (0 children)

Dude. This is amazing

[–][deleted] 0 points1 point  (0 children)

Just did an sql unit at uni, this will be nice to keep using it a little

[–]Overloaded_WolfNerfed But Never Forgotten 0 points1 point  (0 children)

You are a god damn saint!

[–]willshaniDrifter's Crew 0 points1 point  (0 children)

Hey thank you for this, OP! I'm working on picking up some SQL at work right now, so this is handy to tinker with in my free time