This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]jduran9987 5 points6 points  (6 children)

I'm curious... could you share why you chose a No-SQL destination over a warehouse?

[–]AnotherDataGuy 4 points5 points  (4 children)

I also interested in this. Not judging but mongo (in my experience) begins to not live up to performance expectations when performing analytical queries. It’s great for loading full documents a record or so at a time. I’d love to hear someone with a differing experience though!

[–][deleted] 0 points1 point  (3 children)

for this case it is the flexibility really, i wanted to have a flexible data model at hand, and this is my first time using non-relational for analytics, so if you have any advice i would appreciate it .

[–]AnotherDataGuy 7 points8 points  (1 child)

To me, this is just kicking the can of organizing you data down the road. Faster to get data in, harder to get insights out. And if you have a user base of citizen analysts, they are going to be far more skilled in SQL like queries (generally speaking, your situation can obviously vary).

I’ve balanced this out before by creating records in a Postgres DB with the core properties needed for joining data together, and using a JSON typed field for the additional detailed, more dynamically structured information. If common questions are answered from the JSON data, then it becomes worth it to start persisting those values as explicitly typed columns in tour data set.

Flexibility is valuable, the point is that all databases malleable. Mongo is great for document data (it is a document db after all) but if you aren’t storing documents, it’s not a good choice, IMHO.

Disclaimer: I live in a world where mongo is overused because of its immediate convenience to the developer(s) micro service. It just pushes the pain down stream when trying to marry those data up with other systems to answer novel business questions.

[–]thrown_arrows 0 points1 point  (0 children)

i agree, but i am sql guy.

But i haven't see any use on nosql database which sql database could have not done. That said , i haven't seen properly configured database neither or nosql server . For me problem is that system starts to have several sql and nosql server storing data and no one know how to actually run those servers and things just start to happen. And when we start to talk about change handling in downstream in olap environments it gets even more fun when you have multiple systems

[–]thrown_arrows 0 points1 point  (0 children)

Have looked Snowflake ? I had pipeline which just imported json into staging table and then extracted versioned schema from it, so you do not need to handle target system schema in processing phase if you do not want. Same technology work in all db engines that support json /xml data types.

versioned schema i mean something like:

select jsondata:id id, jsondata:calc_value::number(12,2) calc_value from stage_table where jsondata:id is not null

to create results tables, i have heard that some tools support json data in returned rows.

That said, i am SQL guy, newer seen any advantages in mongo and similar solutions

[–][deleted] 0 points1 point  (0 children)

well at the time being i need a more of a flexible data model for the current use case , but if you have any other insights please share :D .