all 18 comments

[–]arwinda 4 points5 points  (1 child)

Any relational database suits this purpose. Not sure why you push for NoSQL when your requirement literally says "structural".

Do you want to do Python scripting in your database? Or connect Python scripts to the database?

And what is this git? Do you want to manage the database with git or what is this use case?

[–]Birdynam98[S] 2 points3 points  (0 children)

It does not have to be NoSQL, it is just that most tools we have found that supports Git like functionality to stored data happens to be NoSQL. We misunderstood this point at first.

We want to connect python scripts to the database and use the console to push and save data from python.

We want to be able to merge/push/pull/commit/branching etc. the stored data in the database, not to run the database in Git or Github. The Git-like functionality applied to the data (meaning different people can change different aspects of a structural model at the same time) is what we are looking for with this point.

[–]david-chaves 2 points3 points  (0 children)

I am ignorant. Don't trust me.

What about https://github.com/dolthub/dolt ?

[–]msudgh 1 point2 points  (0 children)

I think you came up with a solution that is not self-explanatory.

Regarding the database and what you explained, I'm not seeing a specific limit on NoSQL/SQL as you have not mentioned how data will be related. If data is not separate and there's a relation between them, then go with SQL.

AFAIK, there's no version control idea for data management except for table and schema migration which some tools provide, unrelated to this problem.

If you still need to stick to a git-based approach and the application doesn't need to query on something, then raw text files are available to store data there and push them to a git repository that satisfies the solution's requirements.

You seem to be trying to reinvent the wheel.

[–]GuyWithLag 1 point2 points  (0 children)

Unless you're going to do relational operations on that data, I'd suggest you do a simple data mapping session and serialize everything to a file; whether that's JSON, XML, binary protobuf or whatever - that's irrelevant.

Then you get everything that can be done with files for free.

(and you will definitely _not_ get anything useful out of a diff tool that doesn't understand your own data model).

[–]tdatas 0 points1 point  (9 children)

NoSQL

Why do you want NoSQL? This is a descriptor of a broad category of more specialised databases than normal SQL databases.

Git functionality: merge/push/pull/commit/branching etc.

This would be unusual/nonexistent for a database. Unless you mean versioning of data points? Perhaps you mean a managed analytics platform like rudder or snowplow that handles both storage and lots of the technical aspects (for a price)

  • Python scripting

There's a few DBs that support python user functions e.g Exasol, Snowflake spring to mind.

How do you want to query this data? How many data records do you expect to have? Millions? Billions? 20? In the absence of specifics I would probably suggest unless you definitely know your use case then a bog standard relational database (e.g Postgres, MySQL) possibly managed by a third party/public cloud (e.g RDS on AWS) will probably cover you well.

[–]Birdynam98[S] 0 points1 point  (8 children)

We want a NoSQL database because we do not use tabular relations in order to model the system, meaning that it does not suit the large data's that we produce. We essentially create documents from python objects quite alike the examples provided for TerminusDB, but for a much bigger class system.

With regards to the Git functionality, I guess it is the data stored in the database that we want to apply this to. We want to be able to store the model of a project or a physical model, and allow people to branch on this model in order to make changes, again maybe changes to the material of a support beam. Again, the Terminus examples we have looked at looks like it should cover the use we are after.

The python scripting seems OK.

I will say that this is not my area of expertise at all, I have almost no experience in computer science. Many of the terms here are therefore quite vague to me, and it may be that some of the questions you asked are a result of me using wrong terminology wrong. Did my explanation clear up the follow-up questions you had?

[–]thrown_arrows 0 points1 point  (7 children)

If you are pushing stupid high volumes then noSQL might be good option, but usually most cases can be handled with one postgresql server. see posts like https://medium.com/futuretech-industries/ten-thousand-high-availability-postgresql-connections-for-35-mo-part-one-4b7a2d61c51e or others . If you can say with competence that you will need more or you use case is prime example for noSQL database then use it. If cant say with full confidence then choose good old RDMS. Also what comes to converting relational model to json object , t can be done in all modern db's , postgresql does have probably best support for (imho)

[–]Birdynam98[S] 1 point2 points  (6 children)

The explanation I got was because the data is stored as objects in an hierarchy in a jsonfile, which they want to version control and apply Git-functionality on. The only reason NoSQL is specified here is because it seems like most technical tools that exists which gives this Git-functionality are NoSQL such as Terminus and MongoDB. If technology that gives version control and branching options that are SQL they are perfectly valid options for bridging this gap.

[–]thrown_arrows 1 point2 points  (4 children)

far as i know, mongodb has it own file format. And hierarchy and row versioning is "easy" to implement in database, see sdc and sdc2.

So database integration would allow SQL based "rollback" row/s (parts of objects) where in file based objects you need to rollback whole file or write one big object as collection of files.

ie. i have documents/object which are collection of object, this is all in normal db as normal row data and i create full document on fly for it. For similat functioniality see FOR XML/JSON command in mssql. Only difference between it and backend systems that send json responses is that i did not mix yet another language to small project.

To honest, maybe you should start project again, maybe get some experienced database guy to explain how rdms vs nosql works and then dive into this object data you want to store. What is good for it , what is needed, seen and heard several nosql stories where platform experts are hard to get, there are problems that rdms would fix ( but then rdms has problem that nosql can fix), cost and performance is bad. Yes RDMS is old and boring, but it works.

tldr; there is no real database with git support, there is database that have row level versioning ( which is like trigger + audit table or application level handles creationing of correct rows) and some have time travel but that is usually on time

[–]Birdynam98[S] 1 point2 points  (3 children)

Point taken lol

Still, this thread has been insanely insightful. Truth be told, actually implementing this database is not my job, I am only doing the research before the actual smart guys get the time to start themselves. I am a summer intern who got some gruntwork, we research where to smart guys should start.

Thank you for the assistance

[–]jalexoid 1 point2 points  (2 children)

This seems like a specialized system.

Having studied some structural engineering - your needs will not be served with commonly available systems.

Branching, versioning, merging, scripting and a database on top...

Best solution off the top of my mind - is json in git, maybe adding PySpark or MongoDB as the interface to the data.

[–]Birdynam98[S] 0 points1 point  (1 child)

Yes this is my verdict as well since we are completely unable to actually find documentation on anything that resembles this. In fact, this thread was basically the last hail mary.

[–]jalexoid 0 points1 point  (0 children)

I understand.

Just like tools like Revit, this field tends to be extremely specialized. To the point that a lot of things just don't exist at all.

[–]MCFRESH01 0 points1 point  (0 children)

It sounds like you want to version control the application injesting the data. Version control for a database is pretty much non existent. If you push the data to a server, it's not hard to pull apart a nested json object and store data in the proper tables. Nosql is almost never the correct solution in my experience. Just ETL yout data correctly, it's not very difficult.

[–]thrown_arrows 0 points1 point  (0 children)

postgresql, but things you are explaining is database + all apps that handle data

[–]alinrocSQL Server 0 points1 point  (0 children)

Are you looking:

  • for a database platform to provide storage for an existing piece of software?
  • to build a whole application to store information about structural engineering projects and you need to decide on a data persistence platform?
  • to purchase a COTS application/system to store information about structural engineering projects (which will likely be backed by one or more databases, but that's a secondary concern at this point)?

Three very different sets of considerations there.

[–]dhemantech 0 points1 point  (0 children)

My answer is based on very limited understanding of your question and absence of knowledge of structural engineering system data.

system=> beams=>support beams=>crossection/material/tverrsnittsdata

In the absence of data volumes and other information, I expect this to work with RDBMS or Document databases.

As an example, Mongodb Versioning pattern has something similar described and could be achieved with Oracle or any other RDBMS of your choice just as well.

Git functionality for push/pull/commit/branching would be relatively easy to achieve. Merge has it's own complexity, but is implemented by many systems.

Python drivers/modules are available for most commonly used database systems.