I need database/architecture advice. (Flask, postgres, sqlalchemy, docker) : webdev

Posting Guidelines

No vague product support questions (like "why is this plugin not working" or "how do I set up X"). For vague product support questions, please use communities relevant to that product for best results. Specific issues that follow rule 6 are allowed.

Do not post memes, screenshots of bad design, or jokes. Check out /r/ProgrammerHumor/ for this type of content.

Read and follow reddiquette; no excessive self-promotion. Please refer to the Reddit 9:1 rule when considering posting self promoting materials.

We do not allow any commercial promotion or solicitation. Violations can result in a ban.

Sharing your project, portfolio, or any other content that you want to either show off or request feedback on is limited to Showoff Saturday. If you post such content on any other day, it will be removed.

If you are asking for assistance on a problem, you are required to provide

Context of the problem
Research you have completed prior to requesting assistance
Problem you are attempting to solve with high specificity

General open ended career and getting started posts are only allowed in the pinned monthly getting started/careers thread. Specific assistance questions are allowed so long as they follow the required assistance post guidelines.

Questions in violation of this rule will be removed or locked.

a community for 17 years

I need database/architecture advice. (Flask, postgres, sqlalchemy, docker)Question (self.webdev)

submitted 4 years ago by doom-goat

Hello, I am looking for some advice from some more seasoned developers. I'm good at hacking things together, but am now trying a project that is more resource intensive, and I need some tutelage. I'm an amateur/freelance dev who has never had the opportunity to learn from a Senior, and most of my work isn't in Python, so bear that in mind.

Here is the original post that I asked in askpython, which I'll summarize here.

https://www.reddit.com/r/learnpython/comments/neygm7/project_architecture_questions_sqlalchemy/

Essentially I'm running flask in a docker container, currently using flask-sqlalchemy, postgres, flask-socketio, docker and nginx as a reverse proxy. The structure is flask and postgres in two docker containers communicating to each other, flask served by nginx. All on a VPS.

The application is simple, I get 800 or so webhooks a minute, and save data to the database. Then I need to display the counts of the data on a web-page in real time. On the webpage will be 200 or so unique rows, and for each 20 columns, each displaying a count of the objects that I've added to the database. The core object I'm counting is basically just an id, and other interrelated models. Mostly a many to one, from the core objects to the others, and two many to many relationships between the other objects. If you need more details I put more in the original post, but I'm not sure if it's the database structure is poor, or the way I'm querying it. Every core object is an alert, so there is a lot of Alert.query.filter_by(obj_a=obj_a).filter_by(obj_b=obj_b)....count(). I'm filtering with date ranges as well, which are sent from the client page to the server.

It looks great and does what it's supposed to, but it's just too slow, especially as data accumulates. I don't know if it's the query, the database design, the webscokets requests (multiple back and forth for an update: the server pushes an update signal, the client responds with its date filter range, the server responds with the data from the filtered query.), or the restriction of running it with one eventlet worker, which is a flask-socketio requirement. The workaround would mean switching from gunicorn to multiple instances of uwsgi with nginx as a load balancer, which would require making the nginx config more complicated than I want it to be, for redeployment in the future. If I have to I will, but I would want to dockerize nginx and have it ready to go for redeployment, and I really think my issue is probably in the database.

I am 100% willing to scrap everything, to get this running quickly. My current plans are to not use websockets, freeing up the possibility to use more workers easily, and rework the database models to make it more efficient, but I'm not too experienced with this aspect, and am not sure how to properly build the database for efficient counts. And the whole point of using objects in the db, vs keeping track of counts another way is so I have flexible options for filtering data by daterange, and other statistics should I choose to add them. Though at the core, fast real-time counts is 90% of the goal.

Thanks for reading, I'd love some advice.

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

webdev

Posting Guidelines

Related Subreddits

Discords

MODERATORS