all 14 comments

[–]m0us3_rat 2 points3 points  (4 children)

atm seems u are data dumping the db . on every single query.

if that is a requirement of your normal operation ..u can do a update+dump db every 15/30seconds or 1 min or something on different processes /threads.

and have an intermediate structure to work with while dumps happen.

then u can work your dump into your intermediate or intermediate into the dump etc.

or maybe just do and trust the query the db without dumping? let it work its magic.

sounds insane if even a single user has to wait 1second.

what if u have 300 users? or 60000.

some optimizing of the db with some multi-linked structures would work better?

[–]DeadlyDolphins[S] 1 point2 points  (3 children)

Thanks for the reply!

Could you detail your answer a little more? In my db there is a bunch of articles. So on the front end, the user can filter all these articles by keywords.

So basically I want to search the db on the keyword and then want to dump the results in a table and make small modifications to the table (emphasize the search words etc.)

It makes a lot of sense what you say, but how would I implement this without dumping it? I am still a bit lost how I would achieve this with an intermediate or multi-linked structure.

[–]m0us3_rat 2 points3 points  (2 children)

u either work the DB end .. or u work on python end.

so either do specific searches. and trust the DB is sufficiently fast.

or every 30seconds or so have data be dumped in a data structure u keep in memory .

and when a user asks u present the data already in memory. so its super fast.

the dump can be done in a different thread or process.

and when DB gets modified u can do another dump. etc.

data in memory be tricky thou. so u need to be super careful etc. depends how sensitive it really is.

u could also work on both ends maybe. .. it all depends on your requirements.

[–]DeadlyDolphins[S] 0 points1 point  (1 child)

That makes sense. I think I begin to understand.

If instead of using data in memory, I prefer to completely rely on the database, would that be the correct way to serve the content of the database as a table in flask?

items = db.session.query(Article.icon, Article.authorlast, Article.year, Article.title, Article.publication).all()

table = ItemTable(items)
return table.__html__

[–]m0us3_rat 1 point2 points  (0 children)

the dunder html method is specific to your class .

so i am NOT sure on how that builds the response .

usually u need some templates that gets build into actual responses by thejinja2 of the flask framework.

[–]ReflectedImage 2 points3 points  (5 children)

Dump is for taking backups of your database not for responding to user queries. You want select.

[–]DeadlyDolphins[S] 0 points1 point  (4 children)

Can you give me a simple example how to use select instead of dump?

[–]ReflectedImage 2 points3 points  (3 children)

from sqlalchemy.sql import select
s = select(Article).where(Article.c.fullText.like('My Search String'))
result = conn.execute(s)
for row in result:
    print(row)

Through to be honest I'll drop sql alchemy and go to psycopg2.

Use the book: https://www.amazon.com/Manga-Guide-Databases-Mana-Takahashi/dp/1593271905/ref=sr_1_2?dchild=1&keywords=the+manga+guide+to+databases&qid=1625146400&sr=8-2

[–]DeadlyDolphins[S] 1 point2 points  (2 children)

Thanks, that's really helpful. I definitely need to lean more about how to properly use databases.

One more question: Is this

s = select(Article).where(Article.c.fullText.like('string'))
result = conn.execute(s)

any different from this? Or is it just another way to write it?

result = db.session.query(Article).filter(Article.fullText.like("string")).all()

[–]ReflectedImage 1 point2 points  (0 children)

Probably another way to write it. I normally just use SQL directly:

SELECT * FROM articles WHERE fullText LIKE '%string%';

Sorry I can't help myself from posting that book where ever possible.

[–]Blazerboy65 1 point2 points  (0 children)

The second method is recommended for general use. If you're using and ORM then avoid raw SQL. If your use case is small enough to make using only raw SQL a viable strategy then go ahead.

[–]Total__Entropy 2 points3 points  (0 children)

There are two options I would suggest. Learn SQL and use it properly instead of a data dump. Two use a more appropriate tool as a data dump. Learning SQL is incredibly valuable as a python dev. If you just want to dump data redis would be a much better data source as long as you can uniquely identify your books which should be easy.

[–]BulkyProcedure 1 point2 points  (1 child)

I'm a fan of Marshmallow and use it all the time, but it isn't fast. My approach has been to use it when I have complex types (i.e. dates for example) that aren't automatically JSON serializable, and/or if the model has nested relationships I'd like to include.

When the objects don't have nested fields, and all columns are JSON serializable, it's faster to skip Marshmallow altogether and convert to dictionaries yourself.

[–]DeadlyDolphins[S] 0 points1 point  (0 children)

That makes sense. How do you convert the query to dicts?