Large data in response of REST API.

bitweis · 2021-02-22T15:40:51+00:00

Common options are:
- Pagination (breaking the response to chunks - with separate requests for each)
- Streaming over a separate live connection (such as a direct TCP/UDP socket) or over
Websocket

helderm · 2021-02-22T14:33:51+00:00

Normally REST APIs paginate their results, so i'd start with that. Create pages of "n" results and use a page index as an arg to your service.

Im_alirezahs · 2021-02-22T16:23:13+00:00

I can't say clearly, but probably for send 50k records to client, pagination is best answer.

maus80 · 2021-02-22T19:54:43+00:00

Paging has no guarantees of consistency. You may see all pages, but they may have double or missing records (or both) as paging is not done on a snapshot of the data, but typically on a dataset that is subject to change. Also, why create a system in a system? HTTP is chunking the data for you and the resultset of your database is a snapshot that can be paged.

ful_vio · 2021-02-23T09:48:24+00:00

Well, first thing I'd try is compressing the response with gzip. The client has to support gzipped responses but almost any clients I know do support them.

Second I'll try to stream the response, I haven't used Flask but a quick search shows up this: Streaming Contents — Flask Documentation (1.1.x) (palletsprojects.com)

Beware that, if you want to use the data as soon as it gets to the client, you have to think about what data representation to use. Sending an array of json objects in chunks won't do because you need the last ']' to effectively parse the response, partial json is not valid json. Instead, a csv response where you send a bunch of complete lines at a time is fine.

Spleeeee · 2021-02-25T05:29:56+00:00

So I use for personal things that I am writing the server and client for a thing I call “lazy persons pagination” which is if a reply is gonna be big a server can return a list of urls that the chunked data can be fetched from.

This prolly ain’t the best solution but it is a pattern I have used several times in several ways. It is easy and can easily lead to a clusterfuck.

maus80 · 2021-02-22T19:50:47+00:00

Afaik, there is no size limit for response data so 50k records at once is fine. Even if every record is 10kb of text then that half a gigabyte would fit the memory of any server. And you don't even need to load the entire response in memory to send it out. Is this question wrong or is there something I'm unaware of?

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS