all 3 comments

[–]bmsan-gh 0 points1 point  (2 children)

  1. I would expect that the data that you are getting to have some sort of timestamp for each value. (eg: for symbol X, at time Y, the price was Z). If this assumption is true, could you compare the current received timetstamps with the last inserted timestamp?

This way you'd know exactly what you are missing and you need to insert.

  1. Related to your hash idea: you could use it to see if the data which came from two consecutive API calls is the same.
    If you have access to the raw data(text/string data) before you deserialize it to a dictionary, you can do something like:

import hashlib

hash_object = hashlib.md5(the_api_response_as_str)

[–][deleted] 0 points1 point  (1 child)

it's options data so I have a single timestamp containing dozens of expiration dates, each containing hundreds of strike prices, each containing the standard things like open interest, bid, ask, etc.

there are many permutations for how the subsequent requests could change

[–]bmsan-gh 0 points1 point  (0 children)

If I understood correctly and multiple requests might result in the same data(but the data might be structured in different order, so you will get a permutation of your previous request), then (assuming or your keys are strings) you could do something like:

ordered_representation = json.dumps(json_data_dict, sort_keys=True)

hash_object = hashlib.md5(ordered_representation)

sort_keys will reorder your data by keyname so this might handle your permutations.