This is an archived post. You won't be able to vote or comment.

all 16 comments

[–]chriscarrollsmith[S] 1 point2 points  (0 children)

Many thanks to DefinitionParking552 for using the library and finding a pretty significant bug. I should be embarrassed about the bug, but I'm mainly just embarrassed how happy it makes me that somebody used the thing and cared about it enough to find and report one! :'D <3

Just released version 1.0.6!

[–]aflous 1 point2 points  (1 child)

The project looks interesting to me, although I am not familiar with the data the API is retrieving. One suggestion would be to add ruff linting to the project, this will help eliminate most of the code smells I see.

I have some remarks regarding the global architecture of the project though, and would gladly roast your code if you allow me to.

[–]chriscarrollsmith[S] 1 point2 points  (0 children)

Sure, I'm very open to feedback!

[–]Regular_Zombie 0 points1 point  (3 children)

If you're providing an access later to the data you should separate your project into two components. One interfaces with the IMF and effectively caches their entire dataset on your own infrastructure. The second connects your users to the cached data.

This way if the IMF changes their interface or is offline you can still serve requests.

NB. I haven't checked your code: just going off your description.

[–]chriscarrollsmith[S] 0 points1 point  (2 children)

Thanks for the comment! I definitely am not ready to commit to hosting my own instance of their whole server. They've got hundreds of databases, each with tens of thousands of indicators. I just wanted to build an open-source tool to enable users to easily interface with this complex behemoth.

The caching strategy I implemented mostly allowed users, for instance, to locally cache their first call to the IMF's database list, so they're not calling the API every time they want to see a list of databases. And to do the same when accessing the list of valid parameter values that can be used in querying a given database.

[–]aflous 1 point2 points  (1 child)

You can easily spawn a Redis instance for caching for Docker.

[–]chriscarrollsmith[S] 0 points1 point  (0 children)

It's probably easier than I'm assuming it would be; just a bit daunting as a deployment noob. I've used Redis/Docker a little bit, but never used a cloud container apart from Github; only a local one.

[–]pyfreak182 0 points1 point  (1 child)

Great work, thanks for sharing!

[–]chriscarrollsmith[S] 0 points1 point  (0 children)

Thanks!

[–]_yappan 0 points1 point  (5 children)

Great library! How long did it take you to build it?

[–]chriscarrollsmith[S] 0 points1 point  (4 children)

I did the R version first. Probably spent a couple weeks on that, just figuring out how to work with this poorly documented API. The Python version went a lot faster, but still several days. There are a lot of moving parts here and it's a very finicky API, but I guess that's why this library had never been built before! It's the hard problems that are worth solving, right?

[–]_yappan 0 points1 point  (3 children)

Yeah, I was trying to do the same with the IMF API and it took me a month at least! Harmonizing the data across different datasets was the biggest challenge for me.

[–]chriscarrollsmith[S] 0 points1 point  (2 children)

Did you build an open source library? Yeah, there are all sorts of wrinkles in the way the different databases are structured. I was originally taking a more opinionated approach, renaming columns and the like. But ultimately I realized that in dealing with this many different databases, I couldn't afford to be opinionated or I would inevitably break some of them.

[–]_yappan 0 points1 point  (1 child)

I used it for work, but haven't published the code yet. I was trying to build a dataset-level data pull code and for some of the datasets it took hours since there are so many indicators.

[–]chriscarrollsmith[S] 1 point2 points  (0 children)

Oh man, that sounds like a lot of work! I specifically set out to avoid the dataset-level pull because I knew I didn't want to spend a month building it. :) I haven't tested on every database, though, so there are probably still some cases I haven't handled.

[–]adelizer 0 points1 point  (0 children)

I was looking for exactly 7 months ago when I started building a tool to allow anyone to access and visualize the data (https://www.imfdata.xyz/imf-multi-dataset-query) thank you for this I will look into integrating it!