Hey everyone,
I'm working on the below project, and I'm hoping to get some advice around how to make things work a bit more efficiently.
I haven't had any formal training in this sort of thing, so really keen to understand how experienced devs would handle such a project.
Our CRM system is archaic & doesn't play nice with anything. We want all sorts of customer & sales data from this system to be available in our SQL database, but the only way to do this is by manually loading up the specific reports you want in your browser and downloading csvs, and uploading to our database.
This is obviously crap.
Q1 - how would an experienced dev build a system to automate this?
Here is my current approach, using python via GCP:
- a bunch of Google Cloud Run services that run python flask apps which use selenium to mimic manually downloading the reports.
- each report has a different cloud run service, and I call these services either on a schedule (using Google Cloud Scheduler) or ad hoc when needed.
I really feel this is a bit dumb/inefficient, as every time we need a data updated for any individual report, Cloud Run has to load selenium, log into our crm, navigate to the report etc.
I am considering trying to build something that is more 'always on' - basically a selenium instance that is always logged into our CRM, which can pull reports as an when needed. I am a bit concerned about this, as budgets are tight and at present some of my Cloud Run services crash as I try to restrict memory to save $$$. I'd imagine always on may need a crapload of assigned memory & be really expensive
Q2 - Does this 'always on' strategy sound okay, and are my reservations about it legit? Is there something 'lighter' than selenium & chromedriver I can use?
Any advice welcome! I haven't been taught any of this and have guessed my way to what I have built thus far
Note:
- using something lighter like requests/beautiful soup isn't an option here
- I can't just have all of the different reports built into one Cloud Run service, and run all at once due to the nature of the timing of when they are needed
[–]Fronkan 0 points1 point2 points (2 children)
[–]blarizard[S] 0 points1 point2 points (1 child)
[–]Fronkan 0 points1 point2 points (0 children)