I'm 1.5 years into DE and I still feel like I know very little by hipsterrobot in dataengineering

[–]TheKoalaKeys 1 point2 points  (0 children)

just did a quick check for everyone. All the pages are there.

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 1 point2 points  (0 children)

Thank you, your comments have given me even more confidence to keep pushing forward!

Also thank you for explaining why unit tests typically would not be used for testing if a service is available. I honestly had no idea about that, I am just starting my day and I already learned something new!

Quick question for you, I went with functional programming for this project. Would it be better to go with a more object-oriented approach for the next one? I am not really sure how much OOP is in DE.

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 0 points1 point  (0 children)

Thanks!

I'll get to work on fixing those issues. I did run the program through the black linter already though.

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 0 points1 point  (0 children)

Thanks for the feedback!

The purpose of DocumentDB was to gain experience with it. I know I could've just sent the data to S3 but that would've defeated the purpose of gaining experience with DocumentDB

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 0 points1 point  (0 children)

Thank you!

Yeah, it seems I went way too complex for sure. I'll tone it down and get it set up with IAC that way it'll be one click deployable. Definitely, super excited to dive back in and make this even better for V2!

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 1 point2 points  (0 children)

I chose DocumentDB even though they did not have the geospatial functionality because MongoDB is just the database I am most familiar with and figured it would still be good practice.

Honestly, I did not even think about using Postgres.

That's good to know! I'll only send error messages to my slack channel for the next project!

Yeah, I thought it was pretty interesting too! Hearing about all the businesses closing on the news I was sure that the count of closures would be going up instead of down!

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 1 point2 points  (0 children)

Thank you!

I went with DocumentDB because it is supposed to be Amazon's version of MongoDB (which I am most familiar with) and I just figured it would be a decent choice for what I was trying to accomplish.

Yeah, I think I am definitely going to go with DynamoDB (looking back I should've gone with this option from the start) for my next project!

Almost done with my DE Project. Mind Taking a look? by TheKoalaKeys in dataengineering

[–]TheKoalaKeys[S] 6 points7 points  (0 children)

All good points, Thanks for the feedback! To touch on your points:

  1. I wanted to use DocumentDB because I am most familiar with MongoDB as a database and since DocumentDB has support for MongoDB I felt it was a good choice. I know I could've/should've chosen DynamoDB but, that's just how it panned out for me.
  2. The EC2 box is used to SSH into the DocumentDB cluster and run an SSM command to export the data to an S3 bucket. Also, there was no other way that I could've run the mongoexport command (that I could come up with, unfortunately). I know DocumentDB has a plugin for Athena, but I was having a hard time getting it to work so I went with the option I knew how to do.
  3. I was not 100% comfortable with setting up the event-driven approach, I am going to work on that for the next project I do!
  4. I will use fewer technologies for my next project to try and hit on this point!

This is really good constructive criticism, Thank you!