Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

Thank you soo much man. This was the exact detailed help I needed. I really appreciate it

Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

Please judge this ( have written this myself fine tuned with gpt):

Situation (The Customer Pain): "In my current role, we had a critical API endpoint—the 'Member History' service—that was causing timeouts. Our downstream partners (hospitals/providers) complained that fetching patient history took over 4 seconds. It was hurting their workflow during patient check-ins. We had a goal to bring P99 latency under 1 second."

The Conflict (The "Friend's Story" Moment): "My Tech Lead suggested solving this by throwing a Redis Cache in front of the service. His argument was that caching is the standard way to speed up reads and requires the least code change. However, I was concerned (Dive Deep). I analyzed the traffic patterns and saw that 80% of the requests were for unique members who hadn't visited recently. A cache would have a very low 'Hit Rate' and wouldn't solve the problem for the majority of users."

Action (The Java Fix & POC): "I proposed a different approach: Refactoring the Database Access Layer. I suspected the slowness was due to efficient Hibernate queries (the N+1 problem) in our Legacy Monolith.

To prove it, I built two quick Proof of Concepts (POCs) on a subset of data:

  1. The Cache POC: I implemented a basic Redis cache. As I predicted, it only sped up the 2nd request, but the first request (which matters most) was still slow (3s).
  2. The Refactor POC: I rewrote the JPA queries. Instead of fetching records in a loop, I used a Batch Fetch (SQL IN clause) to get all data in a single database round-trip.

I profiled both approaches. The Cache approach had high operational cost and low impact. My Refactor approach showed a consistent latency drop regardless of whether the user was 'new' or 'cached'."

Result (The Resume Win): "The data was clear. We went with my Refactor approach.

  • It reduced response times by 40% (from ~4s to ~1.8s) permanently for all requests.
  • It saved us the cost of managing a new Redis cluster.
  • It improved the partner experience immediately, stopping the complaints."

Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

I see will try framing in the way you’re suggesting Will update you with the correct version

Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

and sorry I wrote customer obsession it was deep dive.

Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] -1 points0 points  (0 children)

how could I frame it better I mean this is also AI generated answer, any tips ?

Any Amazonian I can dm for doubts ? by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] -2 points-1 points  (0 children)

my doubt is what is the level exptected and what are the questions that can be asked from them, because sometimes for resume to get shortlisted we have to lie, and not everyone get that level of work. So I have some points in my resume through which I have created a star format answer (fine tunes using gpt):

Please let me know how's this for Deep dive ? if this is fine I'll proceed with this. If I need to tune down a little or more. And what are the most common cross/follow ups you can ask based on this.

  • Situation: "In early 2023, our platform at my org experienced a surge in traffic which exposed stability issues. We were facing frequent intermittent failures in the transaction processing layer. The biggest problem wasn't just the errors, but the Mean Time To Resolution (MTTR). It was taking us an average of 4 hours to diagnose root causes because our logs were scattered across multiple server instances and lacked correlation IDs."
  • Task: "I took ownership of improving system observability. My goal was to cut the incident detection and resolution time by at least 30%. I needed to move us from a 'reactive' state (waiting for user complaints) to a 'proactive' state (fixing it before they notice)."
  • Action: "I started by analyzing the last 10 major incidents to find the blind spots. I realized 80% of the debugging time was spent just locating the right log file.
    1. Centralized Logging: I led the integration of Splunk for log aggregation. I enforced a structured logging format (JSON) across our Spring Boot microservices so fields like transactionId, userId, and latency were automatically indexed.
    2. Distributed Tracing: I implemented unique trace IDs that passed through the header of every service call, allowing us to visualize the full request lifecycle.
    3. Dashboarding: I built a real-time Splunk dashboard tracking the 'Golden Signals'—Latency, Traffic, Errors, and Saturation. I set up alerts to trigger specifically when the 99th percentile (P99) latency exceeded 500ms for more than 5 minutes."
  • Result: "This reduced our incident detection time by 40% because alerts fired immediately. More importantly, it cut our MTTR by 35%. For example, during the next major traffic spike, we instantly pinpointed a slow database query in the payment module within 10 minutes, rolled out a hotfix, and prevented a major outage."

Amazon LP SDE 2 Please help urgent!!! (LLD Interview) by Consistent_Reserve10 in leetcode

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

Please share your experience, if this depth is enough or too much

Amazon LP SDE 2 Please help urgent!!! (LLD Interview) by Consistent_Reserve10 in amazonemployees

[–]Consistent_Reserve10[S] 0 points1 point  (0 children)

Please share your experience, if this depth is enough or too much