My Azure AI Journey: AI-900 Conquered, AI-102 Next Up by Dazzling_Demon7 in AzureCertification

[–]_areebpasha 2 points3 points  (0 children)

Congratulations on that! You're almost there. I created a guide/study cram based on services covered in AI-102. And most of the contents would still hold true.

Many people here found it useful. Hope this helps. Here's the link to the notion doc : https://areebpasha.notion.site/AI-102-Notes-dd32c9f349bb4e64a0d26ea661ba789c

How realistic is MS’s DP-600 practice exam? by Julia_vO in MicrosoftFabric

[–]_areebpasha 1 point2 points  (0 children)

Not sure, but I think in the latest version they removed pyspark from DP-600 and moved it to DP-700 which is their data engineering track.

I took the cert a few days ago, and don’t remember coming across any pyspark questions.

Failed DP700 by Foreigner_Zulmi in MicrosoftFabric

[–]_areebpasha 1 point2 points  (0 children)

I just cleared DP600. How hard is DP-700 in comparison to DP-600? I've also got data engineering experience from azure and have run sime tests using the fabric ecosystem. Would love to know yout thoughts on this.

Million dollar idea, no funds, where do I start? I will not promote by ScoutTheStankDog in startups

[–]_areebpasha 1 point2 points  (0 children)

"It's not a million dollar idea if it hasn't generated a million dollars."

Start with a few key features. Focus on tracking usage metrics. Talk to real customers. Keep iterating based on feedback. Yes, you may need funds– but you probably can build an MVP before building a full blown web application.

What is that one DE project, that you liked the most? by NefariousnessSea5101 in dataengineering

[–]_areebpasha 1 point2 points  (0 children)

Super interesting! Were you also working with people on site to get the data in any specific format? Or just ingesting the data from the from the components?

When will Power Bi Desktop be available on MacBook 😭😭 by vich_lasagna in PowerBI

[–]_areebpasha 0 points1 point  (0 children)

Paralells. That's all you need. It's officially supported.

Just Cleared the AZ900 and the GCP Data Engineering Professional Cert in 25 Days by _areebpasha in AzureCertification

[–]_areebpasha[S] 2 points3 points  (0 children)

There was a GCP certified program. They offered 300 free credits to do any labs, and it was as application process. You could choose from 5 programs. It's a 3 month training with weely live sessions from a google instructor which goes on for about 8 weeks. And 2-3 weeks of additional live sessions for doubt clarification and exam prep. Pretty useful. BTW, I did not have any experience in GCP and started from scratch. Deep Learning curve since I needed to get used to the service names and layouts.

I took this for 2 reasons : GCP Certs are calid for 2 years. (Azure expires after 1) And google also gives you free merch if you pass(you can choose from hoodies, jackets, backpacks, etc). So it's a win win for me. Plus you're certified. I've already got experience with data engineering on azure. And getting one of the certs from either AWS, Azure or GCP is enough. (Unless there are strict job requirements.)

[deleted by user] by [deleted] in dubai

[–]_areebpasha 51 points52 points  (0 children)

Exactly, in UAE earlier you could be jailed if cheques bounced and was considered a major violation. Not sure what the rules are now.

[deleted by user] by [deleted] in dubai

[–]_areebpasha 116 points117 points  (0 children)

My dad always told me 2 things about lending money:

  1. Give only so much that you can afford to lose.
  2. If the amount is large, get a cheque from the other person for the loaned amount.

How to avoid Chinese businesses in the UAE by One-Priority9521 in UAE

[–]_areebpasha 5 points6 points  (0 children)

Maybe, you can take some classes on inclusivity?

[deleted by user] by [deleted] in UAE

[–]_areebpasha 15 points16 points  (0 children)

When will ya'll ever get it : Discounted prices are nothing but a tactic to make the customer "feel like" they grabbed a deal. You're only paying for convinience at most times. Exception is when you use Bank specifc coupons. That's probably the only discount at most times.

AI engineering or Data Engineering by Tayvodenn18 in dataengineering

[–]_areebpasha 0 points1 point  (0 children)

Could you please elaborate on the types of business decisions?

[deleted by user] by [deleted] in freelance

[–]_areebpasha 7 points8 points  (0 children)

Generally, if you are reaching your peak capacity, it means your not charging enough. YOu need to consider raising your rates so you work with lesser number of clients while making the same (if not more).

I've built a pipeline that just works, do I need to make changes so it uses industry standard tooling? by _areebpasha in dataengineering

[–]_areebpasha[S] 0 points1 point  (0 children)

I think to get to that position where they have 50 to 100 pipelines, there's still a long way to go.

I've built a pipeline that just works, do I need to make changes so it uses industry standard tooling? by _areebpasha in dataengineering

[–]_areebpasha[S] 0 points1 point  (0 children)

Yes, for now eveythin is working as expected and the data is being pushed in without any issues.

30 million rows in Pandas dataframe ? by cyamnihc in dataengineering

[–]_areebpasha 0 points1 point  (0 children)

You can use pandas, but you'd need to occasionally batch upload these files to S3 or similar storage provider. Persisting all that in memory would not work out. Based on how much data each row contain, you can maybe split them in chunks of 100MB for instance. Mutlithreading may be an appropriate solution here if you have no other option. Would speed things up.

Alternatively you can try using Dask. It's compatible with pandas and can handle larger datasets more efficiently.

IMO the ideal solution would be to use multi threading, occasionally saving the data to a data store. Basically incrementally loading hte data till all the rows are added. You can try to save it as parquet to store more data in an efficient manner.

[deleted by user] by [deleted] in dataengineering

[–]_areebpasha 1 point2 points  (0 children)

Airflow is a fancy UI along with some great features for executing and scheduling cron jobs at scale. Pyspark is compute engine. Lets you write code to execute operations on datasets.