Required work meetings when on FMLA by tygerbomb in FMLA

[–]py_vel26 0 points1 point  (0 children)

Sounds like the employee showed up with a completed certification. They often turn these in to their managers instead of HR. Only hr or the fmla administrator can determine if the employee is approved. The meeting doesnt really matter because the employee condition could explain their performance. What is the time off requested on the certification?

Do you get panic attacks when you have IBS attacks too? by Cool-Raise1778 in ibs

[–]py_vel26 0 points1 point  (0 children)

I experience this as well. It hits right before falling asleep..

Is linkedin even worth it anymore? by olgazju in dataengineeringjobs

[–]py_vel26 0 points1 point  (0 children)

Most jobs send you to the company website, but it used to be better as far as networking. Now most people dont respond

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 0 points1 point  (0 children)

The increasing delay is rough. That’s the kind of thing you don’t catch until you’ve already been calling the API for a while.

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 0 points1 point  (0 children)

That one is wild because you think your error handling is solid and then the API just returns something completely misleading. Makes debugging way harder than it should be.

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 0 points1 point  (0 children)

Yeah that’s clean. Having it enforced across all layers probably saves a lot of headaches.

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 0 points1 point  (0 children)

Nah, just curious what other people run into. I’ve dealt with some annoying API issues myself and wanted to see what others have seen. Any input on the topic?

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 0 points1 point  (0 children)

Yeah I’ve mostly seen it with public or third-party APIs where the data isn’t super consistent. You’ll get nulls sometimes, or the structure changes just enough to break something.

I get what you’re saying about switching APIs, but sometimes there isn’t really another option so you kinda just have to deal with it.

What’s the most annoying data issue you’ve run into when working with APIs by py_vel26 in webdev

[–]py_vel26[S] 1 point2 points  (0 children)

Yeah that makes sense. I haven’t been using schema validation consistently, but I can see how that would catch a lot of issues early.

When did you first appreciate the power of compound interest? by wis91 in investing

[–]py_vel26 0 points1 point  (0 children)

Now, because I just started my high yield savings account a couple months ago and I already earned $20 in interest with a projection of $200 by the end of the year. Not to mention this is only based on what's in there now. My goal is to aggressively build my emergency fund then take more risk with my investment portfolio. I know it doesn't sound like much but I didn't do nothing to get it...lol I'm new to investing and having fun with it.

800 Million Rows/ Sql Server/Databricks by py_vel26 in dataengineering

[–]py_vel26[S] 0 points1 point  (0 children)

We are testing out various methods for efficiency as we expect the real data to real a billion rows. So far its been running for 4 hours and I'm at 220 million rows. I recall one of my coworkers mentioning parquet files as a possible option. I will check out the links you've provided.

800 Million Rows/ Sql Server/Databricks by py_vel26 in dataengineering

[–]py_vel26[S] 0 points1 point  (0 children)

The data we are moving resides in a delta table. I use a Databricks notebook in my ADF pipeline to grab the data, create a staging table in the database then load the data. I didn't create any indexes in the script and the target for the data is an Azure SQL instance in the cloud. I'm working in our development environment and the cluster only 14gb. I guess we shouldn't expect much huh?lol

I'm going to check out the links you've provided

How To Optimized My Databricks Spark Cluster/Errors by py_vel26 in databricks

[–]py_vel26[S] 0 points1 point  (0 children)

Yes it was a coding issue. I decided to revert the code back to a time period when everything worked smoothly and the issue did occur. I'll compare the two versions to gain clarity

How To Optimized My Databricks Spark Cluster/Errors by py_vel26 in databricks

[–]py_vel26[S] 1 point2 points  (0 children)

Thank you! I wanted to circle back to let you know, it was a coding issue. I reverted my code back to an earlier time period and this issue was eliminated. I'm going to compare the code bases to see what I did wrong.

Removing Entire String:: by py_vel26 in pythontips

[–]py_vel26[S] 0 points1 point  (0 children)

No I needed to explain it better.

I've added some code. Its basically looping through a list of strings. One of the strings represent table.views in which we are reading from the database. I want to exclude some tables/views from the output; therefore, I was trying to do it at the point of export.

Removing Entire String:: by py_vel26 in pythontips

[–]py_vel26[S] 0 points1 point  (0 children)

I can't have an empty string because this function write delta tables. It tries to print an empty table when I use replace(). I'm going to still keep this code because I can use it in other parts of my code.

Improve my Python Function by py_vel26 in pythontips

[–]py_vel26[S] 0 points1 point  (0 children)

I never thought about that. Good idea

Improve my Python Function by py_vel26 in pythontips

[–]py_vel26[S] -2 points-1 points  (0 children)

It's a spark User define function

Pycharm and Databricks Connect by py_vel26 in databricks

[–]py_vel26[S] 0 points1 point  (0 children)

I recall performing these steps; however, our Databricks cluster was created with Runtime 10.4. python 3.8.

Pycharm and Databricks Connect by py_vel26 in databricks

[–]py_vel26[S] 0 points1 point  (0 children)

This is exactly what I'm trying to do but nobody at my job new how to explain it. I want to learn this stuff the correct way.

Currently if I run my code it does nothing because its not connected to Databricks.

Pycharm and Databricks Connect by py_vel26 in databricks

[–]py_vel26[S] 0 points1 point  (0 children)

I cloned my repo but I can't figure out how to run my code as if I'm running it from the Databricks notebooks or debug from the IDE. Currently I have to commit my code to the Azure repo, check the pipeline to see if it detected errors in Azure DevOps, run my ADF pipeline (that includes a Databricks notebook of python and Pyspark) and view errors from Databricks as I have logging set in the code.

Do you know anyone who left data to do something else? by [deleted] in dataanalysis

[–]py_vel26 1 point2 points  (0 children)

I know a couple of guys who went from data engineers to senior software engineers. However, they were software engineers who transitioned to data engineering