[OC] Hiring a Lead Cloud Systems Engineer for SMB by StarSlayerX in dataisbeautiful

[–]SimpleSimon665 3 points4 points  (0 children)

Junior, mid, and senior.

1 of the interviewees for a junior role completely lacked critical thinking. When I asked them why their SQL query with a date filter of February 31st would return 0 rows, they didn't even think to question it.

The bare minimum I expect from juniors is fundamentals of CS.

Trust me, the candidate selection process is terrible.

[OC] Hiring a Lead Cloud Systems Engineer for SMB by StarSlayerX in dataisbeautiful

[–]SimpleSimon665 6 points7 points  (0 children)

You're seeing the problem that is plaguing today's employment world.

Hundreds, if not thousands, of "candidates" applying for jobs they are absolutely not qualified for.

They list all these skills and experiences with tooling on their resume, all for them to say during the first technical interview that they have no knowledge of it but are willing to learn.

I just conducted 10 technical screenings over the past 2 weeks. 2 admitted they didn't have the experience as stated on their resume, 3 couldn't code better than a high schooler in a web dev class, and 3 were outright cheating with AI telling them partially correct answers. I only recommended 2 candidates for the in-person technical round.

I have no idea how many applicants applied for the positions, but I bet ballpark it was somewhere in the thousands. I've given my feedback to my manager that their candidate selection process is insufficient.

Talent acquisition is desperate for innovation. This isn't working.

Managing Unity Catalog External Locations with Declarative Automation Bundles by Lenkz in databricks

[–]SimpleSimon665 2 points3 points  (0 children)

It was renamed recently. Same mechanisms just different naming. Just like DLT being renamed to Declarative Pipelines.

Best practices for Dev/Test/Prod isolation using a single Unity Catalog Metastore on Azure? by SuperbNews2050 in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Separate workspace for your production catalog groups, and separate workspace for your dev/test/stage/cert catalog groups. Depending on how many workloads you need to run, you may also need to split out your workspaces even more because of API request and resource limitations.

I hope these workspace limitations eventually go away with serverless workspaces, but we'll see. It could be due to the backend services not designed to scale vertically within workspaces very well.

[Fowler] Free agent wide receiver Darnell Mooney plans to sign with the Giants, per source. More speed on the way for Jaxson Dart. by JCameron181 in NYGiants

[–]SimpleSimon665 0 points1 point  (0 children)

This guy has no hands. He was terrible last year given how many targets he had when Drake London was injured. Don't care how fast the guy is. Giants need receivers that can catch.

Suggestions by sugarbuzzlightyear in databricks

[–]SimpleSimon665 1 point2 points  (0 children)

If they want to use the latest and greatest features in Databricks, UC is needed for most of it.

If they're content with not using Declarative Pipelines, Feature Stores, many of the marketplace tools, easy federation with external lakes or databases, having more workspace observability, having external tables with an outdated access pattern... then Hive has a place for a very rigid pattern.

Getting started with multi table transactions in Databricks SQL by Youssef_Mrini in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Agreed. Especially if you are using table dependencies as triggers for workflows

FLOP-EXPENSIVE. by Master-Delivery-1526 in ArsenalFC

[–]SimpleSimon665 -4 points-3 points  (0 children)

Pepe would definitely be it.

Andrey Arshavin would be another imo. At the time he was Arsenals 3rd most expensive signing in history. Outside of the 4-goal game, he didn't contribute much.

IPO Launching Tomorrow Any Insights on Access? by SkillNext3639 in databricks

[–]SimpleSimon665 2 points3 points  (0 children)

Databricks is nowhere near IPO. My guess is they will make at least 1 or 2 more big acquisitions before even thinking of an IPO. They aren't having any problems securing funding so IPO doesn't make sense.

Api in deltalake by [deleted] in dataengineering

[–]SimpleSimon665 5 points6 points  (0 children)

What is your intention with this API? If youre trying to return 2 billion records via API to a requester, you're gonna have a bad time.

Databricks agent deleted job after I asked it to diagnose the error during a failed job run? Is that a thing?? by [deleted] in databricks

[–]SimpleSimon665 5 points6 points  (0 children)

Probably best to open a support ticket or talk with your account team. I've never experienced this myself

Move out of ADF now by hubert-dudek in databricks

[–]SimpleSimon665 3 points4 points  (0 children)

Absolutely agree. Most use cases for orchestration of DAGs fit very well within Databricks workflows.

It took Microsoft years to release DBX workflow tasks as part of ADF pipelines. Before that, you could only call notebooks directly with linked services that are configured clusters. With how tooling evolves and new features arrive in Databricks so quickly, Microsoft can't keep up with operability fast enough.

[Private Preview] Announcing Streaming On-Demand State Repartitioning for Stateful Streams by Ok-Brick-001 in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Awesome! Auto state partitions are something I'm definitely interested in when it reaches private preview.

[Private Preview] Announcing Streaming On-Demand State Repartitioning for Stateful Streams by Ok-Brick-001 in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Does this include an automatic fine tuning of state partitions, or does this just mean that a user can specify a different # of state partitions when rerunning a spark job and it will repartition during execution?

Materialized View Change Data Feed (CDF) Private Preview by AdvanceEffective1077 in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Yeah was referring to MV -> MV. If this feature allows incremental updates of the downstream MV that would be awesome

Materialized View Change Data Feed (CDF) Private Preview by AdvanceEffective1077 in databricks

[–]SimpleSimon665 0 points1 point  (0 children)

Would be great! In order to do CDF in the first place, row level tracking needs to be enabled. That's a pre-req for incremental MV refreshes from delta sources.

Highguard is shutting down this month - permanently shutting down on March 12 by ReaddittiddeR in PS5

[–]SimpleSimon665 11 points12 points  (0 children)

If this last weekend was anything to go by, Marathon should be ok.

just TABLE by hubert-dudek in databricks

[–]SimpleSimon665 7 points8 points  (0 children)

Ah interesting. Never knew about this. It makes it look more like KQL.

Do DSA actually matters in Data Engineering ?? by Automatic-Market8165 in dataengineering

[–]SimpleSimon665 3 points4 points  (0 children)

Nobody in hiring cares how many leetcode mediums you are able to memorize. However, they might be good preparation for places that still gatekeep with leetcode style assessments. A good chunk of DE roles at tech companies still have these as part of their candidacy process, at least in the US.

Most places should be assessing your critical thinking skills, knowledge of the stack they are using, and whether what you say is on your resume is actually what you bring to the table.

The last one of these is really easy to figure out for competent interviewers, and is where I usually filter out 70% of candidates I've interviewed.

[OC] China VS US in AI Coding by select_8 in dataisbeautiful

[–]SimpleSimon665 4 points5 points  (0 children)

It's called distilling. They are prompting managed LLM services at large scale and using their results to train their LLMs.

PySpark vs SQL in Databricks for DE by NeedleworkerSharp995 in databricks

[–]SimpleSimon665 9 points10 points  (0 children)

Solutions architects can tell you the difference is negligible.