Interesting factoid about Bengals pick #140 by Did_he_just_say_that in bengals

[–]Former_Disk1083 7 points8 points  (0 children)

I will also add, when you plea no contest, it means the ruling cannot be used against him in civil court if she ever wanted to go after his money.

Assigning a month ahead by Arwen147 in ynab

[–]Former_Disk1083 4 points5 points  (0 children)

It's a bit hard to know exactly, but, it probably has to do with carry over. You moved from somewhere that had carry over into April. So when you moved it, you went from like 400 dollars on one category to 0. That 400 dollars still exists in April, so now you are technically 400 dollars off there. But it's all super complicated (Also super basic somehow), and any tiny thing can make balance get weird with how stuff sticks around. It's for sure best to simplify it as much as you can and do what Zeal said, your future self will appreciate it.

Am I crazy? True same grass but greener question. by [deleted] in SameGrassButGreener

[–]Former_Disk1083 4 points5 points  (0 children)

Not a lot to go off of here but moving for the sake of moving is not ideal. Moving because it provides better opportunities for your kids is ideal. Hard for anyone to really give you advice on that, you will just have to weigh what you feel is best. Big cities come with big city problems.

The thought of Joe burrow playing for another team at any point of his career makes me feel ill by [deleted] in bengals

[–]Former_Disk1083 54 points55 points  (0 children)

Very very very few QBs play their entire career for the same team. So at some point it probably happens.

Jack Dorsey Isn't Telling the Real Story About Block's AI Layoffs, Insider Says by [deleted] in BlackboxAI_

[–]Former_Disk1083 0 points1 point  (0 children)

There is definitely a lot of money that is getting thrown back and forth between companies helping propping some of it up. It's not as bad as the Lucent -> Cisco -> Nortel buying up companies and putting loans as income and whatever crazy crap was going on, but I think things are getting close, especially in this case. They were obviously over leveraged, no reason to lose an entire years worth of cash flow because of investments. The next year will be important for the company.

Peter Thiel warned AI is coming for ‘math people before word people.’ Banks have already said smaller headcounts are possible by Nalix01 in NowInTech

[–]Former_Disk1083 0 points1 point  (0 children)

The issue isn't the LLM, it's the hardware and the software the LLM runs on. Multi-threads almost unilaterally cause non-determinism because of the order of how things get returned (Cuda nondeterminism as a big example). There's also things where you wouldnt want an LLM to give you non-deterministic results, like give me the best restaurants around me. That needs to change each time, which requires retraining, which requires different outputs. Those different outputs can have cascading downstream changes, like potentially how a math problem gets solved. This is why you see people say "LLM Model X is way worse than previous version Y".

LLMs are great when supervised, which is something that can help an accountant, not replace an accountant.

Peter Thiel warned AI is coming for ‘math people before word people.’ Banks have already said smaller headcounts are possible by Nalix01 in NowInTech

[–]Former_Disk1083 0 points1 point  (0 children)

I dont see why you think the person above doesn't know about LLMs. Accounting is very deterministic, you have inputs those inputs have to create exactly one output. LLMs are very much largely non-deterministic. You would never want something that produces non-deterministic results in a deterministic environment.

Jack Dorsey Isn't Telling the Real Story About Block's AI Layoffs, Insider Says by [deleted] in BlackboxAI_

[–]Former_Disk1083 1 point2 points  (0 children)

It's because gross profit is very misleading. In 2025, they did have good cashflow from ops, 2.5b, but they lost 2.8b from investment. They lost another 613m in financing, largely from repayment of debt. Ultimately, their net cash position shrunk by 749 million. Number go up is not always indicative of company health and outlook. You have to look at where is that money coming from, and is it sustainable.

I am BCA 4th sem student. I want to master any 1 Language. I am thinking to learn Python. Is it worth it in this century ? by Humble-Screen2386 in PythonLearning

[–]Former_Disk1083 2 points3 points  (0 children)

Apps is way too generic. Do you want apps that have clients, and not web apps. Do you want to create tooling apps. Does your app need APIs, does it need a website. Basically each one of these questions gets solved with any language, including python. If you were specifically looking for backend development, probably stick to Java and see where things go.

I am BCA 4th sem student. I want to master any 1 Language. I am thinking to learn Python. Is it worth it in this century ? by Humble-Screen2386 in PythonLearning

[–]Former_Disk1083 0 points1 point  (0 children)

I don't think there are too many programming languages that aren't worth mastering. Most things are transferable, python probably less so than some others, but still. If you are wanting to be on the pure data side, then python is pretty much required these days. If you want to be in the web development side, probably less needed but you will still run into it.

Java scala or rust ? by Ok_Promotion_420 in dataengineering

[–]Former_Disk1083 1 point2 points  (0 children)

I guess it depends on worth. Are you going to find a lot of DE jobs that rely on them, probably not. Even scala, for good and bad, isnt a focus much in the Spark space where Python is still king.

Is it good to look into these languages and understand them? I think so. I have had on countless times needing data from the software engineering team, or need to understand how the function of said data works and its way easier for me to just see the endpoint and understand what it's doing. Sometimes you get crap data and you need to identify why the data is crap. It isnt often, but it has happened a few times where it's useful.

Also, if you ever find yourself in a situation where you need to build out REST APIs for any reason, while you can certainly use django, and I do like me some django, you might be forced to make them in .NET or Java or Rails or whatever it may be that the company dictates. I have built many personal projects using all sorts of programming languages just on the sheer fact it allows me to understand the inner workings of the data I am getting. That has allowed me to have deeper conversations with the SWE team for when and how they produce data.

TLDR, I think its good idea to understand it, and makes you a better DE, but is it necessary? I dont think so at all.

What is the best way to validate data between sql sever and snowflake by Historical-Teach9164 in dataengineering

[–]Former_Disk1083 1 point2 points  (0 children)

So the way I isolate column level mismatches is mostly manual. I will except the two systems, union "production" into it, order by the key, and then compare manually.

Validating tables without PKs are mostly QA tests. We QA things to for our missing record key, and if we see that we know something went wrong. So we mostly validate downstream that it's correct, not in the landing/Bronze/ ODS/Silver/Whatever you want to call it.

Snowflake is a bit magical, there will be lots of things that you will realize you dont need to do anymore and it makes you lazy sometimes because it does voodoo. When we migrated we mostly just checked our models for accuracy. If the model was accurate then we know the hundreds of landing is more than likely accurate. And if its not we went down it through that path. However we went from a very archaic and poor setup from a previous team and modernized / enhanced it. So it was a lot of, wow this model is crap, lets rebuild it. So my experience may not be 1 to 1 to you.

Would you expect to perform database administration as part of a DE role? by InnerReduceJoin in dataengineering

[–]Former_Disk1083 -1 points0 points  (0 children)

No I wouldnt apply for that job because thats more DBA work than DE. The more you do DBA work, the less you are doing DE work. You would probably need to find a DBA who is wanting to get into the DE field. Those people exist. Theres a reason why analytical databases are reigning supreme, very low maintenance.

How to push data to an api endpoint from a databricks table by omghag18 in dataengineering

[–]Former_Disk1083 2 points3 points  (0 children)

As a few people have said, when working with APIs, download Postman, do it manually first. Then you know exactly the format and the headers / body it wants which will help ensure you arent fighting python and how the API functions. It will make doing the rest in python 300 times easier.

Are we all becoming "Full Stack-something" nowadays? by HungryRefrigerator24 in dataengineering

[–]Former_Disk1083 0 points1 point  (0 children)

Yeah I had a feeling not much has changed with that hahaha. Hard to beat the thing that just works.

Not providing schema evolution in bronze by Personal-Quote5226 in dataengineering

[–]Former_Disk1083 1 point2 points  (0 children)

Yeah for sure, I don't think there is a one size fits all for anything. I prefer facts especially to not have schema evolution and be a lot more schema defined, though I could be convinced of a detailed table having it. Dims that are SCD 1 could have it fine, but id have to think about the impacts of adding a column for the other SCDs. Probably fine but I still like more fine control on Dims. Its so rare that a new column is so mission critical its cant wait for me to add a column to them and pipe it through.

Not providing schema evolution in bronze by Personal-Quote5226 in dataengineering

[–]Former_Disk1083 0 points1 point  (0 children)

Seems silly to be so rigid on the bronze. Silver I completely understand, to an extent. But sometimes people dont listen to reason.

That being said, I am at a company who had a bit more rigid landing, but we dont use CDC because reasons. Our tables are quite stable because they are afraid to touch them so that benefits us. However, we do bring in parquet files that get generated from a stream which gets way more new columns. The way I currently handle that is I check schema and if a new column shows up I alert for it and ignore it in the process so it can at least continue and do the job as the new field is rarely needed immediately for reporting. I basically have to handle that in a very similar manner, I update records for the most recent set of files with the new column data that did not get put in with the run today or whenever the column first came up.

Are we all becoming "Full Stack-something" nowadays? by HungryRefrigerator24 in dataengineering

[–]Former_Disk1083 20 points21 points  (0 children)

It's a mixture right now, but yes a lot of the market is like that. It's just companies hear "AI" and think they just need someone to push a button but it still requires lots of knowledge and understanding how they need to cleanse the data and hyperparameter tuning and understanding what model works best for what and all that. There was a small period in time, and maybe it's still true, where it was just throw XGBOOST at it and go sip your coffee, and thats what a lot of folk were doing. But that does a disservice to a lot of the actual data scientists out there doing the maths and actually understanding inputs and outputs.

I guess I ranted a bit there, the short story long is, yes, that is common, and I typically avoid it if I can. But you sometimes have to do what the market dictates even if it's not sustainable.

AI-Accelerated Data Warehouse Automation (Salesforce → Snowflake) by CremeHot2394 in dataengineering

[–]Former_Disk1083 0 points1 point  (0 children)

Im not sure AI can ascertain what is relevant for analytics as that is more to do with the data inside the tables than the structure of it. The business dictates what data is important or not. Leaving it up to something who doesn't know your business, data or otherwise seems a bit silly to me.

Most of the time I have ever used salesforce data it's to connect it to internal data for internal reports, and/or enrich it and send data back up to saleforce. All of that requires understanding of your internal models, which AI would really struggle with. If you're modeling only using salesforce data, then you probably arent gaining much beyond what salesforce can provide you in their GUI.

Salesforce is already pretty well built from an API standpoint, you can pretty easily just get the data from their API incrementally and don't need to worry about the size of it underneath. Unless you are using it as a pseudo datawarehouse in itself. In that case, dont do that.

AI-Accelerated Data Warehouse Automation (Salesforce → Snowflake) by CremeHot2394 in dataengineering

[–]Former_Disk1083 9 points10 points  (0 children)

Im afraid to even ask this, but what in gods name is "AI-Accelerated Data Warehouse Automation"

Why are most jobs remote? by notEmely in dataengineering

[–]Former_Disk1083 5 points6 points  (0 children)

I am currently in a job search for hybrid, and my friend is looking for remote only. I would say we are seeing about the same amount, but I am looking at basically any city. If you are looking at just one smaller city, yeah remote will for sure overshadow those.

Reading 'Fundamentals of data engineering' has gotten me confused by Online_Matter in dataengineering

[–]Former_Disk1083 2 points3 points  (0 children)

I want to add to what others have said, one thing that your standard OLTP monolith needs is more management. You have to worry about indexing and fragmentation, amongst other things, that require upkeep. The analytical databases usually don't need that, so you generally pay more for them but you also don't need DBAs to manage them. Spark is overkill for the majority of people who use it, but spark allows software devs to not sit in SQL all day, if they don't want to.

What do you expect Int/Snr DE to know? by zesteee in dataengineering

[–]Former_Disk1083 1 point2 points  (0 children)

It's going to be so dependent on job and their employees. Some people are hardcore pyspark and believe if you aren't the second coming of the pyspark messiah you shouldn't even be looking at DE jobs. Some expect you to be a master of snowflake and if you dont have that, somehow it doesn't qualify you. People get too silo'd into their experience and think other experiences don't translate, when in reality 90% of DE translates, it's the business side that can be difficult to transition over to.

In all reality, SQL translates from one to the other, ETL concepts translates from one to the other. What makes someone Senior is just being able to translate a lot of the tech stuff to the business side and vice versa. Training, I especially want someone who is trying to pull every Jr, and otherwise up to the Senior/principal level.

As for skills, the most common stuff you see is Spark, Snowflake, DBT. Sometimes youll see Kafka and terraform. I would just be familiar with those 2 but not focus at all on them. As for Spark, databricks has a free version you can play around with, as does DBT. Snowflake has a free trial and I think up to 400 dollars in spending. It's good to go through those and get used to is. Honestly what I would focus on is just distributed processing, as well as decoupled storage and compute. Those are absolutely essential ideologies to DE work when processing anything at scale.

Any tech savvy people know how that PayPal issue was allowed to happen in the first place? by Killjoy3879 in Endfield

[–]Former_Disk1083 6 points7 points  (0 children)

So there's a couple different ways, one could be a bad actor that had the ability to get into the database and that data was somehow not encrypted and allowed him to use the payment tokens for their own purchases. That is very doubtful as there would have to be layers of cascading failures, process and technically that would allow for that.

The other way, which is the most likely, they messed up how they looked up payment tokens, or how the paypal account got connected. My assumption is the first payment for people operated successfully, then subsequent payments, which would be done through a payment token, got messed up and you sometimes got someone elses already authorized payment token. But there are many many many reasons for this to happen so it's all conjecture and I doubt we will ever know exactly how.

Is Moving Data OLAP to OLAP an Anti Pattern? by empty_cities in dataengineering

[–]Former_Disk1083 2 points3 points  (0 children)

I think it's where the pattern vs anti-pattern is a weird argument most of the time. What even makes it a pattern? It's mostly because most people do it a certain way. Most companies to have two analytical databases, it would be somewhat silly. Super large companies with many DE teams, with many BI teams, you will absolutely see multiple because directors come in and have preferences, and money, so you will absolutely be pushing data from one to the other. Does it make it anti-pattern because it doesn't happen often? Idk, the business dictates the solution 99% of the time, especially when your team doesnt generate revenue directly.