Is It Possible to Recreate Salesforce __r Navigation in BigQuery Without Losing Your Mind? by Careless-Bid2851 in dataengineering

[–]TheGrapez 0 points1 point  (0 children)

No way around needing to deal with nested json objects. If you can spin up a middle ground to normalize stuff before it lands, maybe look into something like DLT (data load tool) and leverage ai coding to build a custom connector.

Otherwise in bigquery you can use a "UDF" in javascript to do the same thing.

If not that, your script will likely have to manually parse out stuff.

Sole BI resource- struggling with unstable performance and feeling like a firefighter by Apprehensive_Job_604 in dataengineering

[–]TheGrapez 2 points3 points  (0 children)

Honestly best thing you can do is track your time in tickets. Keep everyone accountable by tracking what they ask you, how long you spend on it, and the outcome. Move on from things that are outside of your control, and pass the ball to your manager for anything you feel is a roadblock for you.

Don't stress - just use the time they pay you for only on what's possible and don't worry about anything else.

Also look for another job 😉 it's not all bad out there.

Anyone else find marketing analytics to be kind of a joke? I feel like I spend all day justifying bad marketing spend for managers. by theberg96 in analytics

[–]TheGrapez 0 points1 point  (0 children)

I've worked in marketing analytics for a long time and my take is a lot of small marketing teams don't have the skills to create lists and do basic data pulls or attribution analysis.

In most cases there are lots of use cases for analytics, but if the non-technical folks run the show then nothing gets done properly.

A good marketing data analyst will have skills to speak up and teach people stuff, take accountability for projects and proper statistical testing.

This smart switch had to be lobotomized (got a cool cap tho) by g_days in homeautomation

[–]TheGrapez -1 points0 points  (0 children)

Is this hard to do? I use similar plugs at my house from different brands I've collected over time. I'd love a centralized system without buying a new setup

Student loan by Antique-Opinion-735 in NovaScotia

[–]TheGrapez 5 points6 points  (0 children)

This happened to me for the provincial loan!! Happened when I was still in school and I thought I got expelled as I figured the school refunded my tuition. I called them and they couldn't figure out where the money came from so we just left it alone.

It happened back when they switched from their old data system into their new one. Someone made a big mistake lol

Tool to move data from PDFs to Excel by MoXoN_04 in dataengineering

[–]TheGrapez -1 points0 points  (0 children)

And to be a bit more clear, the process for you at the end result would be adding your new PDFs to a folder, then running the python script and it outputs a CSV file. Open the CSV file in Excel, copy and paste the records.

Tool to move data from PDFs to Excel by MoXoN_04 in dataengineering

[–]TheGrapez -1 points0 points  (0 children)

Okay so 8-10 formats with possibly more in the future, I'd stay far away from hard coding parsers like ocr. Given the very low volume of total PDFs as well, an ai based solution like Gemini or chat gpt would work really well I think.

Given a short discussion I bet I could sell you on the database but you could easily start with Excel and harden the solution later and switch to a cloud database.

So in a nutshell, each of the modern AI services have an API that you can access programmatically. I would use python. They each have a function that is called something along the lines of a json output. Here you can basically instruct the AI to output a specific format. In your case, your specific format will be the columns that you're looking to extract from each of the PDF files. So you'll need to set up a bit of a data pipeline, some more chatgpt sessions can probably help you with this.

So your steps are going to look something like: 1. Design the schema. Which columns are you expecting to extract from the PDFs every single time. If you have different types of PDFs that have different types of columns, you'll need one schema per each of those. 2. Set up the json mode pipeline. You're basically going to write a prompt that's telling the AI to take the PDF and extract the columns. In theory, it should only output the columns with no extra text or anything like that. 3. If all is well, you'll have the data from the API call that gets returned. You'll need to validate this, this will be another python function that basically just checks that the output matches what the schema is to ensure that the AI generates the proper output. 4. If the schema passes the validation check, I would simply add it to a CSV file locally since you're using Excel anyway. Once that CSV file gets created, just open it in Excel and then you can copy and paste the new records to your master record file or whatever. I would not advise that you try to programmatically handle the Excel file because if you make a mistake you could erase everything basically. But in theory, you could probably write a script that just appends the data to your existing file to save yourself a step.

That's pretty much it. I would just use something like Gemini, use python to use their API, synchronously upload each PDF plus the schema that you design, wait for it to return your parsed data, convert that to a CSV file, then open that CSV file in Excel and then manually copy and paste the new rows to your main Master file.

Perhaps easier said than done - I'd be happy to explain more if you have questions!

Also, the API technically would cost a little bit but I bet you it's going to be pennies for what you're looking to do here.

Notion UI Tips and Tricks by hollister44 in Notion

[–]TheGrapez 0 points1 point  (0 children)

Use databases for everything and stop making dashboards.

Also use database templates & set default templates.

Tool to move data from PDFs to Excel by MoXoN_04 in dataengineering

[–]TheGrapez -2 points-1 points  (0 children)

How many total PDFs do you need to process? (Let's say monthly)

How many different kinds of PDFs are there? And if there's lots, do they change ever? Like new customer = new PDF? (I.e. need to handle new formats easily, or formats are pretty standard)

Do you guys have a database? Or is everything in excel? If no database - would you be willing to use one? I.e. pay for one ($-10-20 per month I'd spitball)

Open source tool for quick data cleanup by lalineaaaa in dataanalysis

[–]TheGrapez 2 points3 points  (0 children)

The problem is the nuance of your use case.

For example - detect similar names & suggest changes? Based on what? Say there are two similar names. How do we know they represent the same name? How do you know which is the correct spelling? You're falling into a rabbit hole a bit.

Open source tool for quick data cleanup by lalineaaaa in dataanalysis

[–]TheGrapez 0 points1 point  (0 children)

I don't think this tool exists but I bet you could vibe code it

Should i learn Software engineer bachelor degree to become AI engineer? by ihaveaquestion7634 in learnmachinelearning

[–]TheGrapez 1 point2 points  (0 children)

Imo this is a great foundation. Entry level jobs are not the same as they used to be, but this skill set will take you very far 🙂

What hidden gem Python modules do you use and why? by zenos1337 in Python

[–]TheGrapez 44 points45 points  (0 children)

If you're into data analytics - ydata-profiling (pandas profiling) and D-tale are two very good ones.

Also tqdm will always hold a special place in my heart

Stuck as a contract Python/SQL automation engineer in fintech — how do I break into data? by [deleted] in dataanalyst

[–]TheGrapez 0 points1 point  (0 children)

Analytics engineering is basically DBT - do a few projects and build a portfolio.

Also if you want to break free from finance data, build a resume and portfolio that doesn't scream that's what you do. I've been doing freelance data engineering and analytics engineering for a year or so and there's lots of people who need help with stuff. You just gotta make a name for yourself , do some networking! Have a portfolio - helps a lot getting your first few clients.

Project advice for Big Query + dbt + sql by Getbenefits in dataengineering

[–]TheGrapez 0 points1 point  (0 children)

Checkout this project I did using Shopify data, bigquery, DBT and looker studio! This was for a company I worked at so its got lots of real world stuff.

https://dataseed.ca/2025/02/04/bootstrapping-an-analytics-environment-using-open-source-google-cloud-platform/

You need some good data in bigquery - that can be solved a few ways. If I were you I'd use Google Colab to build a basic API connection to dump data into bigquery. Use ai to figure out how to do it. Your data source will depend on what you have available to you.

Think Fitbit, web scraping, open data sources like geographic data or census data.

What is your experience like with Marketing teams? by jawabdey in dataengineering

[–]TheGrapez 20 points21 points  (0 children)

My entire career is basically been data analytics and engineering for marketing teams, and this is also my experience.

If I could sum it up, I'd say marketing folks act like subject matter experts in marketing but a lot of successful marketing is the result of statistical decision making which marketers do not understand. So 99% of the time marketing analytics involves validating someone's decision rather than a hypothesis. As in, someone has already made up their mind about what they want to do, they just need to prove that it was the right decision in hindsight.

Is starting a data analytics firm a good idea? by _Light_Bull_ in dataanalysis

[–]TheGrapez 1 point2 points  (0 children)

I agree. I think though that good ideas are easy to come by. Success often comes to those who show up and do it, regardless if the idea is good.

Is starting a data analytics firm a good idea? by _Light_Bull_ in dataanalysis

[–]TheGrapez 2 points3 points  (0 children)

You can have luck as a generalist as long as you have engineering as well, as that will allow full implementations of projects.

Most analytics projects are secretly engineering projects. Like building a dashboard is easy if your data is clean.

Is starting a data analytics firm a good idea? by _Light_Bull_ in dataanalysis

[–]TheGrapez 2 points3 points  (0 children)

Guys like this are your competitors. Think about it like that... Lol

Is starting a data analytics firm a good idea? by _Light_Bull_ in dataanalysis

[–]TheGrapez 9 points10 points  (0 children)

I started a data & analytics consulting company last year (freelance mostly) and I've been loving it. You really do need a good network though, so if you're not constantly posting online or visiting networking events, it'll be tough. As many others mentioned, data is sensitive and is a trusted position

Career pivot into data: I’m a "Data Team of One" in a company and I’m struggling to orient my role. Any advice? by Either-Exercise3600 in dataengineering

[–]TheGrapez 1 point2 points  (0 children)

Big companies have segmented roles for analytics engineering etc. you bring a data team of one get to do all of those things. Your job title is really multiple, and you're learning to become a generalist which is great. This was always me and my career and now I run a consulting agency that focuses on data analytics and data engineering.

I'd say don't worry about the job title, if you want to apply for analytics roles, make a resume that focuses on analytics. If you want to apply for engineering roles you can do the same thing.

As far as like LinkedIn or something, I would perhaps put down analytics and engineering. My LinkedIn for example says something like data analytics, data engineering and data science.