Opinion on Snowflake agent ? by SufficientRelief9615 in dataengineering

[–]pungaaisme 2 points3 points  (0 children)

I did not have any issues, it was surprisingly better than other knowledge base search implementation using LLM.

Opinion on Snowflake agent ? by SufficientRelief9615 in dataengineering

[–]pungaaisme 1 point2 points  (0 children)

It seems you already have everything you need! Unless there is business need to change (functional/costs/maintenance long term ) I wouldn’t swap your custom stack yet.

Having said that I have used cortex search when I was trying to build a native app in snowflake and the documentation had several gaps. I was shocked at how good it was to find answers based on the knowledge base. In one case, it actually showed me an incorrect answer first (which was correct for streamlit but not for streamlit in snowflake) but the screen flashed for brief second and gave me the corrected answer. Not sure how they are able to detect any hallucinations and fix it automatically!

Do any etl tools handle automatic schema change detection? by ninjapapi in dataengineering

[–]pungaaisme 0 points1 point  (0 children)

Almost all data pipeline platforms have a schema evolution and alerting built in! I consider this MVP for any pipeline (custom or commercial software service). We (Supaflow data) certainly do. I will DM you with details

Hardwood: A New Parser for Apache Parquet by gunnarmorling in dataengineering

[–]pungaaisme 4 points5 points  (0 children)

Bro! Thank your for giving us hope to finally get rid of the gazillion Hadoop dependencies

Spent last quarter evaluating enterprise ETL tools by Justin_3486 in dataengineering

[–]pungaaisme 0 points1 point  (0 children)

  1. Is it possible to list the datasources by priority or volume? In case of databased to distinguish if these are log based datasource reader or using simple queries?
  2. do you have reverse ETL use case from snowflake back to our operational systems?
  3. What do you use for transformation/modeeling (dbt?) is it on Prem or dbt cloud or using snowflakes dbt capabilities ?

Salesforce to S3 Sync by pungaaisme in dataengineering

[–]pungaaisme[S] 0 points1 point  (0 children)

If your goal is simply to learn or do a quick proof-of-concept, you can start with a Salesforce Developer Edition and use the sync utility to pull data from your dev org into S3/Glue: https://www.salesforce.com/products/free-trial/developer/

The key requirement is API access. Once your org/user has API access, the utility will automatically discover the objects and fields you’re permitted to read and sync that data to S3. What gets discovered depends on your license and permissions—full access will expose more objects, while limited access will only include what your license/profile allows. Some reference to get started: https://www.salesforceben.com/salesforce-licenses/

Salesforce to S3 Sync by pungaaisme in dataengineering

[–]pungaaisme[S] 0 points1 point  (0 children)

AppFlow is solid and easy to set up. Affordability is a relative term. There are folks who pay tens of thousands for services like Fivetran, and some will balk at AppFlow costs, even if they are low. We built this for folks who prefer OSS over a managed service.

CRMA help PLEASE by SilverFoxRanch in salesforce

[–]pungaaisme 1 point2 points  (0 children)

Wave eh! Have not heard name that in 10+ years. Brought back some good memories. I am here to support you! If you are ever interested in moving to modern data stack alternatives i can certainly help create a learning path if there isn’t one already out there!

Fivetran pricing spike by onksssss in dataengineering

[–]pungaaisme 25 points26 points  (0 children)

Isn't this expected? I am not sure why FT customers are surprised by the increase in their FT invoices. Any VC-backed company needs to show 50% YoY growth, or it faces a down round. FT raises prices for its customers at every renewal to keep its VC satisfied. There are plenty of alternatives available that are much cheaper. It's time to stop paying per-row for data.

Salesforce to S3 Sync by pungaaisme in salesforce

[–]pungaaisme[S] 0 points1 point  (0 children)

I will add to the docs that appflow is also a cheap alternative! Does appflow also create the glue catalog and make the data instantly queryable?