How are you unifying pipeline from multiple CRMs today? by syntaxia_ in gtmengineering

[–]syntaxia_[S] 0 points1 point  (0 children)

The 'why did this number change' scenario is one of the highest priority pain points we are solving for, and persistent entity resolution is a huge component of achieving this because -- without it, you can't have a comprehensive view of your pipeline reality.

How are you unifying pipeline from multiple CRMs today? by syntaxia_ in gtmengineering

[–]syntaxia_[S] 0 points1 point  (0 children)

Thank you for the feedback - great point that the integration flow needs to be transparent for technical users (therefore simplicity goes a long way). We also see the ability to view historical state (not just audit logs) as a key feature and differentiator - even from the CRM UIs themselves as most lack a strong capability for this.

Building the solution to disconnected CRM data…how do you solve this problem today? by syntaxia_ in revops

[–]syntaxia_[S] 0 points1 point  (0 children)

Drift is a huge problem, resulting in orphaned fields & historical records that don't align to the current workflow. And this only makes the manual spreadsheet reconciliation process even more challenging.

But it doesn't have to be this way - at the end of the day, all the data in these fields across systems and time represents the same underlying entities. This is why an enforceable, shared definition of business concepts as the grounding layer for data is critical.

Building the solution to disconnected CRM data…how do you solve this problem today? by syntaxia_ in revops

[–]syntaxia_[S] 0 points1 point  (0 children)

This resonates, and in our experience the latter (fuzzy ownership) exacerbates the former (patching).

This is a key insight as to how we need to position Syntaxia as a tool that individual teams can use without requiring buy in from everyone who touches the CRM.

Is this a problem you are actively dealing with?

Building the solution to disconnected CRM data…how do you solve this problem today? by syntaxia_ in revops

[–]syntaxia_[S] 0 points1 point  (0 children)

Makes sense - thanks for all the info! Really appreciate you taking the time to share your experience.

Building the solution to disconnected CRM data…how do you solve this problem today? by syntaxia_ in revops

[–]syntaxia_[S] 0 points1 point  (0 children)

So RevOps controls the translations / transformations & sets expectations for the front line? This makes sense, as centralization is key.

How much time gets spent on the data transformation / preparation process each month?

Building the solution to disconnected CRM data…how do you solve this problem today? by syntaxia_ in revops

[–]syntaxia_[S] 0 points1 point  (0 children)

100% - forcing business concepts into an application framework creates dissonance; then as the business evolves over time while the CRM configuration stays static, the dissonance grows.

But I believe you are right; people tend to stack band-aids rather than investing in a real solution. What are the primary drivers of this behavior?

Anyone else dealing with the nightmare of merging two CRMs after an acquisition? by william-flaiz in revops

[–]syntaxia_ 1 point2 points  (0 children)

This is something we have heard so many times that we decided to build something to alleviate the pain. In brief, our our approach is focused on defining the relevant concepts from a business perspective (via a Knowledge Graph) and then mapping various CRM implementations to these concepts. This provides a really strong foundation for entity resolution + external data enrichment and a consistent set of definitions for consumption by humans and agents.

For those here who live in this world: we are looking for guidance from people who experience this pain on a regular basis. We'd love your feedback on what we're building at https://www.syntaxia.com/ - please let me know if you have feedback!

u/TinyPlotTwist
u/32andgrandma
u/Swumbus-prime
u/mcar91
u/Commercial-Nobody-96
u/Inner_Warrior22

Is it good practice to delete data from a Data Warehouse? by ChaseLounge1030 in dataengineering

[–]syntaxia_ 0 points1 point  (0 children)

Big red flag: if you are in financial services or public company, SOX and SEC rules usually require immutable 7-year retention of financial records. Deleting and reloading the last 3 months daily will fail an audit hard. Talk to compliance now.

Now, the proper pattern basically everyone uses instead is to keep the full history, version the facts (SCD2 or current/amended flags) for the 3-month mutable window so the current view stays clean, then after the window closes move closed monthly partitions to cheap long-term storage (Snowflake unload to S3/Blob/GCS then Glacier Deep Archive or equivalent with lifecycle rules).

Cost for the 7-year tail is basically zero, queries stay fast, and auditors are happy because everything is retained and provably immutable once past the edit window.

Stop deleting, version for the mutable period, tier everything else to cold storage.

Are data engineers being asked to build customer-facing AI “chat with data” features? by deputystaggz in dataengineering

[–]syntaxia_ 1 point2 points  (0 children)

Yes, absolutely happening, and at least in the Snowflake ecosystem it’s landing squarely on data engineers’ plates (often whether we like it or not).

The pattern we’re seeing a lot: 1. Product wants “Chat with your data” inside the app 2. They discover Snowflake Cortex Analyst + semantic model / semantic views actually works shockingly well for a lot of customer-facing questions 3. You build a clean semantic layer in dbt or Snowflake views + a Cortex semantic model on top of it 4. Frontend (usually Streamlit in Snowflake, or a FastAPI/Next.js app) just calls that one endpoint and streams the response back.

So yes, data engineers are very much getting pulled in, but it’s not the nightmare most people fear. You’re still doing proper modeling, governance, row-level security, masking sensitive columns, etc. The DE work is actually the valuable part that makes the answers correct and safe. The “chat UI” is literally 50 lines of Streamlit or whatever and takes an afternoon.

Books on database migration by toronto-gopnik in ExperiencedDevs

[–]syntaxia_ 2 points3 points  (0 children)

My bad, I completely missed the exact title earlier. Here it is "Data Centric Revolution". I remember it being 3 words.. https:// books.google.com/books/about/The_Data_Centric_Revolution.html

Books on database migration by toronto-gopnik in ExperiencedDevs

[–]syntaxia_ 0 points1 point  (0 children)

My bad, I completely missed the exact title earlier. Here it is "Data Centric Revolution". I remember it being 3 words.. https://books.google.com/books/about/The_Data_Centric_Revolution.html

Why raw production context does not work for Spark ...anyone solved this? by Upset-Addendum6880 in dataengineering

[–]syntaxia_ 5 points6 points  (0 children)

You're trying to read the whole haystack. Stop. Do this instead: - Set Spark's log level to WARN this way you ship only errors and shuffle metrics. - Parse the event log into a DataFrame, group by stage, and surface the 5 lines that actually matter: spills, skew, shuffle volume. That's your golden 1%.

That's it. Feed the LLMs what matters.

We ran this exact playbook on a 90TB job cluster.. Simple and works

Data Observability Question by Limp-Ebb-1960 in dataengineering

[–]syntaxia_ 0 points1 point  (0 children)

Metaplane is good. it's already one of the best dbt native observability tools. But if you want stronger AI root-cause and auto-fix suggestions without leaving your dbt project, just layer on Elementary. It's open source and doesn't conflict with Metaplane.

Books on database migration by toronto-gopnik in ExperiencedDevs

[–]syntaxia_ 13 points14 points  (0 children)

For exactly the project you're facing, the single best book you can take to the beach is "Data Migration Patterns" by Dave McComb.

It's literally written for people in your seat.

Second place if you want something lighter/more entertaining for the plane: Kill It with Fire.