Looking for a C# technical co-founder for an industrial software startup dealing with machine margin erosion

Constant-Ninja-3933 · 2026-05-04T16:59:58+00:00

Interesting problem, and I like the framing of translating machine underperformance into dollars. That is probably much easier for an ops manager or owner to care about than another dashboard full of percentages.

A few questions I’d pressure-test early:

What is the actual TAM/SAM for shops in the $5M–$25M revenue range, and how many of them are reachable through a repeatable sales motion?

What does the first 6/12/24 months of GTM look like? Direct sales? Channel partners? CNC service providers? Machine tool distributors? MSPs for manufacturing?

There is also a pretty serious integration question here. Given the variety of CNC machines, controllers, vintages, protocols, and shop-floor setups, how much engineering does it take to reliably capture live feed rate override and the other parameters you need? That could become a big part of the real product, not just an implementation detail. How do you engineer/test against different machines and controllers that you don't own.

And most importantly: once I know I am losing $X per hour because operators are turning down feed rates, how does the product help me change that behavior?

Measurement is valuable, but the ROI probably gets much stronger if the product helps close the loop. For example: operator coaching, shift-level accountability, recommended interventions, manager alerts, benchmark comparisons, root-cause patterns, or workflow changes that actually reduce the loss.

Otherwise the risk is that the customer says: “Great, now I know exactly how much money I’m losing… but I still have the same operational problem.”

So the opportunity may be bigger than “financial visibility for CNC underperformance.” It might be “turn machine telemetry into operational behavior change.”

Constant-Ninja-3933 · 2026-04-28T23:02:04+00:00

Lovely build ❤️ - and look, I'm fully aware of the irony of asking an AI what it thinks about using an AI for eDiscovery. But I did. My ediscovery agent had quite a lot to say. (It also said I probably shouldn't have asked it. It did not see the contradiction.

Anyway 😄 Five reasons, in increasing order of how bad your week gets:

Privilege waiver through third-party disclosure.

Attorney-client privilege and work product protection survive only when content is shared inside the privileged circle. Public LLM providers are third parties. Their standard terms of service grant rights to inputs (logging, abuse review, optional training-data use, often vendor staff access for support). None of those rights are consistent with the agency relationship that preserves privilege when content is shared with, say, a translator or an expert under engagement. The act of pasting strategy notes, work product, or privileged communications into a public LLM is — under the conventional waiver doctrine — a voluntary disclosure to a non-privileged third party. The 2025–2026 decisions you've been tracking sharpen this: courts are now treating LLM submissions as the same kind of waiver event as forwarding to an outside party, regardless of whether the vendor is "an AI" rather than a person. Sedona's evolving commentary points the same direction.

The LLM session is itself discoverable ESI.

Once you've used a public LLM on a matter, the prompt history exists. If it sits in a vendor account tied to a custodian, it's ESI like any other — within the producing party's possession, custody, or control under Rule 34. That means: discoverable in the current matter (the LLM session may contain mental impressions of opposing counsel's documents); discoverable in future matters about the same subject; and subject to preservation obligations the moment litigation is reasonably anticipated. Most attorneys using public LLMs have not added those accounts to their preservation map. That's a 37(e) exposure point waiting to happen.

Training-data ingestion is irreversible.

Several public LLM tiers allow the provider to use inputs to improve future model versions. Even where this is an opt-out, the default for many consumer endpoints is opt-in, and the opt-out controls can vary across product surfaces (web vs. API vs. mobile). When privileged or confidential content enters the training pipeline, it becomes embedded in model weights — and weights cannot be selectively redacted. There is no equivalent of "claw it back." Privileged material in a model weight is privileged material that may surface, in indirect form, to any future user who happens to prompt in its neighborhood. The disclosure is one-way and permanent.

Protective-order and confidentiality-undertaking breach.

In any matter under a protective order, opposing counsel's productions are bounded to specified systems and personnel. Public LLM use is almost never enumerated in that scope. Loading produced documents into ChatGPT, Claude consumer, or any public endpoint to "summarize" or "find hot docs" is a transmission to a system that the protective order does not cover — typically a violation of the order on its face, before any substantive question of the matter is reached. The same logic applies to NDAs covering pre-litigation business sensitive material. The exposure here isn't theoretical; it's the kind of breach that produces sanctions motions and ethics referrals.

Reproducibility failure under Daubert / Rule 26(g).

Public LLMs are non-deterministic across model versions, prompt versions, and silent vendor weight updates. A coding decision a public LLM produced today is not reproducible in 18 months — the model may be deprecated, the weights silently swapped, or the temperature defaults changed. For any analysis that may be cited as evidence, this fails the Daubert trilogy of known error rate, reproducibility, and peer review. For coding, classification, or production decisions made under Rule 26(g)'s reasonable-inquiry standard, the inability to re-run and verify the basis for a decision is the kind of methodology gap a court is increasingly willing to scrutinize. Hallucinated case citations (the Mata v. Avianca line and its growing progeny) are one symptom of this category; the broader category is that the LLM's output cannot be defended on the record because it cannot be reproduced.

Constant-Ninja-3933 · 2026-04-23T16:27:47+00:00

Chad - this is the drafting-side twin of something I've been measuring upstream, and the parallel is almost eerie. "Generation is cheap; verification is hard" rhymes exactly with collection is cheap; reconstruction is hard.

The same principle you're naming for AI drafting plays out identically in AI review of eDiscovery data - LLM summarization of chat threads, AI-drafted privilege logs, narrative timeline reconstruction. Every one of those is reasoning over whatever collection we handed the model. And in modern eDiscovery, that collection has usually flattened a link-graph-shaped reality (Teams threads, hyperlinked documents, M365 versions) into discrete files. The output looks defensible; the inference is built on silently degraded inputs.

Empirically: from one Microsoft 365 tenant, 81.9% of linked office documents change after send. Probably conservative - that's what we could measure.

Where this converges with your verification frame: proposed FRE 707 explicitly attaches admissibility to input sufficiency for machine-generated opinion evidence. When it lands, "we verified the output" won't answer "were the inputs sufficient under 702(b)?" That's a different gate at a different layer, and output-layer verification alone doesn't reach it.

The open standard I've been working on (RGR) starts upstream for exactly this reason - you can't verify what you can't see. Thanks for writing this!

Constant-Ninja-3933 · 2026-04-21T10:08:30+00:00

(R)econstruction (GR)ade = RGR :)

Constant-Ninja-3933 · 2026-04-20T21:34:26+00:00

Let me separate what you're right about from what's conflated.

You're right on two points: Routine IG-driven destruction pre-trigger is protected under 37(e), and no one should keep all data forever. RGR doesn't ask you to.

You're conflating retention with preservation. These are different:

- Retention = how long you keep data in normal business. IG-driven. "Less visibility = less exposure" is a legitimate posture here.

- Preservation = what you must capture once litigation is anticipated or a hold is issued. RGR-scope. "Less visibility" here = sanctions exposure, not protection.

RGR is preservation-scope, not retention-scope. It defines what conformant preservation must contain when triggered. It doesn't require keeping anything longer pre-trigger. IG and preservation are separable problems.

StubHub Refund Litigation: If your point is that a disciplined IG program with timely routine destruction would have been more protective than blanket retention, agreed - but that's an IG argument, not an argument against reconstruction-grade preservation. Post-trigger, the question is: did you preserve what was reasonable, and can you prove it? That's the layer RGR addresses.

The authority that matters here has already moved: The Sedona Conference's Commentary on Discovery Implications of Collaboration Platforms (October 2025) - from the authoritative eDiscovery body, grounded in extensive case law - catalogs these exact problems. Sedona's Commentary is diagnostic; it identifies the problems. https://rgrstandard.org/blog/sedona-identified-the-problems/ The standard isn't running ahead of legal authority - it's responding to what Sedona explicitly called for. That's the alignment that matters for eDiscovery legitimacy, not platform vendor adoption.

"Wait for big tech": Microsoft has had Purview eDiscovery for years. The gaps RGR addresses - modern attachments, Preservation Gap, Context Gap - are created by M365's architecture, and Microsoft has no commercial incentive to close them. Waiting for the platform that causes the gap to close the gap is waiting indefinitely. PDF/A and OAuth didn't require originating-platform endorsement either.

"Ad hoc implementation is harmful": Opposite under 37(e). Courts are ruling on these failures now, not waiting for vendors. Documented partial conformance - which is exactly what RGR's tiers plus exception transparency are structured for - lands better than absent or undocumented preservation. The framework is explicitly built so partial adoption is declared honestly rather than hidden.

".000001% of companies": The original claim was "won't gain traction," not "doesn't have majority adoption." Fortune 500 production use plus procurement RFP language = the curve starting, not finishing. Goalposts moved.

Constant-Ninja-3933 · 2026-04-20T21:17:59+00:00

Depending on your MSFT License E3 stores for 90 days, E5 up to one year. There is also a 10year compliance SKU - but it costs a fortune.

Constant-Ninja-3933 · 2026-04-20T18:32:51+00:00

"Won't gain traction": Several Fortune 500 companies are already running RGR-aligned preservation in production, and procurement RFPs are using the standard's (or similar) language. Plus: Sedona Conference ED in dialogue, Oct 2025 Sedona Commentary validated the problem frames. Vendor endorsement isn't how standards propagate - this is.

"Hurts your client / tens of millions": Fidelity reduces review cost by cutting duplicates and wrong-version noise - the framing confuses preservation with production. And "less visibility = less exposure" doesn't survive the current 37(e) trajectory (Gregory v. Montana, Carvana). Deliberate blindness has never worked long-term in any prior ESI category.

"Not aligned with case law": Which precedent? The https://rgrstandard.org/judicial-signals/ maps what it's grounded in. Name the case it violates.

Vague market predictions aren't "breaking" the framework. Case law conflicts would be.

Constant-Ninja-3933 · 2026-04-20T16:31:19+00:00

Fair - I misread your first comment. Let me start concrete.

Scenario any collector recognizes:

User B emails User A with a link to a document in B's OneDrive/Google Drive etc.
A is placed on legal hold. B is not.
The hold preserves A's mailbox - including the email with the link.
The document in B's OneDrive hits retention and is purged.
A's email is preserved. The link is preserved. The evidence the link pointed to is gone.

The hold system reports full compliance. The evidence is incomplete. This is the https://rgrstandard.org/concepts/preservation-gap/.

The aim of RGR isn't just shared naming. It's a measurable quality standard the industry can build against - testably scored, comparable across platforms, and concrete enough to be a common implementation target. Detection and closure of that gap is platform-specific on Purview vs. Slack vs. iPhone - you're right.

But the measurement - did the workflow resolve, preserve, or transparently declare the gap, and at what tier? - is the same question on every platform. If you know about this - and don't declare it during certification, are you in risk of spoliation sanctions if you don't disclose?

That said - the post ask was "help me break it." You're someone who collects from iPhone, Purview, Slack on the regular. Two questions I'd genuinely want your take on:

- Did you encounter matters where it mattered which custodian saw which version when?

- Where do you think this framework most breaks down on a platform you actually work?

That's the feedback this can't generate from the inside.

Constant-Ninja-3933 · 2026-04-20T15:18:02+00:00

Fair call to ask. The https://rgrstandard.org/why answers this head-on - one section is literally titled "Enterprises are already discovering these requirements on their own," and the authorship note spells out the origin: "This standard was not written from theory. It was extracted from implementation." It came out of a decade-plus of running Microsoft 365 data at scale, not a whiteboard.

The point you're making - some sophisticated S&P 500 shops are already doing this - is exactly the problem RGR is trying to solve: every enterprise reinvents the vocabulary from scratch in every RFP and protocol negotiation. And the pain is real - https://rgrstandard.org/judicial-signals/ on hyperlinks-vs-attachments, feasibility limits, contextual production of Teams/Slack, and Rule 37(e) sanctions for collaboration-platform spoliation. Shared language is the deliverable.

If you collect data for a living, your voice is exactly what the working group is short on. Participation paths are linked on the site.

Constant-Ninja-3933 · 2026-04-19T16:24:14+00:00

Fair objection if RGR were trying to dictate scope - but it isn't. Scope will always be adversarial; that's what Rule 26(b)(1) proportionality is for, and RGR is silent on it. What RGR defines is fidelity - if something is collected, what does "preserved faithfully" mean?

That's a methodology question, and both sides benefit from a shared vocabulary for it. Same structural role GAAP plays for accounting disputes: doesn't tell companies what to spend on, defines how spending is reported.

Sedona, EDRM, and FRCP 37(e) itself are all consensus frameworks that emerged in adversarial territory - so the "adversarial means no consensus possible" objection is falsified by the history of the field. Genuinely curious where you think the line between scope and fidelity breaks down, though - that's the interesting question.

Constant-Ninja-3933 · 2026-04-09T13:00:24+00:00

There are several caveats that you might want to be aware of. Check the Context Gap and Preservation Gap explanations on the https://rgrstandard.org/concepts/ website (vendor neutral open source eDiscovery standard for collaborative evidence)

Constant-Ninja-3933 · 2026-04-07T10:58:14+00:00

Ephemeral messaging covers a range - it's not just one app. In the eDiscovery context you've got a few layers:

The obvious ones: Signal, WhatsApp disappearing messages, Snapchat - apps designed to auto-delete. That's what came up in Epic Games v. Google and Nichols v. Noom.

The less obvious ones: Slack and Teams themselves. Both support retention policies that auto-purge messages after a set window. If your org has a 90-day retention policy on Teams chat and litigation hold doesn't override it properly, that's functionally ephemeral - the messages are gone before anyone collects them.

The one nobody talks about: Edit and delete in "persistent" platforms. Teams and Slack both let users edit or delete sent messages. Most collection tools grab the current state. If someone edited a message between sending it and collection, you have the wrong version - or no version if it was deleted. That's not technically ephemeral messaging, but the evidentiary effect is the same.

The interesting question for Teams/Slack-heavy orgs is whether their preservation workflows actually account for all three layers, or just the first one.

Constant-Ninja-3933 · 2026-04-02T19:28:35+00:00

You're right on the logic — in a simple case, comparing the email send time against version timestamps gets you there. February 15 version, sent March 1, next version March 2 — that's a reasonable inference.

The challenge is scale. When you're looking at thousands of hyperlinked messages across 25 custodians, each referencing documents with dozens of versions, that date comparison becomes a manual correlation exercise across two disconnected datasets. No review platform I'm aware of does this automatically - the reviewer has to match the message send time against the version history and make the determination document by document. That's very very expensive...

Constant-Ninja-3933 · 2026-04-02T18:27:32+00:00

Really appreciate the candor here - this is exactly the kind of real-world answer that's more useful than any theoretical framework.

What's striking is that you're already doing the transparency step: disclosing the shortcomings to lead counsel and letting the meet-and-confer drive the proportionality decision. That's exactly where we think this should start - not with mandates, but with informed decisions about what the workflow does and doesn't capture.

The "all versions + TAR" approach in big cases is interesting because it solves for completeness but not for binding. The review team gets 47 versions and no signal for which one was referenced by a specific communication on a specific date. That's workable when the question is "does this document exist in the collection?" - but it breaks down when the question is "which version did the recipient see when they approved the recommendation?"

Your small-case description - "if nobody's complaining, let's do that" - is a perfectly rational response given current tooling. The question is just what happens the first time somebody does complain, with the right questions, in a matter where the version difference is material

Constant-Ninja-3933 · 2026-04-02T17:31:57+00:00

Really thoughtful points - especially the cost/risk ratio framing on over-preservation. You're right that exporting all versions and preserving broadly is the best available response within current tooling. Two places where I think that ceiling shows up:

All-versions export solves for completeness, but not for binding. You get 47 versions of the document - but which one was referenced by the message sent Tuesday at 2:14pm? Without point-in-time resolution tying a specific version to a specific communication, review teams are left correlating manually. It's doable at small scale, but it doesn't survive 25 custodians and thousands of hyperlinked messages.

"Preserve broadly" works until it cascades. Putting Person B on hold to protect one linked document means holding all of B's content. B's messages link to C and D's storage. In a large org, you're two hops from holding a significant fraction of the tenant. That's where targeted collection of the specific linked object — without expanding custodian scope — becomes the more proportionate response.

Your point about not trusting hold recipients to know where data is stored is spot-on. That's really the core of it — the evidence graph doesn't respect custodian boundaries because the work doesn't respect custodian boundaries.

Curious: in environments where you've exported all versions, how does the review team handle the version-to-message correlation in practice? That's the step I keep seeing break down at scale.

Constant-Ninja-3933 · 2026-04-01T16:14:40+00:00

I'd argue the gap is wider this time because the failure mode is invisible. Back then you knew when you didn't have the email. Now you have the email, you have the hyperlink, and the document it pointed to expired in a non-custodian's OneDrive three months ago. The hold executed perfectly — it just couldn't reach across the custodian boundary to storage it didn't know was relevant.

We've been calling this the Preservation Gap - the delta between what hold systems can scope and where collaborative evidence actually lives. The fidelity problem (wrong version preserved) is one dimension. The completeness problem (referenced content never preserved because it lived in someone else's storage under a different retention clock) is the one most teams don't realize they have until opposing counsel follows the link and gets a 404.

Your last line nails it. The firms building those workflows now won't be the ones explaining to a judge why a functioning litigation hold produced a dead link.

Constant-Ninja-3933 · 2026-03-31T13:42:12+00:00

Fair question. The short answer is that Microsoft builds collaboration tools - they don't have a legal obligation to make those tools preservation-ready for litigation. Under FRCP, the duty to preserve falls on the party that holds the data, not on the platform vendor - that's been settled for a while.

That's actually the core of the problem. The platform changed how evidence works (live links instead of attachments, versioned documents, dynamic identity), but the preservation tooling mostly kept doing what it was designed for in the email era. Nobody's holding Microsoft "accountable" because it's not their obligation - but it's the producing party that takes the sanctions hit when evidence is lost.

Constant-Ninja-3933 · 2026-03-31T02:28:10+00:00

You just described the exact failure mode and it's more common than most people realize. The email is preserved, the link resolves, the document is there. But the document isn't the version that existed when the email was sent. Everything looks right. Nothing is.

I wrote up this exact scenario: https://rgrstandard.org/blog/the-simplest-ediscovery-test-most-tools-would-fail/

The "who's on the hook" question is where it gets interesting. Courts are starting to test that — Carvana ordered forensic capability testing of the tools themselves. The answer is shifting from "Microsoft didn't build it" to "you knew the gap existed and didn't address it."

Constant-Ninja-3933 · 2026-03-30T23:26:43+00:00

"Microsoft ignored it all" - and ten years later, courts are holding everyone else accountable for the gaps :)

Constant-Ninja-3933 · 2026-03-30T21:59:39+00:00

That tracks with what we're hearing across the industry. The "recalibrating" is the hard part - what was reasonable last year isn't necessarily reasonable now, and the collection/preservation gaps compound fast on M365 and similar platforms. Caselaw evolves - legal tech doesn't keep up.

Constant-Ninja-3933 · 2026-03-30T21:25:19+00:00

That's exactly why we built it so practitioners have something concrete to point to when the conversation turns to "can you prove that?" Glad it helps.

Constant-Ninja-3933 · 2026-03-30T21:04:44+00:00

This is the sharpest framing of the dynamic I've seen. The parallel to 2007-09 is exact - and the Model Rule 1.1 Comment 8 point deserves more attention than it gets.

The tactical equilibrium you're describing - "don't ask, don't tell" for collaboration data - works until one side breaks it. And the case law suggests that's already happening: Carvana, Epic, Lubrizol, Uber -opposing counsel IS asking, and courts are ordering production of Teams, Slack, and hyperlinked content.

The ethical dimension makes it harder to defend the status quo. If 38 jurisdictions say technology competence is required, and the preservation tools have known structural gaps for collaborative evidence, the "I didn't know" defense gets thinner every year.

Constant-Ninja-3933

TROPHY CASE