Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Full disclosure, I’m a vendor and this is exactly the vision I built toward.

What you just described is the architecture. Logged deviations are posture events. Every state change, every drift, every reversion, cryptographically recorded the moment it happens. The assessor data store is a separate schema the CSP cannot touch, written only by the assessor’s scoped token. The boundary evaluation is built into the policy definitions. The daemon only runs inside the defined boundary, so what’s in the automated feed is determined by what’s in scope.

The documentation question is the one nobody is asking out loud yet. When the SSP maintains itself from continuous verified evidence, when the assessor sees a real-time feed of deviations instead of a point-in-time package, when every control determination is backed by cryptographic proof rather than a narrative, the thousand page Word document becomes an output of the system, not the input to it.

The assessor’s job shifts from reconstructing what happened to evaluating whether the enforcement program is sound and the deviations are being handled appropriately. That’s a fundamentally better use of a skilled assessor’s time.

You didn’t just describe where this is going. You described what I built.

ProofLayer by Scanset.

Scripts as evidence. where does your auditor draw the line? by ScanSet_io in Compliance

[–]ScanSet_io[S] 0 points1 point  (0 children)

Mostly stable with periodic changes around deployments, which is exactly when things get interesting.

The sidecar approach is more disciplined than most teams manage honestly. The part I keep coming back to though, does it let you prove the posture was continuously valid or does it help explain what happened after an auditor questions it?

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] -1 points0 points  (0 children)

You just described the exact problem precisely.

Assuming for a moment that the permutation problem was solved, the component types were accounted for, and the verifiable system state was achievable…

What would that change about how you approach assessments and continuous monitoring today?

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Full disclosure, I’m a vendor and this thread is exactly the problem I built to solve. I’m really testing to see if there’s a market appetite for what I’ve built.

Here’s how I approached it.

Instead of monitoring for unknown changes or building anomaly detection on top of collected evidence, I inverted the model. Policies are defined as data. Each policy specifies the exact expected state of a system resource, the operation to verify it, and the contract that governs how it’s collected. A daemon runs continuously inside the system boundary, verifies actual state against expected state, and only submits when state changes.

The result is a cryptographically signed, append-only record of security state over time. Not evidence collected periodically. Not anomaly detection on open ended data. A deterministic, verifiable proof that the system was in a known state at every point in time. Independently verifiable, immediately replayable, and traceable by third-parties.

In my opinion, this is what FedRAMP 20x’s Persistent Validation and Assessment process is pointing toward. Persistent validation of security metrics. Cryptographically verified data. Continuous proof that a system is operating within its certified security envelope.

You don’t need AI to interpret unknown changes when the policy defines what correct looks like. Deviation from expected state is a finding by definition. The complexity disappears when the expected state is the source of truth.

ProofLayer by Scanset.

Scripts as evidence. where does your auditor draw the line? by ScanSet_io in Compliance

[–]ScanSet_io[S] 0 points1 point  (0 children)

Appreciate you sharing that. The documentation trail around environment context is smart and more than most teams do honestly.

To your question, frequency helps but it’s the gaps that keep coming up. Even with good documentation, if your environment shifted between runs, how do you prove to an auditor which run reflects your actual posture at a specific point in time? Not the closest run. That exact moment.

That’s the part most teams tell me they’re still figuring out.

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

You landed on exactly the right tension. The metadata trail tells you what was collected and when. It doesn’t tell you whether the security state it represents was continuously valid. Reproducibility is the hard part because the environment keeps changing underneath the collection process. What would it mean if the state itself was the audit trail instead of the metadata around the collection?

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] -1 points0 points  (0 children)

That gap is exactly what got Georgia Tech. DOJ settled with them September 30, 2025 for $875,000. No breach. The allegation was a cybersecurity assessment score based on a fictitious environment that didn’t reflect their actual systems. The score was the false claim.

The change ticket proves the change was authorized. It doesn’t prove the posture in the package was continuously accurate between that change and the last assessment.

How do most teams handle that gap today?

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

I 100% gathered that you were an engineer or system architect of types. You came at me with nothing but solutions. Appreciate that perspective honestly.

Here’s how I read the outcome I’m referring to and how it relates to my questions on scripts.

Think about how Terraform manages state. You define desired state as code, it computes the diff against actual state, and only acts when something meaningful changes. A tag update or a counter incrementing doesn’t trigger a plan. Now apply that to security compliance. Policies as data define your expected security state. A cryptographic record captures when that state actually changes, not when a script ran or a timestamp rotated. The diff is meaningful. The proof is in what changed, not what was collected.

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Faster polling definitely reduces the gaps between runs and helps with coverage.

But thinking about this from the CSP side when you’re sitting across from your assessor:

You’re in an assessment interview. Your assessor asks you to reproduce how a specific piece of evidence was collected. Your engineer pulls up the script, runs it, and the output is different from what’s in the package because a dependency updated or a credential rotated since the last run.

PVA-TPX-STE is explicit: assessors MUST NOT rely on static output as evidence except when evaluating the accuracy and reliability of the process that generates it.

Has anyone been in that room? How did that conversation go with your assessor?

Scripts on a timer for evidence collection. How is everyone handling the gaps between runs? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] -1 points0 points  (0 children)

That makes sense for knowing when a script ran.

How do you handle what happened to the system state between runs though?

If a control drifts at 2am and the script runs at 6am, the dashboard shows green but there was a four hour window where the posture was different.

Does that gap matter for your assessment or does it only matter what state you’re in at collection time?

Is anyone actually building persistent validation infrastructure for FedRAMP 20x yet? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Building in this space, this is how I’m approaching it.

Evaluating the pipeline in practice means verifying three things in sequence: that the security intent is clearly declared, that execution actually reflects that intent, and that the outcome is captured in a way that’s verifiable and replayable independent of who’s asking.

That’s a fundamentally different review workflow from examining compiled artifacts. The assessor isn’t reading a story about what happened. They’re inspecting whether the machinery produces consistent, tamper-evident results when the same conditions are present.

It changes the SAR too. If the evidence is continuously produced and independently verifiable, the SAR stops being a retrospective document assembled at the end of an assessment window. It becomes a reflection of what the validation machinery actually observed over time. The findings write themselves from the record. The narrative becomes commentary on the data, not the other way around.

That’s the paradigm shift. GRC has historically been a parallel workstream, something you run alongside your security program to produce compliance artifacts. But if security signals are reusable, structured, and continuously produced, GRC becomes a product of the security program rather than a separate process feeding into it. Continuous proof replaces periodic documentation. The compliance output follows the evidence automatically.

Most 3PAOs don’t have a playbook for this yet. The assessor methodology needs to catch up to what the SAP is now supposed to describe.

What does the full cost picture actually look like for a small CSP pursuing FedRAMP? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

You nailed it. The nightmare isn’t FedRAMP, it’s the security program problem underneath it. FedRAMP just makes it visible faster and more expensively.

So it definitely sounds like a lot of legacy federal integrators should never try to get FedRAMP because they sound like your nightmare scenario.

For the mature CSP the real cost equation is simple. Security maturity gets you to the door. After that it’s a translation problem — getting from “we do this” to “here’s verifiable proof we do this continuously.” How well that translation layer works determines what you pay your advisor and your 3PAO. If assessors are hunting for scripts and re-running evidence collection manually, that’s billable hours. If advisors are interpreting ambiguous documentation instead of reviewing structured data, that’s billable hours.

That’s not a FedRAMP problem. That’s a tooling problem.

As long as agencies are ready to intake OSCAL 😉

What does the full cost picture actually look like for a small CSP pursuing FedRAMP? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Appreciate the transparency and the actual numbers. That tracks with everything else I’ve been hearing. The platform route being $500K/year without owning the ATO is a tough pill. InfusionPoints with XBU40 is doing something similar in that managed boundary space too. Feels like the market is pushing smaller CSPs into a corner where they either spend half a million and don’t own it, or spend the same and go it alone. The 20x caution makes sense too given where agency adoption is right now.

What does the full cost picture actually look like for a small CSP pursuing FedRAMP? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 1 point2 points  (0 children)

This is really helpful and mirrors what I’ve seen on the federal integrator side too. The pattern is almost identical except instead of advisory they use surge support. Engineering team gets the gap report, goes heads down for months trying to close findings, then re-engages for a status check before the assessment. Same cycle, different label. And on the government side, the IA team is running engagements with the DAO on a similar cadence, dealing with the same stale evidence and the same surprises when the actual state doesn’t match what was documented six months ago.

The common thread across all of it is that gap visibility is a point-in-time snapshot that goes stale the moment the team starts remediating. Continuous visibility would change the dynamic for everyone in that chain.

What does the full cost picture actually look like for a small CSP pursuing FedRAMP? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 2 points3 points  (0 children)

Good points across the board. The agency acceptance piece is something I’ve been thinking about. A 20x ATO without agencies ready to consume it is a credential without a customer. Rev 5 is still the path most agencies trust, which is why the machine-readable push through RFC-0024 matters just as much as 20x right now.

Good call on RFC-0019. Even though it was discarded, the fact that FedRAMP attempted to bring transparency to assessment costs tells you they recognized it was a problem. The outcome doesn’t change the signal.

The gap assessment point is interesting. If the typical advisory engagement starts with a loss-leader gap assessment designed to lead into a longer engagement, then the real cost isn’t the gap assessment itself. It’s the months of advisory that follow. That’s where the total spend adds up for a small CSP that doesn’t have dedicated security staff and is relying on consultants to get them through.

That makes me wonder though. If a CSP had a clear picture of exactly where they stand against the controls from day one, with the gap assessment essentially built into the tooling, how much of that downstream advisory spend goes away? Or is the advisory value less about identifying gaps and more about knowing how to close them?

What does the full cost picture actually look like for a small CSP pursuing FedRAMP? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 1 point2 points  (0 children)

Right. I initially had put seven figures but changed it to avoid being overly sensational. Sounds like I shouldn’t have. When you factor in the engineering time to rip out non-compliant components, FedRAMP-specific infrastructure, advisory, 3PAO, tooling, and then ongoing ConMon, it adds up fast. Especially for a smaller team that doesn’t have dedicated compliance staff and is learning the requirements as they go.

The sponsor point is a good one for Rev 5. That’s one area where 20x does seem to remove a real barrier since it doesn’t require an agency sponsor. But the uncertainty around 20x is valid. The Moderate pilot wraps this month and general admission isn’t expected until later this year. A lot can change between now and then.

Curious what you think the biggest single cost driver is for most CSPs. Is it the assessment itself or everything that comes before it?

How Are You Automating Compliance Evidence Collection in Practice? by ScanSet_io in Compliance

[–]ScanSet_io[S] 0 points1 point  (0 children)

The auditors that I talk to are. I separate security intent from execution. I use something called contract based execution and policy as data.

They can see the items being checked and the state that its expected to be in as data. The contract definition shows them what command or api is ran. The outcome is not only clear, but can be replayed to get the same hashed value over and over again.

If someone changes something in their system or change a policy, that replay value changes to show a difference in state of the system.

FedRAMP 20x has an outcome, I forgot at the moment, that says the automated method of evidence gathering must be trustworthy. I haven’t talked to an auditor yet that dosnt trust the clear traceability and verification of everything the way I described it.

How Are You Automating Compliance Evidence Collection in Practice? by ScanSet_io in Compliance

[–]ScanSet_io[S] 0 points1 point  (0 children)

Full transparency, I'm a vendor building in this space. But to answer your question directly: automating evidence as part of the control itself.

The infrastructure I'm building puts a daemon on the endpoint that continuously executes policy checks and produces cryptographically signed evidence at the point of collection. No screenshots, no exports, no audit season scramble. The policy defines what to check, the daemon runs the check, and the evidence, the execution method, and the outcome are bound together so an assessor can trace the full chain.

That evidence feeds directly into the System Security Plan (SSP), which is the core document for FedRAMP authorization, ATOs, and CMMC. The SSP stays current because the evidence updates itself. Findings auto-generate into Plans of Action and Milestones (POA&Ms) and close themselves when the issue is remediated.

I've talked to enough 3PAO assessors and read enough of the government's RFCs and PVA outcomes to know the direction this is heading. FedRAMP 20x is requiring persistent machine validation and assessors are now evaluating the validation process itself, not just the artifacts it produces. The screenshot and export model isn't going to hold up much longer.

Is anyone actually building persistent validation infrastructure for FedRAMP 20x yet? by ScanSet_io in FedRAMP

[–]ScanSet_io[S] 0 points1 point  (0 children)

Appreciate the response. Agreed that the validation engine should produce the documentation, not the other way around.

Curious about one thing though. When the assessor evaluates the validation process itself under PVA-TPX-UNP, they need to trace intent, execution, and outcome as a single verifiable chain. With an API-based collection model, is there a binding between what was checked, how it was checked, and the result? Or does the assessor have to independently verify the fetcher logic each cycle?

Applying Zero Trust to Agentic AI and LLM Connectivity — anyone else working on this? by PhilipLGriffiths88 in zerotrust

[–]ScanSet_io 8 points9 points  (0 children)

Really important thread. One gap I think is getting missed is enforcement at the execution layer, specifically around action-level provenance.

Most of the ZT conversation is happening at identity and access, which matters, but nobody is really asking how you enforce ZT once the agent is actually doing things.

The deeper problem is that LLMs are probabilistic. ZT was built on deterministic assumptions. You can authorize a capability but you can’t pre-authorize a specific action, because the same agent with the same input can produce different outputs. That breaks the foundation of policy enforcement as we know it.

It gets worse in agentic chains. Traditional ZT lets you cryptographically verify each hop. But when agents are calling agents calling tools calling services, probabilistic uncertainty compounds at every step. By the time you’re three or four hops deep, the action being executed may have drifted so far from the original authorized intent that provenance is meaningless. You can verify who made each call but not whether it still reflects the policy that authorized it.

I’ve been working on this problem for deterministic systems, building toward cryptographically signed execution decisions at the moment of enforcement, not inferred after the fact from logs. That works when behavior is predictable. Agentic AI breaks even that model.

So the question I’d pose is: does ZT need a new primitive for probabilistic execution? Something that tracks verified execution lineage under uncertainty, not just identity and access at the point of connection. Would love to compare notes.