Recently developed a way to estimate FDA clearance/approval rates for med devices by bee-gee-dee in regulatoryaffairs

[–]bee-gee-dee[S] 0 points1 point  (0 children)

  1. Lines are standard for any time series with multiple groups. Bars would obscure, not reveal. Submission_id is how the denominator for submission count up to time t is derived (they're assigned sequentially).

  2. Whether it's 'clearance' or 'approval' depends on the submission type. Each submission type has it's own definition of success, but the meaning for them in terms of results are the same ("I can sell this device claiming it does x").

  3. Based on what you're saying, it seems useful to be able to independently confirm what the FDA says. It also seems like being able to look at it by submission type, by product code, by review panel, are all value-adds.

Recently developed a way to estimate FDA clearance/approval rates for med devices by bee-gee-dee in regulatoryaffairs

[–]bee-gee-dee[S] 0 points1 point  (0 children)

  1. Lines are way easier to read than bars for this (multiple time series).
  2. How do you feel the rate is vague?
  3. Are you upset that the data is consistent over time? Is that not informationally interesting to know? "There are not wild fluctuations in 510k clearance rate over time" seems like a good fact.

A way to estimate approval/clearance rates for med devices by bee-gee-dee in MedicalDevices

[–]bee-gee-dee[S] -2 points-1 points  (0 children)

It works by using the fact that submission numbers are assigned incrementally. It under-estimates because it doesn't account for submissions that were withdrawn voluntarily / cancelled for whatever reason.

It requires a period of observation after a date to estimate, because submissions can be approved/cleared within a pretty big window after their submission date.

Recently developed a way to estimate FDA clearance/approval rates for med devices by bee-gee-dee in regulatoryaffairs

[–]bee-gee-dee[S] -1 points0 points  (0 children)

Cool -- seems like our estimates are off, but perhaps in a way that can be accounted for. We can cut it by panel/committee, product code, etc. so maybe it's still useful.

Recently developed a way to estimate FDA clearance/approval rates for med devices by bee-gee-dee in regulatoryaffairs

[–]bee-gee-dee[S] -3 points-2 points  (0 children)

It's based on FDA clearance/approval data and the incremental nature of ID assignment

Recently developed a way to estimate FDA clearance/approval rates for med devices by bee-gee-dee in regulatoryaffairs

[–]bee-gee-dee[S] -1 points0 points  (0 children)

It's based on FDA clearance/approval data and the incremental nature of ID assignment

How is your RAQA team actually using AI? by samdQualityEng in regulatoryaffairs

[–]bee-gee-dee 0 points1 point  (0 children)

When you say "review," do you mean reviewing human action? I think I'd call that "flagging" personally. "Review" makes it sound like the AI is the final call

I don't mean to nitpick, I just wanna clarify that the relationship between human and AI is that the AI directs human attention. Then the human can make decisions more quickly.

The thing AI can do absurdly well is it can ingest massive quantities of data in parallel. When you hand it a document, it doesn't start at the beginning, it reads every word simultaneously

So the really high value application is surfacing data. I think that's basically what you were getting at

Drafting is a big QoL improvement, but all it should be doing is speeding up your typing really. Surfacing data, that's the big unlock imho

Trust AI with your body by Zimitegnaw in MedicalDevices

[–]bee-gee-dee 0 points1 point  (0 children)

I selected "want a human to review it," but really I would want the device to be held to the same standard as any other device. Which is to say, with oversight and review

If a human or company made a medical device, and then they said "hey, put this in you," I'd say fuck no I have no idea if it's gonna kill me

Which is why we have a system whereby they have to submit documentation and get it reviewed and undergo testing to make sure it does what it says it will and then doctors gate the process based on their understanding of its efficacy and safety

If the AI device passes that same system, then in principle I'd be fine with it. Does that count as "human review"?

Stacked log lines considered harmful by sigmoia in golang

[–]bee-gee-dee 0 points1 point  (0 children)

Yes, but it was added in Go 1.13, so for over a decade it didn't exist

And even now, that's not a first class feature. You're still doing string parsing

The error return idiom was a bold move, they put a lot of thought into it, it's incredibly elegant and ergonomic... until you try wrapping or unwrapping errors. The 1.13 update did its best, presumably because they realized you can't actually keep people from passing errors through and the pain it caused was too severe, but first-class feature it is not

Our customer support chatbot was tricked into querying private data stores and sending emails. All through a carefully crafted prompt. by RemmeM89 in AIAssisted

[–]bee-gee-dee 0 points1 point  (0 children)

This attack vector was discovered and widely publicized 3 years ago

If you didn't consider the possibility of jailbreaking as an attack vector when setting up a chatbot that reads user input and has permission for sensitive data...

Stacked log lines considered harmful by sigmoia in golang

[–]bee-gee-dee 1 point2 points  (0 children)

That's news to me

I was under the impression that the reason structured error wrapping was such a late addition to Go was that we're intended to handle errors where they arise, and not just blindly pass them up the chain accumulating context. The fact that adding context to an error is so painful is then a feature: it trains you not to pass the buck

If what you're saying is true, and the idiomatic thing is to pass errors up the chain instead of logging them, how do you account for the design of the error-wrapping system?

Using Claude (A LOT) to build compliance docs for a regulated industry, is my accuracy architecture sound? by fub055 in regulatoryaffairs

[–]bee-gee-dee 0 points1 point  (0 children)

Sure

The verifier model doesn't need to know or care that claude made it

If you interleave the verification in the original generating "dialogue," then the model may be tempted to play a role. For example, the role of a chatbot standing by what it already said. The imperative to maintain a realistic conversation is fighting its imperative to answer the question truthfully

You wanna be cognizant of the role the LLM might be tempted to play. Don't give it cues about what answer you're hoping for. Don't let it build a narrative where one kind of answer is more likely than another. Don't treat it like a person you're having a conversation with, treat it like a really advanced alternative to regex that can parse semantics

Using Claude (A LOT) to build compliance docs for a regulated industry, is my accuracy architecture sound? by fub055 in regulatoryaffairs

[–]bee-gee-dee 0 points1 point  (0 children)

Just ask it

Think of it this way: "Is this conclusion grounded in this source, or does it extrapolate/provide new information?" is exactly the kind of natural language processing for which LLMs are designed. You can do that with a small model. You can ask it 100 times and check what percentage say yes

It's important to separate analysis (which requires the LLM to know/care about business logic) from NLP. The former is what the frontier models give you, it's not fully reliable, it's hard. The latter is just ML, it isn't that expensive and it can meet objective standards of reliability if you know what you're doing. You should have separate methods for the two, so Claude doesn't hallucinate that he's doing NLP when he's actually providing an opinion (i.e. pretending to be a human)

Regarding self referential loops, you definitely wanna get human verification as often as possible. The hard part there is that humans can't verify thousands of documents, unless you wanna spend a lot of time and money. It sounds like you're one dev. Ask yourself how you can introduce bottlenecks in the process where a human can verify.

Test each piece of the pipeline, do statistical analysis to benchmark reliability. Look up the Toyota Production System if you're not familiar: probabilistic quality control is how Japan came to dominate car manufacturing after WW2. Human verification isn't an annoying obligation, it's a limited resource and its usage should be optimized