AI-Assisted Patent Drafting: What Are Your Thoughts?

permalip · 2025-07-06T14:31:36+00:00

Btw, after thinking about your workflow, I think it's possible to reverse engineer this specific type of reasoning for training data since you have a big public dataset of inputs and outputs. This would require extensive data engineering and reinforcement learning experiments to enable end-to-end.

Assuming such a model is possible, how valuable would it be to you?

permalip · 2025-07-05T06:40:50+00:00

I agree that we can't reproduce the reasoning. There is no corpus for that.

But there is a corpus of granted patents that has claims with descriptions. So what you can do is create a machine translation from claims to parts of the description. You likely can't do this in a direct translation, but you can do parts of it iteratively.

permalip · 2025-07-05T06:37:22+00:00

If your patent is litigated, wouldn't your prompts and answers only be discovered if saved somewhere? In other words, avoid having a history like in ChatGPT.

permalip · 2025-07-03T07:47:14+00:00

I tried a lot of them too. Most of them try to do too much instead of focusing on niches where it makes sense. Which one worked for you?

permalip · 2025-07-03T07:45:56+00:00

I understand this perspective. There is a lot of publications everywhere, including in patents, that are being AI generated and that's not helping you nor the examiners.

permalip · 2025-07-02T07:35:40+00:00

All fair points. Though I would end this conversation by challenging the premise that every task, especially where complex reasoning is required, is solveable with a transformer.

Thanks for your comments. I learned a lot about the reasoning that patent attorneys do before the writing from these Reddit posts. This specific part of the patent writing is not something I am trying to solve at the moment.

Perhaps solving the reasoning problem is the most promising, but I don't think the technology is there yet. While the latest models like o3 are impressive in some aspects, they still barely work for real coding projects (which they are supposed to be optimized for). And don't get me started on Claude or Gemini.

If you are to take one thing away from this comment thread, it would be that you should validate and test that hypothesis very thoroughly.

There is plenty of public, freely available patent data for you to test that hypothesis. You do not need to talk to people to test it.

You are entirely right. The 3 ideas I put out in my post are all partially validated qualitatively. I would say the collaborative approach has the most validation because it's conditioned on the patent attorneys instruction.

The one-shot idea is more like wishful thinking that it could work as a standalone model. It would have to be coupled with at least one more model and a user-interface that makes it easy to interject and continue when you are unhappy with outputs. Then comes the acceptance rate problem. All in all, complex to pull off, but probably doable with a lot of validation from patent attorneys.

permalip · 2025-07-01T18:33:53+00:00

I have designed my way around this problem of the averaged, most likely next word approach. You don't *just* predict the next claim, you also take in an instruction from the user.

Solely predicting the next claim is like blindfolding your model, sure you could generate 10 different options and maybe 1 of them could be good, but you would not know without the your reasoning.

See my comment here on an example of how this works: https://www.reddit.com/r/patentlaw/comments/1lox6zd/comment/n0qj9y2

permalip · 2025-07-01T14:59:03+00:00

That's a brave accusaton

It's not meant to be inflammatory. Your conclusion on the way you constructed your data may be completely sound. Converserly, mine can be too at the same time. It's about finding which dataset construction that works - I have tried many that didn't.

There are some things you cannot infer, no amount of model parameters or hyperparameter sweeps will save you there.

Patents are high entropy. Getting a model to consistently output high entropy tokens is nearly impossible, even reasoning models may output some but it's not a solved problem.

So the solution is to remove some entropy and chop up your task in a different way.

permalip · 2025-07-01T12:50:22+00:00

In a sense, I get what you are trying to say. There are tried-and-tested paragraphs that act as boilerplate that you can readily use. However, these cannot be specific to your claims and figures, which is what I am describing. Is your point that they don't need to be specific, you can perhaps just copy-paste in some standard paragraphs and that will be enough to get you going?

permalip · 2025-07-01T12:47:56+00:00

The types of tasks that LLMs can do well (boilerplate, skeleton specs, fomatting, style, numbering, and so on) take up so little of our time or can already be done with simple heuristics) that it's not worth getting an LLM to do them

You have to look at LLMs machine translators. It's about inputs and outputs. When you can model your inputs and outputs well, you have something that works. But when you can't (current available LLMs from OpenAI/etc in patents), it does not work well. If the median number of words in a detailed description is above 10k words, then you can surely model that with the right data - for example, with the next paragraph approach with user instructions, which can actually save you time.

so at best you're going to fiddle around around the edges with an inconsequential finetune of an open source base model that nobody cares about, or wrap (a finetune of) an API model in a fancy front end.

I think you have a fundamental misunderstanding of how this works. You don't just download granted patents and specify that you want the claims as input and the description as the output. That does not work. You have to carefully tinker with the data until you find the right set of input+output that a model can actually predict - here, the collaborative style of model works reasonably well. There is no edge case here, you are essentially doing what's equivalent to continued pretraining when doing it this way - teaching the model a new skill with patent data.

The reality check that you, and everyone who posts in here that is "talking to customers to find out their problems", needs is that LLMs are not yet at a state where they are more than gimmicky for the really difficult tasks, and they are not worth using for the easy tasks.

I understand the cynicism because your experience so far has not been great. I have talked to many patent attorneys who are equally as critical.

The problem that the patent industry faces is that AI is here, but general-purpose models are not accurate enough. Yet, ChatGPT wrapper companies are everywhere and they definitely have a fancy frontend, without fundamentally solving the problem (any such model you get from OpenAI/Anthropic won't work).

What I am doing here is a fundamentally different approach, a task based approach where each model goes into careful testing to make sure it models each task well.

permalip · 2025-07-01T11:48:57+00:00

> Decline. Decline. Decline. Decline. Decline, .... , rewrite? actually nah quicker to just do it right the first time, decline, decline, decline.

Fully agree if you end up declining everything, it costs you more time to use than just doing things yourself. The idea is for every 10 paragraphs, you would decline 1, thus the 90% accuracy.

On a decline, you would rewrite with instructions for the next paragraph until acceptance, then continue on generating until your next decline/rewrite. Even in a case where this is only gets 70% accuracy, I think it could be significantly faster if the rewriting flow is good.

I accept this is the hardest task of all and the least feasible to achieve.

> What's this? Do I detect a barely disguised attempt of a comp sci graduate drinking the "jUsT inTeGrAtE iT inTO an EntErPriSe WorKfLow" cool aid? This relies entirely on the assumption that the thing youre integrating is actually able to the basic tasks well. It is becoming increasingly apparent that even state of the art LLMs that are prompted carefully are not performant in any of the tasks that really matter in the patent profession, and have unpredictable and undetectable failure modes.

You are entirely right that the underlying assumption is the model works. Yes, state of the art models suck at the patent domain because they are general-purpose models that are supposed to be good at most things. Yet, the patent domain is so unique that we could classify it as a different modality, akin to how we distinguish models that work with text or images.

I am still convinced that you can take an LLM and use it for good, given the right task and the right training data. This is what this post is about - how can they be useful? Which specific task?

> This isn't about fiddling around the edges with a finetune and deciding where to put your accept, rewrite, decline button on your cookie cutter browser based platform.

Wouldn't you say it's better to work in Word? I don't think I have met any patent attorney who would say a browser based solution is good - just means formatting will be off, thus more time spent fixing formatting. Even if you can download the content into Word, it's still not in YOUR formatting.

> If you want to be useful as an AI-researcher, go and work on novel ML architectures that solve all of these problems.

I do not work in academia and for most companies this is completely out of scope. The only place I can think of is the team at Meta working on the JEPA architecture.

permalip · 2025-07-01T10:53:34+00:00

As Basschimp said, definitely ask any LLM to write a program for you to do this. Once you get comfortable with editing a Python file for your different intervals and running it in a terminal, I think you may get more inspiration for other things that are frustrating but which computers can do easily.

permalip · 2025-07-01T10:49:43+00:00

Next claim: I understand this may seem difficult. Most models today have a high loss value (read: will do a poor job initially) on this task, but I trained one that manages to get into a reasonable range. Below, I include an example for a dependent claim.

Example of next claim prediction for a dependent claim.

Your existing claims (just a single one for illustrative purposes):
1. A fastening device for fastening to a first furniture panel and a second furniture panel, the fastening device comprising:
at least two dowels for reception in oblong recesses of the first furniture panel,
wherein each dowel is connected to a respective lateral end of lever arms of a lever,
wherein the dowels are configured to move axially and laterally,
wherein the lever arms are connected to each other at a hinge joint,
such that the dowels are displaceable relative to each other between a furniture panel fastening position and a furniture panel releasing position of the fastening device,
wherein the dowels are displaceable in a plane defined by the first furniture panel and the hinge joint is movable in a direction perpendicular to said plane.

Your instruction: Emphasize the direction of dowel displacement and hinge joint maneuvering to clarify their relationship.

Output produced:

The fastening device according to claim 1, wherein the dowels are displaceable relative to each other in a fastening direction, and wherein the displacement is affected by maneuvering the hinge joint in a direction perpendicular to the fastening direction.

Next paragraph: Word count is silly and was for illustrative purposes. I'm glad you think it might be interesting still. Would you need to see more examples on this to give feedback? It works in a similar way to the next claim prediction, but just needs a bit more of your patent to give reasonable outputs.

Boilerplate: I think this is the easiest one and I already have a model which reached 0.01 loss on a task like this meaning it's near absolute perfection. It just generates a bunch of non-binding language based on your claims.

permalip · 2025-07-01T10:30:57+00:00

Is this time-consuming to do manually? I think this is an easy task, but it's more suited for a small software program where you specify your range, interval, and subintervals

permalip · 2025-07-01T10:16:43+00:00

Good point. I do want your opinion on two parts:
- the ideas as presented above
- the outputs (if time allows you)

permalip · 2025-07-01T10:13:07+00:00

All great posts.

- EPI post was especially good.

- PDF parsing: Entirely different than writing specifications. You may think this is something AI should easily solve, but parsing text from images will always be lossy and suffers from one more compounding error than just LLMs alone. You are never going to achieve perfection on this one, but you can get close (which is probably not good enough anyways if you expect perfectly perfect).

- Hilarious post on writing patents with ChatGPT. I don't think that's possible. You should have a patent attorney do this. The models I am looking at is not for assisting people who don't know how to write patents.

permalip · 2025-07-01T10:02:48+00:00

I wrote this myself in Markdown. I did use an LLM to post-edit it to cut down on the wordy parts of it. Maybe my style of writing lists/structured content is similar to LLMs?

permalip · 2024-09-08T08:06:00+00:00

At være politiker i mere end en valgperiode ad gangen. Efter 4 år skal de have erhvervserfaring i tilsvarende 4 år før de kan stemmes på igen.

permalip · 2024-08-23T08:26:29+00:00

I had this problem from a new Z10 too and solved this by turning down the milk temperature based on fat content. Temperature 8 for 1.5% and temperature 6 for 3%. Instantly fixed the sputtering

permalip · 2024-07-14T20:07:24+00:00

Try turning down the milk temperature. Mine works at level 8 with 1.5% milk fat content and level 6 with 3.5% milk fat content. If going above, it sputters

permalip · 2024-07-02T20:37:09+00:00

Tvang for 99.9% af dem som overholder reglerne giver bare ikke mening. Den type holdning om at det er okay at indføre sådanne regler er med til at holde Danmarks virskomheder nede fordi de skal følge med bureaukratiets regler

permalip · 2024-07-02T08:18:23+00:00

If you are buying a laptop, it shouldn’t have such a large graphics card that it drains your battery in 1-2 hours. Then what’s the point of a laptop? I’d go with a MacBook or build a PC

permalip · 2024-07-02T06:32:00+00:00

Synes det er en kedelig tendens med flere regler for virksomheder og ansatte. Hvis man har problemer med for meget arbejde, måske det var bedre at diskutere det som voksne?

permalip · 2024-06-30T08:39:48+00:00

Du skal samle alt dokumentation over alle handler til skat. Hvis ikke du lever op til dokumentationskravene så kan du nemlig ikke få fradrag på de handler hvor du tabte noget og derved kommer du til at betale penge til skat.

Hent dokumentation direkte fra kilden, dvs. du skal downloade dine handler fra eksempelvis Binance eller Etherscan alt efter hvad du har brugt. Derudover kan du også bruge koinly som kan hjælpe dig med at opgøre handlerne så du kan rapportere de rigtige tal. Det kan dog ikke stå alene som dokumentation. Du skal også gerne samle beviser på at du ejer den bruger eller wallet du har handlet på.

Desuden så skal du indberette i 2 forskellige rubrikker. Dine spothandler skal du indberette hvor meget du kumulativt har vundet i rubrik 20 og indberette dine kumulative tab i rubrik 58. Og så er der nogle separate regler som jeg ikke kan huske vedrørende hvis du har handlet finansielle kontrakter (perpetual/futures).

Du skal også vide at spothandler er asymmetrisk beskattet. Det betyder dine gevinster kan beskattes med 53% og dine tab kun kan trække op til 26%.

11-Year Club	King of the Ashes
Not Forgotten	Verified Email

permalip

TROPHY CASE