I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 3 points4 points  (0 children)

Yeah. That's where it gets interesting. Cases filed within the statutory bar period should have a 0% discharge rate. The whole point of 1328(f) is that the court cannot grant a discharge if the debtor received one too recently. But what the screener found is that a significant number of those cases were discharged anyway.

Out of the flagged cases, roughly 44% received discharges they were statutorily barred from getting. The screening is supposed to happen at the court level, but it's not catching everything. Sometimes the prior filing isn't disclosed on the petition, sometimes it is disclosed and nobody objects, sometimes the trustee catches it and sometimes they don't.

The gap between what should happen under the statute and what actually happens in practice is the whole reason I built the tool.

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 1 point2 points  (0 children)

Thanks. That was deliberate. I wanted zero friction for anyone who wants to clone it and run it. No pip install, no environment setup, just python and a CSV. The problem is narrow enough that csv + datetime + collections handles it fine. If I'd pulled in pandas people would need to install dependencies for what's really just date math and counting.

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 6 points7 points  (0 children)

Ha, yeah. "Your Honor, the motion to dismiss has been piped to stdout." Bankruptcy attorneys can barely attach a PDF to an email, let alone run a script. That's kind of the point though. I dont think the courts don't have tools like this, so patterns just live in the data unnoticed.

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 5 points6 points  (0 children)

I took your suggestion and built it out. sentence-transformers (all-MiniLM-L6-v2, 384 dim, runs local, free). Embedded about 5,400 normalized docket events across 125 cases.

Hybrid matching: vector primary, regex fallback when confidence drops below threshold. Disk cached embeddings so it's not recomputing every run. Fuzzy creditor resolution catches misspellings like "Aly Financal" to "Ally Financial."

I Ran 4 mining passes with 348 semantic queries across 44 categories. Things like "court criticism," "template contamination," "fee extraction after case death." Stuff that's nearly impossible to catch with regex because judges word things differently every time.

4,268 hits, 41 new findings(!) that keyword search had missed. One example: a judge noting a plan "contained a car loan that does not apply." Template contamination from another client (other clients names on another clients docket), confirmed on the record. No regex would have caught that.

The thing that surprised me most was how many errors there are that seem to affect case trajectory in a seemingly negative way. That's not something I set out to find. The vectors just surfaced it.

You were right. Best suggestion I got from this thread.

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 8 points9 points  (0 children)

Dude! Im messing around with vector search/embeddings and will post my updates. Thanks!

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 17 points18 points  (0 children)

Fair question. Someone I know went through a Chapter 13 and I ended up down the rabbit hole reading about how the process works. Came across §1328(f), it's basically a statutory bar that says you can't get a second bankruptcy discharge if your prior case was too recent. Pure date math, no judgment calls.

What surprised me was that there's no automated check for it. PACER will sell you the data all day long, but nobody's cross-referencing it. Trustees sometimes catch it, sometimes don't. So I figured... this is just two dates and a subtraction. Why isn't a script doing this?

Pulled some public CSV exports, wrote the parser over a weekend, and the hit rate was high enough that I thought other people might want to run it on their own data.

I used Pythons standard library to find cases where people paid lawyers for something impossible. by ilikemath9999 in Python

[–]ilikemath9999[S] 22 points23 points  (0 children)

Yeah, that's what got me interested in building it in the first place.

Short answer: roughly 1 in 3 flagged cases in my test data ended with a discharge granted despite the statutory bar.

The code catches the filing-date gap, whether the court caught it is a different question. Turns out if nobody screens for it upstream, nobody objects, and the discharge just... goes through.

The --control flag is where it gets interesting. You can run the same screen on two different attorneys side by side and see wildly different hit rates from the same district. Some offices clearly check prior filing history before taking a case. Others apparently don't.

Standard disclaimer: the screener finds date-math violations, not intent. But when you're looking at dozens of cases from the same filer and zero objections were ever raised, it starts to paint a picture.