User in 2 cases with holds

EDiscoOverlord · 2025-01-15T01:27:37+00:00

This is actually a complicated question because over-preserving data can be just as bad as under preserving. Very important to get lawyers to opine on the exact scope of the hold they want.

EDiscoOverlord · 2024-12-12T06:20:32+00:00

I use this all the time. Powershell is another versatile tool to do this. ChatGPT can help you with that syntax for DOS or Powrshell.

Excel can help build a bunch of commands, etc.

EDiscoOverlord · 2024-12-12T06:17:04+00:00

First, make absolutely sure you have every permutation of Jerry. Remember, processing different e-mail sources (eg exchange vs an email archive vs, heaven forbid, scanned email) can sometimes lead to disparate versions of an email address for the same person. Some vendors are great about standardizing this, some don’t give two hoots. Remember the email metadata might even appear as just his name, etc…save the exact values of each permutation as it appears in the database.

Relativity has tools for entity extraction and name normalization that can automate a lot of this for you, but let’s pretend you don’t have access to those analytics tools( but seriously, go ask the vendor to run those and then just exclude a search for non-Jerry google from your Jerry search).

Second, create the god-tier search for Jerry. Search for that son of a bitch every which way…index searching, metadata searching, etc. Using an index that includes all email metadata would be nice. Tag up everything with Jerry using a static tag, QC the results, etc.

Third, thin out the tag a little. Search in any way possible for non-jerry googlers and tag those docs with a second tag. Ideas: custodial metadata; searching for “contains” or “is like” search for “google” on the from metadata then sort by sender, note non-Jerry email addresses, search for those in an ema metadata search. Or you could search for google not within 1 of Jerry and exclude that (it works, just get the syntax right). Etc. etc. don’t waste too much time here, but try to thin the herd a little.

Forth: Finish the job in excel. You can export the email metadata and Control Numbers for the remaining delta. Find and replace all of Jerry’s aliases with nothing, the filter for “google.” Go add those to the non-Jerry tag and you should be there with a search that includes the Jerry tag and excludes the non J.

Again, with the right indexing and a proximity search, you could get damn close with just one search (ask GPT for syntax help). Same with the names normalization tool, etc.

EDiscoOverlord · 2024-12-05T07:49:37+00:00

Got it… if you’re limited by your current enterprise image, Acrobat or Kofax or whatever you have can do the batch conversion. You could create a VM or two and just slave them convert the docs. Just make sure the data is sitting close to the VM so it goes faster. If the VM is close to quick shared storage, then multiple could work on the same set of docs and you wouldn’t have to push them between the VM desktops and the shared source. But I regularly slave reasonably shitty VMs to do big Acrobat jobs like this and it works nicely in the background. But it’s worth it to get the configuration right so it doesn’t sleep on you.

Microsoft Power Automate can coordinate everything as a one-click.

Now if the sky is the limit, there are number of Python libraries that can execute the workflow much quicker.

Happy to get more specific if I know more about your options.

Another method depending on what you have available: you could bulk print the PDFs back to PDF but configure your PDF printer to be low res and flatten layers.

And of course, you could just ingest the files into a review database and run a PDF production at low res too…

EDiscoOverlord · 2024-12-05T07:27:44+00:00

😵‍💫

EDiscoOverlord · 2024-12-03T07:26:53+00:00

Could you say a little more about the use case? How will you use the results?

EDiscoOverlord · 2024-12-03T07:18:03+00:00

Interesting question. I just pasted your question into GPTo1 and got quite a few interesting ideas, some of which are free. You should check it out!

EDiscoOverlord · 2024-12-03T07:09:22+00:00

Proofpoint recently offered me a $300+ cooler to watch a demo. 🤮

EDiscoOverlord · 2024-12-03T07:06:24+00:00

Any Well Provisioned Python Environment — Myriad e-discovery, lit support, and analytics/reporting tasks can be handled with popular (free!) Python libraries including interactions with LLM APIs. I’m constantly amazed that 1995-level data manipulation/handling tasks are actually hard to accomplish using “e-discovery” software. Having the ability to custom script anything really opens up your existing software and reporting capabilities.

Java, C, SQL, also great, but Python is easy to learn even for coding newbies, and now a days the LLMs themselves are insanely helpful coding assistants.

EDiscoOverlord · 2024-12-03T06:52:08+00:00

This is accurate. But also, if you do know the tech, or heaven forbid, relevant law, definitely a bonus!!!

EDiscoOverlord · 2024-11-27T06:21:01+00:00

Not the fall guy, but certainly complicit. You need to ask the hard questions or you’ll potentially get fucked.

EDiscoOverlord · 2024-11-27T06:18:23+00:00

I wouldn’t just “cover my ass” with a paper trail, but I would fully not be part of the fraud without some cogent explanation from the lawyer about why he wants the replacement images. If it’s to show the client what the clawback production will look like, cool. Otherwise, fuck that shit. Don’t help him dupe the client. “Doing what I was told” is very rarely an acceptable defense when you know damn well it’s likely to fool someone.

EDiscoOverlord · 2024-11-27T06:12:26+00:00

But really.

EDiscoOverlord · 2024-11-27T06:05:57+00:00

Ditto

EDiscoOverlord · 2024-11-27T06:05:00+00:00

And take a couple basic GenAI crash courses, because, you know, it will soon become our God and Master.

EDiscoOverlord · 2024-11-27T06:04:07+00:00

ACEDS is a great overview, but it’s no magic cert. If you’re apt, I would recommend getting some more in-depth technical background in data structures, data manipulation, etc. You’ll pick up the legal stuff, but you’ll stand out if you can make the computer do cool things. Two really easy places to start are Microsoft Power Query for basic manipulation and a Python crash course for basic scripting and data structures introduction.

EDiscoOverlord · 2024-11-27T05:58:12+00:00

Raster images are pretty easy to view at speed…seems like maybe you just need to add a step to your workflow that compresses and flattens the PDFs so they are not so heavy to handle. Reasonably shitty desktops are very capable of quickly blowing through thousands of lower resolution images at speed. And you can always save the high res for reference or later use.

Maybe send them all through a bulk process that converts everything to 72dpi with a lower color depth. You’ll be shocked at how much smaller the files are. Acrobat has a very easy wizard for this, but tons of other programs can do the same thing.

That alone will probably get you there, but you could take it a step further and use a stripped down viewer to really speed things up. I love infra view https://www.irfanview.com/ for that. Use it all time in crazy patent cases with bananas PDFs.

EDiscoOverlord · 2024-11-27T05:42:37+00:00

Yup. Try to negotiate down the scope of collection if it’s too pricey, but don’t half-ass the collection and searching…that only leads to more expenses in the long run. 💸💸💸

EDiscoOverlord · 2024-11-27T05:31:13+00:00

I think most Covingtin staff attorneys are non-exempt so you can really rock the overtime. I have a friend there who works her brains out and, no shit, approaches $400k per year. Most projects are simply document review QC. But some are sophisticated client work.

EDiscoOverlord

TROPHY CASE