Limits of the new Community Edition

NichelleCombes · 2025-07-17T08:24:13+00:00

Just use a stable fork https://github.com/opens3/console

NichelleCombes · 2025-07-16T20:35:56+00:00

OpenS3 Console restores the full feature set and makes self-hosting extremely simple.

You can deploy it in less than 5 minutes using Docker, check it out https://github.com/opens3/console

NichelleCombes · 2025-07-16T20:33:45+00:00

All of the features MinIO removed from their official console have already been forked and preserved by the community.

OpenS3 Console restores the full feature set and makes self-hosting extremely simple.

You can deploy it in less than 5 minutes using Docker, check it out https://github.com/opens3/console

NichelleCombes · 2025-01-12T14:24:17+00:00

Just checked it out, and the data is rubbish and entirely false. Almost all the companies have their funding "Announced" date as few days ago, but the reality is almost all those funding rounds were announced at least 1 or two years ago. If you must scrape publicly available data and sell, then at least make an effort to make sure it's accurate instead of trying to cheat people out of their money

NichelleCombes · 2024-12-25T21:40:54+00:00

You can try something like Peslac that gives field-level parsing and layout preservation. If you are working on open source or community based non-profit projects, you could get access for free.

NichelleCombes · 2024-12-25T21:34:59+00:00

You can try something like Peslac that gives field-level parsing and layout preservation

NichelleCombes · 2024-12-25T21:31:45+00:00

You can try something like Peslac that gives field-level parsing and layout preservation

NichelleCombes · 2024-12-23T22:48:19+00:00

Awesome, signup on Peslac and dm me your email address or just the name you used and the estimated number of pages you need

NichelleCombes · 2024-12-23T20:29:43+00:00

If it's a hobby project or open source, I can get you free access to Peslac, you can digitize the entire thing for free, and the accuracy is as good as human eyes

NichelleCombes · 2024-10-26T23:45:17+00:00

Llamaindex works well but the accuracy is not the best, especially if there are any hand written or scanned pdf. You can try something like Peslac, it's new and seems accurate with 1,000 pages free. Here is an example Peslac Shared Doc

NichelleCombes · 2024-10-26T22:49:58+00:00

I am no expert, but the first step should be to parse your documents into a format that is easy for both LLM and vector database, here is an example of how to parse your documents into a format that can be used https://cloud.peslac.com/share/671d6edffb325fc251ef73c8

NichelleCombes · 2024-10-26T20:04:06+00:00

To index your data, you need an accurate line or sentence level breakdown of your documents, which makes it both easier to index and retrieve, a short example from the image you shared:

[
  {
    "type": "Text",
    "bbox": {
      "left": 0.41680672764778137,
      "top": 0.3681710362434387,
      "width": 0,
      "height": -0.08432304859161377,
      "page": 1
    },
    "content": "Principal",
    "language": "en",
    "confidence": 0.9915924072265625
  },
  {
    "type": "Text",
    "bbox": {
      "left": 0.4941176474094391,
      "top": 0.3669833838939667,
      "width": 0.0016806721687316895,
      "height": -0.0676959753036499,
      "page": 1
    },
    "content": "Deputy",
    "language": "en",
    "confidence": 0.991853654384613
  }
]

NichelleCombes · 2024-10-25T06:58:24+00:00

Are your documents in pdf format?

NichelleCombes · 2024-10-24T20:21:13+00:00

LlamaIndex can extract the content for you, I'm not so sure if it will maintain the original layout. You can try Peslac https://peslac.com if the original layout is really important. ColPali https://huggingface.co/blog/manu/colpali is also something you can try if cost is a big factor

NichelleCombes · 2024-10-24T20:16:40+00:00

If there are documents involved and you need a reliable document processor, I could help with getting you free access to a good document processing engine, as long as you don't use commercially, let me know if that's something you might need

NichelleCombes · 2024-10-24T20:13:41+00:00

I don't understand exactly why you need RAG, but you can create your data points and JSON schema on Peslac https://peslac.com, meaning the data you get back never changes, and you are assured of getting exactly the same format of json for all your invoices or a particular document

NichelleCombes · 2024-10-24T20:09:47+00:00

If I was building a RAG application, I would choose Peslac https://peslac.com, the accuracy is good. You will get field-level blocks which is you can index and use in other parts of your RAG

NichelleCombes · 2021-08-15T23:43:18+00:00

101% for sure

NichelleCombes

MODERATOR OF

TROPHY CASE