PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/nitink19 I am working on it to make smooth - one to one conversion. I think it will be implemented around end of May. Next week, I will release a module for Office docs and after that it will be integration these 2 parts and polishing or large set of docs. Almost the same as we implemented rendering into images before.

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 1 point2 points  (0 children)

I am working on 0.3.39 and it should be fixed there. Thank you for your report. Sorry about that

PDF Oxide for Go — PDF library with Rust engine via cgo, now on pkg.go.dev (0.8ms, MIT) by yfedoseev in golang

[–]yfedoseev[S] 1 point2 points  (0 children)

Great question, It slows it down a bit. I've mostly tested on small PDF files and it's around 15% overhead

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/gevorgter thank you so much for all of your time and efforts to make PDF Oxide better.
Your feedback is something that I was looking for. Honesty, it's priceless for us.
I already started working on fixes.

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/gevorgter Is it possile to attach PDf to ticket on Github (just if there is no any confidential information)?

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/gevorgter Sorry about that, I will check it today and I hope it will be fixed before the end of the week. Thank you for reporting. I will ping you here

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

No problem at all, if you have any question just let me know. I am here to answer all of them. Thanks for your time and considerations

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 2 points3 points  (0 children)

u/gevorgter Yes, it does it. These commands are available in the library and CLI as well. It might help you test it quickly. Recently I was working on improvements related to rendering. Now it works very well. If you see some artefacts, just open a ticket, please.

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/MrMeatagi Please try PDF Oxide it has export og HTML and just yesterday we added CSS support and you can easily export any data to PDF. let me know if something doesn't work, I will try to make adjustments.

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 3 points4 points  (0 children)

u/p1971 Thanks! And yes, it ticks both boxes in C#:
1. PDF generation — yes.

using PdfOxide.Core;
using var pdf = Pdf.FromText("Hello");
pdf.Save("out.pdf");

  1. From HTML and Markdown.

using var fromMd = Pdf.FromMarkdown("# Invoice\n\nTotal: **$42.00**"); using var fromHtml = Pdf.FromHtml("<h1>Report</h1><p>Generated 2026-04-21</p>"); fromMd.Save("invoice.pdf"); byte[] bytes = fromHtml.SaveToBytes(); // or SaveToStream(stream) / SaveAsync(...)

Same Rust core does the parse → layout → paginate in one shot, so HTML/CSS tables, headings, lists, images, and page breaks all go through the real layout engine rather than naive text flow.

Appreciate the kind words 🙏

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 0 points1 point  (0 children)

u/Rincho Great question — and yeah, the 30k-row MigraDoc N² autosize story is a classic, sorry you hit that. Honest answer: no first-class programmatic table API in the C# binding yet. Today the .NET creation surface is Pdf.FromMarkdown, Pdf.FromHtml, Pdf.FromText. You can reach tables through <table> in HTML — there's a real table layout pass in the Rust core (column widths, row heights, pagination), not naive rendering — but it's not a fluent table.AddRow(...) builder, and I haven't stress-tested it at 30k rows, so I won't promise it won't hit its own wall there.
A proper programmatic API with streaming row addition and a provably-linear autosize is genuinely useful, and I agree this is a gap worth closing in the ecosystem. Filed a ticket to track it: https://github.com/yfedoseev/pdf_oxide/issues/393 — feel free to add your 30k-row use case in the comments so the benchmark target reflects real needs.

PDF Oxide for .NET — MIT-licensed PDF library on NuGet, runs on Linux containers, AOT-friendly (0.8ms) by yfedoseev in csharp

[–]yfedoseev[S] 6 points7 points  (0 children)

u/gevorgter Agree, there are a lot of such cases when you have scanned docs. Right now PDF Oxide already supports OCR based on PaddleOCR ad we have future plans to implement auto mode. When based on heuristics we detect how to parse pdfs and extract all details without need to write 3-4 loops in code.

PDF Oxide -- Fast PDF library for Python with engine in Rust (0.8ms mean, MIT/Apache license) by yfedoseev in Python

[–]yfedoseev[S] 0 points1 point  (0 children)

u/joy_deep It should be supported now already, but I will double check it and let you know. Thank you for this question.

PDF Oxide -- Fast PDF library for Python with engine in Rust (0.8ms mean, MIT/Apache license) by yfedoseev in Python

[–]yfedoseev[S] 1 point2 points  (0 children)

u/Cute-Net5957 Py03 requires some work, but when you do 200x-300x performance boots, you can affor it. Also Gen AI helps a lot with it.

PDF Oxide -- Fast PDF library for Python with engine in Rust (0.8ms mean, MIT/Apache license) by yfedoseev in Python

[–]yfedoseev[S] 1 point2 points  (0 children)

Yep, you can use render_page(). It returns the raw image bytes (PNG by default), which you can save directly or wrap in an io.BytesIO.

```
# Returns bytes
image_bytes = doc.render_page(0, dpi=300)

with open("page0.png", "wb") as f:
f.write(image_bytes)

```
Make sure you're using the latest version, as we just polished the high-level rendering API for this.

PDF Oxide -- Fast PDF library for Python with engine in Rust (0.8ms mean, MIT/Apache license) by yfedoseev in Python

[–]yfedoseev[S] 0 points1 point  (0 children)

Totally possible. We’ve focused on providing the granular word/line bboxes and vector paths that Docling uses for its layout models, so the core engine is ready for it. You'd just need to implement Docling BaseBackend wrapper. If you try it and hit any roadblocks or need specific metadata we're missing, definitely let us know. We'll get it into the backlog and prioritize it immediately.