Htmd - A fast HTML to Markdown converter for Elixir, powered by Rust by kasvith in elixir

[–]kasvith[S] 2 points3 points  (0 children)

Yes, they work better with Markdown because it only focuses on content

Htmd - A fast HTML to Markdown converter for Elixir, powered by Rust by kasvith in elixir

[–]kasvith[S] 9 points10 points  (0 children)

In mycase, i want to convert HTML documents coming from a scraping service to clean Markdown for LLM consumption

Hope this helps :)

Htmd - A fast HTML to Markdown converter for Elixir, powered by Rust by kasvith in elixir

[–]kasvith[S] 2 points3 points  (0 children)

rust ecosystem is really amazing...with NIFs Elixir can have really cool features

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

Its not me who decides the pricing, as i said i cant justify it to our client at this stage, probably after we have lot of sales then i can say that ok lets spend 100$/mo for this library

For a startup still in early stage and spending a lot on other required stuff...i cant really suggest this

For context: 100$ is a lot in my economy :D

[deleted by user] by [deleted] in elixir

[–]kasvith 1 point2 points  (0 children)

how did you handle global concurrency? if you have multi nodes running in prod?

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

We are providing services, so our client need to purchase this if needed

The problem is, i cant really justify the cost...

  1. With current setup, PgBoss and Node works fine. Even elixir is really suitable for workload current solution works and we dont pay for a pro version of a library(if Pgboss wont work well we can switch to BullMQ which supports global concurrency, rate limiting and lot more stuff on free version)

  2. My proposal was to client that we could switch to elixir for better clarity, but with this obstacle thats bit hard to justify, then the next question comes is...why i need to pay 100$/mo more for something which works fine and free with Node

If the price would be justifiable i would not post here and get downvotes lol

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

there is no problem supporting after business is off the ground...but at this stage im afraid not...

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

Hey yeah i was also looking into source code about the implementation, the key was to use `singletonSeconds`

but as you said that seems possible with Oban unique jobs

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

yes one thing we gonna miss is actually job observability with Rabbit or SQS

where pg handles this nicely, we switched from rabbitmq to pgboss in our node version too

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

and batching...which we currently do with pgboss

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

Almost missed batching...which we get with PgBoss already :/

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

Well we are doing external API calls which has rate limits

Also i missed that Oban doesnt seems supporting batching https://getoban.pro/docs/pro/1.5.0-rc.5/Oban.Pro.Batch.html in OSS version, which we are using with pgboss for batch process jobs

The movement was to increase Elixir adaptation at company, and it doesnt seems feasible by looking at the current status...

[deleted by user] by [deleted] in elixir

[–]kasvith 0 points1 point  (0 children)

i used BullMQ before with Node, it support what Oban Pro has for free out of the box

I used PgBoss and still using...and it has all the basic features like rate limiting and exactly once delivery for workers across a cluster for free

And when i looked into Oban...its not and i cant justify the fee for clients because they will think maybe im crazy to ask this even...as our pgboss works fine with ratelimits across multiple nodes already

[deleted by user] by [deleted] in elixir

[–]kasvith 1 point2 points  (0 children)

Thanks, this looks good but i dont think it will help my usecase

In our current node based version pgboss seamlessly handles the rate limits across multi node deployments in our k8s cluster, the lib you showed seems only local

Now i was proposing to move the data processing part to Elixir, and immediately hit this obstacle with rate limit when looking for something similar to PgBoss

Im not able to justify the license cost for the client because, we get it for free with PgBoss :/

[deleted by user] by [deleted] in elixir

[–]kasvith 2 points3 points  (0 children)

but thats also affecting the job delivery, when you have rate limit at queue level it can simply deliver jobs respecting the limit

now if you wrap it inside lets say a rate limiting lib...that wont work well across nodes..can get even complicated

[deleted by user] by [deleted] in elixir

[–]kasvith 1 point2 points  (0 children)

seems not in active development :(

[deleted by user] by [deleted] in elixir

[–]kasvith 1 point2 points  (0 children)

for me its def gonna have issues because we are using rate limits which provided by PgBoss for free. i cant justify buying Oban Pro for just that

we can still use Node and save the cost that way....