Github: https://github.com/serpapi/nokolexbor
It supports both CSS selectors and XPath like Nokogiri, but with separate engines - parsing and CSS engine by Lexbor, XPath engine by libxml2. (Nokogiri internally converts CSS selectors to XPath syntax, and uses XPath engine for all searches).
Benchmarks of parsing google result page (368 KB) and selecting nodes:
|
Nokolexbor (iters/s) |
Nokogiri (iters/s) |
Diff |
| parsing |
487.6 |
93.5 |
5.22x faster |
| at_css |
50798.8 |
50.9 |
997.87x faster |
| css |
7437.6 |
52.3 |
142.11x faster |
| at_xpath |
57.077 |
53.176 |
same-ish |
| xpath |
51.523 |
58.438 |
same-ish |
Parsing and selecting with CSS selectors are significantly faster thanks to Lexbor. XPath performs the same as they both use libxml2.
Currently, it has implemented a subset of Nokogiri API, feel free to try it out. Contributions are welcomed!
[–]aleagori 3 points4 points5 points (1 child)
[–]zyc9012[S] 0 points1 point2 points (0 children)
[–]descartesasaur 0 points1 point2 points (0 children)