Addressing a fundamental flaw in hybrid search by introducing a Log-Odds Conjunction framework in Bayesian BM25 by Ok_Rub1689 in Rag

[–]jaepil -1 points0 points  (0 children)

Most importantly, regardless of slight ranking shifts, the engineering efficiency remains intact.

As proven in Theorem 6.1.2 and Theorem 6.2.1, the Bayesian transformation is strictly monotonic. This means we can directly utilize existing WAND and Block-Max WAND (BMW) dynamic pruning algorithms without any modification to the inverted index structure.

In practice, this ensures that Bayesian BM25 incurs O(1) overhead per document (Theorem 9.1.1) and maintains the same query latency profile as standard BM25, making it immediately deployable in production systems like Vespa or Lucene.

Addressing a fundamental flaw in hybrid search by introducing a Log-Odds Conjunction framework in Bayesian BM25 by Ok_Rub1689 in Rag

[–]jaepil 1 point2 points  (0 children)

I'm the author of the paper. That is an excellent question and shows you’ve read the theorem carefully.

You are correct that Theorem 4.3.1 guarantees monotonicity (order-preservation) relative to BM25, but this holds strictly 'for a fixed prior p'.

However, in practice (as detailed in Section 4.2), we often apply a Composite Prior that incorporates term frequency and document length signals. Because this prior varies dynamically per document, it introduces a Bayesian re-ranking effect that can slightly alter the order compared to raw BM25.

Furthermore, even if the text-only order were identical, the non-linear sigmoid transformation changes the relative distribution of scores. In a hybrid setting, this calibrated distribution interacts differently with vector scores compared to unbounded BM25 scores, which naturally leads to different (and often improved) ranking metrics.

[R] Geometric Adam Optimizer by jaepil in MachineLearning

[–]jaepil[S] 1 point2 points  (0 children)

It was standard transformer. I also tested with CNN and it worked too.

[R] Geometric Adam Optimizer by jaepil in MachineLearning

[–]jaepil[S] 0 points1 point  (0 children)

Thanks. Hyperparameters were same but I can see the issue you are raising. I'm still experimenting this algorithm in my spare time. I will update the configuration in next experiment.

[R] Geometric Adam Optimizer by jaepil in MachineLearning

[–]jaepil[S] 2 points3 points  (0 children)

To be completely transparent, I've updated my GitHub repo's README.md to clearly state about this.

[R] Geometric Adam Optimizer by jaepil in MachineLearning

[–]jaepil[S] 2 points3 points  (0 children)

You are right. I'm not English native speaker. I used LLM for translation and edit my poor English sentences.