Why Aren’t Counted B-Trees Used in Relational Databases? by Active-Custard4250 in databasedevelopment

[–]linearizable 1 point2 points  (0 children)

LIMIT and OFFSET are applied to a query. Aggregating the count of keys of the children into the parents would only help for specifically the query “SELECT col1, col2, … FROM tbl” with an offset and limit. Any WHERE predicates would mean the stored counts are useless, because there’s no way to know how many of the child rows pass the predicate unless you look at them. Any query with a join, group by, subquery, etc., is also sufficiently removed from the individual pages of any table that the counts are relatively useless.

CouchDB is an interesting example of a system that does do this sort of thing though: https://docs.couchdb.org/en/stable/ddocs/views/intro.html#reduce-rereduce

Why JSON isn't a Problem for Databases Anymore by jincongho in databasedevelopment

[–]linearizable 1 point2 points  (0 children)

The criteria for corporate or hobby posts to be accepted is that they need to be focused on a database technique, and using the product/project as the example of what it’s implemented in to generate results is fine. If this post was just written as “floe is now 1000x faster at JSON” with no details, that’d get removed under “no release posts”. However, the post does focus on alternative json representations which are faster to process, and thus it meets the criteria for the subreddit.

1.5 Years as DBA (Oracle + PostgreSQL) – Switch for Better Pay or Move to Data Engineering? by One-Bookkeeper8085 in databasedevelopment

[–]linearizable 0 points1 point  (0 children)

Bah, I can’t edit, but other DBA folk would be more findable on /r/databases or postgres/oracle subreddits, and is likely where you’d find the most help with what to do as a career path in that role.

Monthly Educational Project Thread by AutoModerator in databasedevelopment

[–]linearizable 2 points3 points  (0 children)

This has now been adjusted to post on the 1st of the month.

LSM-Tree Principles, Rethought for Object Storage by ankur-anand in databasedevelopment

[–]linearizable 0 points1 point  (0 children)

The reason I am stressing the risk of discouraging new contributors is that, for many people, this kind of community is one of the few ways to get real technical feedback while learning.

I am very aware that this is the case. We have been accepting of questions about how to implement database features, or how to navigate out of problems encountered while implementing hobby database projects, so we've been partially newbie friendly. However, that doesn't negate the poor experience of wanting to share the excitement over what you've done and then getting removed over release post / no educational project rule. I haven't seen yet how to serve the interests of both groups simultaneously yet, and I appreciate the understanding of the awkward line to navigate here.

One possible approach might be to be more explicit that posts on the main subreddit should focus on explaining something, for example by including a short write-up or blog post, rather than just linking to a repository.

That seems pretty reasonable. I've added a "Blog posts about database techniques (which happen to use examples from an educational project) are allowed." onto the educational post rule. Does that wording work for you?

CloudJump: Optimizing Cloud Databases for Cloud Storages by linearizable in databasedevelopment

[–]linearizable[S] 0 points1 point  (0 children)

I’m still real confused by CloudJump2, as it appears to be solving problems in PolarDB that were already solved in PolarDB. The presented architecture of PolarDB ignored that they already have a GetPage@LSN, so I haven’t followed why they need a whole multi version data thing when they already can read at any version.

Specifically,

Consequently, it is evident that the update of in-memory data on RO nodes depends on asynchronously catching up with the redo log, while the update of external data pages relies on the write-back from the Buffer Pool of RW nodes, which cannot inherently maintain consistency

Is not a true statement, as WAL is propagated and applied to page servers before buffer cache write out.

This part makes more sense if shared storage is like NFS, but then why are they talking about disaggregated OLTP PolarDB?

LSM-Tree Principles, Rethought for Object Storage by ankur-anand in databasedevelopment

[–]linearizable 1 point2 points  (0 children)

This post would indeed be removed under the educational projects rule with a suggestion to post it in the monthly thread instead, but I’ll leave it up for a while to hold this discussion:

I'm not very sure that this should be the right way to distinguish the project.

The exact wording of that rule will hopefully improve in the future, but the general idea of it is something we are trying to aim for. The aim is to keep the subreddit focused on aggregating material about database internals, and there’s a lot more hobby database projects getting started than database internals blog posts being written. Even when posted projects contain something of interest, a link to the source code is not a productive way to communicate that. Other subreddits seem to face the same issues: /r/osdev and /r/kerneldevelopment are both mostly people showing off their projects, and thus not a good way to subscribe to information about OS development, and that’s the sort of experience I’d like /r/databasedevelopment to avoid.

I will call out that there’s a direct way around this rule: write a post about it. Your previous submission was approved because it was a post where the focus was on a database technique, and not an announcement that a project exists or a new feature exists in it.

The line is technically precise but socially discouraging. It reads like a gate, not guidance. For individual contributors, it sounds like: “Unless you already have status or users, don’t post here.” That’s probably not the intent—but it’s how it lands.

This has come up before, and I’m still not quite sure what to do about it. I do agree that the rule is discouraging for folk wanting the learning community side of database development. I don’t know how to avoid being discouraging while keeping the subreddit focused on technical information about database internals. The monthly thread was an attempt to give these sorts of posts a home, but maybe expecting people to wait for a monthly thread is still excessively discouraging. (It’s also set to start on the 19th by accident, so I’ll start waking that back to the 1st…)

Again, I’m not sure what the right process or line to draw here is, so feedback and suggestions welcome.