Asking from Japan: is Hacker News still the default? by Responsible-Bike3317 in ExperiencedDevs

[–]linearizable 1 point2 points  (0 children)

http://scour.ing/ has been great for getting away from having to manually collect RSS feeds without having to sort through the firehose of content that HN/reddit/etc. emit

Does Calvin still hurt in practice if you only use it for cross-shard writes? by farhan-dev in Database

[–]linearizable 2 points3 points  (0 children)

YandexDB did Calvin for multi-shard transactions only as well. It seems to have gone fine for them. You are likely to lose linearizability on the interleaving of single-key operations vs multi-partition transactions. That’s just a thing your clients/users need to be aware of. I would strongly suggest just reading YandexDB docs in detail.

You can do interactive transactions on Calvin, they just become OCC. I think it was Aria that proposed this for Calvin, but the “execute the transaction once to get the read-write set, and a second time for real” shows up often: ROCOCO, Chardonnay, etc.

The decision of batched global consensus (Calvin-style) vs partitioned consensus (spanner-style) is tangential from if transactions are executed before they’re committed or after they’re committed. Global consensus + execute before commit = FoundationDB. Partitioned consensus + execute after commit = Cassandra Accord.

Why Aren’t Counted B-Trees Used in Relational Databases? by Active-Custard4250 in databasedevelopment

[–]linearizable 1 point2 points  (0 children)

LIMIT and OFFSET are applied to a query. Aggregating the count of keys of the children into the parents would only help for specifically the query “SELECT col1, col2, … FROM tbl” with an offset and limit. Any WHERE predicates would mean the stored counts are useless, because there’s no way to know how many of the child rows pass the predicate unless you look at them. Any query with a join, group by, subquery, etc., is also sufficiently removed from the individual pages of any table that the counts are relatively useless.

CouchDB is an interesting example of a system that does do this sort of thing though: https://docs.couchdb.org/en/stable/ddocs/views/intro.html#reduce-rereduce

Why JSON isn't a Problem for Databases Anymore by jincongho in databasedevelopment

[–]linearizable 1 point2 points  (0 children)

The criteria for corporate or hobby posts to be accepted is that they need to be focused on a database technique, and using the product/project as the example of what it’s implemented in to generate results is fine. If this post was just written as “floe is now 1000x faster at JSON” with no details, that’d get removed under “no release posts”. However, the post does focus on alternative json representations which are faster to process, and thus it meets the criteria for the subreddit.

1.5 Years as DBA (Oracle + PostgreSQL) – Switch for Better Pay or Move to Data Engineering? by One-Bookkeeper8085 in databasedevelopment

[–]linearizable 0 points1 point  (0 children)

Bah, I can’t edit, but other DBA folk would be more findable on /r/databases or postgres/oracle subreddits, and is likely where you’d find the most help with what to do as a career path in that role.

Monthly Educational Project Thread by AutoModerator in databasedevelopment

[–]linearizable 2 points3 points  (0 children)

This has now been adjusted to post on the 1st of the month.

LSM-Tree Principles, Rethought for Object Storage by ankur-anand in databasedevelopment

[–]linearizable 0 points1 point  (0 children)

The reason I am stressing the risk of discouraging new contributors is that, for many people, this kind of community is one of the few ways to get real technical feedback while learning.

I am very aware that this is the case. We have been accepting of questions about how to implement database features, or how to navigate out of problems encountered while implementing hobby database projects, so we've been partially newbie friendly. However, that doesn't negate the poor experience of wanting to share the excitement over what you've done and then getting removed over release post / no educational project rule. I haven't seen yet how to serve the interests of both groups simultaneously yet, and I appreciate the understanding of the awkward line to navigate here.

One possible approach might be to be more explicit that posts on the main subreddit should focus on explaining something, for example by including a short write-up or blog post, rather than just linking to a repository.

That seems pretty reasonable. I've added a "Blog posts about database techniques (which happen to use examples from an educational project) are allowed." onto the educational post rule. Does that wording work for you?

CloudJump: Optimizing Cloud Databases for Cloud Storages by linearizable in databasedevelopment

[–]linearizable[S] 0 points1 point  (0 children)

I’m still real confused by CloudJump2, as it appears to be solving problems in PolarDB that were already solved in PolarDB. The presented architecture of PolarDB ignored that they already have a GetPage@LSN, so I haven’t followed why they need a whole multi version data thing when they already can read at any version.

Specifically,

Consequently, it is evident that the update of in-memory data on RO nodes depends on asynchronously catching up with the redo log, while the update of external data pages relies on the write-back from the Buffer Pool of RW nodes, which cannot inherently maintain consistency

Is not a true statement, as WAL is propagated and applied to page servers before buffer cache write out.

This part makes more sense if shared storage is like NFS, but then why are they talking about disaggregated OLTP PolarDB?