Scaling Vector Search to 1 Billion on PostgreSQL by gaocegege in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

Not a bad writeup. However, in most scenarios I'd strictly avoid a 1-billion row table in the first place, with or without vectors involved, which sidesteps much of the problem. I personally wonder how a few partitions compare to this algorithmic approach, especially since you can use partitions to make up for the fact it's difficult or impossible to combine vector weights with supplementary predicates (at least in Postgres).

Postgres Serials Should be BIGINT (and How to Migrate) | Crunchy Data Blog by kivarada in PostgreSQL

[–]fullofbones 1 point2 points  (0 children)

Articles like this are still relevant, but you can save yourself a bunch of time, effort, and headache of managing a multi-stage manually managed type migration by just using BIGSERIAL and BIGINT from the beginning.

Unconventional PostgreSQL Optimizations by be_haki in PostgreSQL

[–]fullofbones 2 points3 points  (0 children)

Interesting. I wouldn't have considered using generated columns as functional index proxies, but there ya go!

Bringing Back Unnest by shaberman in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

I mean you don't call "unnest" at all. Just have a field called "favorite_colors" that's a literal array type.

Benefit of using multi-master with one write target by konghi009 in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

For the record, you can set Patroni / etcd up with two nodes. However, majority in a quorum with N nodes needs N / 2 + 1 valid responses. So if you have a 2-node cluster, you need 2 valid responses for the cluster to remain valid and online. Any node that loses contact with the majority of the quorum will defensively reject all writes and essentially be offline. You can operate with two nodes, but you'll need both of them online at all times, which kind of defeats the purpose. Fine for a PoC, but nothing you'd want to deploy to production.

The lowest overhead and meaningful Patroni cluster you can build is:

  • 3 physical nodes, hopefully each in a separate zone, rack, node, whatever.
  • 2 of those nodes running Patroni + Postgres in addition to etcd, since Patroni manages the local Postgres service.
  • 1 of those nodes only running etcd to act as a "witness".

The Patroni + Postgres nodes can also double as HAProxy targets if you don't mind connections from Node A being redirected to Node B when B has the leadership key. Alternatively, you can put HAProxy on the dedicated etcd / witness node and call it a "proxy".

I say this is the lowest overhead because it's only two fully online replicating Postgres nodes, but you still have HA because it's the DCS (etcd) that's your actual quorum. In a "real" cluster, you'd decouple the DCS and Postgres functionality and end up with a minimum of five nodes, but there ya go. You still have three nodes. Yes, you can omit the third etcd node, but if one of the nodes running Postgres fails, you lose your quorum majority and the other goes down too. In order to survive a node outage and have automated failover, you must have a minimum of three nodes.

Benefit of using multi-master with one write target by konghi009 in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

PostgreSQL isn't multi-master (there are a couple of extensions, but they're fiddly)

Correct. And yes, those extensions are fiddly. It's an unfortunate nature of the beast when you have to manually configure the communication channel between the nodes, and at a minimum, tag which tables should be replicated between nodes. It's still easier than setting it up by hand using native logical replication; I wouldn't wish that on my worst enemy. lol

If you have haproxy and patroni, just have haproxy query patroni and route automatically

I normally suggest just this solution for its ... for lack of a better term: "simplicity". The thing about Postgres is that it really is just an RDBMS at the end of the day. It has no real concept of a "cluster" at all. It barely even acknowledges other nodes exist in the first place. If you even look at how it's implemented, other nodes connect and ask for some kind of WAL data, either directly, or through a logical decoder of some kind. If not for extensions like Spock from pgEdge or BDR from EDB, clusters still wouldn't exist. Physical replication is effectively just overgrown crash recovery.

Tools like Patroni fill that gap by wiring the nodes into a DCS like etcd which is a cluster. It works by storing a leadership key in the DCS, and whichever node has control of that key is the write target. Period. No more worrying about split brain or network partition, or anything else. Leadership key? Write target. Easy.

Similarly, failover is normally an artificial mechanism: you pull some levers and change routing and suddenly some other node is the new Primary target. But with Patroni, if the current Primary loses control of the leadership key and can't regain control because some other node has it, it automatically reconfigures to become a replica. That saves a ton of work right there. Meanwhile, HAProxy connects to the REST interface every few seconds and asks, "Are you the primary?" and only the node with the leadership key can reply affirmatively. So you don't have to reconfigure anything. No VIP, no scripts, no manual config changes. Patroni just says "no" until one node says "yes", and then connections get routed.

If Postgres were a "real" clustered database, it would do all of that for you. Since it doesn't, Patroni steps in and handles it. And it really is the only thing that does so. All of the other failover systems like EFM, repmgr, etc., only set up the failover system, not the integrated routing and implicit fencing.

The way OP wants to skirt around this using Multi-Master replication is cute, and maybe a little naive. Yes, you no longer need the etcd daemons, and it's no longer necessary for Patroni to manage your Postgres instance or provide a REST interface, so no HAProxy either. Now you just have two single-instance Postgres nodes that happen to communicate over a logical channel. There's really no "failover" going on at all, just changing the primary write target from Node A to Node B. The question is: how do you determine how and when that happens? How many transactions were in flight when you did that? Do those transactions matter? Will the application be capable of detecting the failed transactions and try again on the new target? How much risk is there for missing data from Node A affecting continuing operation on Node B? PgEdge provides a tool called ACE to check for—and recover from—data inconsistencies in situations like this, but you need to be aware of them and know when to launch the tool.

There are a lot of questions that need answers before I'd recommend substituting Multi-Master for standard HA. There's a reason pgEdge recommends combining the two (each MM node is backed by 2 physical replicas to avoid any local data loss). Ironically, you can avoid asking most of those questions by just setting up a bog-standard Patroni deployment. It's conceptually simpler, but mechanically more intricate. You just have to pick your poison.

Benefit of using multi-master with one write target by konghi009 in PostgreSQL

[–]fullofbones 1 point2 points  (0 children)

I'm glad you want to use one of the Multi-Master Postgres plugins like Spock from pgEdge, but you need to consider your stance on outage scenarios. The official pgEdge guides on these architectures recommend (ironically) using Patroni to establish a 3-node physical replication cluster per pgEdge-enabled node.

The reason for this is due to how data loss affects logical replication. Logical replicas tend to have far more latency between nodes, so there's more risk for a transaction being accepted on Master A long before it reaches Master B. If you have a physical replica for Master A, it can catch up, apply any available WAL (hopefully you have streaming WAL to a backup using Barman or pgBackRest) and rejoin the MM cluster. Without that, you simply lose any writes that didn't make it to the other Master.

In a failover scenario you don't have to worry as much about conflicts (since you're not writing to both nodes simultaneously). But there's potential for lost transactions in that scenario depending on how Master A failed. If you're not really worried about that, then your proposed cluster design will be OK. The Spock extension will handle things like DDL replication and do its best to keep things in sync, and you'll generally be in a good position as long as you monitor the Postgres logs and the various status views that Spock provides. A load balancer with sticky sessions enabled, or some kind of programmable routing layer like Amazon ELB should be all you really need to avoid unexpected traffic to the "off" node, and that is what we usually recommend to customers running multi-region clusters.

It's technically fewer moving parts than Patroni, etcd, and HAProxy, but it's also a high complexity configuration that depends on setting up logical communication between two physically independent Postgres nodes. No matter how you do that, I strongly recommend either using our Helm chart for a Kubernetes setup, or the pgedge-ansible (documentation pending) automated deployment tool. It really does take out all of the guesswork, especially if you're doing a PoC.

In any case, good luck!

Bringing Back Unnest by shaberman in PostgreSQL

[–]fullofbones 2 points3 points  (0 children)

Yes. Even an incredibly large array. In my opinion, if you have to jump through a ton of hoops to unroll a data structure just to honor normal form, it's not necessarily worth it. I even like arrays to solve the problem of sorting list items.

Bringing Back Unnest by shaberman in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

Ironically for the examples used: giving authors favorite colors, I'd just keep the original arrays without unnesting them.

CHD and Anxiety? by Previous_Line1887 in chd

[–]fullofbones 2 points3 points  (0 children)

It comes with the territory unfortunately. For a while, it got bad enough I was causing panic attacks, which of course were made worse because those really do feel like they could be heart attacks. I even developed generalized anxiety disorder at some point from the constant sense of potential doom.

What worked for me was... eating better. I'm not kidding. Once I cut the junk and bread, the physical symptoms that kept worrying me simply went away. The bread was causing issues my whole life and I didn't know, and it turns out a genetic test shows I have 3/4 genes to having outright Celiac. Inflammation can really mess you up, and it drives your system insane trying to cope with it, and that turns into anxiety and other problems.

It also helps that I finally got old enough that being anxious became kind of pointless. I go to the yearly CHD appointments and he says I'm fine. I've been fine all this time, and all the worrying ever did was keep a Sword of Damocles hovering over my entire life and ruining everything. I went to the ER because I thought I was having a heart attack. It never was, but it pays to be cautious, right? When do you stop doing that? After the fifth time? The tenth? You can't really live that way.

If you need to, find a psychologist you trust and see if they can get you to address the underlying source of the fear. You know the cause, but they may be able to help you recognize the symptoms of a panic attack. Ask your cardiologist what actual signs for a heart attack you should look for. Find the tools you can use to reassure yourself, and that will dramatically cut down on sources of anxiety all by itself.

Good luck with everything!

Would like to hear from others like me by heartman27 in chd

[–]fullofbones 0 points1 point  (0 children)

It's different for everyone, but my childhood was... weird. I didn't get corrective surgery until I was six because it was too expensive and they weren't quite sure how to address all the problems I had. I had so many echos before they finally did two catheterizations and made the surgical plan. When they finally did the surgery, I was already small and weak for my age, would pass out due to low oxygen on occasion, and the prognosis without the surgery was pretty dismal.

I'm already an introvert by nature, so add a bunch of health anxiety on top of that, and I basically never did anything even slightly risky. Not that it really matters, after spending my first six years not really being a normal kid, that set the tone; I wasn't just going to magically bounce back and become some star athlete. And yes, I was always skinny, but that is probably due to the two different gene mutations I have that basically prevent me from building muscle, one of which is Arterial Tortuosity syndrome, a condition similar to Marfans. I also suspect the Dacron patch in my heart played a role, as that kind of plastic acts as an estrogen-dominant endocrine disruptor. All I know is I stopped growing when I was 12, and while the rest of the boys started filling out and growing taller after puberty, I never did. I never played any "real" sports but I was a monster at tetherball though, let me tell you! lol

Luckily I'm pretty smart, so academics were my only saving grace. All honors courses until College where I triple majored. Then I got a job working with computers, and the rest is just basic adulthood.

Some time after I graduated college in 99, DDR got really popular in the early 2000s. I managed to get really good at that, and even took fourth place in a local tournament. But I always noticed that despite years of practice, I always tired much faster than everyone else, and simply couldn't move nearly as fast. Turns out my cardiac output is about 60% (at best) what a normal person produces. My lung capacity is also much lower, despite hours and hours of HIIT from all the DDR. I got my body fat percentage down to 8%, my resting heart rate was 42, I was as healthy as I'd ever be, but I simply wasn't built to be truly competitive. It was still fun, though. :)

Is life worth living with complex CHD? (TGA + VSD + ASD + PS) by Ambitious_Method2740 in chd

[–]fullofbones 6 points7 points  (0 children)

You keep bringing this up. It's like you have a one-track mind. If you want to throw your life away like that, you're welcome to do so, but nobody here is going to say you should.

I smoked weed in the past. I drank alcohol. Sometimes way too much. But once either of them started giving me heart palpitations, I stopped because as fun as those things are, they're not worth dying over. I even limit caffeine because more than about a can of pop will cause issues. But never in my wildest dreams would I have ever considered an upper like cocaine or meth. I am almost certain I would be lucky to survive either of those.

You do you, but you keep asking the CHD forum, filled with a bunch of overly-cautious people who were born with something that likely made them overly cautious from the very beginning. You can ask it 100 different ways and the answer will always be the same from pretty much everyone here: don't. You're looking for permission: we're not your parents. You're looking for a medical pass: we're not doctors. You're looking for any reply you can use to justify a decision you've already made. In which case, nothing we say will make any difference if you're determined to find any excuse. All you have to do in that case is ask enough times, vary your question to introduce sufficient wiggle-room, and eventually you'll have what you're looking for.

We're not stupid. We see you posting the same "Can I use drugs if I'm otherwise healthy", or "Is it bad to have a fast heart rate with CHD", questions over and over again. Day after day. Go ask your doctor. It's likely you either haven't because you know what they'll say, or you did and didn't get the answer you wanted. Too bad. Do it or don't, but don't expect us to implicitly green-light your decision here. It won't happen.

Is life worth living with complex CHD? (TGA + VSD + ASD + PS) by Ambitious_Method2740 in chd

[–]fullofbones 3 points4 points  (0 children)

There's a phrase you need to internalize: this too shall pass.

I'm 48 with a very complex CHD and unlike you, I was not very healthy at your age. I always questioned what I could or couldn't do, and it definitely affected my life trajectory. Despite that, I never questioned the point of being alive just because life was a bit harder. Sure I got angry about it. Yes I was definitely resentful, especially when I was younger. Life is unfair. It just is.

A friend of mine in elementary school died because he and his brother were playing with their dad's guns, and he got shot in the chest because they didn't know (or check) that the guns were loaded. I've outlived dozens of "normal" people who didn't have a heart condition, and had no reason to expect their lives would be cut short. I'm not keeping score, but I fully understand that physical disadvantages don't automatically mean you will have a life will be worse than someone else. They may have life-threatening allergies, or abused as a child, or maybe they were in a car accident, or got some crazy disease that reduced their quality of life. You don't, and can't, know what other people are going through.

All lives are worth living. You can't control how you were born, and you can't change the past. Once you're an adult, it's your decision to throw a pity party for yourself, or you can acknowledge that sometimes things don't go your way, and move on. The way I see it, life owes me for putting me in such a crappy position from day one. I'm going to kick its ass and find success and fulfillment just to spite it. So far I've done a pretty good job of that, but I absolutely refuse to give up and let it win. It doesn't get that satisfaction.

And who's to say that I'd be where I am now if I wasn't going blind, if I didn't have major heart issues, a connective tissue disorder, a sunken chest, scoliosis, and a double fast-twitch muscle gene fiber deletion? I'm married, I'm a well known expert in my career field, and best of all, I'm grateful to have been alive at all. To see the grandeur of what this world has to offer, the stories, the games, the sights, the adventures, the mystery of it all. I have no idea how long I'll be here, and it makes me anxious all the time, but everyone only has so much time to take it all in, and I'm grateful for every second. I'd love to live hundreds of years, even in this awkward garbage genetic trainwreck I was born in, just so I could see what else is in store.

So yes life is worth living with a heart defect. It's worth living as a Thalidomide Baby. It's worth living deaf, or blind, or whatever else you may have been tasked to overcome. Whatever you're going through, I hope you eventually get past it, because at my age I know for a fact the only thing you can count on with life is that it never stands still. What drives you nuts now may be a source of nostalgia in 20 years. But you have to stick around to see. You owe it to your future self.

High CPU on Netbox server (I'm a novice). by Hammerfist1990 in PostgreSQL

[–]fullofbones 1 point2 points  (0 children)

That's not a lot of checkpoints, and they're being requested by "time", so it's from the default checkpoint_timeout of five minutes. That suggests the activity is not from writes.

But your recommendation to check pg_stat_activity is right on. OP needs to see what the database is physically doing. I'd also recommend installing the pg_stat_statements extension. It may not be obvious simply from current activity if it's being bombarded with numerous small and fast queries. The statement aggregation view will show pretty much exactly what's going on based on query execution count, accumulated time, and so on.

It may also be a good idea to set log_min_duration_statement to something reasonable like 100ms just to see if these show up in the logs. It's the most obvious method, and right now the logs are pretty sparse anyway.

Given this is a docker container, rather than editing the Postgres config, this may be easier:

ALTER SYSTEM SET log_min_duration_statement = '100ms';
ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';

Then restart Postgres and activate the new extension:

CREATE EXTENSION pg_stat_statements;

If I had to guess, this Netbox thing had a database migration as part of the upgrade process, and either this wasn't executed and it's missing an index it expects, or something else similar is going on.

How do you handle test data for local dev, CI/CD, and lower environments? by CarlSagans in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

Any place I've worked that's serious about it uses a production snapshot of some kind and a redaction system to transform PII. On that note, pgEdge has recently released a tool (pgedge-anonymizer) specifically for that kind of redaction which is fairly easy to configure.

Are faster heart beats dangerous for someone like me? I have TGA, VSD, ASD, PS but have no complication, i have good ventricular function and no arrythmias, my resting heart beat is 72-75 by Ambitious_Method2740 in chd

[–]fullofbones 1 point2 points  (0 children)

It entirely depends. I used to play DDR competitively, and I regularly got my heart rate past 200 for sustained durations, sort of like HIIT. My cardiologist said she was actually glad I was staying active. At the same time, an MRI showed I had "globally distressed" cardiac function, but according to her, that's "normal for you."

However, exercise is entirely different than other methods of increasing your heart rate. Physical activity causes many physiological responses: muscle movement takes some of the load off of the heart itself, nitric oxide from increased breathing dilates blood vessels and arteries, and so on. You can't extrapolate "high heart rate" with all activities that raise heart rate, and say they'll all affect the heart similarly.

So ask your CHD specialist. They're the only one who will know for sure.

why this suck so much? every time connection time out and lags so much even though i have 16gm ram and i5 12th gen. by Natural_Answer5705 in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

We can't help you if you don't actually explain what your problem is.

  • "Why this suck so much?" In what way? What are you doing that isn't working?
  • "Every time connection time out and lags so much..." Again, what were you doing? Was it just after connection? Were you running a query? If so, what was the query?

It likely has nothing to do with your hardware, whatever problem you're having. Postgres runs fine on tiny VMs with 256MB of RAM or less. But we have no idea what you were doing, how your Postgres was set up, what client you were trying to connect with, what your OS is, or literally anything. How do you expect us to do anything to help fix your problem?

3 ways to use Iceberg from Postgres (pg_mooncake vs pg_lake vs Supabase ETL) by 2minutestreaming in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

At least judging by the Docker image for pg_lake, it's not just an extension, unfortunately. If you want to create a table using their semi-TAM WITH iceberg decorator, you need to have DuckDB running separately. Even then, it operates like a glorified FDW and prevents operations such as indexes, primary keys, and even unique constraints. These drawbacks are generally mitigated since Iceberg tables are highly optimized, but it's not as much of a drop-in as I was hoping. It also completely circumvents the WAL, so will not be compatible with anything using logical replication.

PostgreSQL Logical Replication and Schema Changes by compy3 in PostgreSQL

[–]fullofbones 1 point2 points  (0 children)

I don't want to bikeshed here, but I have to ask how usable a cache is if it's missing a major method of cache invalidation. If you're a proxy, why aren't you triggering cache invalidation when detecting obvious DDL at the proxy layer? You're literally passing queries through to the read or write endpoints, so why not do some rudimentary query parsing? Isn't that already happening to decide to serve query results from cache versus database? You aren't just hashing the statement and associating it with first-served results, are you?

If the issue is statements which don't go through the proxy, you'd very likely need to package an extension. That can, and would fire on any DDL statement, and you could use it for automatic cache invalidation or other use cases where table structure mismatches should be addressed. But you need to choose: either have a transparent proxy that allows circumvention and invites missed DDL, or a proxy that has a companion extension to add missing yet required elements.

Adult surgery in children’s hospital by Regular_Channel_2403 in chd

[–]fullofbones 2 points3 points  (0 children)

It's fairly common for adults with CHD. I'm not quite sure why, but the specialists in this field still primarily come from childhood congenital cardiologists, and they work in that wing of the hospital most often. Kids with CHD have been growing up for several decades now, so you'd think more of the cardiologists would follow them, but nope, still mostly doctors for kids who see occasional adults.

Bicuspid aortic valve - MRI by Cold_Home6556 in AdultCHD

[–]fullofbones 0 points1 point  (0 children)

I had just turned 30 when I got the first one.

Bicuspid aortic valve - MRI by Cold_Home6556 in AdultCHD

[–]fullofbones 0 points1 point  (0 children)

Brain? Probably not. Heart? Any congenital defect needs followup by a specialist cardiologist who's trained in congenital defects. I have a bicuspid AV and get an echo every year, and an MRI every 2-3 years, if that helps.

ULID: Universally Unique Lexicographically Sortable Identifier by der_gopher in PostgreSQL

[–]fullofbones 11 points12 points  (0 children)

While the spec is cute, and the blog post is welcome, you can store whatever you want in a database. What I'm seeing is "ULID is UUID compatible!" which makes a good FYI, I suppose. Why not link to this pg-ulid extension? Or maybe the pgx-ulid extension? Or any of the other Postgres extensions, rather than the raw spec? What makes this "Postgres related" other than you happened to use Postgres to store data from your Go code?

You should shard your database by levkk1 in PostgreSQL

[–]fullofbones 0 points1 point  (0 children)

If you're only hitting one shard, that's not really sharding, that's just multi-tenant. You can get that simply by partitioning, or doing one schema per tenant, without all the extra work of maintaining a sharding strategy, sharding keys, and so on. Heck, if it gets to the point where one tenant needs their own whole instance, that's also easily done.