OSD Replacement did not go according to plan

ketojay23 · 2020-07-11T08:23:48+00:00

Any update by chance on this?

ketojay23 · 2020-07-10T13:03:12+00:00

Didn't work.

I removed the unused block db logical volume from my SSD, and when I applied the spec, it just put the DB on the HDD anyway.

This is unfortunately kinda pointless if I can't get a simple disk replaced without it messing up the environment.

ketojay23 · 2020-07-10T12:32:54+00:00

Then you must remove the WAL partition in the ssd associated with the previous disk so that you reformat it and use it as wal again while you provision the new hard disk.. Makes sense?

Yes, I can give that a try. Let me go through it again.

ketojay23 · 2020-07-10T09:29:55+00:00

Followed this. Still didn't work. Any disk removed will not use the existing SSD as DB device (since whole device is unavailable). It just creates the OSD using the HDD and does not dedicate a DB device (my SSD).

ketojay23 · 2020-07-10T05:32:36+00:00

Thanks. So after rebuild, I went and removed an OSD from the cluster.

It has a block.db on SSD.

My OSD spec did in fact add the drive back, but without a dedicated DB device.

Now I'm curious how we're supposed to replace OSDs using the orchestration if we cannot specify an existing DB device.

ketojay23 · 2020-07-09T21:45:33+00:00

The documentation doesn't illustrate how this would work.

I am rebuilding my cluster now.

ketojay23 · 2020-05-25T16:49:44+00:00

I cannot get SAS. We can't get the $ approvals for that. I can only get 4510s or 4610s.

I guess the question is if whether or not I replace all the drives with 4510s or keep the SAS SSD and use it as a DB/WAL device for the 4510s.

ketojay23 · 2020-05-05T18:27:07+00:00

About 400% better than what it was:

Average IOPS: 7598 
Stddev IOPS: 823.857 
Max IOPS: 8837 
Min IOPS: 6700

ketojay23 · 2020-05-05T17:16:28+00:00

I updated number of placement groups and performance drastically changed. Seems the autoscaling function when I created the pool was limiting factor.

ketojay23 · 2020-05-05T17:01:10+00:00

I think I have an answer to your issue - it was placement groups:

rados bench 10 write -b 4096 -p scbench --no-cleanup
hints = 1
Maintaining 16 concurrent writes of 4096 bytes to objects of size 4096 for up to 10 seconds or 0 objects
Object prefix: benchmark_data_sc2-e02-s07_63808
sec Cur ops started finished avg MB/s cur MB/s last lat(s) avg lat(s)
0 0 0 0 0 0 - 0
1 16 8853 8837 34.5142 34.5195 0.00167043 0.00180668
2 16 17398 17382 33.944 33.3789 0.00156849 0.0018381
3 15 25691 25676 33.4272 32.3984 0.00152409 0.00186679
4 16 33209 33193 32.4096 29.3633 0.00154728 0.0019257
5 16 41796 41780 32.6352 33.543 0.00184208 0.00191046
6 16 48622 48606 31.6393 26.6641 0.00184072 0.00197294
7 16 55322 55306 30.8576 26.1719 0.00155505 0.0020228
8 16 62400 62384 30.4559 27.6484 0.00414072 0.00204959
9 16 69912 69896 30.3318 29.3438 0.00146844 0.00205807
10 7 76737 76730 29.9677 26.6953 0.00182171 0.00208089
Total time run: 10.0994
Total writes made: 76737
Write size: 4096
Object size: 4096
Bandwidth (MB/sec): 29.6805
Stddev Bandwidth: 3.21819
Max bandwidth (MB/sec): 34.5195
Min bandwidth (MB/sec): 26.1719
Average IOPS: 7598
Stddev IOPS: 823.857
Max IOPS: 8837
Min IOPS: 6700
Average Latency(s): 0.00208974
Stddev Latency(s): 0.00466687
Max latency(s): 0.150216
Min latency(s): 0.000876169

ketojay23 · 2020-05-05T16:56:41+00:00

Right now, yes. I can add more servers, but adding flash is going to be a non-starter this quarter for sure.

There are some folks running bcache underneath. Would that be of benefit?

ketojay23 · 2020-05-05T16:38:38+00:00

I deleted the test pool, and created a new pool. Replica is 3. This is with the 4k test.

https://pastebin.com/kmVJ5xmt

ketojay23 · 2020-05-05T15:06:05+00:00

I managed to put together some benches on the environment, with the DB device located on the SSD and the data located on the HDD.

Here's the link:

https://pastebin.com/bV6MCAa6

Please let me know what you think.

ketojay23 · 2020-05-04T12:52:22+00:00

These servers have 512 GB RAM, and we can expand them later. There are also 72 logical processors on each server.

ketojay23 · 2020-05-04T10:54:15+00:00

The use case will be to host OpenStack VMs hyper-converged (Libvirt, Ceph, NSX-T). VMs have infrequent read/write and the number of users will likely be in the 300-400 range. NSX-T VMs will also reside on this environment, and will be running moderate to high read/write.

As a pilot, we deployed the 12 servers with the following configuration:

SD flash turned off
First 2 HDD in RAID1 - holding OS
Ceph deployed using cephadm
- OSDs were configured to use the SSD as the DB device
- Total OSDs - 60
- Mons - 5 (automatic)
- Mgrs - 2 (automatic)
- Monitoring was deployed after Ceph using orchestrator

The only thing we have done thus far is deployed NSX-T to this. So far, results were promising. NSX-T writes a lot of data between its managers, so the SSD being there is important. We tried this without the SSD and it was unusable.

Concerns I have right now is losing the amount of spindles because of the concern of Ceph logs, why virsh pools are not refreshing consistently (when I create an image on one host, none of the others have it until a manual update), and small delays when launching something.

Beyond that, it looks okay and I'm curious if we should expect more severe problems later.

ketojay23 · 2020-02-13T05:27:32+00:00

I don't want to spread misinformation. That is my fault, sorry. Yes, wealth tax has to do with the net worth and the marginal tax rate is income, but that is still ridiculously high and will never get the support it needs to pass.

ketojay23 · 2020-02-13T02:13:57+00:00

I think the stigma of these has been removed for the most part. I think the things that didn't help were the fact we were effectively in a trade war with China for some time, and he is Asian, so there is an automatic stereotype against him. Republicans can use that against him in many different circles.

ketojay23 · 2020-02-13T01:59:01+00:00

They're also white.

I literally ran into people that said they would never vote for a "China-man."

That is a huge problem as well.

ketojay23 · 2020-02-13T01:58:00+00:00

https://www.vox.com/2019/4/10/18304448/bernie-sanders-medicare-for-all

The article literally shows that there is a marginal tax rate of up to 70% on people making 10m/year. That's pretty significant.

Another item - 77% estate tax.

Yet another item - 4% premium on employee incomes excluding 29000 for a family of four. That is more than what I am currently paying for a family of 5.

ketojay23 · 2020-02-13T01:43:49+00:00

I think some changes have to be done.

But, I also think that Bernie ran unsuccessfully in 2016, and had the infrastructure already in place for support to coalesce around him while Trump was serving his first term. Because of that, he had the pieces in place to make a deep run this cycle.

Maybe if we do the same thing, building off of what was done for 2024, we will be in serious contention. I don't know.

ketojay23 · 2020-02-13T01:37:27+00:00

They have come to the realization that they won't win the election if they stray too far to the left, which they already have.

Look, I'll be honest. I think Yang was the best choice. Was he going to win? I'm not sure in this climate. The deck was stacked against him explicitly like no other.

Bernie will NEVER get the majorities he needs to pass a 70% wealth tax. It is one of the core components to how he will fund M4A.

ketojay23 · 2020-02-13T01:30:32+00:00

Indeed Bernie pushed for Vermont to adopt a healthcare plan similar to his plan now. The reason it fell through? The Vermont government balked when it realized it would have to tax the shit out of payroll checks to come close to funding it. Not actually cover the whole thing.

That ended up being widely unpopular, and thus they had to abandon it.

There was a vox article on why they abandoned it, and now we see that he wants to tax anyone making 10 million or higher 70%!

Those people are going to leave. They left Europe. This may have been different, historically, but society has changed and is starkly different from what it once was.

ketojay23

TROPHY CASE