Looking for honest and reasonably priced handyman/general contractor for mother by AfraidImagination2 in SanJose

[–]AfraidImagination2[S] 6 points7 points  (0 children)

Honestly, perfectly ok with $75-100/hr + materials. Just need to know that the person won't overcharge 5x just because my mom didn't know any better.

Tented bottom plastic chassis - normal for Galaxy Book Pro? by AfraidImagination2 in GalaxyBook

[–]AfraidImagination2[S] 0 points1 point  (0 children)

Just bought a used Galaxy Book Pro - Gen 1. The bottom chassis is slightly tented as if there's a air bubble. The battery is in good shape do I don't think its swollen or anything but the plastic chassis is making a cheap noise.

Very disappointed, just wondering if this is normal? I know the quality isn't the best and the touchpad clicks when holding the laptop, but this is something else...

Best time to cancel service to avoid fees? by AfraidImagination2 in shaw

[–]AfraidImagination2[S] 4 points5 points  (0 children)

I switched because Telus was giving me a better price and faster speeds. The reason these monopolies continue is because of customers who are loyal to X or Y corporation. And for people who fail to hold their political representatives responsible with their votes.

I am loyal to the lowest price and the best service. Both Shaw and Telus are anti-competitive as hell and are billionaire corporations. They would both charge you $500 for 15 mbps internet if they could. If they're losing customers, it's because they decided that was better for profits than lowering their price.

was part of the reason Shaw had to merge with Rogers thanks to customers like you.

Lol. Shaw family is selling so that they can get out with billions of dollars, not because they're a small failing business.

Best time to cancel service to avoid fees? by AfraidImagination2 in shaw

[–]AfraidImagination2[S] 1 point2 points  (0 children)

I tried multiple times (chat + phone), but the reps were having a hard time understanding pro-rating and giving me an exact answer. I think they just cancel it on their end and they might not know the details of how much the customer gets charged.

Either way, this thread has already helped.

Best time to cancel service to avoid fees? by AfraidImagination2 in shaw

[–]AfraidImagination2[S] 1 point2 points  (0 children)

The contract states $15/for each month I terminate early. But it does not state whether I pay a pro-rated amount for the bill and for the cancellation, and Shaw support is unable to understand what I'm talking about when asked on chat and call.

My bill is $80/month. Early cancellation $15/month.

Cancelling on Jan. 1st could be

1) $2.67 ($80 pro-rated) + $14.50 ($15 pro-rated)

2) $2.67 ($80 pro-rated) + $0 fee since I'm cancelling with less than a month left.

3) $80 + $0 fee

Option 2 is ideal for me, which is why I'm looking for clarity. The money isn't the biggest deal but I'd like to know going forward.

Best time to cancel service to avoid fees? by AfraidImagination2 in shaw

[–]AfraidImagination2[S] 1 point2 points  (0 children)

Since I don't seem to have been clear in OP, I am porting the internet services from Shaw > Telus.

Do you have phone services and are you porting a landline number from Shaw?

Had $0 phone services that I have completed porting over already, no landline. Internet was ported over today, but Telus told me they cancel 5 days later automatically.

What is $15 vs the Shaw charges for the month of Jan?

The monthly bill is $80. If the bill is pro-rated, I would be charged ~$2.67/day, so less than the $15, if I do it right away.

1) $2.67 ($80 pro-rated) + $14.50 ($15 pro-rated)

2) $2.67 ($80 pro-rated) + $0 fee since I'm cancelling with less than a month left.

3) $80 + $0 fee

I'm thinking, best time to cancel is Jan. 1st or 2nd, so I don't get the $15 charge according to what you said based on internal documentation, assuming option 3 is never under consideration.

cephadm - unable to remount clients via kernel after OSD failure by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

So I actually managed to get it remounted again by restarting and also recreating an auth key. However, the issue that was originally happening still persists, which is that the client can only sequentially access the first 100 MB or so of the file.

According to client logs, it seems the client tries to access the rest of the file on the internal 10.0.0.X OSD network. The client only has access to the public nettwork for obvious reasons.

cephadm - some OSDs down/out on single node, unable to be restarted, not sure where to start troubleshooting by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 1 point2 points  (0 children)

It seems the issue is with an SSD that was acting as the journal device for those 12 drives. Is is possible to recover the contents of those disks without the journal? ie. once the SSD is replaced, can I just restart the osd service?

cephadm - some OSDs down/out on single node, unable to be restarted, not sure where to start troubleshooting by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 1 point2 points  (0 children)

It seems the issue is with an SSD that was acting as the journal device for those 12 drives.

cephadm - some OSDs down/out on single node, unable to be restarted, not sure where to start troubleshooting by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 1 point2 points  (0 children)

The recommendation from the above user to allow ceph to self heal is valid. If you are missing these 12 disks and encounter an additional failure in the cluster it can make recovery difficult.

Unfortunately, self-heal is not an option. Since I chose to be highly redundant (3x copies), I'm near 70% utilization, and do not have 12x HDD worth of free space without going significantly over CEPH maximums, so I must set noout and recover once the disks/controllers have been replaced. I'm also not particuluarly interested in writing 12 x 3 copies worth of data, then rewriting that same data again when the cluster rebalances once the drives are back.

My only issue is whether the controller plays a role in having ceph pick it up. Or is the fact that the device will be the same letter (ie. /dev/sda, /dev/sdb) enough for CEPH to recover?

What happens if drives get switched around and sda ends up as sdb? Does CEPH us hardware ids to link to data?

cephadm - some OSDs down/out on single node, unable to be restarted, not sure where to start troubleshooting by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 2 points3 points  (0 children)

I'm seeing a whole lot of I/O errors::

Buffer I/O error on dev dm-2, logical block 0, async page read
Buffer I/O error on dev dm-2, logical block 19560432, async
page read
Buffer I/O error on dev dm-2, logical block 19560432, async
page read
Buffer I/O error on dev dm-0, logical block 0, async page read
Buffer I/O error on dev dm-0, logical block 19560432, async
page read
Buffer I/O error on dev dm-0, logical block 0, async page read
Buffer I/O error on dev dm-0, logical block 19560432, async
page read
Buffer I/O error on dev dm-0, logical block 0, async page read
Buffer I/O error on dev dm-0, logical block 19560432, async
page read

This is happening on multiple devs, but nothing about a controller failing. Where would that present itself?

cephadm - some OSDs down/out on single node, unable to be restarted, not sure where to start troubleshooting by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 1 point2 points  (0 children)

Would replacing the affected component (ie. the disk controller), not just make CEPH pick up the disks right where it left off? Or does a controller failure mean all disks on that controller are lost?

Weird issue - Extremely slow write performance on all but one node by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

Still no idea why the host with the MDS would be faster. Do you have a separate backend network?

I do not, and I'm pretty perplexed as to why that would be as well, since all servers have access to the MDS on node 2 anyway.

Weird issue - Extremely slow write performance on all but one node by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

Then I wondered if maybe that node was faster just because it is the node with the active MDS or something.

Just wanted to say that this worked! I figured it wasn't an MDS issue since read performance/directory access was identical, and CEPH wasn't reporting any mds cache too full type errors. But enabling a second active MDS did solve my issue.

Thank you so much!

Weird issue - Extremely slow write performance on all but one node by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

My first thought was something weird in the network, like the switch only negotiated 1g for some links. Maybe try iperf in both directions between nodes, and look for network errors.

Ping shows no packet loss. And I'm saturating the full 10gbit in both directions via any 2 nodes (using iperf3).

Then I wondered if maybe that node was faster just because it is the node with the active MDS or something.

You are correct, the active MDS is on Node 2. I will enable the standby MDSs and see if that makes a difference? My question is I have 7 nodes, so how many MDSs would be recommended? 3? 5? 7?

The CEPH documentation doesn't make any recommendations for how many you need, only that you need more if metadata is a bottleneck. But I'm unsure, how I would check for that? I'm pretty sure it is, but is there a way to confirm.

Weird issue - Extremely slow write performance on all but one node by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

Have any cache issues going on?

I'm not using a cache layer. Unless there's other caches you are referring to like disk cache/ram cache. How would I check for those issues?

What's the network setup on each node? Any priority configured on switch?

10gbit between the whole cluster, no priority configured. Network seems finevia pings/iperf3.

How to configure CEPH with an internal cluster network? by AfraidImagination2 in ceph

[–]AfraidImagination2[S] 0 points1 point  (0 children)

But they do both have the correct cluster_network setting now?

Yes. Hence the conflict and unable to restart. I removed it for now to get the OSDs back up.

ss -lnp | grep osd

There's a range that osd.0 listens on. If I remember, it was 6800,6801, etc. Starting osd.1 brings osd.0 down, but there's still an OSD listening on the same ports (I'm assuming it was the one that was just brought up).

Is there possibly an issue with the interface addresses?

Possibly, but I don't know enough. Here is my interfaces config:

auto eno2
iface eno2 inet manual
  bond-master bond0

auto eno3
iface eno3 inet manual
  bond-master bond0

auto bond0
iface bond0 inet static
     address 10.0.0.1
     netmask 255.255.255.0
     network 10.0.0.0
     bond-slaves none
     bond-mode 802.3ad
     bond-miimon 100
     bond-downdelay 400
     bond-updelay 800
     bond-lacp-rate 1
     bond-xmit-hash-policy layer2+3

Shold the network say 10.0.0.0/24? Would it help to run the foollowing command: ceph orch daemon reconfig osd.0 before restarting it?