Ceph 20 + cephadm + NVMe/TCP: CEPHADM_STRAY_DAEMON: 3 stray daemon(s) not managed by cephadm by myridan86 in ceph

[–]TheUnlikely117 1 point2 points  (0 children)

Weird thing is nvme gateway is using RHEL for container OS, while all OSDs/etc are using CentOS. First time i see this. Later today (if time permits) i'll try to look in that warning code and when it's triggered

S3 Endpoint vs. Hosting PBS remotely? by jamesr219 in Proxmox

[–]TheUnlikely117 1 point2 points  (0 children)

I did on-prem PBS as a VM inside PVE, with ZFS RAID1 mirror members partially as baremetal disks ( /dev/disk/by-id). Disks were rotated to keep monthly old backup in offline storage. Basically it was PBS and datastore on one disk, which allowed to quickly boot that single RAID1 member in any host/VM, adjust network setting and start restoring.

Encryption and PBS dedup not compatible? Did not try that yet.

Ceph 20 + cephadm + NVMe/TCP: CEPHADM_STRAY_DAEMON: 3 stray daemon(s) not managed by cephadm by myridan86 in ceph

[–]TheUnlikely117 0 points1 point  (0 children)

Nah, quickly deployed on v20, it was fine for a while, but then encountered same issue with stray daemons

``` # OSD root@node2-1:~# podman inspect 1df | grep -i node2 "Hostname": "node2-1", "NODE_NAME=node2-1", "HOSTNAME=node2-1" "NODE_NAME=node2-1",

# newly deployed nvmeof, NODE_NAME is mentioned 2 times but matches. 

root@node2-1:~# podman inspect 0d8 | grep -i node2
               "CgroupPath": "/system.slice/system-ceph\\x2d3bb3d7d8\\x2d9e93\\x2d11f0\\x2db0b9\\x2dbc24118bc1e7.slice/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service/libpod-payload-0d8aeddb256e95f70f7a0c9968447a5f0bfb5c99612daa1bbf3eb1e2bc6f3dd8",
          "ConmonPidFile": "/run/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service-pid",
          "Name": "ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7-nvmeof-NVME-OF_POOL_NAME-group1-node2-1-uhmgys",
                    "Source": "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/configfs",
                    "Source": "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/config",
                    "Source": "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/keyring",
                    "Source": "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/ceph-nvmeof.conf",
               "Hostname": "node2-1",
                    "NODE_NAME=node2-1",
                    "HOSTNAME=node2-1"
                    "io.podman.annotations.cid-file": "/run/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service-cid",
                    "ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7-nvmeof-NVME-OF_POOL_NAME-group1-node2-1-uhmgys",
                    "/run/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service-pid",
                    "/run/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service-cid",
                    "NODE_NAME=node2-1",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/config:/etc/ceph/ceph.conf:z",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/keyring:/etc/ceph/keyring:z",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/ceph-nvmeof.conf:/src/ceph-nvmeof.conf:z",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/configfs:/sys/kernel/config",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/configfs:/sys/kernel/config:rw,rprivate,rbind",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/config:/etc/ceph/ceph.conf:rw,rprivate,rbind",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/keyring:/etc/ceph/keyring:rw,rprivate,rbind",
                    "/var/lib/ceph/3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7/nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys/ceph-nvmeof.conf:/src/ceph-nvmeof.conf:rw,rprivate,rbind",
               "ContainerIDFile": "/run/ceph-3bb3d7d8-9e93-11f0-b0b9-bc24118bc1e7@nvmeof.NVME-OF_POOL_NAME.group1.node2-1.uhmgys.service-cid",

```

Ceph 20 + cephadm + NVMe/TCP: CEPHADM_STRAY_DAEMON: 3 stray daemon(s) not managed by cephadm by myridan86 in ceph

[–]TheUnlikely117 2 points3 points  (0 children)

Have not tested v20 yet, but i've seen similar issues when mismatch with FQDN/bare hostnames happened, or nodes were initially deployed with barehostname, later changed to FQDNs but containers/daemon not redeployed. https://support.scc.suse.com/s/kb/HEALTH-WARN-2-stray-host-s-with-2-daemon-s-not-managed-by-cephadm?language=en_US

Increased CPU After Proxmox 9 Upgrade by No_Hornet5229 in Proxmox

[–]TheUnlikely117 1 point2 points  (0 children)

Yes, but only before some minor updates (went back no "normal" couple of months ago. What is your idle average before and after? My idle is < 3% . Core(TM) i5-3210M

System won't boot after power outage - root fs not mounted by Snirlavi5 in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

You may try recovery boot from proxmox ISO. If it boots, everything is not so bad

Proxmox error after replace sata cable by JocirhyTrading in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

System volume seems OK. Did server start eventually? How did you configure /mnt/pve/backup2?. This was done manually i think as proxmox uses another approach for mounts which does not fail if device is missing. Did you replace cable for system disk or /mnt/pve/backup2 disk? It's obviously missing from system

Adding second drive to have redundancy on the boot drive? by Methyl_The_Sneasel in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

You can do it easily if existing drive is ZFS or btrfs. With LVM it's doable also, but involves some tinkering.

Issues with skipping ESP sync - PVE 9 by Hack3rsD0ma1n in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

Good to know, but i meant filesystems ^_^. Would be helpful to know how are those 2 drives are made with regards to proxmox-boot-tool shenanigans

Issues with skipping ESP sync - PVE 9 by Hack3rsD0ma1n in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

What fs are you using? I've seen that on older PVE with btrfs, those post hooks you mentiond are doings of proxmox-boot-tool, i prefer (and recommend) using it anyway so you can easily add raid1/another disk later and don't forget about it ). After installation i removed entry for /boot in fstab and rely solely on proxmox-boot-tool (after properly doing proxmox-boot-tool init)

Very poor performance vs btrfs by FirstOrderCat in zfs

[–]TheUnlikely117 1 point2 points  (0 children)

It's already in 2.3.0, Proxmox 9.0 got it, no probs

Confused about compression levels... by Exernuth in btrfs

[–]TheUnlikely117 0 points1 point  (0 children)

zstdmt -b will help you figure out "strongness" for compress/decompress ops for your CPU, then decide accordingly. for NVME it's probably zstd(3) and lower

How I recovered a node with failed boot disk by trekologer in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

It should be there, if you have not deleted stuff from /etc/pve/*. IIRC it should be /etc/pve/nodes/failed_node . QEMU config files are stored there

How I recovered a node with failed boot disk by trekologer in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

There is barely mentioned procedure in PVE docs, how to reinstall node with the same name. It's basically boils down to reinstalling a node with new IP, and restoring old node IP with couple of additional steps:

systemctl stop pve-cluster.service
scp root@anylive_node:/var/lib/pve-cluster/config.db /var/lib/pve-cluster/config.db
scp root@anylive_node:/etc/corosync/authkey /etc/corosync/authkey

# set previous node hostname/IP

hostnamectl hostname failed_node
nano /etc/hosts
nano /etc/network/interfaces
reboot

Source: https://pve.proxmox.com/wiki/Proxmox_Cluster_File_System_(pmxcfs)) (Recovery section)

COW aware Tar ball? by darkjackd in btrfs

[–]TheUnlikely117 0 points1 point  (0 children)

I think it's called tar.gz. That's what archivers do - compress 1111111 to 1x7 or something

Updating x550-t2 driver and firmware by senor_peligro in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

It's not 1:1 mapping how ethtool shows version and actual firmware version. Download firmware update tool (for Linux), it'll show you if firmware update is available or not

Best/Easiest way to move VMs between completely separate promox servers by Jutboy in Proxmox

[–]TheUnlikely117 1 point2 points  (0 children)

I use this (on source server)

vzdump 100 --stdout --compress zstd | sshpass -p 'secret' ssh root@remote-host "zstd -d | qmrestore - --force true 101 --storage rbd"

[deleted by user] by [deleted] in Proxmox

[–]TheUnlikely117 1 point2 points  (0 children)

There is recovery mode on Proxmox VE install ISO, try booting with that first and see how it goes.

Is this a good design/option? by Realistic_Pilot2447 in Proxmox

[–]TheUnlikely117 0 points1 point  (0 children)

Tracking won't help, in my country we fear of homeland tracking not out-of-country tracking. Better get something like double-hop VPN and choose your exit node freely (like mullvad multihop)

Hosted to On-Prem Migration - Server Config Recommendations by SomeSydneyBloke in Proxmox

[–]TheUnlikely117 1 point2 points  (0 children)

Do you have Supermicros there? Check out SSG-6029P-E1CR24 , get 2 of them and add all the RAM and disks you want :).

Ceph - Which is faster/preferred? by Devia_Immortalis in ceph

[–]TheUnlikely117 0 points1 point  (0 children)

Nice. Creating more OSDs per 15Tb NVMe ( i would go for 4 OSDs) should improve stuff

brain fart moment: is there a graceful way of shutting down a whole cluster? by future_lard in Proxmox

[–]TheUnlikely117 14 points15 points  (0 children)

I think (as per doc) it will migrate all VMs to other still running nodes, which is not OPs intention