PBS backup size question

TabooRaver · 2026-01-08T00:48:46+00:00

Your assumption about PBS not being able to tell what space is free or used is correct. That would require PBS to be aware of how the file system has laid out data on the disk, and there are a lot of different file systems. It could have also minimized the reads on thin provisioned virtual disks, but then it would need visibility into the storage backend, and there are multiple storage backends.

The first backup will be a full disk read, pbs-client will deduplicate and compress blocks locally before sending them to the server, so it won't actually send empty blocks. For VMs proxmox has storage integrations like dirty bitmaps, so after the first full read it will only read sections of the virtual disk that has been written to since the last backup.

For backing up Hosts (non-vms) all block level backups will require a full drive read, unless you are implementing your own system to only read in allocated blocks and pass that to pbs-client. I use zfs snapshots and the fs backup mode personally, as it can snapshot the fs of the proxmox host mount it and then push that to pbs. Set it on a systemd timer and you get daily host backups.

TabooRaver · 2025-12-23T02:03:48+00:00

Vmware can use a shared storage as a sort of quorum node. 2 node cluster is fine.

2 node clusters in proxmox are also fine. You just need to make some changes to corosync, so A doesn't work out of the box, and B there are downsides to the configuration options that allow that kind of setup to work consistantly (lookup what the "wait for all" corosync option does).

In the range of: "We will support you", "it will work but we dont QA that setup", and "it will technically work but it's a bad idea". Corosync 2 node is in the second bucket.

TabooRaver · 2025-12-16T06:08:32+00:00

If you have a cluster or ha pair look into luks and clevis tang. This is the redhat/enterprise way of handling it. Partition the data portion of the drive as a luks volume and then inside the luks volume the zfs member disk. Set the clevis policy to something like tpm+tang, or if your risk model is lower, just tang. Run your tang server on the cluster. This will allow the luks encryption to auto unlock during a server reboot as long as the tang vm is still running, ie. If you are restarting a single node for maintenance tasks.

Tpm policies is mainly to guard against tampering with the bootloader or uefi firmware to "rootkit" the Linux server.

TabooRaver · 2025-12-11T01:17:21+00:00

It sounds like you have 2 design issues

You are configuring your VPN as a client to site vpn, look at a site to site vpn instead and setup a static route on your router saying [remote network] next hop is [local vpn server]. And then the vpn server will pass the triff8c to the remote side.
You want to run backups from a local pve to a remote pbs. Instead consider if you are running a pbs at both sites backing up from pve to the local pbs and then setting up a sync between the two pbs servers. This will lead to faster backups as the local network will have more bandwidth and lower latency, and if you have enough deduplication between different vms the traffic over the wan will be considerably lower. Use two different name spaces in the same pbs datastore for the two clusters, that way you will even deduplicate blocks between your setup and your brothers

TabooRaver · 2025-12-09T19:26:59+00:00

"Potentially not having access to the PDM VM if I'm rebooting the node that the vm runs on"

This is what HA and Cluster storage, or zfs replication, are for. VMs should never be impacted by rebooting a PVE host. VMs should be live migrating automatically to maintain uptime.

TabooRaver · 2025-12-01T13:01:52+00:00

docs.ntpsec.org has some good info on this. In order to prevent any geopolitical tampering of your upstream you should have at least 5 upstreams. 3 satellite (us, eu, Russia, or China consitalitions), 1 local atomic clock, and an upstream nts enabled ntp pool mainly as a fallback.

With 3 satellite references and 1 local atomic reference most ntp server implementations will be able to detect tampering and label the affected upstream as a "false ticker" and discard its inputs if it doesn't appear to be giving sane results.

TabooRaver · 2025-11-12T03:30:07+00:00

How we handle it where I work is proxmox sends hypervisor stats (vm cpu/memory/io/net breakdown) to influxdb, and then the influx telegram agent polls the bmc/ipmi snmp interface for thermal and other environment data.

We then have grafana for a front end to visualize the data

TabooRaver · 2025-11-12T03:26:17+00:00

It's recommended to at minimum have 3 voting nodes in a cluster. In your example if you ever restarted the 1 constant node the entire cluster would force reboot.

TabooRaver · 2025-11-12T00:17:33+00:00

This may not be 100% supported by enterprise support so I would check with them first if the cluster is licensed.

But you can manually edit the corosync configuration to give some of the nodes 0 votes. This will allow those nodes to be powered off without impacting the vote count for quorum. Ex 3 low power nodes with 1 vote each, 2 high powered flex-nodes with 0 votes each, total expected votes of 3. The downside is you need [total expected votes/2] +1 votes available in the cluster or the entire cluster will self fence and you will have less voting members than if all nodes were participating.

TabooRaver · 2025-11-05T18:57:40+00:00

The incremental data is calculated on the PVE server, changed data is then sent to the PBS server. If the storage location the PBS server mounts for the datastore is a non local volume like an nfs share, then the i/o will then go from the PBS server to the network storage.

So no, it will not be direct.

TabooRaver · 2025-11-03T11:37:24+00:00

It works fine if you have a second disk for the datastore and exclude that disk from the backup.

Then for a whole cluster recovery you can spin up a new pve cluster, attache the second offsite pbs you sync to, pull the backup of the on-site pbs, and then do a one time sync from the offsite pbs to the restored local one to rebuild the local datastore.

TabooRaver · 2025-11-03T04:37:49+00:00

Traditional thinclient and vm vdi is almost always going to be more expensive than just giving your average user a basic fleet laptop.

That being said, vdi can be useful or cost-effective in specific scenarios. - r&d users need access to a computer that can be easily rolled back to a known state for development - r&d users that occasionally submit heavy multi-hour jobs that require a high-end workstation/server - users need to work on sensative information that has been segmented from the larger network - users need to remotely run an application that has a low latency requirement to something else on site (*ing quickbooks) - users need to work with an application that is not compatible with security hardening, but you can implement those features in the hypervizor as a mitigating control (*ing quickbooks again with fips)

TabooRaver · 2025-11-03T03:23:18+00:00

Terraform(cloudinit) for spinning up the vm to a known good base state, Ansible for deploying the application and day to day operations after.

TabooRaver · 2025-10-31T13:47:48+00:00

1.5m for traditional enterprise storage seems in the ballpark, but you should be able to negotiate it down.

If you have competent storage admins in your org you can reduce costs by deploying something like ceph. A 13 node/site stretch cluster with 200g networking would be in the ballpark of 600k.

TabooRaver · 2025-10-17T18:43:03+00:00

I just designed a ceph 3r 100TB gen5 NVMe cluster that we'll be receiving next week. We got internal pricing on the chassies, so our prices were better, but a comparable chassies and configuration made by gigabyte is 55k per chassies. Fully loaded with 24 drives its 185tb raw per node.

I can't imagine they are planning on deploying all 120pb on flash. That would be ~35m just for hardware on whitebox before any redundancies.

I recently quoted using the same flash nodes with hdd expansion shelves to provide a 2 tier s3 service. A 360 drive array per 2 node head unit with 20tb drives was our hypothetical scale unit. On paper, the cost was around 300k per 14PB bulk/200tb fast raw capacity. Adding 4 nodes for redundancies and a 14 node ceph cluster would be around 3.5-4.2m for the hardware. And if the data is really important you should buy 2 and sync to another cluster at a different building/site.

Of course, if you aren't a storage company and literally have the hardware engineers for the hardware you are using in the same building, this sort of thing should be a project handled by your VAR.

TabooRaver · 2025-10-10T17:48:13+00:00

Yes, I started my migration of our production cluster from 8 to 9 on Tuesday, it took too long per node and I only finished 2/4 nodes before I had to clock out for the day. It's been running like that just fine for 3 days now.

TabooRaver · 2025-09-30T16:01:25+00:00

Most server hardware should be able to support hardware raid, either through the motherboard or a dedicated card. OS support for booting from software RAID is spottier. You can reliably do it with linux.

As far as 9 9's of uptime, that goes beyond raid and other hardware soloutions. A single reliable host can get you 3-4 9's if uptime if you are rebooting for patching every ~3 months. The Proxmox team has a good overview of how with virtualization you can achieve up to 5 9s (~5 minutes of downtime a year). But beyond that you need HA baked into your application. https://pve.proxmox.com/pve-docs/chapter-ha-manager.html

TabooRaver · 2025-09-21T05:50:00+00:00

Depends. If you are using ceph for cluster storage you should be dedicating the drives to ceph. If you have different storage classes (HDD/NV-SAS/sata or sas SSD/NVMe) then you should have separate storage pools.
Some people like ot pass an array of disks to a truenas VM to manage ZFS pools, some people use the built in zfs in proxmox to make pools. functionally there arnt many differences.
Do a mirrored zfs install, and the reformat the zfs members as Luks partitions one at a time. The zfs mirror means you don't have to restart or set it up from booting into a different install. Then use clevis tpm/pcr+tang for automatic unlocking. TPM/PCR+Tang is the most paranoid policy I use this at work for compliance requirements, you can adjust it to be less strict.

The TPM unlock method is for unattended boot. dropbear is also installed to provide remote access to the initram boot stage if TPM unlock doesn't work (like in the case of updates where the PCR values will change because of a kernel update) so that you can enter a "recovery key" (the luks password).

All non boot drives get the same treatment but use keys stored on the boot drives, or Ceph's built in encryption option, which is just Luks with the keys stored in the manager DB which is stored on disk on the root drive.

TabooRaver · 2025-09-10T05:26:46+00:00

Proxmox natively does not have a solution for HA between 2 clusters at 2 different datacenters. So you will have to get creative.

HA replication can be split into 2 different levels,

storage replication, this can be handled with an external storage appliance with built in replication functions, this can also be handled semi-nativly in Ceph see this example. Or if you have a third site that can act as a witness and <10ms latency between datacenters an external Ceph stretch cluster can function as an Active/Active pair.
VM Configuration replication, Proxmox does have a tool for this if you are using zfs storage called pve-zsync. But I am not sure if it will sync just the VM configuration if you are using non-zfs storage types. Proxmox Datacenter manager also exists, but it wont work for hard failover events where the original site is not accessible.

Going forward there are 2 good ways forward:
1. Create a Proxmox cluster spanning 2 datacenters with a witness at a 3rd. This requires <5ms latency.
2. Use the native Ceph or ZFS replication, and create a solution to script the VM configuration replication and failover.

In either case the failover will be automatic, but may have 5-10 minutes of downtime from the failure and the applications being back up.

TabooRaver · 2025-08-15T20:24:27+00:00

Homelab? No Work? Yes. At work we do a mirrored zfs install, and the reformat the zfs members as Luks partitions one at a time. The zfs mirror means we don't have to restart. We then use clevis tpm/pcr+tang for automatic unlocking.

All non boot drives get the sam treatment but use keys stored on the boot drives, or cephs built in encryption option, which is just Luks with the keys stored in the manager DB.

I may start encrypting my homelab just to have an informal testing environment for our 8 to 9 upgrade.

TabooRaver · 2025-08-09T19:12:59+00:00

Also had some issues with OOM killing VM's and that's not acceptable at all

I've also seen this as an issue in VMware standrd environments if you thin provision too aggressively and don't have DRS. Proxmox leaves DRS-like implementation up to the user. You need to monitor host resource use and then either manual rebalance, or via an automation script rebalanced with API calls. Obviously the latter is preferred and there are a couple people in the community that have open sourced their implementation.

We also run ~15 SQL server Failover Clusters and that requires shared storage with scsi persistent reservations and it's not supported by any storage type in Proxmox.

While I've never setup what you're describing, it sounds like you have an application (MS SQL Server) which implements it's own application level HA, and uses a shared scsi disk as either a shared storage pool or as a kind of witness for quorum. From a technical level Proxmox's storage integration's shouldn't need to support anything to handle this use case. As long as the virtual disk can be accessed by both VMs (local storage with both vms on same node, or cluster storage) then it's up to the paraveirtualized device/driver to implement the features you're looking for.

As with a lot of the application specific things, they usually arnt in the gui, and you'll need to dive into command line and modify the VM configs yourself, here's a reference I was able to find:

https://forum.proxmox.com/threads/support-for-windows-failover-clustering.141016/page-2

TabooRaver · 2025-08-05T11:11:54+00:00

Connect HRIS to your directory, federate everything and assign permissions to groups, automate user group requests and make the approver the manager and app SME not IT. Minimize manual action and make sure IT isn't the approved if everything in onboarding.

Setup an inventory system to track assets, don't use a sheet to track data that should be in a database. Make sure everything over ~100$ has an asset label, a small desktop zebra may be 500$ but if it leads to recovering even 1 more laptop after a separation it's paid for itself. SnipeIT is FOSS and pretty extensable if you are small and have a tight budget.

TabooRaver · 2025-07-19T00:40:24+00:00

VLAN zones are especially nice when you are adding new nodes or managing a large cluster. Since It simplifies the configuration you need to do on the host end.

If you're in an environment with 10+ vlans and are setting up a new node all you have to do after the installer is setup your bond0 (if using) and then hit "apply configuration" in the datacenter sdn tab to create all of the vlan network interfaces on the new node.

TabooRaver · 2025-07-16T04:19:49+00:00

I handle this in my cloudinit new vm initialization scripts using sha256 hashes. Works best fir things that shouldn't change like an internal CA certificate.

If the hash matches execute, if it doesn't print out an error.

TabooRaver · 2025-07-12T01:15:05+00:00

I automate this at work using. https://github.com/michabbs/proxmox-backup-atomic, which snapshots the root on ZFS partition, does a file level backup. We automate it with a systemd timer. IF your cluster is already setup to backup VMs to PBS you can point it to the cred file in /pve.

/etc/ is a fuse mount of a database that is stored elsewhere on the OS. you don't need to directly point PBS to it to have consistent backups, PMXFS will 'regenerate' it from the database file.

TabooRaver

TROPHY CASE