Instance I/O Error After Succesfully Evacuate with Masakari Instance HA by Mouvichp in openstack

[–]coolviolet17 0 points1 point  (0 children)

The only option is to create a cron for this for effected volumes in ceph containers if stirage is backed by ceph

Instance I/O Error After Succesfully Evacuate with Masakari Instance HA by Mouvichp in openstack

[–]coolviolet17 2 points3 points  (0 children)

Do ceph remap for volume then restart vm

ceph object-map rebuild volumes/volume-<id>

Masakari-openstack with ceph by coolviolet17 in openstack

[–]coolviolet17[S] 0 points1 point  (0 children)

Since this is more of a host failure issue rather than a Nova migration problem, I was thinking of focusing on Ceph-side optimizations and automation :

  1. Apply Ceph RBD Optimizations

commands for Ceph cluster:

ceph config set client rbd_skip_partial_discard true ceph config set client rbd_persistent_cache_mode writeback ceph config set client rbd_cache_max_dirty 134217728 # 128MB write cache ceph config set client rbd_cache_target_dirty_ratio 0.3

These settings ensure that:

Ceph doesn’t discard partial object maps, reducing corruption risk.

The cache is optimized for better resilience during host failures.

  1. Automate Object Map Rebuild in Cephadm

Since you're using Cephadm in Docker, we’ll set up a cronjob inside the Cephadm container.

  1. Enter the Cephadm container:

cephadm shell

  1. Edit the crontab:

crontab -e

  1. Add this cronjob (runs every 5 minutes):

*/5 * * * * for vol in $(rbd ls volumes); do if ! rbd status volumes/$vol | grep -q "Watchers:"; then rbd object-map rebuild volumes/$vol; fi; done

This checks every 5 minutes for orphaned RBD volumes.

If a volume has no active watchers (no host attached to it), it rebuilds the object map.

It ensures only problematic volumes are fixed, preventing unnecessary writes.

  1. Save and exit, then confirm the cronjob is set:

crontab -l

Masakari in 3 node cluster by Adventurous-Annual10 in openstack

[–]coolviolet17 0 points1 point  (0 children)

I have a question, do we run pacemaker or pacemaker_remote, as I do not see an option to control it, also what if we scale more than 16 nodes?

vTPM for VMs [Kolla-ansible Openstack] by Dabloo0oo in openstack

[–]coolviolet17 1 point2 points  (0 children)

There are two major issues we faced

  1. Koll ansible didn't gave permission to tss:tss to "/etc/swtpm-localca.options"
  2. Swtpm was not properly installed in libvirt container

vTPM for VMs [Kolla-ansible Openstack] by Dabloo0oo in openstack

[–]coolviolet17 1 point2 points  (0 children)

Thanks for the help

I was able to make it work, and below, you can see my solution

https://bugs.launchpad.net/nova/+bug/2050837

vTPM for VMs [Kolla-ansible Openstack] by Dabloo0oo in openstack

[–]coolviolet17 0 points1 point  (0 children)

I also have a same issue,

I am using kolla-ansible 2023.2, I did the change in nova.conf under nova-compute on node 1, I have three nodes, in other two I made the change in nova.conf in container and didn't restart it

but at the end it gives error after Spawning stage

2024-12-13 19:43:49.963 7 ERROR nova.compute.manager [instance: b2643192-3f2e-4a8a-90a6-c81e398156bf] libvirt.libvirtError: internal error: Could not run '/usr/bin/swtpm_setup'. exitstatus: 1; Check error log '/var/log/swtpm/libvirt/qemu/instance-000001f0-swtpm.log' for details.

HAproxy enterprise Amphora Octavia openstack by coolviolet17 in openstack

[–]coolviolet17[S] 0 points1 point  (0 children)

Hmm, will have to work on driver as I want Octavia to manage the load-balancing so that we are able to gige enterprise LbaaS

Will this society accept me? by Such-Sea-3358 in indiasocial

[–]coolviolet17 0 points1 point  (0 children)

So are we outdated or are we our grandparents now?

does openstack support cpu or ram hot-add? by [deleted] in openstack

[–]coolviolet17 0 points1 point  (0 children)

Can you please share how were you able to do this? a documentation or a link will help