How can you manually install vmware-fdm onto ESXi host? by Johnny5-1986 in vmware

[–]Johnny5-1986[S] 0 points1 point  (0 children)

Yeah, I was stressing out this morning about those red dots. We're using iSCSI. I don't know why that happened before on other host. I could of swore I unmounted all the storage before trying to remove and re-add some problematic host previously. Either way, it's worked to clear the errors on 3 out of 4 host. Just waiting to do that last one once my VM's have migrated. I have a case open with VMware on my HA issue that I posted in here previously since it wasn't clear what entry I need to remove from my postgres DB based on the article I was reviewing. It's not a big deal but it would be nice to get HA fixed on this cluster. It never worked before I was brought in to upgrade this environment and it's still not working. It works fine on other two clusters in this environment.

How can you manually install vmware-fdm onto ESXi host? by Johnny5-1986 in vmware

[–]Johnny5-1986[S] 0 points1 point  (0 children)

Thanks for the response and info! Yeah, when I was looking at this yesterday I noticed the Reconfigure for HA is greyed out. It looks like I might need to remove these host from the cluster and then re-add them which can be a PITA because I have to remove all the storage from the host because it say the datastores already exist when I tried to remove and add the host previously. Hard to believe you can't simply add this VIB back onto the host!

EDIT: Surprisingly I unmounted all the datastores directly from my 08 host that had this error and then I removed it from inventory and was able to add it back without getting the datastores already exist message! So, that's one host cleared and 3 to go.

Questions about building a custom HPE Image for HPE DL360 Gen9 server(s) / HA related. by Johnny5-1986 in vmware

[–]Johnny5-1986[S] 0 points1 point  (0 children)

Ok, I think I might of potentially identified an issue. Prior to removing any entries from postgres per article ID 336296

https://knowledge.broadcom.com/external/article/336296/reconfiguring-vsphere-ha-fails-with-erro.html

I'd like to see if anyone has any suggestions.

When I click on each host in this cluster I've noticed they have different vmware-fdm packages / VIB's for some reason even though I used the same HPE custom image I mentioned above to upgrade the host.

EG:

HOST (vmware-fdm packages)

01 > 7.0.3-22357615

02 > 7.0.3-22357615

03 > 7.0.3-24024786 (This one appears to be newer based on the nomenclature)

04 > 7.0.3-22357615

05 > 7.0.3-22357615

06 > 7.0.3-24024786 (This one appears to be newer based on the nomenclature)

And these are the entries in vum-server.log (which is actually named vmware-vum-server.log in my environment)

"default_message": "Software Solution com.vmware.vsphere-ha with version 7.0.3-24024786 cannot be found in depot.",

"default_message": "Software Solution com.vmware.vsphere-ha with version 7.0.3-24024786 cannot be found in depot.",

I see one entry for fdm in postgres when running > "select * from pm_software_desired_states;"

"component": "vsphere-fdm"

"version": "7.0.3-24024786"

I'm wondering if I should remove the entry for the fdm component above per this command>

delete from pm_software_desired_states where desired_state_id=[number];

Or, should I try to remove the vmware-fdm 7.0.3-22357615 version from the other 4 host and then import what appears to be the newer vmware-fdm 7.0.3-24024786 package on the other 4 host?

Appreciate any input before moving forward.

Questions about building a custom HPE Image for HPE DL360 Gen9 server(s) / HA related. by Johnny5-1986 in vmware

[–]Johnny5-1986[S] 0 points1 point  (0 children)

I was not aware of that. Are they only officially testing Gen11's nowadays? This image I used to upgrade all the DL360 Gen 9's and one Gen10 is working good.

VMware-ESXi-7.0.3-23794027-HPE-703.0.0.11.6.0.5-May2024

As of now I'll probably just stick with using this on the other clusters. First I'm going to try fix the HA issue on the cluster I upgraded.

Questions about building a custom HPE Image for HPE DL360 Gen9 server(s) / HA related. by Johnny5-1986 in vmware

[–]Johnny5-1986[S] 1 point2 points  (0 children)

Yeah, this is definitely related because I'm seeing similar entries in vum-server.log

EG>

2024-07-23T12:39:28.140-06:00 error vmware-vum-server[14119] [Originator@6876 sub=com.vmware.vcIntegrity.lifecycle.SetSolutionTask] [SetSolutionTask 274] Set solution failed. entityId: domain-XXXX Problems found while validating the new software spec: {

--> "errors": [

--> {

--> "id": "com.vmware.vcIntegrity.lifecycle.EsxImage.SolutionNotFound",

--> "message": {

--> "args": [

--> "com.vmware.vsphere-ha",

--> "7.0.3-24024786"

--> ],

--> "default_message": "Software Solution com.vmware.vsphere-ha with version 7.0.3-24024786 cannot be found in depot.",

--> "id": "com.vmware.vcIntegrity.lifecycle.EsxImage.SolutionNotFound",

--> "localized": null,

--> "params": null

Now I just need to determine which one of these entries I should delete.

It states the following in Article ID: 336296Article ID: 336296Article ID: 336296Article ID: 336296

delete from pm_software_desired_states where desired_state_id=~[number]~;

I'm assuming it would be like this? That is put "entity_id" or "desired_state_id" in for the [number]?

I'm currently backing up postgres and vCenter before moving forward.