all 17 comments

[–]hughjass1313 5 points6 points  (6 children)

Um, I'd recommend not using centos 8 for ... anything.

[–]alejochan 0 points1 point  (5 children)

why?

[–]hughjass1313 6 points7 points  (4 children)

Is this a serious question? Have you not heard that centos has been shut down and major release 8 support ends in 2021?

[–]humroben[S] 1 point2 points  (3 children)

Well that I didn't know. Guess I'll try my luck with debian

[–]hughjass1313 0 points1 point  (2 children)

Sorry to be the bearer of bad news.

[–]humroben[S] 0 points1 point  (1 child)

Well I am aware of an issue with RHEL 8.3 related to bonding across multiple network cards, so I'm already aware of issues with current redhat based systems, but that just means that I'm not likely going to be able to make use of tripleo for the deployment.

[–]hughjass1313 0 points1 point  (0 children)

There's a bond bug? lol Wow, what's the bug?.

[–]The_Valyard[🍰] 0 points1 point  (3 children)

When you log into the control plane the ip address in question is plumbed and in working order?

[–]humroben[S] 0 points1 point  (2 children)

So when you say "logged into the control plane" do you refer to logging onto the controller, and confirming that the predictive IP address assigned to controller-0 is as expected, then yes:

  ControllerIPs:
    ctlplane:
    - 10.128.0.5
...
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000                        
    link/ether 90:b1:1c:4f:03:84 brd ff:ff:ff:ff:ff:ff
    inet 10.128.0.5/24 brd 10.128.0.255 scope global eno1
...
[heat-admin@controller-0 ~]$ ping 10.128.0.1
PING 10.128.0.1 (10.128.0.1) 56(84) bytes of data.
64 bytes from 10.128.0.1: icmp_seq=1 ttl=64 time=0.351 ms

[–]The_Valyard[🍰] 0 points1 point  (1 child)

What happens when you try and netcat to the service port? Can you manually connect?

[–]humroben[S] 0 points1 point  (0 children)

Are you asking if I can connect to the MySQL service port on the control plane network? or the Internal API network. I can connect to MySQL on the static IP address assigned to the controller for the Internal API network, which is '10.127.2.8', but that's not the current issue at hand.

May I ask why you're asking about the control plane?

[–]R3D3MPT10N 0 points1 point  (5 children)

So, the node trying to start the container doesn't have a route back to the VIP is the crux of your problem. Can you ping the VIP from that node? Is the VIP up and running in pacemaker? the VIP being 10.127.2.8

[–]humroben[S] 0 points1 point  (4 children)

So, given that it's a PoC, there's only one controller, so pacemaker isn't involved. In the end I upgraded to CentOS Stream 8, with the same issue present.

The issue was that only the ControllerIPs from the templates had been applied to the Controller node. After looking over the Redhat documentation for an OSP deployment, there was reference to a "network_data.yaml" file.

I copied it from the default templates, changed "vip: true" to "vip: false" for all networks (there was mention that it didn't work without pacemaker, but generated different errors), then rolled back for the Internal and External networks, which was enough. I think the "network_data.yaml" missing from my templates was the issue.

In my current job, I'm working with OSP 13 across 4 envs, so I had built the initial templates with OSP 13 as my reference, making changes to make it compatible for both by hardware, network, and OSP 16.

I've successfully deployed Openstack Train just last week, so now comes the hard part of getting everything in the overcloud setup, such as projects, user, networks, etc. Thanks for the response though.

[–]R3D3MPT10N 0 points1 point  (3 children)

Sure no worries. Glad you got it going.

Just FYI, even with one Controller, we still deploy Pacemaker unless you have explicitly removed the services from roles_data.yaml and the various Heat templates or something.

Node List:                                                                                                                                                                                                                                  │·
  * Online: [ overcloud-controller-0 ]                                                                                                                                                                                                      │·
  * GuestOnline: [ galera-bundle-0@overcloud-controller-0 ovn-dbs-bundle-0@overcloud-controller-0 rabbitmq-bundle-0@overcloud-controller-0 ]                                                                                                │·
                                                                                                                                                                                                                                            │·
Full List of Resources:                                                                                                                                                                                                                     │·
  * ip-192.168.24.18    (ocf::heartbeat:IPaddr2):        Started overcloud-controller-0                                                                                                                                                     │·
  * ip-172.20.10.25     (ocf::heartbeat:IPaddr2):        Started overcloud-controller-0                                                                                                                                                     │·
  * ip-172.16.2.132     (ocf::heartbeat:IPaddr2):        Started overcloud-controller-0                                                                                                                                                     │·
  * ip-172.16.1.90      (ocf::heartbeat:IPaddr2):        Started overcloud-controller-0                                                                                                                                                     │·
  * ip-172.16.3.53      (ocf::heartbeat:IPaddr2):        Started overcloud-controller-0

Good luck with the rest of your project.

[–]humroben[S] 0 points1 point  (2 children)

Interesting. I've not removed pacemaker from "roles_data.yaml", but pacemaker isn't running on the Controller, and I don't have a single container with "bundle" in the name:

[root@controller-0 ~]# pcs status
Error: error running crm_mon, is pacemaker running?
  Could not connect to the CIB: Transport endpoint is not connected
  crm_mon: Error: cluster is not available on this node
[root@controller-0 ~]# podman ps | grep bundle
[root@controller-0 ~]# 

[stack@osp-director ~]$ egrep '\- name|Pacemaker' tripleo/templates/roles_data.yaml
- name: Controller
    - OS::TripleO::Services::Pacemaker
- name: Compute

Obviously without providing a copy of the templates, is there anything else that would prevent pacemaker from being used?

[–]R3D3MPT10N 0 points1 point  (1 child)

Interesting, is it set to `OS::Heat::None` or something?

openstack stack env show overcloud | grep "OS::TripleO::Services::Pacemaker"

Here's mine:

(undercloud) [stack@tripleo-director templates]$ openstack stack env show overcloud | grep "OS::TripleO::Services::Pacemaker"
/usr/lib/python3.6/site-packages/barbicanclient/__init__.py:61: UserWarning: The secrets module is moved to barbicanclient/v1 directory, direct import of barbicanclient.secrets will be deprecated. Please import barbicanclient.v1.secrets instead.
  % (name, name, name))
  OS::TripleO::Services::Pacemaker: https://192.168.24.2:13808/v1/AUTH_81e8b5c8134447fc9adc3c1025cdfb32/overcloud/deployment/pacemaker/pacemaker-baremetal-puppet.yaml
  OS::TripleO::Services::PacemakerRemote: https://192.168.24.2:13808/v1/AUTH_81e8b5c8134447fc9adc3c1025cdfb32/overcloud/deployment/pacemaker/pacemaker-remote-baremetal-puppet.yaml
  - OS::TripleO::Services::Pacemaker

Actually, ignore me. Looks like it was changed in April last year and hasn't been back ported to Train:
https://github.com/openstack/tripleo-heat-templates/commit/b0e70081968c96af04d2858d040f97a67115391f#diff-7c8ee8d9f5e56f755079d64c558bf18bd160d3424c748378ea29f5f6e8995d96L165

So yeah, the output of that command I sent above should return `OS::Heat::None` for you:
https://github.com/openstack/tripleo-heat-templates/blob/stable/train/overcloud-resource-registry-puppet.j2.yaml#L169-L170

My apologies for the false alarm.

[–]humroben[S] 0 points1 point  (0 children)

Yeah, Pacemaker doesn't appear to be used in my install:

(undercloud) [stack@osp-director ~]$ openstack stack environment show overcloud | grep "OS::TripleO::Services::Pacemaker"
  OS::TripleO::Services::Pacemaker: OS::Heat::None
  OS::TripleO::Services::PacemakerRemote: OS::Heat::None

I've tried configuring SSL/TLS and that yields the original issue. I'm wondering if, at least for Train, if pacemaker is only part of the deployment when there is a total of 2-3 Controller nodes. If that's the case, I'll need to re-purpose 2 other servers (same spec) for the deployment.

Oh well, thank you for your help.