Introduction
If you have been following the blog posts on this site, we implemented NSX-T with Openshift 4.6 with NCP’s support for Openshift operators (see https://www.vrealize.it/2021/03/24/nsx-t-ncp-integration-with-openshift-4-6-the-easy-way/) using the UPI installation.
In the meantime, NCP 3.2 was released, which supports Openshift 4.7 and 4.8 and is also able to get installed through the IPI installation process. While UPI installation is still supported, IPI is even easier because you don’t have to deal with the provisioning of the node VMs through terraform on your own. It also removes the need for API loadblalancing since it comes with an API VIP that is automatically available on one of the master nodes.
So let’s jump right in…
High-Level Installation Walkthrough
Let’s first review what the high-level tasks are to get it working:
- Prepare a small jumphost VM for all the installation tasks and install the required installer files
- Prepare the required DNS host entries
- Configure NSX-T networking constructs to host the cluster
- Prepare the Openshift install config and modify it for NCP. This will create the cluster manifests and ignition files.
- Deploy an Openshift cluster as installer-provisioned infrastructure with bootstrap, control-plane and compute hosts
Detailed Installation Walkthrough
1. Jumphost Preparation and Pre-Requisites
For my lab, I have downloaded a CentOS 7.8 minimal ISO and created a VM based on it. If you like, you can grab the ISO here: http://isoredirect.centos.org/centos/7/isos/x86_64/, but any other linux-based VM should work as well.
As we are going to use a couple of scripts, it makes sense to have at least Python installed. Compared to the previous posts, we don’t need Terraform any more, as the provisioning process is integrated in IPI into the openshift installer.
sudo yum install python-pip
sudo yum install unzip
sudo yum install wget
To keep things tidy, let’s create a directory structure for the Openshift deployment. You don’t have to, but since you might want to deploy separate deployments, it makes sense to have at least one directory for each deployment:
[localadmin@oc-jumphost ~]$ tree openshift/ -L 1
openshift/
├── config-files
├── deployments
├── downloads
├── installer-files
└── scripts
Download the following items to the downloads folder, extract them into the install-files directory, and move the clients and installer to your binary folder (At the time of this writing, the current version of Openshift 4.8 is 4.8.19, so that is what I have used for the installer and clients).
cd openshift/downloads
wget -c https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.8.19/openshift-install-linux.tar.gz
wget -c https://mirror.openshift.com/pub/openshift-v4/clients/ocp/4.8.19/openshift-client-linux.tar.gz
cd ../installer-files
tar -xf ../downloads/openshift-client-linux.tar.gz
tar -xf ../downloads/openshift-install-linux.tar.gz
sudo cp {oc,kubectl,openshift-install} /usr/bin/
Now, you should have the openshift installer and kubectl commands available.
Next step is to create ssh keys, as we will need them to ssh to the RHCOS container hosts:
ssh-keygen -t rsa -b 4096 -N '' -f ~/.ssh/id_rsa
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa
Next, we also need the NSX-T NCP containers files and corresponding config files.
As for NSX-T NCP container, you need a myVMware account and you can download it from here: https://customerconnect.vmware.com/downloads/details?downloadGroup=NSX-T-PKS-3201&productId=982. Please note that even though the NCP version is already 3.2.0.1, NSX-T 3.2 is not released yet. Therefore, NCP 3.2.0.1 support NSX-T versions 3.1.2 and 3.1.3.
Put the ncp container image into the download folder as well and extract it to the installer folder. In the ncp folder, we are going to need the ncp ubi container image, the other items are not needed, so we can remove a couple of files as well.
cd ~/openshift/installer-files/
unzip ../downloads/nsx-container-3.2.0.1.18891234.zip
rm -r nsx-container-3.2.0.1.18891234/PAS/
rm -r nsx-container-3.2.0.1.18891234/OpenvSwitch/
rm nsx-container-3.2.0.1.18891234/Kubernetes/nsx-ncp-ubuntu-3.2.0.1.18891234.tar
rm nsx-container-3.2.0.1.18891234/Kubernetes/nsx-ncp-photon-3.2.0.1.18891234.tar
We will also need the configuration files for the NCP network operator. They are included in the NCP zip file, but you might as well grap the most current version using git from this location:
cd ~/openshift/installer-files/
git clone https://github.com/vmware/nsx-container-plugin-operator.git
During the Openshift installation process, the ncp operator container image will be automatically downloaded as the image is public available on docker hub, but the NCP container image is required as well, which is not public available. Therefore, you will have to provide the ncp container image on a private image registry, or temporarily deploy it on a private docker hub location.
In my case, I already have a private image registry running, based on Harbor (see https://goharbor.io/), so I placed the NCP image there:
cd ~/openshift/installer-files/nsx-container-3.2.0.1.18891234/Kubernetes/
podman image load -i nsx-ncp-ubi-3.2.0.1.18891234.tar
podman tag registry.local/3.2.0.1.18891234/nsx-ncp-ubi harbor.corp.local/library/nsx-ncp
podman push harbor.corp.local/library/nsx-ncp
Last, we need to get a pull-secret from Redhat, which will allow the container hosts to download the needed containers during the deployment. The pull secret requires a Redhat account (you might as well register for a developer account for free, if you don’t have a corporate subscription).
Go to https://console.redhat.com/openshift/install/vsphere/installer-provisioned and download your pull secret:
As a preparation, I also strongly recommend to create a TLS certificate for the openshift apps. If you don’t do this up-front, you can’t provide the certificate during the installation. This means that all the openshift routes for the openshift apps (like Console, Prometheus etc.) will not be placed on the NCP loadbalancer, because NCP doesn’t create a self-signed certificate automatically.
To create this certificate, you can use openssl. The certificate SAN needs to point to the wildcard cluster domain. As you can see below, my apps domain URL is *.apps.openshift4.corp.local. Here are the commands required to generate this certificate:
export COMMONNAME=*.openshift4.corp.local
openssl req -newkey rsa:2048 -x509 -nodes -keyout openshift.key -new -out openshift.crt -subj /CN=$COMMONNAME -reqexts SAN -extensions SAN -config <(cat ./openshift-cert.cnf <(printf "[SAN]\nsubjectAltName=DNS:$COMMONNAME")) -sha256 -days 365
openssl x509 -in openshift.crt -text -noout
The command above will generate a self-signed certificate, save it in file openshift.crt and save the key to openshift.key, based on the input variables from the file openshift-cert.cnf. The cnf-file can be prepared before and takes whatever you would like to put into the cert. Mine looks like this:
[ req ]
default_bits = 4096
distinguished_name = req_distinguished_name
req_extensions = req_ext
prompt = no
[ req_distinguished_name ]
countryName = DE
stateOrProvinceName = BW
localityName = Stuttgart
organizationName = NSX
commonName = *.openshift4.corp.local
[ req_ext ]
subjectAltName = @alt_names
[alt_names]
SIDENOTE: Take a look at the resulting certificate. Newer versions of OpenSSL automatically generate self-signed certificates with option basicConstraints=CA:TRUE. That means it generates a CA certificate, which is not what we want, because NSX-T will deny that certificate as server certificate. If you OpenSSL has that option set, you have to revert it in the cnf-file.
2. DNS Preparation
Let’s first take a look at what we are planning to deploy. The default set consists of 3 control-plane nodes and 3 compute nodes. As we are going to use the installer-provisioned way of deploying the cluster in vSphere, we just need to take care of the DNS entries for API, API-INT and the apps-domain.
We are also going to use the NSX-T infrastructure for all possible elements, like network and DHCP Server, except for DNS, which is most likely already existing in your environment. Our final topology will be looking like this (during bootstrap, one more VM is needed, called bootstrap):
Openshift expects each deployment to have separate cluster id, which needs to correlate with the respective DNS zone. So in my example, my base DNS domain is corp.local. My Openshift cluster name will be openshift4.
Therefore, I have to create DNS entries in a DNS zone called openshift4.corp.local.
We need to create records for openshift API, API-INT and the apps-domain. There’s no need to create any DNS records for nodes, etcd-hosts etc. any more. Here’s the complete list of DNS records that are needed:
The following 2 entries point to the API VIP, which will be defined in the config file later and need to be from the OCP Management network range:
api.openshift4.corp.local 172.16.170.100
api-int.openshift4.corp.local 172.16.170.100
A wildcard DNS entry needs to be in place for the OpenShift 4 ingress router, which is also a load balanced endpoint. This will come from the NCP ingress range.
*.apps.openshift4.corp.local 172.16.172.1
As for the DNS entry for *.apps.openshift4.corp.local, the IP address refers to the first IP address from the Ingress IP Pool that we will configure in step 3. NCP will take over the Ingress-LB for the openshift apps and will take the first one from the pool for the newly created cluster. If you are not sure yet, you can omit the DNS entry until the Ingress-LB is created on NSX-T during the installation.
3. Configure NSX-T networking constructs to host the cluster
Let’s refer to the topology:
In NSX-T, we will create a base topology where the cluster hosts will be attached to. For that, we create a separate T1-Router where all OCP segments will be attached to. We will also create a segment where the hosts will be attached to. Last, a DHCP server will be created for the cluster hosts to get dynamic IP adresses during bootup.
As an optional exercice, I have also created an Ingress-IP-Pool and Egress-NAT-Pool for NCP to consume. This can be done dynamically by NCP as well, but I prefer the pre-provisioned way to be on the safe side.
Assuming you have configured a T0-router already and deployed NSX-T on the vSphere cluster already, let me quickly walk you through the creation of the components above:
Configure T1 for OCP Hosts
– Log in to NSX-T Manager
– Click on the Networking tab
– Connectivity > Tier-1 Gateways
– Add Tier-1 Gateway
Configure Segment for OCP Hosts
– Click on the Networking tab
– Connectivity > Segments
– Add Segment
Configure DHCP Server
– Click on the Networking tab
– IP Management > DHCP
– Add DHCP Profile
Attach the DHCP Server to the OCP-Management segment
– Click on the Networking tab
– Connectivity > Segments
– click edit on the OCP-Management segment
– click edit DHCP config
Configure Ingress IP Pool and Egress NAT Pool
– Click on the Networking tab
– IP Management -> IP Address Pools
– add 2 IP Address Pools
Just make sure that you have configured your T1 propagation settings correctly (advertising Connected Segments, NAT and LB IPs) and verify what your redistribution settings for T0 are. If you use BGP routing, you need to advertise the corresponding settings as well.
Configure Loadbalancing for API and Machine Config Server
With the IPI installation, we don’t need to prepare any L4 loadbalancing any more.
4. Prepare the Openshift install config and modify it for NCP
In this step, we are going to configure the openshift installation files on your linux jumphost that we prepared in step 1.
Referring to the directory structure, move to directory openshift/config-files and create a install-config.yaml file.
[localadmin@oc-jumphost ~]$ tree openshift/ -L 1
openshift/
├── config-files
├── deployments
├── downloads
├── installer-files
└── scripts
[localadmin@oc-jumphost ~]$ cd ~/openshift/config-files/
Here’s what my install-config.yaml looks like:
apiVersion: v1
baseDomain: corp.local
compute:
- architecture: amd64
hyperthreading: Enabled
name: worker
replicas: 3
platform:
vsphere:
cpus: 4
coresPerSocket: 2
memoryMB: 8196
osDisk:
diskSizeGB: 40
controlPlane:
architecture: amd64
hyperthreading: Enabled
name: control-plane
replicas: 3
platform:
vsphere:
cpus: 8
coresPerSocket: 4
memoryMB: 16384
osDisk:
diskSizeGB: 40
metadata:
name: openshift4
networking:
networkType: ncp
clusterNetwork:
- cidr: 10.4.0.0/16
hostPrefix: 23
machineCIDR: 172.16.170.0/24
serviceNetwork:
- 172.30.0.0/16
platform:
vsphere:
vcenter: vcsa-01a.corp.local
username: administrator@corp.local
password: ENTER YOUR PASSWORD HERE
datacenter: DC-SiteA
defaultDatastore: ds-site-a-nfs03
network: ocp-mgmt
cluster: Compute-Cluster
apiVIP: 172.16.170.100
ingressVIP: 172.16.170.101
fips: false
pullSecret: 'ENTER YOUR PULL-SECRET HERE'
sshKey: 'ENTER YOUR SSH KEY HERE'
proxy:
additionalTrustBundle: |
-----BEGIN CERTIFICATE-----
'ENTER YOUR REGISTRY CA CERT HERE'
-----END CERTIFICATE-----
Couple of comments regarding these settings: | |
clusterNetwork | this is the pod network that will be deployed through NCP for the internal pod communication. |
machineCIDR | this needs to match with the OCP Segment IP Range that we configured on NSX-T (in this case: 172.16.170.0/24) |
password | enter your vSphere Password here |
apiVIP | This is going to be the IP address of the kubernetes API. This needs to be in the same network range as machineCIDR |
ingressVIP | With NCP, this is not really needed as will be provided through NSX-T, but Openshift needs this to be set and it needs to be in the same network range as machineCIDR. So it is pretty much a dummy IP address, as our Ingress will be on 172.16.172.1 |
pullSecret | enter the Redhat pull secret that you obtained in step 1. Make sure you put it in ‘ |
sshKey | enter the contents of your ~/.ssh/id_rsa.pub file from step 1. Make sure you put it in ‘ |
proxy: | Only needed if you deploy the ncp container image from a private registry. As of Openshift 4.4, the only way to provide additional trusted CA certificates is through the proxy configuration, even if the proxy setting itself is empty. You can remove the proxy setting if you deploy the ncp container image on the public docker hub. |
additionalTrustBundle: | Only needed if you deploy the ncp container image from a private registry. Here, you enter the CA cert that can verify the private registry server certificate (in my case, the CA cert that signed the server certificate for harbor.corp.local). This is needed, otherwise the NCP download will fail since the openshift hosts can’t validate the private registry certificate. You can remove the additionalTrustBundle setting if you deploy the ncp container image on the public docker hub. |
Next step is to prepare the NCP operator config files accordingly. These are located in the deploy/openshift4 folder of the ncp-operator git directory.
[localadmin@oc-jumphost config-files]$ cd ~/openshift/installer-files/nsx-container-plugin-operator/deploy/openshift4
[localadmin@oc-jumphost deploy]$ ls
configmap.yaml
namespace.yaml
operator.nsx.vmware.com_ncpinstalls_crd.yaml
operator.yaml
role.yaml
lb-secret.yaml
nsx-secret.yaml
operator.nsx.vmware.com_v1_ncpinstall_cr.yaml
role_binding.yaml
service_account.yaml
With the operator support, we only need to modify 3 files:
Modify configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: nsx-ncp-operator-config
namespace: nsx-system-operator
data:
ncp.ini: |
[DEFAULT]
[coe]
adaptor = openshift4
cluster = openshift-ipi
loglevel = WARNING
nsxlib_loglevel = WARNING
[ha]
[k8s]
apiserver_host_ip = api-int.openshift4-ipi.corp.local
apiserver_host_port = 6443
client_token_file = /var/run/secrets/kubernetes.io/serviceaccount/token
ca_file = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
loglevel = WARNING
enable_multus = False
[nsx_kube_proxy]
[nsx_node_agent]
ovs_uplink_port = ens192
[nsx_v3]
policy_nsxapi = True
nsx_api_managers = 192.168.110.200
nsx_api_user = admin
nsx_api_password = ENTER_YOUR_NSX_PW_HERE
insecure = True
subnet_prefix = 24
log_firewall_traffic = DENY
use_native_loadbalancer = True
lb_default_cert_path = /etc/nsx-ujo/lb-cert/tls.crt
lb_priv_key_path = /etc/nsx-ujo/lb-cert/tls.key
pool_algorithm = WEIGHTED_ROUND_ROBIN
external_ip_pools = ocp-egress-pool
top_tier_router = T1-OCP
single_tier_topology = True
external_ip_pools_lb = ocp-ingress-pool
overlay_tz = 180f6238-4899-4945-af4d-0ca72557bcc6
edge_cluster = 2c266085-6059-4b42-86f9-cba96ab21871
[vc]
All the other settings are commented out, so NCP takes the default values for everything else. If you are interested in all the settings, the original file in the directory is quite large and has each config item explained.
Couple of comments regarding these settings: | |
nsx_api_password | Put the NSX admin user password here |
overlay_tz | Put the UUID of the Overlay-Transport-Zone here |
service_size | For PoC, having a small LB deployed should be fine. For production deployment, you would rather want to use medium or large LB. |
Modify operator.yaml. The only thing you need to modify here is the location where you have placed the NCP image.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nsx-ncp-operator
namespace: nsx-system-operator
spec:
replicas: 1
selector:
matchLabels:
name: nsx-ncp-operator
template:
metadata:
labels:
name: nsx-ncp-operator
spec:
hostNetwork: true
serviceAccountName: nsx-ncp-operator
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
- effect: NoSchedule
key: node.kubernetes.io/not-ready
containers:
- name: nsx-ncp-operator
image: vmware/nsx-container-plugin-operator:latest
command: ["/bin/bash", "-c", "nsx-ncp-operator --zap-time-encoding=iso8601"]
imagePullPolicy: Always
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: OPERATOR_NAME
value: "nsx-ncp-operator"
- name: NCP_IMAGE
value: "harbor.corp.local/library/nsx-ncp:latest"
- name: WATCH_NAMESPACE
value: "nsx-system-operator"
Modify lb-secret.yaml. In this file, you place the certificate you created in step 1 for the openshift apps. This will enable NCP to put the certificate as Ingress LB certificate and build up the corresponding route configurations. Please be aware that certificate and key entries are expected as base64. So you might want to first convert the certificate as follows:
base64 -w0 openshift.crt
base64 -w0 openshift.key
You take those printouts and put them into the lb-secret-yaml:
apiVersion: v1
data:
tls.crt: <<COPY THE BASE64 CRT FILE IN HERE>>
tls.key: <<COPY THE BASE64 KEY FILE IN HERE>>
kind: Secret
metadata: {name: lb-secret, namespace: nsx-system-operator}
type: kubernetes.io/tls
Now, we are ready to create the openshift installer manifests and ignition files. For each deployment, the openshift installer will create files in a specific folder structure. So let’s create a new directory for this deployment and copy the install-config.yaml into that folder.
cd ~/openshift/deployments/
mkdir ncp-oc4-vsphere
cp ../config-files/install-config.yaml ncp-oc4-vsphere/
With the next step, we create the openshift manifests:
openshift-install create manifests --dir=ncp-oc4-vsphere
Depending on whether you would like to have pods scheduled on the control-plane nodes, the openshift docs suggest you do the following:
nano ncp-oc4-vsphere/manifests/cluster-scheduler-02-config.yml
Set mastersScheduleable: false.
sed -i 's/mastersSchedulable: true/mastersSchedulable: false/g' ncp-oc4-vsphere/manifests/cluster-scheduler-02-config.yml
Next we need to move the NCP operator config files into the manifest folder and start the installation:
cp ../installer-files/nsx-container-plugin-operator/deploy/openshift4/*.yaml ncp-oc4-vsphere/manifest
Important Notes:
(1) The Openshift installer includes a certificate in these ign files for the initial deployment. That certificate is only valid for 24 hours. If you don’t get your cluster up and running within 24 hours, you need to generate new manifests and ignition configs.
(2) If you have to start over again from a previous deployment, you can simply delete contents of the ncp-oc4-vsphere folder, but there are 2 hidden files: .openshift_install.log and .openshift_install_state.json where Openshift keeps installation information. Unless you also delete these two files, the certificates will not be renewed.
5. Deploy an Openshift cluster as installer-provisioned infrastructure with bootstrap, control-plane and compute hosts
We are now ready to deploy the bootstrap, control-plane and compute nodes to our vSphere environment. This is all done through the create cluster process:
openshift-install create cluster --dir=ncp-oc4-vsphere
We are pretty close now. First, the installer will download the vmware OVA file and upload it to vSphere. It will create the bootstrap and control-plane nodes and the bootstrap node will start deploying the openshift cluster on the control-plane nodes. As some point, the bootstrap will be done and the installer will remove the bootstrap node. In case the deployment takes longer than the create cluster process allows, the installer script will end, but the cluster buildup will continue. We can monitor that process with the following command:
cd ~/openshift/deployments/
openshift-install wait-for bootstrap-complete --dir=ncp-oc4-vsphere --log-level debug
Let’s wait now until the openshift installer signals that the bootstrap process is complete:
DEBUG Bootstrap status: complete
INFO It is now safe to remove the bootstrap resources
You can now remove the bootstrap node through the openshift installer:
openshift-install destroy bootstrap --dir=ncp-oc4-vsphere --log-level debug
Let’s finalize the deployment:
cd ~/openshift/deployments/
openshift-install --dir=ncp-oc4-vsphere/ wait-for install-complete --log-level=DEBUG
There are a couple of commands that you can use during the installation phase to see details on the progress:
export KUBECONFIG=~/openshift/deployments/ncp-oc4-vsphere/auth/kubeconfig
oc get nodes
oc project nsx-system
oc get pods (this should show you all NCP pods)
watch -n5 oc get clusteroperators
As NCP fires up, it implements all the required networks and loadbalancers in NSX-T for this installation. In segments, you should find a segment for each Openshift project. If all the operators are running, there should be 49 segments (including the OCP-Management segment).
In Loadbalancers, there are now 2 Ingress-Loadbalancers deployed as well. NCP has auto-allocated an IP adress from the LB-Pool for it.
DONE!!
(well, almost. You need to tell Openshift things about image registry and where to find storage in your vSphere cluster. Please refer to https://docs.openshift.com/container-platform/4.8/installing/installing_vsphere/installing-vsphere-installer-provisioned-network-customizations.html. I did the following:
Tell OC that image registry is managed
oc project openshift-image-registry
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"managementState": "Managed"}}'
Fake image repository for PoCs
oc patch configs.imageregistry.operator.openshift.io cluster --type merge --patch '{"spec":{"storage":{"emptyDir":{}}}}'
Further Links
I focussed in this blog on the NSX-T integration part. Therefore, I did not elaborate any further on Openshift specifics or config variables. If you like to drill-down further, or use HA-Proxy to handle the API-LB, here are a couple of links:
- NSX-T – NCP Integration with Openshift 4.8 – The Super-Easy Way - 6. December 2021
- NSX-T – NCP Integration with Openshift 4.6 – The Easy Way - 24. March 2021
- NSX-T – NCP Integration with Openshift 4.4 – The Easy Way - 29. September 2020
Hi
nsx-ncp-operator pod is running then failing immediately
seeing this in the log any help
Error creating: pods “nsx-ncp-operator-7bc8fd64fb-” is forbidden: error looking up service account nsx-system-operator/nsx-ncp-operator: serviceaccount “nsx-ncp-operator” not found
Integrating the ncp plugin with openshift 4.9 but nsx-ncp-operator pod is going on loop from running to error and crashloopback. This what I am seeing Error creating: pods “nsx-ncp-operator-7bc8fd64fb-” is forbidden: error looking up service account nsx-system-operator/nsx-ncp-operator: serviceaccount “nsx-ncp-operator” not found.. But the service account is there in the namespace
Hi. Actually, Openshift 4.9 is not yet supported with NCP. Take a look at the current release notes, https://docs.vmware.com/en/VMware-NSX-T-Data-Center/3.2/rn/NSX-Container-Plugin-3201-Release-Notes.html, which says 4.6, 4.7 and 4.8 is supported and tested. Given the changes that Openshift brings in on every release, I expect errors will come up on 4.9.
Thanks for your reply.. I followed your article step by step but its look like its stuck
It has deployed the nsx-ncp-operator and its in running state.. No errors and the nodes are in not ready state.. Also an IP pool is created in NSX-T (lb_segment_pool_openshift) Automatically created from lb segment subnet config with an 169 IP range
nsx-system-operator nsx-ncp-operator-7886b75cf8-9ksxr 1/1 Running 0 28m
openshift-apiserver-operator openshift-apiserver-operator-5bd476769f-gkmlh 0/1 Pending 0 65m
openshift-authentication-operator authentication-operator-6c74565878-gqnnc 0/1 Pending 0 65m
openshift-cloud-credential-operator cloud-credential-operator-78d69b894-zckkq 0/2 Pending 0 64m
openshift-cluster-machine-approver machine-approver-85c5bbb65d-qj5qp 0/2 Pending 0 64m
openshift-cluster-node-tuning-operator cluster-node-tuning-operator-59f8d7f977-4k5b7 0/1 Pending 0 64m
openshift-cluster-storage-operator cluster-storage-operator-6759dddb45-4zzjj 0/1 Pending 0 64m
openshift-cluster-storage-operator csi-snapshot-controller-operator-c87897fcd-9zsmg 0/1 Pending 0 64m
openshift-cluster-version cluster-version-operator-69cb5f4d9-4cst9 0/1 ContainerCreating 0 65m
openshift-config-operator openshift-config-operator-5bc57d6bb9-mw5nw 0/1 Pending 0 64m
openshift-controller-manager-operator openshift-controller-manager-operator-75d58df564-hf5nx 0/1 Pending 0 65m
openshift-dns-operator dns-operator-64976bfbd4-qc5gh 0/2 Pending 0 65m
openshift-etcd-operator etcd-operator-648f8d98f8-mlmqp 0/1 Pending 0 64m
openshift-image-registry cluster-image-registry-operator-558969c469-5mpzs 0/1 Pending 0 64m
openshift-ingress-operator ingress-operator-7659fd478-dt5kh 0/2 Pending 0 64m
openshift-insights insights-operator-5544b5d4bd-fx96r 0/1 Pending 0 64m
openshift-kube-apiserver-operator kube-apiserver-operator-5d4bcd74b8-ndrqv 0/1 Pending 0 64m
openshift-kube-controller-manager-operator kube-controller-manager-operator-6cf68dbf5c-4z9bw 0/1 Pending 0 65m
openshift-kube-proxy openshift-kube-proxy-9x6fk 2/2 Running 0 56m
openshift-kube-proxy openshift-kube-proxy-fkgtc 2/2 Running 0 56m
openshift-kube-proxy openshift-kube-proxy-gzkqs 2/2 Running 0 56m
openshift-kube-scheduler-operator openshift-kube-scheduler-operator-755f9b4d4d-kbwpk 0/1 Pending 0 65m
openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-6bb7c975b9-2zd6l 0/1 Pending 0 64m
openshift-machine-api cluster-autoscaler-operator-675d6744c-z2kft 0/2 Pending 0 64m
openshift-machine-api cluster-baremetal-operator-67dc7b9ff-kwzcv 0/2 Pending 0 64m
openshift-machine-api machine-api-operator-597d557ccb-2f8gm 0/2 Pending 0 64m
openshift-machine-config-operator machine-config-operator-88655f6f8-jtjxg 0/1 Pending 0 64m
openshift-marketplace marketplace-operator-668c6756c5-f2lwr 0/1 Pending 0 65m
openshift-monitoring cluster-monitoring-operator-5f788b5f67-cvqqw 0/2 Pending 0 64m
openshift-multus multus-additional-cni-plugins-9jxv5 1/1 Running 0 56m
openshift-multus multus-additional-cni-plugins-clb2j 1/1 Running 0 56m
openshift-multus multus-additional-cni-plugins-qn897 1/1 Running 0 56m
openshift-multus multus-c47r4 1/1 Running 5 56m
openshift-multus multus-c8fzh 1/1 Running 5 56m
openshift-multus multus-lrfwf 1/1 Running 5 56m
openshift-multus network-metrics-daemon-7p8ns 0/2 ContainerCreating 0 56m
openshift-multus network-metrics-daemon-fcx5f 0/2 ContainerCreating 0 56m
openshift-multus network-metrics-daemon-lfrc7 0/2 ContainerCreating 0 56m
openshift-network-diagnostics network-check-source-767dcf4757-7k9m4 0/1 Pending 0 56m
openshift-network-diagnostics network-check-target-56vbq 0/1 ContainerCreating 0 56m
openshift-network-diagnostics network-check-target-78wnv 0/1 ContainerCreating 0 56m
openshift-network-diagnostics network-check-target-k7t9c 0/1 ContainerCreating 0 56m
openshift-network-operator network-operator-67b5f89f89-smrzg 1/1 Running 6 65m
openshift-operator-lifecycle-manager catalog-operator-9b8b8d9bf-pr4jz 0/1 Pending 0 64m
openshift-operator-lifecycle-manager olm-operator-7ff58f46cf-5dgmh 0/1 Pending 0 64m
openshift-service-ca-operator service-ca-operator-56785599f6-vxqrx 0/1 Pending 0 64m
openshift-vsphere-infra coredns-openshift-7h8l6-master-0 2/2 Running 0 57m
openshift-vsphere-infra coredns-openshift-7h8l6-master-1 2/2 Running 0 57m
openshift-vsphere-infra coredns-openshift-7h8l6-master-2 2/2 Running 0 57m
openshift-vsphere-infra haproxy-openshift-7h8l6-master-0 2/2 Running 0 57m
Not sure where its getting hung
It seems like the nsx-system namespace it not created. Normally, when the master nodes are coming up, NCP node-agent is placed on each node and one NCP agent is scheduled in the nsx-system namespace. Where did you point the images source for the NCP docker image to in the operator.yaml? There’s the setting for “name: NCP_IMAGE” and its value should point to the registry location where you placed the NCP docker image.
Thanks Jorg,
I have fixed that issue.
But now the bootstrap is stuck at
Dec 15 12:42:26 localhost bootkube.sh[7475]: Tearing down temporary bootstrap control plane…
Dec 15 12:42:26 localhost bootkube.sh[7475]: Sending bootstrap-finished event.Waiting for CEO to finish…
Dec 15 12:42:27 localhost bootkube.sh[7475]: W1215 12:42:27.729354 1 etcd_env.go:287] cipher is not supported for use with etcd: “TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256”
Dec 15 12:42:27 localhost bootkube.sh[7475]: W1215 12:42:27.729494 1 etcd_env.go:287] cipher is not supported for use with etcd: “TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256”
Dec 15 12:42:27 localhost bootkube.sh[7475]: I1215 12:42:27.746223 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
Dec 15 12:42:49 localhost bootkube.sh[7475]: I1215 12:42:49.958173 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
Dec 15 12:44:21 localhost bootkube.sh[7475]: W1215 12:44:21.430492 1 reflector.go:436] k8s.io/client-go@v0.21.1/tools/cache/reflector.go:167: watch of *v1.Etcd ended with: an error on the server (“unable to decode an event from the watch stream: http2: client connection lost”) has prevented the request from succeeding
Dec 15 12:44:22 localhost bootkube.sh[7475]: I1215 12:44:22.815858 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
Dec 15 12:47:58 localhost bootkube.sh[7475]: I1215 12:47:58.574468 1 waitforceo.go:67] waiting on condition EtcdRunningInCluster in etcd CR /cluster to be True.
However all the segments,IP pools, load balancer, SNAT everything is created in the NSX-T
Logged in to each node and found that all the pods are running