PowerVS Frequently Asked Questions

When Building out a Cluster

VM Console shows "grub>"

Not the end of the world. Just type normal and hit ENTER.

VM Console shows "grub rescue>"

GRUB 2 was unable to find the grub folder or its contents are missing/corrupted. The grub folder contains the GRUB 2 menu, modules and stored environmental data. This should not happen very often. Simply reboot the VM from https://cloud.ibm.com.

RHCOS Ignition stalls with "version mismatch"

The version of RHCOS you are booting up with on cluster nodes cannot handle what the OCP build/version is serving as config for cluster nodes. Make sure they match at least the major and minor versions (eg. 4.5) in your var.tfvars file.

Error about auth key/token

Error: Error occured while fetching the auth key for power iaas: "Post https://iam.cloud.ibm.com/identity/token: dial tcp: lookup iam.cloud.ibm.com on 192.168.2.1:53: read udp 192.168.2.25:49597->192.168.2.1:53: i/o timeout"

You may occasionally get this error upon terraform apply runs. One thing you can try is to regenerate IBM Cloud API key and rerun with it. See: https://cloud.ibm.com/docs/account?topic=account-userapikey

Image Registry not coming up

Toward the end of deploying a cluster or even after your terraform apply appears to complete happily, you might find your image-registry operator not AVAILABLE. Digging in, if you find the registry-pvc PersistentVolumeClaim stuck in Pending state, while nfs-client-provisioner pod appears to be Running fine, it may be the case where the backing NFS share is read-only, and thus PVCs can't be fulfilled. Check the directory ownership and permission of /export on your bastion node. It has to be owned by "nobody" and world writeable. Use below commands as necessary;

# chown nobody:nobody /export
# chmod 777 /export
# exportfs -r

Note: allow a few minutes to sort all things out after these commands.

Problems on a Running Cluster

Need to add more CPU/Memory to Node

Go to https://cloud.ibm.com, then to your PowerVS service instance, find your VM that backs the cluster Node (eg. myocp-worker-0), click to open up the details page, then "Edit details," update the CPU/Memory with new values, and Save. You should get "Edit successful" message, and your OCP cluster should recognize the change and start showing in Compute section of Web Console. If something fails along the way, try "OS Shutdown" action, edit CPU/Memory as necessary while in "Shutoff" state, and restart the VM.