Skip to main content

Upgrading with ocboot

This article introduces how to upgrade the service version to a specific version with ocboot.

tip

ocboot can only upgrade environments deployed with ocboot. If your environment was deployed with docker-compose, please refer to this document for upgrading:

Prerequisites

It is recommended to upgrade to adjacent versions, such as upgrading from v3.8.x to v3.10.x with the following steps:

  1. v3.8.x => v3.9.x
  2. v3.9.x => v3.10.x

In general, the upgrade steps are as follows:

Use the ocboot tool we wrote to upgrade. This tool mainly calls ansible to upgrade all nodes in the cluster.

  1. Use git to pull the latest code of ocboot and switch to the tag of v3.10.12.
  2. Use the ./ocboot.py upgrade command in ocboot for version upgrade.

View current version

You can use kubectl to view the current version of the cluster.

# 使用 kubectl 获得当前集群的版本
$ kubectl -n onecloud get onecloudclusters default -o=jsonpath='{.spec.version}'
3.9.14 # 发现当前版本为 3.9.14

Clone ocboot tool

If you already have the ocboot tool locally, skip this step and update the code to the corresponding branch.

# Install ansible locally
$ python3 -m pip install --upgrade pip setuptools wheel
$ python3 -m pip install 'ansible<=9.0.0' paramiko

Clone the ocboot tool to your local environment.

# Use git clone the ocboot deployment tool locally
$ git clone -b release/3.10 https://github.com/yunionio/ocboot && cd ./ocboot

Update ocboot code

$ git fetch
$ git checkout v3.10.12

Upgrade service components

Prerequisites

The principle of updating services is to remotely log in to the first control node of the cluster without password via local ssh, obtain the information of all nodes, and then execute playbook updates through ansible. Therefore, the following requirements must be met:

  1. The local machine can ssh remotely to PRIMARY_MASTER_HOST.
  2. PRIMARY_MASTER_HOST can ssh log in to other nodes in the cluster without password.

If password-free login is not set up, use the command ssh-copy-id -i ~/.ssh/id_rsa.pub root@PRIMARY_MASTER_HOST to distribute the public key to the corresponding nodes in your own environment.

You can check the version number for upgrade from the CHANGELOG 3.10 page.

Execute upgrade commands

The following command upgrades ocboot-related services to higher versions, which will take a long time because of pulling docker images. Please be patient.

PRIMARY_MASTER_HOST 是指部署集群的第一个节点的 ip 地址,需要本机能够使用 ssh 密钥登录上去。

$ ./ocboot.py upgrade <PRIMARY_MASTER_HOST> v3.10.12

In addition, you can use ./ocboot.py upgrade --help to view other optional parameters.

  • --user specifies other ssh users
  • --port specifies ssh port
  • --key-file specifies another ssh private key
  • --as-bastion lets PRIMARY_MASTER_HOST act as a bastion machine deployed on the host machine of the intranet

View upgrade progress

You can log in to PRIMARY_MASTER_HOST during the upgrade process to use kubectl to check the upgrade status of the corresponding pods.

$ kubectl get pods -n onecloud --watch

Instructions for Downgrading

If there are any issues with the upgraded function or bugs, you can use the following command to regress and rollback.

warning
  • There are generally no problems with downgrading minor versions, such as downgrading from v3.10.7 to v3.10.6.
  • There may be issues with cross-version downgrades, such as downgrading from v3.10.7 to v3.9.15.

If you encounter any issues, please submit an issue at GitHub or contact us for help.

Executing the Rollback

The rollback works by modifying the image version of each service. For example, if the current version is v3.10.7 and you want to revert to v3.10.6, execute the following command on the first control node:

$ /opt/yunion/bin/ocadm cluster update --version v3.10.6 --wait

Rollback will pull new images. You can open another window, log in to the first control node, and use the following command to check the update status of each pod.

$ kubectl -n onecloud get pods -w

Rollback Operation

Since version v3.9.7, ocboot backups the deployment of the operator and the yaml file of onecloudcluster to /opt/yunion/ocboot/_upgrade/ each time it upgrades the cluster service, making it easier to manually rollback components in the future.

In case the upgrade fails, or you want to revert to the previous version after the upgrade, use the following steps:555555

# View the upgrade backup directory in /opt/yunion/ocboot/_upgrade/
# The subdirectory inside is named after the upgrade time
$ ls /opt/yunion/ocboot/_upgrade/
2023.02.20-10:10:07

# For example, if you want to revert to the version before 2023.02.20-10:10:07, perform the following steps.
$ cd /opt/yunion/ocboot/_upgrade/2023.02.20-10\:10\:07/
# View the relevant yaml file
$ ls -lh
-rw-r--r-- 1 root root 61K Feb 20 10:10 oc.yml # oc.yaml is the description file for onecloudcluster
-rw-r--r-- 1 root root 3.9K Feb 20 10:10 operator.yml # operator.yaml is the description file for operator deployment

# Rollback operator first
$ kubectl apply -f operator.yml
deployment.extensions/onecloud-operator configured
# Wait until the newly created operator is Running. If it is in the ContainerCreating state, wait a while.
$ kubectl get pods -n onecloud | grep operator
onecloud-operator-5557d86d74-7s5sl 1/1 Running 0 2s

# Then rollback onecloudcluster
$ kubectl apply -f oc.yml