Maintenance Ceph storage cluster

Maintenance Ceph storage cluster

21-02-2023 00:00:00 - 23-02-2023 07:00:00

Urgency: Planned
Time windows:
21/02/2023 00:00 hrs - 07:00 hrs
21/02/2023 08:30 hrs - 17:30 hrs
22/02/2023 08:30 hrs - 17:30 hrs
23/02/2023 08:30 hrs - 17:30 hrs
24/02/2023 00:00 hrs - 07:00 hrs
Affected services:
- Virtual machines
- Shared storage
- Shared hosting
- MSSQL instances
Expected impact:
- Reduced redundancy
- CephFS Shared file system unavailable for a short time, possibly several times
Customer intervention required: No
Reference number: 184090

Summary:

From Tuesday 21 February 0:00 to Friday 24 February 7:00, BIT engineers will perform maintenance on the Ceph storage cluster in the time windows specified above. During all maintenance windows the cluster will be less redundant. The shared filesystem (CephFS) should only be briefly unavailable during the time window on 24 February.

Details:

The Ceph cluster will get a software upgrade. The maintenance is not expected to have a noticeable impact on the operation of the cluster, with the exception of the shared file system service (CephFS). CephFS will be unavailable for a short period of time, possibly several times. The cluster will be upgraded node by node. A fully healthy cluster (Ceph HEALTH_OK) is waited for before the next node is upgraded. The cluster is designed so that maintenance on most components can be performed without affecting services. These parts of the cluster are therefore upgraded during the day. For those components where it is estimated that there is an increased risk of impact, or where it is certain that there will be impact in terms of availability, the maintenance will take place at night.


Update placed at 21/02/2023, 2:30h

Maintenance on Tuesday 21 February has been successfully completed. However, there has been an unforeseen disruption of the CephFS shared filesystem. This filesystem was unavailable between 00:28 - 00:52 h.

Update placed at 23/02/2023, 1:55h

Maintenance has been successfully completed. No further issues have been noticed.