[gpfsug-discuss] add local nsd back to cluster?

Fri Jul 29 17:48:44 BST 2022

If there are cluster nodes up, restore from the running nodes instead of the file. I think it’s -p, but look at the manual page.

-- 
Stephen Ulmer

Sent from a mobile device; please excuse auto-correct silliness.

> On Jul 29, 2022, at 11:20 AM, shao feng <shaof777 at gmail.com> wrote:
> 
> 
> Thanks Olaf
> 
> I've setup the mmsdr backup as https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=exits-mmsdrbackup-user-exit, since my cluster is CCR enabled, it generate a CCR backup file,
> but when trying to restore from this file, it require quorum nodes to shutdown? Is it possible to restore without touching quorum nodes?
> 
> [root at tofail ~]# mmsdrrestore -F CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz
> Restoring a CCR backup archive is a cluster-wide operation.
> The -a flag is required.
> mmsdrrestore: Command failed. Examine previous error messages to determine cause.
> 
> [root at tofail ~]# mmsdrrestore -F CCRBackup.986.2022.07.29.23.06.19.myquorum.tar.gz -a
> Restoring CCR backup
> Verifying that GPFS is inactive on quorum nodes
> mmsdrrestore: GPFS is still active on myquorum
> mmsdrrestore: Unexpected error from mmsdrrestore: CCR restore failed.  Return code: 192
> mmsdrrestore: Command failed. Examine previous error messages to determine cause.
> 
> 
>> On Thu, Jul 28, 2022 at 3:14 PM Olaf Weiser <olaf.weiser at de.ibm.com> wrote:
>> 
>> 
>> Hi - 
>> assuming, you'll run it  withou ECE  ?!? ... just with replication on the file system level 
>> ba aware, every time a node goes offline, you 'll have to restart the disks in your filesystem .. This causes a complete scan of the meta data to detect files with missing updates / replication
>> 
>> 
>> apart from that to your Q :
>> you may consider to backup mmsdr 
>> additionally, take a look to   mmsdrrestore, in case you want to restore a nodes's SDR configuration 
>> 
>> quick and dirty..  save the content  of  /var/mmfs  may also help you 
>> 
>> during the node is "gone".. of course.. the disk is down , after restore of SDR / node's config .. it should be able to start .. 
>> the rest runs as usual 
>> 
>> 
>> 
>> Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von shao feng <shaof777 at gmail.com>
>> Gesendet: Donnerstag, 28. Juli 2022 09:02
>> An: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
>> Betreff: [EXTERNAL] [gpfsug-discuss] add local nsd back to cluster?
>>  
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>  
>> Hi all,
>> 
>> I am planning to implement  a cluster with a bunch of old x86 machines, the disks are not connected to nodes via the SAN network, instead each x86 machine has some local attached disks.
>> The question is regarding node failure, for example only the operating system disk fails and the nsd disks are good. In that case I plan to replace the failing OS disk with a new one and install the OS on it and re-attach these nsd disks to that node, my question is: will this work? how can I add a nsd back to the cluster without restoring data from other replicas since the data/metadata is actually not corrupted on nsd.
>> 
>> Best regards,
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220729/1c773ee2/attachment.htm>