[gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 Re: Re: gpfsug-discuss Digest, Vol 149, Issue 2
Peter Childs
p.childs at qmul.ac.uk
Mon Aug 12 12:41:51 BST 2024
Yes, Out current plan, is to
Deploy 5.2.0-1 on 9.4, given its the only version that works, Sure I'd have gone with 5.1.9 if it worked, but I have no reason to believe it does and going with the latest version does have a few benifits.
Deploy 5.2.0-1 on our Scale Servers but not run mmchfs to upgrade the on-disk data structures. We should be find to run the mmchconfig release=LATEST given all nodes with in the same cluster (using multicluter are running the same version)
The Legacy C7 Nodes, will be HPC compute only and I'm currently planning to patch them to 5.1.2-15 (from 5.1.2-8) in the hope that this will fix the snapshot issue we've got,
Basically 5.2.0-1 keeps crashing with an snapshot issue, which I think is caused by a bug in 5.1.2-8 that looks like it got fixed in 5.1.2-15..... but if someone tells me to upgrade the C7 further I'm more than happy to look at doing so, to fix this bug.
Peter Childs
ITS Research Storage
Please contact Research Support via Ticket by email:its-research-support at qmul.ac.uk or https://support.research.its.qmul.ac.uk/
Please check the Research blog at https://blog.hpc.qmul.ac.uk/
________________________________________
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Jackie Nunes <theunixchick at gmail.com>
Sent: 08 August 2024 6:01 PM
To: gpfsug-discuss at gpfsug.org
Subject: [EXTERNAL] Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 149, Issue 2
[You don't often get email from theunixchick at gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe.
CentOS7 and RHEL 7 are not supported with Storage Scale 5.2.#
RHEL 9.4 5.14.0-427.24.1.el9 has been tested.
Printed page 9 of the gpfsclusterafaq.psf makes the 5.2.0.1 stuff pretty clear to me, but yes a better representation could be done.
-Jackie Nunes
> On Aug 8, 2024, at 10:31 AM, gpfsug-discuss-request at gpfsug.org wrote:
>
> Send gpfsug-discuss mailing list submissions to
> gpfsug-discuss at gpfsug.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
> or, via email, send a message with subject or body 'help' to
> gpfsug-discuss-request at gpfsug.org
>
> You can reach the person managing the list at
> gpfsug-discuss-owner at gpfsug.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of gpfsug-discuss digest..."
>
>
> Today's Topics:
>
> 1. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1
> (Peter Childs)
> 2. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1
> (Peter Childs)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 8 Aug 2024 16:27:01 +0000
> From: Peter Childs <p.childs at qmul.ac.uk>
> To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>, gpfsug
> main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to
> 5.2.0-1
> Message-ID:
> <DU0PR07MB851373A0090BFEE9A6395ACEA4B92 at DU0PR07MB8513.eurprd07.prod.outlook.com>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows.
>
> A Full Support matrix would be helpful on this one please rather than just a note in an FAQ.
>
> I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good.
>
> I'm tempted to agree that the statement in the docs that says...
>
> "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used."
>
> Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote"
>
> However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED)
>
>
>
>
> Peter Childs
>
>
>
>
> ________________________________________
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Bolinches, Luis (WorldQuant) <Luis.Bolinches at worldquant.com>
> Sent: 08 August 2024 11:39 AM
> To: gpfsug main discussion list; gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1
>
> [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe.
>
>
> Hi
>
> Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2?
>
> Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try.
>
> I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question
>
> --
> Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions
>
> Luis Bolinches
> WQ Aligned Infrastructure
> "If you always give you will always have" -- Anonymous
>
> https://www.credly.com/users/luis-bolinches/badges
>
> -----Original Message-----
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Peter Childs
> Sent: Thursday, 8 August 2024 13.23
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1
>
> We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7)
>
> We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think)
>
> Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works.
>
> I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable.
>
> I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all.
>
> If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last.
>
> Thanks
>
> Peter Childs
> ITS Research Storage
> Queen Mary University of London.
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
> ###################################################################################
>
> The information contained in this communication is confidential, may be
>
> subject to legal privilege, and is intended only for the individual named.
>
> If you are not the named addressee, please notify the sender immediately and
>
> delete this email from your system. The views expressed in this email are
>
> the views of the sender only. Outgoing and incoming electronic communications
>
> to this address are electronically archived and subject to review and/or disclosure
>
> to someone other than the recipient.
>
> ###################################################################################
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 8 Aug 2024 16:27:01 +0000
> From: Peter Childs <p.childs at qmul.ac.uk>
> To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>, gpfsug
> main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to
> 5.2.0-1
> Message-ID:
> <DU0PR07MB851373A0090BFEE9A6395ACEA4B92 at DU0PR07MB8513.eurprd07.prod.outlook.com>
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows.
>
> A Full Support matrix would be helpful on this one please rather than just a note in an FAQ.
>
> I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good.
>
> I'm tempted to agree that the statement in the docs that says...
>
> "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used."
>
> Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote"
>
> However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED)
>
>
>
>
> Peter Childs
>
>
>
>
> ________________________________________
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Bolinches, Luis (WorldQuant) <Luis.Bolinches at worldquant.com>
> Sent: 08 August 2024 11:39 AM
> To: gpfsug main discussion list; gpfsug main discussion list
> Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1
>
> [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe.
>
>
> Hi
>
> Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2?
>
> Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try.
>
> I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question
>
> --
> Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions
>
> Luis Bolinches
> WQ Aligned Infrastructure
> "If you always give you will always have" -- Anonymous
>
> https://www.credly.com/users/luis-bolinches/badges
>
> -----Original Message-----
> From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Peter Childs
> Sent: Thursday, 8 August 2024 13.23
> To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
> Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1
>
> We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7)
>
> We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think)
>
> Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works.
>
> I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable.
>
> I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all.
>
> If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last.
>
> Thanks
>
> Peter Childs
> ITS Research Storage
> Queen Mary University of London.
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
> ###################################################################################
>
> The information contained in this communication is confidential, may be
>
> subject to legal privilege, and is intended only for the individual named.
>
> If you are not the named addressee, please notify the sender immediately and
>
> delete this email from your system. The views expressed in this email are
>
> the views of the sender only. Outgoing and incoming electronic communications
>
> to this address are electronically archived and subject to review and/or disclosure
>
> to someone other than the recipient.
>
> ###################################################################################
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
> ------------------------------
>
> End of gpfsug-discuss Digest, Vol 149, Issue 2
> **********************************************
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
More information about the gpfsug-discuss
mailing list