From p.childs at qmul.ac.uk Thu Aug 8 11:23:18 2024 From: p.childs at qmul.ac.uk (Peter Childs) Date: Thu, 8 Aug 2024 10:23:18 +0000 Subject: [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 Message-ID: We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. Thanks Peter Childs ITS Research Storage Queen Mary University of London. From Luis.Bolinches at worldquant.com Thu Aug 8 11:39:17 2024 From: Luis.Bolinches at worldquant.com (Bolinches, Luis (WorldQuant)) Date: Thu, 8 Aug 2024 10:39:17 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: Hi Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question -- Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions Luis Bolinches WQ Aligned Infrastructure "If you always give you will always have" -- Anonymous https://www.credly.com/users/luis-bolinches/badges -----Original Message----- From: gpfsug-discuss On Behalf Of Peter Childs Sent: Thursday, 8 August 2024 13.23 To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. Thanks Peter Childs ITS Research Storage Queen Mary University of London. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org ################################################################################### The information contained in this communication is confidential, may be subject to legal privilege, and is intended only for the individual named. If you are not the named addressee, please notify the sender immediately and delete this email from your system. The views expressed in this email are the views of the sender only. Outgoing and incoming electronic communications to this address are electronically archived and subject to review and/or disclosure to someone other than the recipient. ################################################################################### From Luis.Bolinches at worldquant.com Thu Aug 8 11:39:17 2024 From: Luis.Bolinches at worldquant.com (Bolinches, Luis (WorldQuant)) Date: Thu, 8 Aug 2024 10:39:17 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: Hi Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question -- Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions Luis Bolinches WQ Aligned Infrastructure "If you always give you will always have" -- Anonymous https://www.credly.com/users/luis-bolinches/badges -----Original Message----- From: gpfsug-discuss On Behalf Of Peter Childs Sent: Thursday, 8 August 2024 13.23 To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. Thanks Peter Childs ITS Research Storage Queen Mary University of London. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org ################################################################################### The information contained in this communication is confidential, may be subject to legal privilege, and is intended only for the individual named. If you are not the named addressee, please notify the sender immediately and delete this email from your system. The views expressed in this email are the views of the sender only. Outgoing and incoming electronic communications to this address are electronically archived and subject to review and/or disclosure to someone other than the recipient. ################################################################################### From p.childs at qmul.ac.uk Thu Aug 8 17:27:01 2024 From: p.childs at qmul.ac.uk (Peter Childs) Date: Thu, 8 Aug 2024 16:27:01 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. I'm tempted to agree that the statement in the docs that says... "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) Peter Childs ________________________________________ From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) Sent: 08 August 2024 11:39 AM To: gpfsug main discussion list; gpfsug main discussion list Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. Hi Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question -- Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions Luis Bolinches WQ Aligned Infrastructure "If you always give you will always have" -- Anonymous https://www.credly.com/users/luis-bolinches/badges -----Original Message----- From: gpfsug-discuss On Behalf Of Peter Childs Sent: Thursday, 8 August 2024 13.23 To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. Thanks Peter Childs ITS Research Storage Queen Mary University of London. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org ################################################################################### The information contained in this communication is confidential, may be subject to legal privilege, and is intended only for the individual named. If you are not the named addressee, please notify the sender immediately and delete this email from your system. The views expressed in this email are the views of the sender only. Outgoing and incoming electronic communications to this address are electronically archived and subject to review and/or disclosure to someone other than the recipient. ################################################################################### _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From p.childs at qmul.ac.uk Thu Aug 8 17:27:01 2024 From: p.childs at qmul.ac.uk (Peter Childs) Date: Thu, 8 Aug 2024 16:27:01 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. I'm tempted to agree that the statement in the docs that says... "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) Peter Childs ________________________________________ From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) Sent: 08 August 2024 11:39 AM To: gpfsug main discussion list; gpfsug main discussion list Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. Hi Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question -- Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions Luis Bolinches WQ Aligned Infrastructure "If you always give you will always have" -- Anonymous https://www.credly.com/users/luis-bolinches/badges -----Original Message----- From: gpfsug-discuss On Behalf Of Peter Childs Sent: Thursday, 8 August 2024 13.23 To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. Thanks Peter Childs ITS Research Storage Queen Mary University of London. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org ################################################################################### The information contained in this communication is confidential, may be subject to legal privilege, and is intended only for the individual named. If you are not the named addressee, please notify the sender immediately and delete this email from your system. The views expressed in this email are the views of the sender only. Outgoing and incoming electronic communications to this address are electronically archived and subject to review and/or disclosure to someone other than the recipient. ################################################################################### _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From theunixchick at gmail.com Thu Aug 8 18:01:05 2024 From: theunixchick at gmail.com (Jackie Nunes) Date: Thu, 8 Aug 2024 11:01:05 -0600 Subject: [gpfsug-discuss] gpfsug-discuss Digest, Vol 149, Issue 2 In-Reply-To: References: Message-ID: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> CentOS7 and RHEL 7 are not supported with Storage Scale 5.2.# RHEL 9.4 5.14.0-427.24.1.el9 has been tested. Printed page 9 of the gpfsclusterafaq.psf makes the 5.2.0.1 stuff pretty clear to me, but yes a better representation could be done. -Jackie Nunes > On Aug 8, 2024, at 10:31?AM, gpfsug-discuss-request at gpfsug.org wrote: > > ?Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at gpfsug.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at gpfsug.org > > You can reach the person managing the list at > gpfsug-discuss-owner at gpfsug.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > (Peter Childs) > 2. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > (Peter Childs) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 8 Aug 2024 16:27:01 +0000 > From: Peter Childs > To: gpfsug main discussion list , gpfsug > main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to > 5.2.0-1 > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. > > A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. > > I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. > > I'm tempted to agree that the statement in the docs that says... > > "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." > > Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" > > However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) > > > > > Peter Childs > > > > > ________________________________________ > From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) > Sent: 08 August 2024 11:39 AM > To: gpfsug main discussion list; gpfsug main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. > > > Hi > > Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? > > Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. > > I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question > > -- > Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions > > Luis Bolinches > WQ Aligned Infrastructure > "If you always give you will always have" -- Anonymous > > https://www.credly.com/users/luis-bolinches/badges > > -----Original Message----- > From: gpfsug-discuss On Behalf Of Peter Childs > Sent: Thursday, 8 August 2024 13.23 > To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) > > We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) > > Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. > > I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. > > I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. > > If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. > > Thanks > > Peter Childs > ITS Research Storage > Queen Mary University of London. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ################################################################################### > > The information contained in this communication is confidential, may be > > subject to legal privilege, and is intended only for the individual named. > > If you are not the named addressee, please notify the sender immediately and > > delete this email from your system. The views expressed in this email are > > the views of the sender only. Outgoing and incoming electronic communications > > to this address are electronically archived and subject to review and/or disclosure > > to someone other than the recipient. > > ################################################################################### > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > > ------------------------------ > > Message: 2 > Date: Thu, 8 Aug 2024 16:27:01 +0000 > From: Peter Childs > To: gpfsug main discussion list , gpfsug > main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to > 5.2.0-1 > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. > > A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. > > I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. > > I'm tempted to agree that the statement in the docs that says... > > "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." > > Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" > > However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) > > > > > Peter Childs > > > > > ________________________________________ > From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) > Sent: 08 August 2024 11:39 AM > To: gpfsug main discussion list; gpfsug main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. > > > Hi > > Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? > > Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. > > I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question > > -- > Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions > > Luis Bolinches > WQ Aligned Infrastructure > "If you always give you will always have" -- Anonymous > > https://www.credly.com/users/luis-bolinches/badges > > -----Original Message----- > From: gpfsug-discuss On Behalf Of Peter Childs > Sent: Thursday, 8 August 2024 13.23 > To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) > > We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) > > Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. > > I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. > > I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. > > If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. > > Thanks > > Peter Childs > ITS Research Storage > Queen Mary University of London. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ################################################################################### > > The information contained in this communication is confidential, may be > > subject to legal privilege, and is intended only for the individual named. > > If you are not the named addressee, please notify the sender immediately and > > delete this email from your system. The views expressed in this email are > > the views of the sender only. Outgoing and incoming electronic communications > > to this address are electronically archived and subject to review and/or disclosure > > to someone other than the recipient. > > ################################################################################### > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ------------------------------ > > End of gpfsug-discuss Digest, Vol 149, Issue 2 > ********************************************** From klbuter at sandia.gov Thu Aug 8 23:11:32 2024 From: klbuter at sandia.gov (Buterbaugh, Kevin Lynn) Date: Thu, 8 Aug 2024 22:11:32 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: <72848B0F-E466-4EF4-B4EF-1E93DBA2ACEC@sandia.gov> All, IBM may have bought RedHat but they are definitely not the same company! Ask IBM for pricing for RHEL for POWER9 and you?ll find that out real quickly! Kevin B. > On Aug 8, 2024, at 10:27?AM, Peter Childs wrote: > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. From klbuter at sandia.gov Thu Aug 8 23:11:32 2024 From: klbuter at sandia.gov (Buterbaugh, Kevin Lynn) Date: Thu, 8 Aug 2024 22:11:32 +0000 Subject: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 In-Reply-To: References: Message-ID: <72848B0F-E466-4EF4-B4EF-1E93DBA2ACEC@sandia.gov> All, IBM may have bought RedHat but they are definitely not the same company! Ask IBM for pricing for RHEL for POWER9 and you?ll find that out real quickly! Kevin B. > On Aug 8, 2024, at 10:27?AM, Peter Childs wrote: > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. From vladimir.sapunenko at cnaf.infn.it Fri Aug 9 14:51:45 2024 From: vladimir.sapunenko at cnaf.infn.it (Vladimir) Date: Fri, 9 Aug 2024 15:51:45 +0200 Subject: [gpfsug-discuss] Thin provisioning disks in multi-PB filesystem In-Reply-To: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> References: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> Message-ID: <5066afce-b843-4fce-bd60-c3e80d1652d4@cnaf.infn.it> Hello, I have two questions regarding support of thin provisioning in GPFS: 1) is anybody tried to use thin provisioned disks in multi-PB file systems? Upon initial observation, it appears that enabling thin provisioning results in noticeable delays during mount operations. These delays are roughly equivalent to the time required to read 1% of the total capacity of the file system. I'm concerned about potential drawbacks that could impact performance. Have you encountered any issues related to this? 2) is it possible to convert "normal" nsd into "thin provisioned"? From mmchnsd man page it seems to be possible, but trying to do so mmchnsd says "This command cannot be used to change the thin provisioning disk type for an NSD." (I'm on PTF 5.1.2-15 on Linux) I would be grateful for any feedback or insights related to this matter. Thanks, Vladimir From henrik.cednert at onepost.se Fri Aug 9 16:25:41 2024 From: henrik.cednert at onepost.se (Henrik Cednert) Date: Fri, 9 Aug 2024 15:25:41 +0000 Subject: [gpfsug-discuss] GPFS 5.1.9.4 on Windows 11 Pro. Performance issues, write. Message-ID: Hello I have some issues with write performance on a windows 11 pro system and I'm out of ideas here. Hopefully someone here have some bright ideas and/or experience of GPFS on Windows 11? The system is a: Windows 11 Pro 22H2 2 x Intel(R) Xeon(R) Gold 6418H 2.10 GHz 512 GB RAM GPFS 5.1.9.4 Mellanox ConnectX 6 Dx 100GbE connected to Mellanox Switch with 5m Mellanox DAC. Before deploying this workstation we had a single socket system as a test bench where we got 60 GbE in both directons with iPerf and around 6GB/sec write and 3GB/sec read from the system over GPFS (fio tests, same tests as furhter down here). With that system I had loads of issues before getting to that point though. MS Defender had to be forcefully disabled via regedit some other tweaks. All those tweaks have been performed in this new system as well, but I can't get the proper speed out of it. On this new system and with iPerf to the storage servers I get around 50-60GbE in both directions and send and receive. If I mount the storage over SMB and 100GbE via the storage gateway servers I get around 3GB/sec read and write with Blackmagics Disk speed test. I have not tweaked the system for samba performande, just a test to see what it would give and part of the troubleshooting. If I run Blackmagics diskspeed test to the GPFS mount I instead get around 700MB/sec write and 400MB/sec read. Starting to think that the Blackmagic test might not run properly on this machine with these CPUs though. Or it's related to the mmfsd process maybe, how that threads or not threads...? But if we instead look at fio. I have a bat script that loops through a bunch of FIO-tests. A test that I have been using over the years so that we easily can benchmark all deployed systems with the exakt same tests. The tests are named like: seqrw-gb-mb-t The result when I run this is like the below list. Number in parenthesis is the by fio reported latency. Job: seqrw-40gb-1mb-t1 ????????????Write: 162 MB/s (6 ms) ????????????Read: 1940 MB/s (1 ms) Job: seqrw-20gb-1mb-t2 ????????????Write: 286 MB/s (7 ms) ????????????Read: 3952 MB/s (1 ms) Job: seqrw-10gb-1mb-t4 ????????????Write: 549 MB/s (7 ms) ????????????Read: 6987 MB/s (1 ms) Job: seqrw-05gb-1mb-t8 ????????????Write: 989 MB/s (8 ms) ????????????Read: 7721 MB/s (1 ms) Job: seqrw-40gb-2mb-t1 ????????????Write: 161 MB/s (12 ms) ????????????Read: 2261 MB/s (0 ms) Job: seqrw-20gb-2mb-t2 ????????????Write: 348 MB/s (11 ms) ????????????Read: 4266 MB/s (1 ms) Job: seqrw-10gb-2mb-t4 ????????????Write: 626 MB/s (13 ms) ????????????Read: 4949 MB/s (1 ms) Job: seqrw-05gb-2mb-t8 ????????????Write: 1154 MB/s (14 ms) ????????????Read: 7007 MB/s (2 ms) Job: seqrw-40gb-4mb-t1 ????????????Write: 161 MB/s (25 ms) ????????????Read: 2083 MB/s (1 ms) Job: seqrw-20gb-4mb-t2 ????????????Write: 352 MB/s (23 ms) ????????????Read: 4317 MB/s (2 ms) Job: seqrw-10gb-4mb-t4 ????????????Write: 696 MB/s (23 ms) ????????????Read: 7358 MB/s (2 ms) Job: seqrw-05gb-4mb-t8 ????????????Write: 1251 MB/s (25 ms) ????????????Read: 6707 MB/s (5 ms) So with fio I get a very nice read speed, but the write is horrendous and I cannot find what causes it. I have looked at affinity settings for the mmfsd process but not sure I fully understand it. But no matter what I set it to, I see no difference. I have "played" with the bios and tried with/without hyperthreading, numa and so on. And nothing affects atleast the blackmagic disk speed test. the current settings for this host is like below. I write "current" because I have tested a few different settings here but nothing affects the write speed. maxTcpConnsPerNodeConn for sure bumped the read speed though. nsdMaxWorkerThreads 16 prefetchPct 60 maxTcpConnsPerNodeConn 8 maxMBpS 14000 Does anyone have any suggestions or ideas on how to troubleshoot this? Thanks -- Henrik Cednert / + 46 704 71 89 54 / CTO / OnePost (formerly Filmlance Post) ?? OnePost, formerly Filmlance's post-production, is now an independent part of the Banijay Group. New name, same team ? business as usual at OnePost. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan at mcwinter.org Mon Aug 12 10:41:08 2024 From: jan at mcwinter.org (Jan Winter) Date: Mon, 12 Aug 2024 10:41:08 +0100 Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 Message-ID: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> Hello, I'm running a 5.1.9 gpfs cluster on Rocky Linux 8, what we recently updated from Centos 7. Since then I notice that ACL inhered permission are not getting applied to new created directory's via NFS. As an example, we exporting a space /path/to/space This space has posix permission + some extra ACL: group:some-extra-groups:rwxc:allow:FileInherit:DirInherit (X)READ/LIST (X)WRITE/CREATE (X)APPEND/MKDIR (X)SYNCHRONIZE (X)READ_ACL (X)READ_ATTR (X)READ_NAMED (X)DELETE (X)DELETE_CHILD (X)CHOWN (X)EXEC/SEARCH (X)WRITE_ACL (X)WRITE_ATTR (X)WRITE_NAMED If I create a new file on the NFS client, the ACL get applied, but when I create a new directory the ACL are missing. I didn't had this problem with Centos 7, does anyone here have an idea what the problem could be, or a way how to debug this issue? Regards Jan From p.childs at qmul.ac.uk Mon Aug 12 12:41:51 2024 From: p.childs at qmul.ac.uk (Peter Childs) Date: Mon, 12 Aug 2024 11:41:51 +0000 Subject: [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 Re: Re: gpfsug-discuss Digest, Vol 149, Issue 2 In-Reply-To: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> References: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> Message-ID: Yes, Out current plan, is to Deploy 5.2.0-1 on 9.4, given its the only version that works, Sure I'd have gone with 5.1.9 if it worked, but I have no reason to believe it does and going with the latest version does have a few benifits. Deploy 5.2.0-1 on our Scale Servers but not run mmchfs to upgrade the on-disk data structures. We should be find to run the mmchconfig release=LATEST given all nodes with in the same cluster (using multicluter are running the same version) The Legacy C7 Nodes, will be HPC compute only and I'm currently planning to patch them to 5.1.2-15 (from 5.1.2-8) in the hope that this will fix the snapshot issue we've got, Basically 5.2.0-1 keeps crashing with an snapshot issue, which I think is caused by a bug in 5.1.2-8 that looks like it got fixed in 5.1.2-15..... but if someone tells me to upgrade the C7 further I'm more than happy to look at doing so, to fix this bug. Peter Childs ITS Research Storage Please contact Research Support via Ticket by email:its-research-support at qmul.ac.uk or https://support.research.its.qmul.ac.uk/ Please check the Research blog at https://blog.hpc.qmul.ac.uk/ ________________________________________ From: gpfsug-discuss on behalf of Jackie Nunes Sent: 08 August 2024 6:01 PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] Re: [gpfsug-discuss] gpfsug-discuss Digest, Vol 149, Issue 2 [You don't often get email from theunixchick at gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. CentOS7 and RHEL 7 are not supported with Storage Scale 5.2.# RHEL 9.4 5.14.0-427.24.1.el9 has been tested. Printed page 9 of the gpfsclusterafaq.psf makes the 5.2.0.1 stuff pretty clear to me, but yes a better representation could be done. -Jackie Nunes > On Aug 8, 2024, at 10:31?AM, gpfsug-discuss-request at gpfsug.org wrote: > > ?Send gpfsug-discuss mailing list submissions to > gpfsug-discuss at gpfsug.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > or, via email, send a message with subject or body 'help' to > gpfsug-discuss-request at gpfsug.org > > You can reach the person managing the list at > gpfsug-discuss-owner at gpfsug.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of gpfsug-discuss digest..." > > > Today's Topics: > > 1. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > (Peter Childs) > 2. Re: [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > (Peter Childs) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Thu, 8 Aug 2024 16:27:01 +0000 > From: Peter Childs > To: gpfsug main discussion list , gpfsug > main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to > 5.2.0-1 > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. > > A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. > > I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. > > I'm tempted to agree that the statement in the docs that says... > > "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." > > Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" > > However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) > > > > > Peter Childs > > > > > ________________________________________ > From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) > Sent: 08 August 2024 11:39 AM > To: gpfsug main discussion list; gpfsug main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. > > > Hi > > Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? > > Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. > > I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question > > -- > Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions > > Luis Bolinches > WQ Aligned Infrastructure > "If you always give you will always have" -- Anonymous > > https://www.credly.com/users/luis-bolinches/badges > > -----Original Message----- > From: gpfsug-discuss On Behalf Of Peter Childs > Sent: Thursday, 8 August 2024 13.23 > To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) > > We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) > > Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. > > I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. > > I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. > > If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. > > Thanks > > Peter Childs > ITS Research Storage > Queen Mary University of London. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ################################################################################### > > The information contained in this communication is confidential, may be > > subject to legal privilege, and is intended only for the individual named. > > If you are not the named addressee, please notify the sender immediately and > > delete this email from your system. The views expressed in this email are > > the views of the sender only. Outgoing and incoming electronic communications > > to this address are electronically archived and subject to review and/or disclosure > > to someone other than the recipient. > > ################################################################################### > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > > ------------------------------ > > Message: 2 > Date: Thu, 8 Aug 2024 16:27:01 +0000 > From: Peter Childs > To: gpfsug main discussion list , gpfsug > main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to > 5.2.0-1 > Message-ID: > > > Content-Type: text/plain; charset="iso-8859-1" > > Last I checked 5.1.9 did not support 9.4 and only 5.2.0-1 worked. IBM do like breaking the gplbin when they push out a new version of RHEL, Hopefully it will get better now IBM and Redhat are the same company, but who knows. > > A Full Support matrix would be helpful on this one please rather than just a note in an FAQ. > > I'm fairly sure 5.1.9 has the same issue with the same bug anyway. Anyway 5.2.0-1 should mean we get all the latest features, which is good. > > I'm tempted to agree that the statement in the docs that says... > > "In multicluster environments, it is recommended to upgrade the home cluster before the cache cluster especially if file audit logging, watch folder, clustered watch, and AFM functions are being used." > > Agrees with you. And suggest that upgrading the NSD Servers early is a good idea, but that statement is a little misleading given the term "Cache" with "Multicluster" has no meaning, and it might be better to use the word "Remote" > > However upgrading the servers is always the most risky step given our ESS Servers have a memory leak from the last time we upgraded them that as yet Nvidia have not said is fixed. (But might or might not be better (or worse)) (Yes the leak was tracked down to MOFED) > > > > > Peter Childs > > > > > ________________________________________ > From: gpfsug-discuss on behalf of Bolinches, Luis (WorldQuant) > Sent: 08 August 2024 11:39 AM > To: gpfsug main discussion list; gpfsug main discussion list > Subject: Re: [gpfsug-discuss] [EXTERNAL] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > [You don't often get email from luis.bolinches at worldquant.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] > > CAUTION: This email originated from outside of QMUL. Do not click links, scan QR codes or open attachments unless you recognise the sender and know the content is safe. > > > Hi > > Have you gave a though to jump to TLS 5.1.9? instead of PTF0 of a new release 5.2? > > Seems that you are valuing stability over bleeding edge features and that is supposedly what TLSs are for. Few PFTs on the 5.1.9 already, without knowing the exact issue that you are hitting, worth the try. > > I would go first with quorum, fs mgr, nsd servers, gateways order but is like vi emacs question > > -- > Yst?v?llisin terveisin/Regards/Saludos/Salutations/Salutacions > > Luis Bolinches > WQ Aligned Infrastructure > "If you always give you will always have" -- Anonymous > > https://www.credly.com/users/luis-bolinches/badges > > -----Original Message----- > From: gpfsug-discuss On Behalf Of Peter Childs > Sent: Thursday, 8 August 2024 13.23 > To: gpfsug main discussion list > Subject: [EXTERNAL] [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 > > We are attempting to upgrade out Scale cluster which is currently running 5.1.2-8 a rather old LTS version of scale..... To 5.2.0-1 so we can upgrade from a rather old OS. (Otherwise known as CentOS7) > > We have an issue where by the freshly deployed nodes with 5.2.0-1 Scale is Crashing and it looks to be caused by a bug fix since 5.1.2-8 (I suspect its the one for hc_flash_7100841_00132_notok which is in 5.1.2-15 I think) > > Assuming once we've upgraded everything our cluster(s) will be stable again, I'm trying to work out how to upgrade everything without causing everything to be worse before it gets better. ie we want an upgrade to fix stuff rather than breaking it before it works. > > I'm trying to work out when to upgrade our NSD Servers, if we need to do them ASAP to improve the new servers running 5.2.0-1 or if we ought to leave them till last to not cause critical kit to be unstable. > > I'm also trying to work out if its worth the effort of upgrading the old nodes at all, as that's quite a bit of extra work.... and if 5.2.0-1 was stable I would not be looking at upgrading them at all. > > If anyone has worked out a good order to upgrade a scale cluster then that might help ie is it best to upgrade Quorum and NSD servers early or late within any upgrade cycle, or leave them till last. > > Thanks > > Peter Childs > ITS Research Storage > Queen Mary University of London. > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ################################################################################### > > The information contained in this communication is confidential, may be > > subject to legal privilege, and is intended only for the individual named. > > If you are not the named addressee, please notify the sender immediately and > > delete this email from your system. The views expressed in this email are > > the views of the sender only. Outgoing and incoming electronic communications > > to this address are electronically archived and subject to review and/or disclosure > > to someone other than the recipient. > > ################################################################################### > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > ------------------------------ > > End of gpfsug-discuss Digest, Vol 149, Issue 2 > ********************************************** _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From jonathan.buzzard at strath.ac.uk Mon Aug 12 12:52:42 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 12 Aug 2024 12:52:42 +0100 Subject: [gpfsug-discuss] Upgrade Scale from 5.1.2-8 to 5.2.0-1 Re: Re: gpfsug-discuss Digest, Vol 149, Issue 2 In-Reply-To: References: <3AACFE11-218F-4E9F-87B7-8B3BD7F7B15D@gmail.com> Message-ID: On 12/08/2024 12:41, Peter Childs wrote: > > Yes, Out current plan, is to > > Deploy 5.2.0-1 on 9.4, given its the only version that works, Sure > I'd have gone with 5.1.9 if it worked, but I have no reason to > believe it does and going with the latest version does have a few > benifits. Note that 5.1.9-PTF2-efix9 works on 9.2EUS because that is what DSS-G v5.0b and v5.0c are using. If you have access to EUS versions (it is a cheap add on the server version if you don't) then you could take the EUS kernel SRPM from 9.2 and compile your own RPM and run that on 9.4. I have done things like that in the past and plan to in the future relating to the issues around RedHat and rebuilds. Because you would not be distributing the RedHat SRPM then you stay within the updated licensing :-) JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From scl at virginia.edu Mon Aug 12 19:02:50 2024 From: scl at virginia.edu (Losen, Stephen C (scl)) Date: Mon, 12 Aug 2024 18:02:50 +0000 Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 In-Reply-To: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> References: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> Message-ID: Hi, How is the permission change flag set on the fileset? mmlsfileset devname filesetname -Y If it is set to chmodandsetacl then any posix chmod operation completely replaces the ACL. You can use setaclonly but then chmod fails. Your best option is probably chmodandupdateacl which applies the chmod permissions without destroying the ACL. I'm guessing that your fileset is chmodandsetacl and that when a directory is created over NFS, a hidden chmod operation is destroying the directory's ACL. You can change the setting with mmchfileset devname filesetname --allow-permission-change chmodandupdateacl Steve Losen University of Virginia Research Computing ?-----Original Message----- From: gpfsug-discuss > on behalf of Jan Winter > Reply-To: gpfsug main discussion list > Date: Monday, August 12, 2024 at 5:42 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 Hello, I'm running a 5.1.9 gpfs cluster on Rocky Linux 8, what we recently updated from Centos 7. Since then I notice that ACL inhered permission are not getting applied to new created directory's via NFS. As an example, we exporting a space /path/to/space This space has posix permission + some extra ACL: group:some-extra-groups:rwxc:allow:FileInherit:DirInherit (X)READ/LIST (X)WRITE/CREATE (X)APPEND/MKDIR (X)SYNCHRONIZE (X)READ_ACL (X)READ_ATTR (X)READ_NAMED (X)DELETE (X)DELETE_CHILD (X)CHOWN (X)EXEC/SEARCH (X)WRITE_ACL (X)WRITE_ATTR (X)WRITE_NAMED If I create a new file on the NFS client, the ACL get applied, but when I create a new directory the ACL are missing. I didn't had this problem with Centos 7, does anyone here have an idea what the problem could be, or a way how to debug this issue? Regards Jan _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From scale at us.ibm.com Tue Aug 13 17:24:34 2024 From: scale at us.ibm.com (scale) Date: Tue, 13 Aug 2024 16:24:34 +0000 Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 In-Reply-To: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> References: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> Message-ID: Hi Jan, It could be that the ACL is being overwritten for the directory after it has been created. Another possibility is that the NFSv3 client may not interpret the ACL correctly since NFSv4 ACL is involved. Also, since v3 is used, the Linux kernel NFSv3 server assumes that the filesystem has POSIX ACL. I suggest opening a ticket for this issue with the IBM Scale support team for further investigation. Thanks, Anh Dao From: gpfsug-discuss on behalf of Jan Winter Date: Monday, August 12, 2024 at 4:43?PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ACL issue with Linux kernel NFSv3 Hello, I'm running a 5.1.9 gpfs cluster on Rocky Linux 8, what we recently updated from Centos 7. Since then I notice that ACL inhered permission are not getting applied to new created directory's via NFS. As an example, we exporting a space /path/to/space This space has posix permission + some extra ACL: group:some-extra-groups:rwxc:allow:FileInherit:DirInherit (X)READ/LIST (X)WRITE/CREATE (X)APPEND/MKDIR (X)SYNCHRONIZE (X)READ_ACL (X)READ_ATTR (X)READ_NAMED (X)DELETE (X)DELETE_CHILD (X)CHOWN (X)EXEC/SEARCH (X)WRITE_ACL (X)WRITE_ATTR (X)WRITE_NAMED If I create a new file on the NFS client, the ACL get applied, but when I create a new directory the ACL are missing. I didn't had this problem with Centos 7, does anyone here have an idea what the problem could be, or a way how to debug this issue? Regards Jan _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From scottg at emailhosting.com Tue Aug 13 17:32:49 2024 From: scottg at emailhosting.com (ScottG) Date: Tue, 13 Aug 2024 12:32:49 -0400 Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 In-Reply-To: References: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> Message-ID: <1296574af146fa49615702896f8ed8e63de3f23a.camel@emailhosting.com> Hi Jan, Can you please try to mount the share with the NOACL option. ?that should/will resolve the issue. Scott Goldman ? -----Original Message----- From: scale Reply-To: gpfsug main discussion list To: gpfsug main discussion list , Anh Dao Subject: Re: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 Date: 08/13/2024 12:24:34 PM Hi Jan, It could be that the ACL is being overwritten for the directory after it has been created. Another possibility is that the NFSv3 client may not interpret the ACL correctly since NFSv4 ACL is involved. Also, since v3 is used, the Linux kernel NFSv3 server assumes that the filesystem has POSIX ACL. I suggest opening a ticket for this issue with the IBM Scale support team for further investigation. Thanks, Anh Dao From:gpfsug-discuss on behalf of Jan Winter Date: Monday, August 12, 2024 at 4:43?PM To: gpfsug main discussion list Subject: [EXTERNAL] [gpfsug-discuss] ACL issue with Linux kernel NFSv3 Hello, I'm running a 5.1.9 gpfs cluster on Rocky Linux 8, what we recently updated from Centos 7. Since then I notice that ACL inhered permission are not getting applied to new created directory's via NFS. As an example, we exporting a space /path/to/space This space has posix permission + some extra ACL: group:some-extra-groups:rwxc:allow:FileInherit:DirInherit ? (X)READ/LIST (X)WRITE/CREATE (X)APPEND/MKDIR (X)SYNCHRONIZE (X)READ_ACL? (X)READ_ATTR? (X)READ_NAMED ? (X)DELETE??? (X)DELETE_CHILD (X)CHOWN??????? (X)EXEC/SEARCH (X)WRITE_ACL (X)WRITE_ATTR (X)WRITE_NAMED If I create a new file on the NFS client, the ACL get applied, but when I create a new directory the ACL are missing. I didn't had this problem with Centos 7, does anyone here have an idea what the problem could be, or a way how to debug this issue? Regards Jan _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ivano.talamo at psi.ch Wed Aug 14 08:47:07 2024 From: ivano.talamo at psi.ch (Talamo Ivano Giuseppe) Date: Wed, 14 Aug 2024 07:47:07 +0000 Subject: [gpfsug-discuss] DSS-G V5 and GUI In-Reply-To: <1603f202-a548-44d9-a9ae-01d07e3fd4eb@strath.ac.uk> References: <1603f202-a548-44d9-a9ae-01d07e3fd4eb@strath.ac.uk> Message-ID: Same confusion here. Since last week there's the 5.0c. At a quick check same RPMs, OFED... but finally a PDF for the installation of the GUI, that in the 5.0b doc was mentioned but not present. Though the installation of the GUI is via confluent, and I'm not a big fan of it __________________________________________ Paul Scherrer Institut Ivano Talamo OBBA/230 Forschungsstrasse 111 5232 Villigen PSI Schweiz Phone: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch Available: Monday - Wednesday ________________________________ From: gpfsug-discuss on behalf of Jonathan Buzzard Sent: 28 June 2024 15:05 To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] DSS-G V5 and GUI So I skipped the v5.0a release because that only supported the new V3 SR650's. However I have finished the "get off CentOS 7" project (apart from the one server that is now on TuxCare ELS awaiting Ubuntu 24.04 support in GPFS) and so now have the time to look into v5.0b which does support the older V1 and V2 SR650's. However in the release notes I read this The Storage Scale GUI is not supported with DSS-G 5.0. Now while I could not care less about the GUI per se, you need it for the RestAPI which I really *do* care about. Please tell me that Lenovo have not just yanked this. I do notice that there are still RPM's for it but really what on earth is going on? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jan at mcwinter.org Wed Aug 14 15:52:19 2024 From: jan at mcwinter.org (Jan Winter) Date: Wed, 14 Aug 2024 15:52:19 +0100 Subject: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 In-Reply-To: <1296574af146fa49615702896f8ed8e63de3f23a.camel@emailhosting.com> References: <7de4ac04-642e-4586-82c7-75ea9dd3b954@mcwinter.org> <1296574af146fa49615702896f8ed8e63de3f23a.camel@emailhosting.com> Message-ID: Hello, I double checked and we have set "chmodandupdateacl" on all our filesets, so that was not the issue for us. As Scott pointed out, setting options "noacl" on the client fix the issue. I had already a call open with IBM, but the short answer was that they don't support Linux kernel NFS server. It seems to be that the NFS client negotiates with the server is going wrong. I may reopen the call with IBM to figure out if they are interested to fix this issue or bring in some light why the behaviour have changed. Thanks for all the help! Jan On 13/08/2024 17:32, ScottG wrote: > > Hi Jan, > Can you please try to mount the share with the NOACL option. ?that > should/will resolve the issue. > > Scott Goldman > > -----Original Message----- > *From*: scale > > *Reply-To*: gpfsug main discussion list

> > *To*: gpfsug main discussion list

>, Anh Dao > > *Subject*: Re: [gpfsug-discuss] ACL issue with Linux kernel NFSv3 > *Date*: 08/13/2024 12:24:34 PM > > Hi Jan, > > It could be that the ACL is being overwritten for the directory after it > has been created. Another possibility is that the NFSv3 client may not > interpret the ACL correctly since NFSv4 ACL is involved. Also, since v3 > is used, the Linux kernel NFSv3 server assumes that the filesystem has > POSIX ACL. I suggest opening a ticket for this issue with the IBM Scale > support team for further investigation. > > Thanks, > Anh Dao > > *From:*gpfsug-discuss on behalf of > Jan Winter > *Date: *Monday, August 12, 2024 at 4:43?PM > *To: *gpfsug main discussion list > *Subject: *[EXTERNAL] [gpfsug-discuss] ACL issue with Linux kernel NFSv3 > > Hello, > > I'm running a 5.1.9 gpfs cluster on Rocky Linux 8, what we recently > updated from Centos 7. > Since then I notice that ACL inhered permission are not getting applied > to new created directory's via NFS. > > As an example, we exporting a space > /path/to/space > > This space has posix permission + some extra ACL: > > group:some-extra-groups:rwxc:allow:FileInherit:DirInherit > ? (X)READ/LIST (X)WRITE/CREATE (X)APPEND/MKDIR (X)SYNCHRONIZE > (X)READ_ACL? (X)READ_ATTR? (X)READ_NAMED > ? (X)DELETE??? (X)DELETE_CHILD (X)CHOWN??????? (X)EXEC/SEARCH > (X)WRITE_ACL (X)WRITE_ATTR (X)WRITE_NAMED > > If I create a new file on the NFS client, the ACL get applied, but when > I create a new directory the ACL are missing. > > I didn't had this problem with Centos 7, does anyone here have an idea > what the problem could be, or a way how to debug this issue? > > Regards > Jan > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From cdmaestas at us.ibm.com Tue Aug 20 00:46:37 2024 From: cdmaestas at us.ibm.com (CHRIS MAESTAS) Date: Mon, 19 Aug 2024 23:46:37 +0000 Subject: [gpfsug-discuss] Announcement: Scale 5.2.1 is out! Message-ID: For a summary of changes see here! If you miss: dstat ?gpfs ?gpfs-ops Maybe your new favorite command is mmpstat! And if you want those expelled nodes to stay down, now they will! That is unless you run: mmexpelnode -r/?reset The new and improved Cluster Export Services (CES) S3 is here! Think of it as High Performance Object (HPO) 2.0 now also running on VMs and bare metal! Check out the test measurements here where you can get 60 GB/s of read performance. Yes, that?s a byte which is a lot of bits! If you want to get Scale storage services on arm64 platforms now, you can! Unofficially, let?s race to running it on your Pi! Scale training has also been updated and it?s available on IBM training and Coursera! Check out this blog post for more information on classes and subscription options! -- The Chief Troublemaker 8) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Tue Aug 20 15:36:41 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Tue, 20 Aug 2024 15:36:41 +0100 Subject: [gpfsug-discuss] Native Rest API Message-ID: I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From aspurdy at us.ibm.com Tue Aug 20 17:06:36 2024 From: aspurdy at us.ibm.com (Amy Hirst) Date: Tue, 20 Aug 2024 16:06:36 +0000 Subject: [gpfsug-discuss] Announcement: Scale 5.2.1 is out! In-Reply-To: References: Message-ID: I?m always happy to see these emails. Well done, team! Thank you, Amy (Purdy) Hirst Vice President IBM Storage Software, Site Reliability Engineering, and User Experience She/Her/Hers Assistant: Michelle Garcia Diaz (michelle.garcia.diaz at ibm.com) IBM -- From: gpfsug-discuss on behalf of CHRIS MAESTAS Date: Monday, August 19, 2024 at 7:49?PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] Announcement: Scale 5.2.1 is out! For a summary of changes see here! If you miss: dstat ?gpfs ?gpfs-ops Maybe your new favorite command is mmpstat! And if you want those expelled nodes to stay down, now they will! That is unless you run: mmexpelnode -r/?reset The new and improved For a summary of changes see here! If you miss: dstat ?gpfs ?gpfs-ops Maybe your new favorite command is mmpstat! And if you want those expelled nodes to stay down, now they will! That is unless you run: mmexpelnode -r/?reset The new and improved Cluster Export Services (CES) S3 is here! Think of it as High Performance Object (HPO) 2.0 now also running on VMs and bare metal! Check out the test measurements here where you can get 60 GB/s of read performance. Yes, that?s a byte which is a lot of bits! If you want to get Scale storage services on arm64 platforms now, you can! Unofficially, let?s race to running it on your Pi! Scale training has also been updated and it?s available on IBM training and Coursera! Check out this blog post for more information on classes and subscription options! -- The Chief Troublemaker 8) -------------- next part -------------- An HTML attachment was scrubbed... URL: From Luis.I.Teran at ibm.com Tue Aug 20 19:16:41 2024 From: Luis.I.Teran at ibm.com (Luis I Teran) Date: Tue, 20 Aug 2024 18:16:41 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: Hello Jonathan, > Does that mean the mm-command layer > is going away? Will I in the future going to forced to use some "naff" > GUI layer to administer a GPFS cluster? Frankly I am quite happy using > the mm-command layer thank you very much and would like to keep it that > way and just be able to ignore the GUI. I do appreciate I might be > somewhat old school in that view but never the less I view GUI > administration of things with disdain. The Native Rest API is a new feature that is meant to replace mm-commands in the long-term. The Native Rest API feature is being delivered in phases. For the 5.2.2.0 GA, not all the functionality that mm-commands expose will be available in the Native Rest API. Due to this limitation, all mm-commands will remain available, and complete co-existence is supported by the Native Rest API with mm-commands (meaning you can run both at the same time). The Native Rest API will not require GUI administration. With the Native Rest API, there is a new CLI that has a similar look to its equivalent mm-command (-N options, -F options with stanza files, and many of the same flags per command, just a different invocation). > Secondly at the moment the Rest API requires installing the GUI. Does > the "native" bit of the title mean that requirement is going away and > there will be a Rest API without the need for the additional complexity > of the GUI nodes? Or is the mm-command layer going away and yes you will > need the extra complexity of the GUI because you are going to have to > suck up administering the system with a GUI? The Native Rest API will not require installing the GUI. It does require a new RPM+service, but it does not expose a GUI itself. Once the Native Rest API exposes the functionality that the GUI requires, the GUI itself will rely on the Native Rest API (assuming you want a GUI running), as well as other internal components that today rely on mm-commands. Thanks, Luis Teran IBM Storage Scale Development From: gpfsug-discuss on behalf of gpfsug-discuss-request at gpfsug.org Date: Tuesday, August 20, 2024 at 9:10?AM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] gpfsug-discuss Digest, Vol 149, Issue 13 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at gpfsug.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at gpfsug.org You can reach the person managing the list at gpfsug-discuss-owner at gpfsug.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Native Rest API (Jonathan Buzzard) 2. Re: Announcement: Scale 5.2.1 is out! (Amy Hirst) ---------------------------------------------------------------------- Message: 1 Date: Tue, 20 Aug 2024 15:36:41 +0100 From: Jonathan Buzzard To: gpfsug main discussion list Subject: [gpfsug-discuss] Native Rest API Message-ID: Content-Type: text/plain; charset=UTF-8; format=flowed I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG ------------------------------ Message: 2 Date: Tue, 20 Aug 2024 16:06:36 +0000 From: Amy Hirst To: gpfsug main discussion list , CHRIS MAESTAS Subject: Re: [gpfsug-discuss] Announcement: Scale 5.2.1 is out! Message-ID: Content-Type: text/plain; charset="utf-8" I?m always happy to see these emails. Well done, team! Thank you, Amy (Purdy) Hirst Vice President IBM Storage Software, Site Reliability Engineering, and User Experience She/Her/Hers Assistant: Michelle Garcia Diaz (michelle.garcia.diaz at ibm.com) IBM -- From: gpfsug-discuss on behalf of CHRIS MAESTAS Date: Monday, August 19, 2024 at 7:49?PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] Announcement: Scale 5.2.1 is out! For a summary of changes see here! If you miss: dstat ?gpfs ?gpfs-ops Maybe your new favorite command is mmpstat! And if you want those expelled nodes to stay down, now they will! That is unless you run: mmexpelnode -r/?reset The new and improved For a summary of changes see here! If you miss: dstat ?gpfs ?gpfs-ops Maybe your new favorite command is mmpstat! And if you want those expelled nodes to stay down, now they will! That is unless you run: mmexpelnode -r/?reset The new and improved Cluster Export Services (CES) S3 is here! Think of it as High Performance Object (HPO) 2.0 now also running on VMs and bare metal! Check out the test measurements here where you can get 60 GB/s of read performance. Yes, that?s a byte which is a lot of bits! If you want to get Scale storage services on arm64 platforms now, you can! Unofficially, let?s race to running it on your Pi! Scale training has also been updated and it?s available on IBM training and Coursera! Check out this blog post for more information on classes and subscription options! -- The Chief Troublemaker 8) -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Subject: Digest Footer _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org ------------------------------ End of gpfsug-discuss Digest, Vol 149, Issue 13 *********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue Aug 20 19:19:25 2024 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 20 Aug 2024 18:19:25 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: I'm going to be a bit... harsh here. I absolutely hate companies taking away my CLI's and giving me some half-working REST trash. IBM has already done this across a Number of different products(sklm,etc) , so I will not be surprised if it eventually happens here. All the young programming teams across the globe they employ love this kind of thing. I wish they would actually ASK the customers instead of that kind of thing, but it *IS* a thing, and has been happening to more than just IBM products. On the plus side, I don't think Scale is QUITE at that point, so we are probably safe for at least another few versions. They've been pushing the lackluster GUI quite hard for some time now. Many of us out here actually have the GUI disabled, or not installed at all, due to all the CVEs, and/or an inability to move forward for various reasons. For example: There was no path forward on Power8 without 'rolling your own', and I assume again soon for our Power 9s. Your Mileage May Vary, of course. I shudder to think about attempting to diagnose a cluster boot after a major datacenter maintenance outage where a timeout caused various "vdisks" not to get marked active, with REST. Bet that takes much longer than the CLI. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, August 20, 2024 10:37 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Native Rest API I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!KGKeukY!1tVc1LXRJBbg0sPzWS-JMvWe23NYygl1TXWcS1gkLY7Bct9Y-1bioDGtKMv6lT6xNlimEnEIkLO7wqHRwW5UKg2bCu296w$ From ewahl at osc.edu Tue Aug 20 19:19:25 2024 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 20 Aug 2024 18:19:25 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: I'm going to be a bit... harsh here. I absolutely hate companies taking away my CLI's and giving me some half-working REST trash. IBM has already done this across a Number of different products(sklm,etc) , so I will not be surprised if it eventually happens here. All the young programming teams across the globe they employ love this kind of thing. I wish they would actually ASK the customers instead of that kind of thing, but it *IS* a thing, and has been happening to more than just IBM products. On the plus side, I don't think Scale is QUITE at that point, so we are probably safe for at least another few versions. They've been pushing the lackluster GUI quite hard for some time now. Many of us out here actually have the GUI disabled, or not installed at all, due to all the CVEs, and/or an inability to move forward for various reasons. For example: There was no path forward on Power8 without 'rolling your own', and I assume again soon for our Power 9s. Your Mileage May Vary, of course. I shudder to think about attempting to diagnose a cluster boot after a major datacenter maintenance outage where a timeout caused various "vdisks" not to get marked active, with REST. Bet that takes much longer than the CLI. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, August 20, 2024 10:37 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Native Rest API I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!KGKeukY!1tVc1LXRJBbg0sPzWS-JMvWe23NYygl1TXWcS1gkLY7Bct9Y-1bioDGtKMv6lT6xNlimEnEIkLO7wqHRwW5UKg2bCu296w$ From cdmaestas at us.ibm.com Tue Aug 20 20:15:29 2024 From: cdmaestas at us.ibm.com (CHRIS MAESTAS) Date: Tue, 20 Aug 2024 19:15:29 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: Honest feedback is what is needed! There is no intention to remove mm* commands. There is an intention to solidify the CLI, GUI and REST control paths into a common framework. This has been known as the Modernization of Scale (MOS) work and has been in tech-preview since version 5.1.9 last year. Please feel free to look at: https://www.spectrumscaleug.org/wp-content/uploads/2024/07/SSUG24ISC-Modernisation-of-Storage-Scale.pdf for an overview of the direction that is being taken today. There is a sponsor user group for this tech-preview feature that you are welcome to join. You can participate in new calls and listen to previously recorded calls. --cdm From: gpfsug-discuss on behalf of Wahl, Edward Date: Tuesday, August 20, 2024 at 12:32?PM To: gpfsug main discussion list , gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Native Rest API I'm going to be a bit... harsh here. I absolutely hate companies taking away my CLI's and giving me some half-working REST trash. IBM has already done this across a Number of different products(sklm,etc) , so I will not be surprised if it eventually happens here. All the young programming teams across the globe they employ love this kind of thing. I wish they would actually ASK the customers instead of that kind of thing, but it *IS* a thing, and has been happening to more than just IBM products. On the plus side, I don't think Scale is QUITE at that point, so we are probably safe for at least another few versions. They've been pushing the lackluster GUI quite hard for some time now. Many of us out here actually have the GUI disabled, or not installed at all, due to all the CVEs, and/or an inability to move forward for various reasons. For example: There was no path forward on Power8 without 'rolling your own', and I assume again soon for our Power 9s. Your Mileage May Vary, of course. I shudder to think about attempting to diagnose a cluster boot after a major datacenter maintenance outage where a timeout caused various "vdisks" not to get marked active, with REST. Bet that takes much longer than the CLI. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, August 20, 2024 10:37 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Native Rest API I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cdmaestas at us.ibm.com Tue Aug 20 20:15:29 2024 From: cdmaestas at us.ibm.com (CHRIS MAESTAS) Date: Tue, 20 Aug 2024 19:15:29 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: Honest feedback is what is needed! There is no intention to remove mm* commands. There is an intention to solidify the CLI, GUI and REST control paths into a common framework. This has been known as the Modernization of Scale (MOS) work and has been in tech-preview since version 5.1.9 last year. Please feel free to look at: https://www.spectrumscaleug.org/wp-content/uploads/2024/07/SSUG24ISC-Modernisation-of-Storage-Scale.pdf for an overview of the direction that is being taken today. There is a sponsor user group for this tech-preview feature that you are welcome to join. You can participate in new calls and listen to previously recorded calls. --cdm From: gpfsug-discuss on behalf of Wahl, Edward Date: Tuesday, August 20, 2024 at 12:32?PM To: gpfsug main discussion list , gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Native Rest API I'm going to be a bit... harsh here. I absolutely hate companies taking away my CLI's and giving me some half-working REST trash. IBM has already done this across a Number of different products(sklm,etc) , so I will not be surprised if it eventually happens here. All the young programming teams across the globe they employ love this kind of thing. I wish they would actually ASK the customers instead of that kind of thing, but it *IS* a thing, and has been happening to more than just IBM products. On the plus side, I don't think Scale is QUITE at that point, so we are probably safe for at least another few versions. They've been pushing the lackluster GUI quite hard for some time now. Many of us out here actually have the GUI disabled, or not installed at all, due to all the CVEs, and/or an inability to move forward for various reasons. For example: There was no path forward on Power8 without 'rolling your own', and I assume again soon for our Power 9s. Your Mileage May Vary, of course. I shudder to think about attempting to diagnose a cluster boot after a major datacenter maintenance outage where a timeout caused various "vdisks" not to get marked active, with REST. Bet that takes much longer than the CLI. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss On Behalf Of Jonathan Buzzard Sent: Tuesday, August 20, 2024 10:37 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Native Rest API I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Aug 23 23:54:58 2024 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 23 Aug 2024 22:54:58 +0000 Subject: [gpfsug-discuss] DSS-G V5 and GUI In-Reply-To: References: <1603f202-a548-44d9-a9ae-01d07e3fd4eb@strath.ac.uk> Message-ID: <0CCD3238-ED74-4D23-A9C1-D90349A68C64@rutgers.edu> Does anyone from Lenovo read this list that can comment on whether or not there?s therefore an upgrade path from 5.0b to 5.0c? If they?re running all of the same stuff, pretty much, it would be nice not to have to run two slightly different systems for obvious reasons. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Aug 14, 2024, at 03:47, Talamo Ivano Giuseppe wrote: Same confusion here. Since last week there's the 5.0c. At a quick check same RPMs, OFED... but finally a PDF for the installation of the GUI, that in the 5.0b doc was mentioned but not present. Though the installation of the GUI is via confluent, and I'm not a big fan of it __________________________________________ Paul Scherrer Institut Ivano Talamo OBBA/230 Forschungsstrasse 111 5232 Villigen PSI Schweiz Phone: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch Available: Monday - Wednesday ________________________________ From: gpfsug-discuss on behalf of Jonathan Buzzard Sent: 28 June 2024 15:05 To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] DSS-G V5 and GUI So I skipped the v5.0a release because that only supported the new V3 SR650's. However I have finished the "get off CentOS 7" project (apart from the one server that is now on TuxCare ELS awaiting Ubuntu 24.04 support in GPFS) and so now have the time to look into v5.0b which does support the older V1 and V2 SR650's. However in the release notes I read this The Storage Scale GUI is not supported with DSS-G 5.0. Now while I could not care less about the GUI per se, you need it for the RestAPI which I really *do* care about. Please tell me that Lenovo have not just yanked this. I do notice that there are still RPM's for it but really what on earth is going on? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncalimet at lenovo.com Sat Aug 24 11:41:37 2024 From: ncalimet at lenovo.com (Nicolas CALIMET) Date: Sat, 24 Aug 2024 10:41:37 +0000 Subject: [gpfsug-discuss] [External] Re: DSS-G V5 and GUI In-Reply-To: <0CCD3238-ED74-4D23-A9C1-D90349A68C64@rutgers.edu> References: <1603f202-a548-44d9-a9ae-01d07e3fd4eb@strath.ac.uk> <0CCD3238-ED74-4D23-A9C1-D90349A68C64@rutgers.edu> Message-ID: Hello, The DSS-G 5.0c release mostly brings back the GUI that was not supported in 5.0a and 5.0b as there were problems we had to work out with IBM (and that we eventually resolved ourselves). The GUI-related RPMs shipping with the prior 5.0a and 5.0b releases were kept around in case someone needs them (which seems to be indeed a likely scenario given this thread) but were not supported in the context of deploying a GUI node for DSS-G. Therefore only the GUI documentation was removed in these releases, and is now back again in DSS-G 5.0c. Bottom line - installations already running with DSS-G 5.0b (and 5.0a) do not need upgrading. DSS-G 5.0b and 5.0c are running the same software and firmware levels and can coexist without issue. The release notes for DSS-G 5.0c are pretty short actually: ---------- DSS-G 5.0c ---------- Released 2024-08-06 This DSS-G release supports gen2 to gen4 DSS servers (Lenovo ThinkSystem SR6x0 / SR6x0 V2 / SR655 V3) only. Deployment is supported with a Confluent management server only. This release brings back support for the Storage Scale GUI for DSS-G2xy configurations deployed with DSS-G release 5.0 only. Updates and fixes ----------------- - DSS-G code and documentation * Added back and revisited the DSS-G GUI documentation * GUI: restored deployment of GUI nodes with dssg-gui-install via the Confluent management server * GUI: added support for SR655 V3 servers and D4390 external enclosures * onecli.sh: added timeout to work around occasional hangs when checking UEFI settings at bootup * [DSS-G2xy] drive firmware: added LENOVO (vs LENOVO-X) references in firmwareTable.drive for D4390 enclosures The one thing that could be useful to deployments with DSS-G 5.0b is the update to the onecli.sh wrapper. If needed, the script can be copied from the 5.0c tarball into /opt/lenovo/dss/bin/ on those DSS nodes deployed with 5.0b. HTH -- Nicolas Calimet, PhD | HPC System Architect | Lenovo ISG | Meitnerstrasse 9, D-70563 Stuttgart, Germany | +49 71165690146 | https://www.lenovo.com/dssg From: gpfsug-discuss On Behalf Of Ryan Novosielski Sent: Saturday, August 24, 2024 00:55 To: gpfsug main discussion list Subject: [External] Re: [gpfsug-discuss] DSS-G V5 and GUI Does anyone from Lenovo read this list that can comment on whether or not there's therefore an upgrade path from 5.0b to 5.0c? If they're running all of the same stuff, pretty much, it would be nice not to have to run two slightly different systems for obvious reasons. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Aug 14, 2024, at 03:47, Talamo Ivano Giuseppe > wrote: Same confusion here. Since last week there's the 5.0c. At a quick check same RPMs, OFED... but finally a PDF for the installation of the GUI, that in the 5.0b doc was mentioned but not present. Though the installation of the GUI is via confluent, and I'm not a big fan of it __________________________________________ Paul Scherrer Institut Ivano Talamo OBBA/230 Forschungsstrasse 111 5232 Villigen PSI Schweiz Phone: +41 56 310 47 11 E-Mail: ivano.talamo at psi.ch Available: Monday - Wednesday ________________________________ From: gpfsug-discuss > on behalf of Jonathan Buzzard > Sent: 28 June 2024 15:05 To: gpfsug-discuss at gpfsug.org

> Subject: [gpfsug-discuss] DSS-G V5 and GUI So I skipped the v5.0a release because that only supported the new V3 SR650's. However I have finished the "get off CentOS 7" project (apart from the one server that is now on TuxCare ELS awaiting Ubuntu 24.04 support in GPFS) and so now have the time to look into v5.0b which does support the older V1 and V2 SR650's. However in the release notes I read this The Storage Scale GUI is not supported with DSS-G 5.0. Now while I could not care less about the GUI per se, you need it for the RestAPI which I really *do* care about. Please tell me that Lenovo have not just yanked this. I do notice that there are still RPM's for it but really what on earth is going on? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From shaof777 at gmail.com Wed Aug 28 03:23:30 2024 From: shaof777 at gmail.com (shao feng) Date: Wed, 28 Aug 2024 10:23:30 +0800 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: If my understanding is correct, the data returned by "native rest api" will also be from GUI. If that is true, will you fix this "bug" which cause current API not a serious API: https://www.ibm.com/docs/en/storage-scale/5.2.1?topic=issues-gui-is-displaying-outdated-information On Wed, Aug 21, 2024 at 3:18?AM CHRIS MAESTAS wrote: > Honest feedback is what is needed! There is no intention to remove mm* > commands. There is an intention to solidify the CLI, GUI and REST control > paths into a common framework. This has been known as the Modernization of > Scale (MOS) work and has been in tech-preview since version 5.1.9 last > year. Please feel free to look at: > https://www.spectrumscaleug.org/wp-content/uploads/2024/07/SSUG24ISC-Modernisation-of-Storage-Scale.pdf > for an overview of the direction that is being taken today. There is a > sponsor user group for this tech-preview feature that you are welcome to > join. You can participate in new calls and listen to previously recorded > calls. > > > > --cdm > > > > > > *From: *gpfsug-discuss on behalf of > Wahl, Edward > *Date: *Tuesday, August 20, 2024 at 12:32?PM > *To: *gpfsug main discussion list , gpfsug > main discussion list > *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Native Rest API > > I'm going to be a bit... harsh here. I absolutely hate companies taking > away my CLI's and giving me some half-working REST trash. IBM has already > done this across a Number of different products(sklm,etc) , so I will not > be surprised if it eventually happens here. All the young programming > teams across the globe they employ love this kind of thing. I wish they > would actually ASK the customers instead of that kind of thing, but it *IS* > a thing, and has been happening to more than just IBM products. > > On the plus side, I don't think Scale is QUITE at that point, so we are > probably safe for at least another few versions. They've been pushing the > lackluster GUI quite hard for some time now. Many of us out here actually > have the GUI disabled, or not installed at all, due to all the CVEs, and/or > an inability to move forward for various reasons. For example: There was > no path forward on Power8 without 'rolling your own', and I assume again > soon for our Power 9s. Your Mileage May Vary, of course. > > I shudder to think about attempting to diagnose a cluster boot after a > major datacenter maintenance outage where a timeout caused various "vdisks" > not to get marked active, with REST. Bet that takes much longer than the > CLI. > > Ed Wahl > Ohio Supercomputer Center > > -----Original Message----- > From: gpfsug-discuss On Behalf Of > Jonathan Buzzard > Sent: Tuesday, August 20, 2024 10:37 AM > To: gpfsug main discussion list > Subject: [gpfsug-discuss] Native Rest API > > > I just had an email from IBM about technology preview of the "Native Rest > API" feature in 5.2.1.0 > > There are at least two interrelated and important questions that are not > answered in the web page about this "Native Rest API" feature IMHO. > > Firstly the page says it "eliminates" the need to administer the Scale > cluster with the mm-command layer. Does that mean the mm-command layer is > going away? Will I in the future going to forced to use some "naff" > GUI layer to administer a GPFS cluster? Frankly I am quite happy using the > mm-command layer thank you very much and would like to keep it that way and > just be able to ignore the GUI. I do appreciate I might be somewhat old > school in that view but never the less I view GUI administration of things > with disdain. > > Secondly at the moment the Rest API requires installing the GUI. Does the > "native" bit of the title mean that requirement is going away and there > will be a Rest API without the need for the additional complexity of the > GUI nodes? Or is the mm-command layer going away and yes you will need the > extra complexity of the GUI because you are going to have to suck up > administering the system with a GUI? > > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From shaof777 at gmail.com Wed Aug 28 03:23:30 2024 From: shaof777 at gmail.com (shao feng) Date: Wed, 28 Aug 2024 10:23:30 +0800 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: Message-ID: If my understanding is correct, the data returned by "native rest api" will also be from GUI. If that is true, will you fix this "bug" which cause current API not a serious API: https://www.ibm.com/docs/en/storage-scale/5.2.1?topic=issues-gui-is-displaying-outdated-information On Wed, Aug 21, 2024 at 3:18?AM CHRIS MAESTAS wrote: > Honest feedback is what is needed! There is no intention to remove mm* > commands. There is an intention to solidify the CLI, GUI and REST control > paths into a common framework. This has been known as the Modernization of > Scale (MOS) work and has been in tech-preview since version 5.1.9 last > year. Please feel free to look at: > https://www.spectrumscaleug.org/wp-content/uploads/2024/07/SSUG24ISC-Modernisation-of-Storage-Scale.pdf > for an overview of the direction that is being taken today. There is a > sponsor user group for this tech-preview feature that you are welcome to > join. You can participate in new calls and listen to previously recorded > calls. > > > > --cdm > > > > > > *From: *gpfsug-discuss on behalf of > Wahl, Edward > *Date: *Tuesday, August 20, 2024 at 12:32?PM > *To: *gpfsug main discussion list , gpfsug > main discussion list > *Subject: *[EXTERNAL] Re: [gpfsug-discuss] Native Rest API > > I'm going to be a bit... harsh here. I absolutely hate companies taking > away my CLI's and giving me some half-working REST trash. IBM has already > done this across a Number of different products(sklm,etc) , so I will not > be surprised if it eventually happens here. All the young programming > teams across the globe they employ love this kind of thing. I wish they > would actually ASK the customers instead of that kind of thing, but it *IS* > a thing, and has been happening to more than just IBM products. > > On the plus side, I don't think Scale is QUITE at that point, so we are > probably safe for at least another few versions. They've been pushing the > lackluster GUI quite hard for some time now. Many of us out here actually > have the GUI disabled, or not installed at all, due to all the CVEs, and/or > an inability to move forward for various reasons. For example: There was > no path forward on Power8 without 'rolling your own', and I assume again > soon for our Power 9s. Your Mileage May Vary, of course. > > I shudder to think about attempting to diagnose a cluster boot after a > major datacenter maintenance outage where a timeout caused various "vdisks" > not to get marked active, with REST. Bet that takes much longer than the > CLI. > > Ed Wahl > Ohio Supercomputer Center > > -----Original Message----- > From: gpfsug-discuss On Behalf Of > Jonathan Buzzard > Sent: Tuesday, August 20, 2024 10:37 AM > To: gpfsug main discussion list > Subject: [gpfsug-discuss] Native Rest API > > > I just had an email from IBM about technology preview of the "Native Rest > API" feature in 5.2.1.0 > > There are at least two interrelated and important questions that are not > answered in the web page about this "Native Rest API" feature IMHO. > > Firstly the page says it "eliminates" the need to administer the Scale > cluster with the mm-command layer. Does that mean the mm-command layer is > going away? Will I in the future going to forced to use some "naff" > GUI layer to administer a GPFS cluster? Frankly I am quite happy using the > mm-command layer thank you very much and would like to keep it that way and > just be able to ignore the GUI. I do appreciate I might be somewhat old > school in that view but never the less I view GUI administration of things > with disdain. > > Secondly at the moment the Rest API requires installing the GUI. Does the > "native" bit of the title mean that requirement is going away and there > will be a Rest API without the need for the additional complexity of the > GUI nodes? Or is the mm-command layer going away and yes you will need the > extra complexity of the GUI because you are going to have to suck up > administering the system with a GUI? > > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Wed Aug 28 05:38:20 2024 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 28 Aug 2024 04:38:20 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References:

Message-ID: The new native REST API will not retrieve information from the GUI. The new API is handled by a new ?Admin Daemon?, which will communicate directly with mmfsd, or with peer Admin Daemons, as needed. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of shao feng Date: Tuesday, August 27, 2024 at 10:26 PM To: gpfsug main discussion list Cc: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Native Rest API If my understanding is correct, the data returned by "native rest api" will also be from GUI. If that is true, will you fix this "bug" which cause current API not a serious API: https:?//www.?ibm.?com/docs/en/storage-scale/5.?2.?1?topic=issues-gui-is-displaying-outdated-information If my understanding is correct, the data returned by "native rest api" will also be from GUI. If that is true, will you fix this "bug" which cause current API not a serious API: https://www.ibm.com/docs/en/storage-scale/5.2.1?topic=issues-gui-is-displaying-outdated-information On Wed, Aug 21, 2024 at 3:18?AM CHRIS MAESTAS > wrote: Honest feedback is what is needed! There is no intention to remove mm* commands. There is an intention to solidify the CLI, GUI and REST control paths into a common framework. This has been known as the Modernization of Scale (MOS) work and has been in tech-preview since version 5.1.9 last year. Please feel free to look at: https://www.spectrumscaleug.org/wp-content/uploads/2024/07/SSUG24ISC-Modernisation-of-Storage-Scale.pdf for an overview of the direction that is being taken today. There is a sponsor user group for this tech-preview feature that you are welcome to join. You can participate in new calls and listen to previously recorded calls. --cdm From: gpfsug-discuss > on behalf of Wahl, Edward > Date: Tuesday, August 20, 2024 at 12:32?PM To: gpfsug main discussion list >, gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Native Rest API I'm going to be a bit... harsh here. I absolutely hate companies taking away my CLI's and giving me some half-working REST trash. IBM has already done this across a Number of different products(sklm,etc) , so I will not be surprised if it eventually happens here. All the young programming teams across the globe they employ love this kind of thing. I wish they would actually ASK the customers instead of that kind of thing, but it *IS* a thing, and has been happening to more than just IBM products. On the plus side, I don't think Scale is QUITE at that point, so we are probably safe for at least another few versions. They've been pushing the lackluster GUI quite hard for some time now. Many of us out here actually have the GUI disabled, or not installed at all, due to all the CVEs, and/or an inability to move forward for various reasons. For example: There was no path forward on Power8 without 'rolling your own', and I assume again soon for our Power 9s. Your Mileage May Vary, of course. I shudder to think about attempting to diagnose a cluster boot after a major datacenter maintenance outage where a timeout caused various "vdisks" not to get marked active, with REST. Bet that takes much longer than the CLI. Ed Wahl Ohio Supercomputer Center -----Original Message----- From: gpfsug-discuss > On Behalf Of Jonathan Buzzard Sent: Tuesday, August 20, 2024 10:37 AM To: gpfsug main discussion list > Subject: [gpfsug-discuss] Native Rest API I just had an email from IBM about technology preview of the "Native Rest API" feature in 5.2.1.0 There are at least two interrelated and important questions that are not answered in the web page about this "Native Rest API" feature IMHO. Firstly the page says it "eliminates" the need to administer the Scale cluster with the mm-command layer. Does that mean the mm-command layer is going away? Will I in the future going to forced to use some "naff" GUI layer to administer a GPFS cluster? Frankly I am quite happy using the mm-command layer thank you very much and would like to keep it that way and just be able to ignore the GUI. I do appreciate I might be somewhat old school in that view but never the less I view GUI administration of things with disdain. Secondly at the moment the Rest API requires installing the GUI. Does the "native" bit of the title mean that requirement is going away and there will be a Rest API without the need for the additional complexity of the GUI nodes? Or is the mm-command layer going away and yes you will need the extra complexity of the GUI because you are going to have to suck up administering the system with a GUI? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Wed Aug 28 05:38:20 2024 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 28 Aug 2024 04:38:20 +0000 Subject: [gpfsug-discuss] Native Rest API In-Reply-To: References: