From st.graf at fz-juelich.de  Thu Jun  2 15:31:43 2022
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Thu, 2 Jun 2022 16:31:43 +0200
Subject: [gpfsug-discuss] Protection against silent data corruption
Message-ID: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>

Hi,

I am wondering if there is an option in SS to enable some checking to 
detect silent data corruption.

Form GNR I know that there is End-to-End integrity. So a checksum is 
stored in addition.

The background is that we are facing an issue where in some files (which 
have data replication =  2) the mmrestripefile is reporting, that one 
block is mismatching it's copy (the storage cluster is running SS 
without GNR).
We have validated that the copied block is fine, but the original one is 
broken (and this is what is returned on read access).
SS right now in our installation is unable to determine which is the 
correct one.
Is there any option to enable this kind of feature in SS? If not, does 
it make sense to create an "IDEA" for it?

Stephan

-- 
Stephan Graf
Juelich Supercomputing Centre

Phone:  +49-2461-61-6578
Fax:    +49-2461-61-6656
E-mail: st.graf at fz-juelich.de
WWW:    http://www.fz-juelich.de/jsc/
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5360 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220602/bbefeb3f/attachment.bin>

From Achim.Rehor at de.ibm.com  Thu Jun  2 18:01:06 2022
From: Achim.Rehor at de.ibm.com (Achim Rehor)
Date: Thu, 2 Jun 2022 17:01:06 +0000
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
Message-ID: <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>

hi Stephan,

there is, see mmchconfig man page :

nsdCksumTraditional
This attribute enables checksum data-integrity checking between a traditional NSD client node and its NSD server. Valid values are yes and no. The default value is no.
(Traditional in this context means that the NSD client and server are configured with IBM Spectrum Scale rather than with IBM Spectrum Scale RAID.
The latter is a component of IBM Elastic Storage Server (ESS) and of IBM GPFS Storage Server (GSS).)

The checksum procedure detects any corruption by the network of the data in the NSD RPCs that are exchanged between the NSD client and the
server. A checksum error triggers a request to retransmit the message.

When this attribute is enabled on a client node, the client indicates in each of its requests to the server that it is using checksums. The server uses checksums only in
response to client requests in which the indicator is set. A client node that accesses a file system that belongs to another cluster can use checksums in the same way.

You can change the value of the this attribute for an entire cluster without shutting down the mmfsd daemon, or for one or more nodes without restarting the nodes.

Note:
* Enabling this feature can result in significant I/O performance degradation and a considerable increase in CPU usage.

* To enable checksums for a subset of the nodes in a cluster, issue a command like the following one:
   mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>

   The -N flag is valid for this attribute.


--

Mit freundlichen Gr??en / Kind regards

Achim Rehor

Technical Support Specialist S?pectrum Scale and ESS (SME)
Advisory Product Services Professional
IBM Systems Storage Support - EMEA

Achim.Rehor at de.ibm.com<mailto:Achim.Rehor at de.ibm.com> +49-170-4521194
IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940


-----Original Message-----
From: Stephan Graf <st.graf at fz-juelich.de<mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
To: gpfsug-discuss <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
Subject: [EXTERNAL] [gpfsug-discuss] Protection against silent data corruption
Date: Thu, 02 Jun 2022 16:31:43 +0200

Hi,

I am wondering if there is an option in SS to enable some checking to
detect silent data corruption.

Form GNR I know that there is End-to-End integrity. So a checksum is
stored in addition.

The background is that we are facing an issue where in some files (which
have data replication =  2) the mmrestripefile is reporting, that one
block is mismatching it's copy (the storage cluster is running SS
without GNR).
We have validated that the copied block is fine, but the original one is
broken (and this is what is returned on read access).
SS right now in our installation is unable to determine which is the
correct one.
Is there any option to enable this kind of feature in SS? If not, does
it make sense to create an "IDEA" for it?

Stephan

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220602/c2c8ee77/attachment.htm>

From ulmer at ulmer.org  Thu Jun  2 18:55:50 2022
From: ulmer at ulmer.org (Stephen Ulmer)
Date: Thu, 2 Jun 2022 13:55:50 -0400
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
Message-ID: <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>

This only adds a checksum to the NSD wire protocol. The question was about detecting data corruption at rest.

-- 
Stephen


> On Jun 2, 2022, at 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com> wrote:
> 
> hi Stephan, 
> 
> there is, see mmchconfig man page : 
> 
> nsdCksumTraditional
> This attribute enables checksum data-integrity checking between a traditional NSD client node and its NSD server. Valid values are yes and no. The default value is no.
> (Traditional in this context means that the NSD client and server are configured with IBM Spectrum Scale rather than with IBM Spectrum Scale RAID. 
> The latter is a component of IBM Elastic Storage Server (ESS) and of IBM GPFS Storage Server (GSS).)
> 
> The checksum procedure detects any corruption by the network of the data in the NSD RPCs that are exchanged between the NSD client and the 
> server. A checksum error triggers a request to retransmit the message.
> 
> When this attribute is enabled on a client node, the client indicates in each of its requests to the server that it is using checksums. The server uses checksums only in
> response to client requests in which the indicator is set. A client node that accesses a file system that belongs to another cluster can use checksums in the same way.
> 
> You can change the value of the this attribute for an entire cluster without shutting down the mmfsd daemon, or for one or more nodes without restarting the nodes.
> 
> Note:
> * Enabling this feature can result in significant I/O performance degradation and a considerable increase in CPU usage.
> 
> * To enable checksums for a subset of the nodes in a cluster, issue a command like the following one:
>    mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
> 
>    The -N flag is valid for this attribute.
> 
>  -- 
> Mit freundlichen Gr??en / Kind regards
> 
> Achim Rehor
> 
> Technical Support Specialist S?pectrum Scale and ESS (SME)
> Advisory Product Services Professional
> IBM Systems Storage Support - EMEA
> 
> Achim.Rehor at de.ibm.com <mailto:Achim.Rehor at de.ibm.com> +49-170-4521194
>   
> IBM Deutschland GmbH 
> Vorsitzender des Aufsichtsrats: Sebastian Krause
> Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer, 
> Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
> Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
> 
> 
> -----Original Message-----
> From: Stephan Graf <st.graf at fz-juelich.de <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
> Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
> To: gpfsug-discuss <gpfsug-discuss at gpfsug.org <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
> Subject: [EXTERNAL] [gpfsug-discuss] Protection against silent data corruption
> Date: Thu, 02 Jun 2022 16:31:43 +0200
> 
> Hi,
> 
> I am wondering if there is an option in SS to enable some checking to 
> detect silent data corruption.
> 
> Form GNR I know that there is End-to-End integrity. So a checksum is 
> stored in addition.
> 
> The background is that we are facing an issue where in some files (which 
> have data replication =  2) the mmrestripefile is reporting, that one 
> block is mismatching it's copy (the storage cluster is running SS 
> without GNR).
> We have validated that the copied block is fine, but the original one is 
> broken (and this is what is returned on read access).
> SS right now in our installation is unable to determine which is the 
> correct one.
> Is there any option to enable this kind of feature in SS? If not, does 
> it make sense to create an "IDEA" for it?
> 
> Stephan
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220602/ef1a4a34/attachment.htm>

From cdmaestas at us.ibm.com  Fri Jun  3 21:19:25 2022
From: cdmaestas at us.ibm.com (Christopher Maestas)
Date: Fri, 3 Jun 2022 20:19:25 +0000
Subject: [gpfsug-discuss] Spectrum Scale 5.1.4 release notes items!
Message-ID: <MN2PR15MB29607493AA96812EB25DED1680A19@MN2PR15MB2960.namprd15.prod.outlook.com>

Hello everyone!

I know I spoke to some of you at ISC 2022 this week about some of these features. They are officially out!

Check out: https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes
Summary of changes<https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes>
This topic summarizes changes to the IBM Spectrum Scale licensed program and the IBM Spectrum Scale library. Within each topic, these markers ( ) surrounding text or illustrations indicate technical changes or additions that are made to the previous edition of the information.
www.ibm.com

Particularly:
---

Control fileset access for remote clusters
Administrators can now configure access to remote cluster nodes for only a subset of filesets instead of the entire file system. For more information, see Fileset access control for remote clusters<https://www.ibm.com/docs/en/STXKQY_5.1.4/com.ibm.spectrum.scale.v5r10.doc/bl1adv_fielsetaccesscontrol.html>.

Increase in the number of independent filesets
In IBM Spectrum Scale the maximum number of independent filesets is increased from 1000 to 3000.
---

We'll talk further about this at the Scale user group in a few weeks in London!

-Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220603/f7d247cc/attachment.htm>

From xhejtman at ics.muni.cz  Fri Jun  3 22:44:22 2022
From: xhejtman at ics.muni.cz (Lukas Hejtmanek)
Date: Fri, 3 Jun 2022 23:44:22 +0200
Subject: [gpfsug-discuss] Spectrum Scale 5.1.4 release notes items!
In-Reply-To: <MN2PR15MB29607493AA96812EB25DED1680A19@MN2PR15MB2960.namprd15.prod.outlook.com>
References: <MN2PR15MB29607493AA96812EB25DED1680A19@MN2PR15MB2960.namprd15.prod.outlook.com>
Message-ID: <YpqAtgCva1fIzfZx@ics.muni.cz>

Hello,

nice to see that only file set can be exported now.

We are running Kubernetes platform together with Spectrum Scale. Beside K8s,
we have also HPC clusters using GPFS/NFS exports. 

We would like to integrate storage from HPC to K8s and vice versa. 

Currently, this is a problem because in K8s almost all users are using UID
1000 for running pods while in HPC they have different UIDs. 

As far as I know, there is no possibility to remap UIDs between K8s and HPC on
the same Spectrum Scale file system. Running pods with different UIDs is hard 
option as many containers assume, they run exactly as UID 1000.

What do you think, is there anything that can be done here?

On Fri, Jun 03, 2022 at 08:19:25PM +0000, Christopher Maestas wrote:
> Hello everyone!
> 
> I know I spoke to some of you at ISC 2022 this week about some of these features. They are officially out!
> 
> Check out: https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes
> Summary of changes<https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes>
> This topic summarizes changes to the IBM Spectrum Scale licensed program and the IBM Spectrum Scale library. Within each topic, these markers ( ) surrounding text or illustrations indicate technical changes or additions that are made to the previous edition of the information.
> www.ibm.com
> 
> Particularly:
> ---
> 
> Control fileset access for remote clusters
> Administrators can now configure access to remote cluster nodes for only a subset of filesets instead of the entire file system. For more information, see Fileset access control for remote clusters<https://www.ibm.com/docs/en/STXKQY_5.1.4/com.ibm.spectrum.scale.v5r10.doc/bl1adv_fielsetaccesscontrol.html>.
> 
> Increase in the number of independent filesets
> In IBM Spectrum Scale the maximum number of independent filesets is increased from 1000 to 3000.
> ---
> 
> We'll talk further about this at the Scale user group in a few weeks in London!
> 
> -Chris

> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-- 
Luk?? Hejtm?nek

Linux Administrator only because
  Full Time Multitasking Ninja 
  is not an official job title


From jcatana at gmail.com  Fri Jun  3 22:51:48 2022
From: jcatana at gmail.com (Josh Catana)
Date: Fri, 3 Jun 2022 17:51:48 -0400
Subject: [gpfsug-discuss] Spectrum Scale 5.1.4 release notes items!
In-Reply-To: <YpqAtgCva1fIzfZx@ics.muni.cz>
References: <MN2PR15MB29607493AA96812EB25DED1680A19@MN2PR15MB2960.namprd15.prod.outlook.com>
 <YpqAtgCva1fIzfZx@ics.muni.cz>
Message-ID: <CAJOKg0QDQZyPKHvz3c=QtqM91JQdq+QtBPtjU9okOkek0zDkaw@mail.gmail.com>

I force my users to runAsUser their user ID in order to access storage (
enforced by OPA policy) and maintain POSIX complaince. I put the
responsibility of being able to run as non-root and the container creator.
I feel like this is growing as standard to run as non-root for things that
aren't system level operators in k8s.
If they aren't accessing storage, I don't care what UID they run as.

On Fri, Jun 3, 2022, 5:46 PM Lukas Hejtmanek <xhejtman at ics.muni.cz> wrote:

> Hello,
>
> nice to see that only file set can be exported now.
>
> We are running Kubernetes platform together with Spectrum Scale. Beside
> K8s,
> we have also HPC clusters using GPFS/NFS exports.
>
> We would like to integrate storage from HPC to K8s and vice versa.
>
> Currently, this is a problem because in K8s almost all users are using UID
> 1000 for running pods while in HPC they have different UIDs.
>
> As far as I know, there is no possibility to remap UIDs between K8s and
> HPC on
> the same Spectrum Scale file system. Running pods with different UIDs is
> hard
> option as many containers assume, they run exactly as UID 1000.
>
> What do you think, is there anything that can be done here?
>
> On Fri, Jun 03, 2022 at 08:19:25PM +0000, Christopher Maestas wrote:
> > Hello everyone!
> >
> > I know I spoke to some of you at ISC 2022 this week about some of these
> features. They are officially out!
> >
> > Check out:
> https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes
> > Summary of changes<
> https://www.ibm.com/docs/en/spectrum-scale/5.1.4?topic=summary-changes>
> > This topic summarizes changes to the IBM Spectrum Scale licensed program
> and the IBM Spectrum Scale library. Within each topic, these markers ( )
> surrounding text or illustrations indicate technical changes or additions
> that are made to the previous edition of the information.
> > www.ibm.com
> >
> > Particularly:
> > ---
> >
> > Control fileset access for remote clusters
> > Administrators can now configure access to remote cluster nodes for only
> a subset of filesets instead of the entire file system. For more
> information, see Fileset access control for remote clusters<
> https://www.ibm.com/docs/en/STXKQY_5.1.4/com.ibm.spectrum.scale.v5r10.doc/bl1adv_fielsetaccesscontrol.html
> >.
> >
> > Increase in the number of independent filesets
> > In IBM Spectrum Scale the maximum number of independent filesets is
> increased from 1000 to 3000.
> > ---
> >
> > We'll talk further about this at the Scale user group in a few weeks in
> London!
> >
> > -Chris
>
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at gpfsug.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
> --
> Luk?? Hejtm?nek
>
> Linux Administrator only because
>   Full Time Multitasking Ninja
>   is not an official job title
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220603/059dcf39/attachment.htm>

From leslie.james.elliott at gmail.com  Sat Jun  4 08:26:56 2022
From: leslie.james.elliott at gmail.com (leslie elliott)
Date: Sat, 4 Jun 2022 17:26:56 +1000
Subject: [gpfsug-discuss] Watch folders
Message-ID: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>

Hi all

I was wondering if anyone had any scoping suggestions for enabling this
feature for multiple filesystems with SMB and NFS shares

We are running a standalone kafka cluster, not part of spectrumscale,
and each of the multiple file system watches, update this with individual
topics
for each file system

We have noticed file system access being affected negatively by the watches
when we were running all the 10 filesystems at the same time.

All of the filesets are AFM, some to NFS homes, and some to NSD homes

any feedback appreciated

leslie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220604/470ef1b5/attachment.htm>

From scale at us.ibm.com  Tue Jun  7 20:53:00 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 8 Jun 2022 01:23:00 +0530
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
Message-ID: <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>


Hi Jake,

Can you or some from your squad please answer the below Watch Folder query.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"leslie elliott" <leslie.james.elliott at gmail.com>
To:	"gpfsug main discussion list"
            <gpfsug-discuss at spectrumscale.org>
Date:	04-06-2022 12.58 PM
Subject:	[EXTERNAL] [gpfsug-discuss] Watch folders
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi all I was wondering if anyone had any scoping suggestions for enabling
this? feature for multiple filesystems with SMB and NFS shares? We are
running a standalone kafka cluster, not part of spectrumscale,? and each of
the multiple file system
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Hi all

I was wondering if anyone had any scoping suggestions for enabling this
feature for multiple filesystems with SMB and NFS shares

We are running a standalone kafka cluster, not part of spectrumscale,
and each of the multiple file system watches, update this with individual
topics
for each file system

We have noticed file system access being affected negatively by the
watches
when we were running all the 10 filesystems at the same time.

All of the filesets are AFM, some to NFS homes, and some to NSD homes

any feedback appreciated

leslie
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220608/8b7e4e28/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220608/8b7e4e28/attachment.gif>

From scale at us.ibm.com  Tue Jun  7 20:53:00 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Wed, 8 Jun 2022 01:23:00 +0530
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
Message-ID: <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>


Hi Jake,

Can you or some from your squad please answer the below Watch Folder query.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"leslie elliott" <leslie.james.elliott at gmail.com>
To:	"gpfsug main discussion list"
            <gpfsug-discuss at spectrumscale.org>
Date:	04-06-2022 12.58 PM
Subject:	[EXTERNAL] [gpfsug-discuss] Watch folders
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi all I was wondering if anyone had any scoping suggestions for enabling
this? feature for multiple filesystems with SMB and NFS shares? We are
running a standalone kafka cluster, not part of spectrumscale,? and each of
the multiple file system
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Hi all

I was wondering if anyone had any scoping suggestions for enabling this
feature for multiple filesystems with SMB and NFS shares

We are running a standalone kafka cluster, not part of spectrumscale,
and each of the multiple file system watches, update this with individual
topics
for each file system

We have noticed file system access being affected negatively by the
watches
when we were running all the 10 filesystems at the same time.

All of the filesets are AFM, some to NFS homes, and some to NSD homes

any feedback appreciated

leslie
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220608/8b7e4e28/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220608/8b7e4e28/attachment-0001.gif>

From scale at us.ibm.com  Wed Jun  8 19:35:05 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Thu, 9 Jun 2022 00:05:05 +0530
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
 <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
Message-ID: <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>


Hi Stephen,

Currently such a feature is not available in Spectrum Scale product.


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Stephen Ulmer" <ulmer at ulmer.org>
To:	"gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>
Date:	02-06-2022 11.32 PM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] Protection against silent data
            corruption
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


This only adds a checksum to the NSD wire protocol. The question was about
detecting data corruption at rest. -- Stephen On Jun 2, 2022, at 1:01 PM,
Achim Rehor <Achim.Rehor at de.ibm.com> wrote: hi Stephan, ????????????????????????????
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
This only adds a checksum to the NSD wire protocol. The question was about
detecting data corruption at rest.

--
Stephen


      On Jun 2, 2022, at 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com>
      wrote:

      hi Stephan,

      there is, see mmchconfig man page :

      nsdCksumTraditional
      This attribute enables checksum data-integrity checking between a
      traditional NSD client node and its NSD server. Valid values are yes
      and no. The default value is no.
      (Traditional in this context means that the NSD client and server are
      configured with IBM Spectrum Scale rather than with IBM Spectrum
      Scale RAID.
      The latter is a component of IBM Elastic Storage Server (ESS) and of
      IBM GPFS Storage Server (GSS).)

      The checksum procedure detects any corruption by the network of the
      data in the NSD RPCs that are exchanged between the NSD client and
      the
      server. A checksum error triggers a request to retransmit the
      message.

      When this attribute is enabled on a client node, the client indicates
      in each of its requests to the server that it is using checksums. The
      server uses checksums only in
      response to client requests in which the indicator is set. A client
      node that accesses a file system that belongs to another cluster can
      use checksums in the same way.

      You can change the value of the this attribute for an entire cluster
      without shutting down the mmfsd daemon, or for one or more nodes
      without restarting the nodes.

      Note:
      * Enabling this feature can result in significant I/O performance
      degradation and a considerable increase in CPU usage.

      * To enable checksums for a subset of the nodes in a cluster, issue a
      command like the following one:
         mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>

         The -N flag is valid for this attribute.

      --
      Mit freundlichen Gr??en / Kind regards

      Achim Rehor

      Technical Support Specialist S?pectrum Scale and ESS (SME)
      Advisory Product Services Professional
      IBM Systems Storage Support - EMEA

      Achim.Rehor at de.ibm.com +49-170-4521194
      IBM Deutschland GmbH
      Vorsitzender des Aufsichtsrats: Sebastian Krause
      Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
      Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
      Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
      Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940


      -----Original Message-----
      From: Stephan Graf <st.graf at fz-juelich.de>
      Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
      To: gpfsug-discuss <gpfsug-discuss at gpfsug.org>
      Subject: [EXTERNAL] [gpfsug-discuss] Protection against silent data
      corruption
      Date: Thu, 02 Jun 2022 16:31:43 +0200

      Hi,

      I am wondering if there is an option in SS to enable some checking to
      detect silent data corruption.

      Form GNR I know that there is End-to-End integrity. So a checksum is
      stored in addition.

      The background is that we are facing an issue where in some files
      (which
      have data replication =  2) the mmrestripefile is reporting, that one
      block is mismatching it's copy (the storage cluster is running SS
      without GNR).
      We have validated that the copied block is fine, but the original one
      is
      broken (and this is what is returned on read access).
      SS right now in our installation is unable to determine which is the
      correct one.
      Is there any option to enable this kind of feature in SS? If not,
      does
      it make sense to create an "IDEA" for it?

      Stephan

      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at gpfsug.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
      _______________________________________________
      gpfsug-discuss mailing list
      gpfsug-discuss at gpfsug.org
      http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220609/d6acfc3d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220609/d6acfc3d/attachment.gif>

From st.graf at fz-juelich.de  Thu Jun  9 06:59:13 2022
From: st.graf at fz-juelich.de (Stephan Graf)
Date: Thu, 9 Jun 2022 07:59:13 +0200
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
 <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
 <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>
Message-ID: <101bf257-ee13-11fd-95f4-523135dbb57b@fz-juelich.de>

Hi,

I have create an IDEA for it: 
https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851

Stephan


Am 08.06.2022 um 20:35 schrieb IBM Spectrum Scale:
> Hi Stephen,
> 
> Currently such a feature is not available in Spectrum Scale product.
> 
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of ?Spectrum 
> Scale (GPFS), then please post it to the public IBM developerWroks Forum 
> at 
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 
> <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>. 
> 
> 
> If your query concerns a potential software error in Spectrum Scale 
> (GPFS) and you have an IBM software maintenance contract please contact 
>  ?1-800-237-5511 in the United States or your local IBM Service Center 
> in other countries.
> 
> The forum is informally monitored as time permits and should not be used 
> for priority messages to the Spectrum Scale (GPFS) team.
> 
> Inactive hide details for "Stephen Ulmer" ---02-06-2022 11.32.27 
> PM---This only adds a checksum to the NSD wire protocol. The q"Stephen 
> Ulmer" ---02-06-2022 11.32.27 PM---This only adds a checksum to the NSD 
> wire protocol. The question was about detecting data corruption
> 
> From: "Stephen Ulmer" <ulmer at ulmer.org>
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>
> Date: 02-06-2022 11.32 PM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data 
> corruption
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> 
> ------------------------------------------------------------------------
> 
> 
> 
> This only adds a checksum to the NSD wire protocol. The question was 
> about detecting data corruption at rest. -- Stephen On Jun 2, 2022, at 
> 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com> wrote: hi Stephan, 
> ????????????????????????????
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
> 
> ZjQcmQRYFpfptBannerEnd
> This only adds a checksum to the NSD wire protocol. The question was 
> about detecting data corruption at rest.
> 
> -- 
> Stephen
> 
> 
>     On Jun 2, 2022, at 1:01 PM, Achim Rehor <_Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com>> wrote:
> 
>     hi Stephan,
> 
>     there is, see mmchconfig man page :
> 
>     nsdCksumTraditional
>     This attribute enables checksum data-integrity checking between a
>     traditional NSD client node and its NSD server. Valid values are yes
>     and no. The default value is no.
>     (Traditional in this context means that the NSD client and server
>     are configured with IBM Spectrum Scale rather than with IBM Spectrum
>     Scale RAID.
>     The latter is a component of IBM Elastic Storage Server (ESS) and of
>     IBM GPFS Storage Server (GSS).)
> 
>     The checksum procedure detects any corruption by the network of the
>     data in the NSD RPCs that are exchanged between the NSD client and the
>     server. A checksum error triggers a request to retransmit the message.
> 
>     When this attribute is enabled on a client node, the client
>     indicates in each of its requests to the server that it is using
>     checksums. The server uses checksums only in
>     response to client requests in which the indicator is set. A client
>     node that accesses a file system that belongs to another cluster can
>     use checksums in the same way.
> 
>     You can change the value of the this attribute for an entire cluster
>     without shutting down the mmfsd daemon, or for one or more nodes
>     without restarting the nodes.
> 
>     Note:
>     * Enabling this feature can result in significant I/O performance
>     degradation and a considerable increase in CPU usage.
> 
>     * To enable checksums for a subset of the nodes in a cluster, issue
>     a command like the following one:
>      ? ?mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
> 
>      ? ?The -N flag is valid for this attribute.
> 
>     -- 
>     Mit freundlichen Gr??en / Kind regards
> 
>     Achim Rehor
> 
>     Technical Support Specialist S?pectrum Scale and ESS (SME)
>     Advisory Product Services Professional
>     IBM Systems Storage Support - EMEA
> 
>     _Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com>?+49-170-4521194
>     IBM Deutschland GmbH
>     Vorsitzender des Aufsichtsrats: Sebastian Krause
>     Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
>     Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
>     Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
>     Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
> 
> 
>     -----Original Message-----
>     *From*: Stephan Graf <_st.graf at fz-juelich.de_
>     <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
>     *Reply-To*: gpfsug main discussion list <_gpfsug-discuss at gpfsug.org_
>     <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
>     *To*: gpfsug-discuss <_gpfsug-discuss at gpfsug.org_
>     <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
>     *Subject*: [EXTERNAL] [gpfsug-discuss] Protection against silent
>     data corruption
>     *Date*: Thu, 02 Jun 2022 16:31:43 +0200
> 
>     Hi,
> 
>     I am wondering if there is an option in SS to enable some checking to
>     detect silent data corruption.
> 
>     Form GNR I know that there is End-to-End integrity. So a checksum is
>     stored in addition.
> 
>     The background is that we are facing an issue where in some files
>     (which
>     have data replication = ?2) the mmrestripefile is reporting, that one
>     block is mismatching it's copy (the storage cluster is running SS
>     without GNR).
>     We have validated that the copied block is fine, but the original
>     one is
>     broken (and this is what is returned on read access).
>     SS right now in our installation is unable to determine which is the
>     correct one.
>     Is there any option to enable this kind of feature in SS? If not, does
>     it make sense to create an "IDEA" for it?
> 
>     Stephan
> 
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>
>     _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>_
>     __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> 
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-- 
Stephan Graf
Juelich Supercomputing Centre

Phone:  +49-2461-61-6578
Fax:    +49-2461-61-6656
E-mail: st.graf at fz-juelich.de
WWW:    http://www.fz-juelich.de/jsc/
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5360 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220609/cf20f600/attachment.bin>

From scale at us.ibm.com  Thu Jun  9 19:45:40 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Fri, 10 Jun 2022 00:15:40 +0530
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <101bf257-ee13-11fd-95f4-523135dbb57b@fz-juelich.de>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
 <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
 <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>
 <101bf257-ee13-11fd-95f4-523135dbb57b@fz-juelich.de>
Message-ID: <OF864448A1.6B152580-ON8525885C.0066C439-6525885C.00670ED6@ibm.com>


Thanks Stephan.
This will be looked into and accordingly prioritized by the offering
manager team. Incase the IBM team has any further questions on this then we
will get back to you.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	"Stephan Graf" <st.graf at fz-juelich.de>
To:	<gpfsug-discuss at gpfsug.org>
Date:	09-06-2022 11.31 AM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] Protection against silent data
            corruption
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi,

I have create an IDEA for it:
https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851

Stephan


Am 08.06.2022 um 20:35 schrieb IBM Spectrum Scale:
> Hi Stephen,
>
> Currently such a feature is not available in Spectrum Scale product.
>
>
> Regards, The Spectrum Scale (GPFS) team
>
>
------------------------------------------------------------------------------------------------------------------

> If you feel that your question can benefit other users of ?Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks Forum
> at
>
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479

> <
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
>.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please contact
>  ?1-800-237-5511 in the United States or your local IBM Service Center
> in other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for "Stephen Ulmer" ---02-06-2022 11.32.27
> PM---This only adds a checksum to the NSD wire protocol. The q"Stephen
> Ulmer" ---02-06-2022 11.32.27 PM---This only adds a checksum to the NSD
> wire protocol. The question was about detecting data corruption
>
> From: "Stephen Ulmer" <ulmer at ulmer.org>
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>
> Date: 02-06-2022 11.32 PM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data
> corruption
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
>
> ------------------------------------------------------------------------
>
>
>
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest. -- Stephen On Jun 2, 2022, at
> 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com> wrote: hi Stephan,
> ????????????????????????????
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest.
>
> --
> Stephen
>
>
>     On Jun 2, 2022, at 1:01 PM, Achim Rehor <_Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com>> wrote:
>
>     hi Stephan,
>
>     there is, see mmchconfig man page :
>
>     nsdCksumTraditional
>     This attribute enables checksum data-integrity checking between a
>     traditional NSD client node and its NSD server. Valid values are yes
>     and no. The default value is no.
>     (Traditional in this context means that the NSD client and server
>     are configured with IBM Spectrum Scale rather than with IBM Spectrum
>     Scale RAID.
>     The latter is a component of IBM Elastic Storage Server (ESS) and of
>     IBM GPFS Storage Server (GSS).)
>
>     The checksum procedure detects any corruption by the network of the
>     data in the NSD RPCs that are exchanged between the NSD client and
the
>     server. A checksum error triggers a request to retransmit the
message.
>
>     When this attribute is enabled on a client node, the client
>     indicates in each of its requests to the server that it is using
>     checksums. The server uses checksums only in
>     response to client requests in which the indicator is set. A client
>     node that accesses a file system that belongs to another cluster can
>     use checksums in the same way.
>
>     You can change the value of the this attribute for an entire cluster
>     without shutting down the mmfsd daemon, or for one or more nodes
>     without restarting the nodes.
>
>     Note:
>     * Enabling this feature can result in significant I/O performance
>     degradation and a considerable increase in CPU usage.
>
>     * To enable checksums for a subset of the nodes in a cluster, issue
>     a command like the following one:
>      ? ?mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
>
>      ? ?The -N flag is valid for this attribute.
>
>     --
>     Mit freundlichen Gr??en / Kind regards
>
>     Achim Rehor
>
>     Technical Support Specialist S?pectrum Scale and ESS (SME)
>     Advisory Product Services Professional
>     IBM Systems Storage Support - EMEA
>
>     _Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com>?+49-170-4521194
>     IBM Deutschland GmbH
>     Vorsitzender des Aufsichtsrats: Sebastian Krause
>     Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
>     Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
>     Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
>     Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
>
>     -----Original Message-----
>     *From*: Stephan Graf <_st.graf at fz-juelich.de_
>     <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
>     *Reply-To*: gpfsug main discussion list <_gpfsug-discuss at gpfsug.org_
>     <
mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e
>>
>     *To*: gpfsug-discuss <_gpfsug-discuss at gpfsug.org_
>     <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
>     *Subject*: [EXTERNAL] [gpfsug-discuss] Protection against silent
>     data corruption
>     *Date*: Thu, 02 Jun 2022 16:31:43 +0200
>
>     Hi,
>
>     I am wondering if there is an option in SS to enable some checking to
>     detect silent data corruption.
>
>     Form GNR I know that there is End-to-End integrity. So a checksum is
>     stored in addition.
>
>     The background is that we are facing an issue where in some files
>     (which
>     have data replication = ?2) the mmrestripefile is reporting, that one
>     block is mismatching it's copy (the storage cluster is running SS
>     without GNR).
>     We have validated that the copied block is fine, but the original
>     one is
>     broken (and this is what is returned on read access).
>     SS right now in our installation is unable to determine which is the
>     correct one.
>     Is there any option to enable this kind of feature in SS? If not,
does
>     it make sense to create an "IDEA" for it?
>
>     Stephan
>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>
>     _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>_
>     __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

--
Stephan Graf
Juelich Supercomputing Centre

Phone:  +49-2461-61-6578
Fax:    +49-2461-61-6656
E-mail: st.graf at fz-juelich.de
WWW:    http://www.fz-juelich.de/jsc/
---------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------

Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------

[attachment "smime.p7s" deleted by Huzefa H Pancha/India/IBM]
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220610/4926f1ac/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220610/4926f1ac/attachment.gif>

From ulmer at ulmer.org  Thu Jun  9 20:47:07 2022
From: ulmer at ulmer.org (Stephen Ulmer)
Date: Thu, 9 Jun 2022 15:47:07 -0400
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <OF864448A1.6B152580-ON8525885C.0066C439-6525885C.00670ED6@ibm.com>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
 <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
 <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>
 <101bf257-ee13-11fd-95f4-523135dbb57b@fz-juelich.de>
 <OF864448A1.6B152580-ON8525885C.0066C439-6525885C.00670ED6@ibm.com>
Message-ID: <6423D118-609A-4767-8F96-79B1D8EB4C8F@ulmer.org>

Just to be clear: any follow-up should be directed to Stephan, who is requesting the feature.

I am well aware that Scale does not provide this feature, and was just clarifying Stephan?s question for Achim, who answered the question with an unrelated reference after which Scale support replied to me.

This is also where I notice that for all that is holy, the generated IDEA links point to DeveloperWorks and don?t even get you to the correct forum thread. Sigh.

-- 
Stephen


> On Jun 9, 2022, at 2:45 PM, IBM Spectrum Scale <scale at us.ibm.com> wrote:
> 
> Thanks Stephan.
> This will be looked into and accordingly prioritized by the offering manager team. Incase the IBM team has any further questions on this then we will get back to you.
> 
> Regards, The Spectrum Scale (GPFS) team
> 
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>. 
> 
> If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact  1-800-237-5511 in the United States or your local IBM Service Center in other countries. 
> 
> The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.
> 
> <graycol.gif>"Stephan Graf" ---09-06-2022 11.31.01 AM---Hi, I have create an IDEA for it:
> 
> From: "Stephan Graf" <st.graf at fz-juelich.de>
> To: <gpfsug-discuss at gpfsug.org>
> Date: 09-06-2022 11.31 AM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data corruption
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> 
> 
> 
> 
> Hi,
> 
> I have create an IDEA for it: 
> https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851 <https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851>
> 
> Stephan
> 
> 
> Am 08.06.2022 um 20:35 schrieb IBM Spectrum Scale:
> > Hi Stephen,
> > 
> > Currently such a feature is not available in Spectrum Scale product.
> > 
> > 
> > Regards, The Spectrum Scale (GPFS) team
> > 
> > ------------------------------------------------------------------------------------------------------------------
> > If you feel that your question can benefit other users of  Spectrum 
> > Scale (GPFS), then please post it to the public IBM developerWroks Forum 
> > at 
> > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479> 
> > <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>>. 
> > 
> > 
> > If your query concerns a potential software error in Spectrum Scale 
> > (GPFS) and you have an IBM software maintenance contract please contact 
> >   1-800-237-5511 in the United States or your local IBM Service Center 
> > in other countries.
> > 
> > The forum is informally monitored as time permits and should not be used 
> > for priority messages to the Spectrum Scale (GPFS) team.
> > 
> > Inactive hide details for "Stephen Ulmer" ---02-06-2022 11.32.27 
> > PM---This only adds a checksum to the NSD wire protocol. The q"Stephen 
> > Ulmer" ---02-06-2022 11.32.27 PM---This only adds a checksum to the NSD 
> > wire protocol. The question was about detecting data corruption
> > 
> > From: "Stephen Ulmer" <ulmer at ulmer.org>
> > To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>
> > Date: 02-06-2022 11.32 PM
> > Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data 
> > corruption
> > Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> > 
> > ------------------------------------------------------------------------
> > 
> > 
> > 
> > This only adds a checksum to the NSD wire protocol. The question was 
> > about detecting data corruption at rest. -- Stephen On Jun 2, 2022, at 
> > 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com> wrote: hi Stephan, 
> > ????????????????????????????
> > 
> > This only adds a checksum to the NSD wire protocol. The question was 
> > about detecting data corruption at rest.
> > 
> > -- 
> > Stephen
> > 
> > 
> >     On Jun 2, 2022, at 1:01 PM, Achim Rehor <_Achim.Rehor at de.ibm.com_
> >     <mailto:Achim.Rehor at de.ibm.com <mailto:Achim.Rehor at de.ibm.com>>> wrote:
> > 
> >     hi Stephan,
> > 
> >     there is, see mmchconfig man page :
> > 
> >     nsdCksumTraditional
> >     This attribute enables checksum data-integrity checking between a
> >     traditional NSD client node and its NSD server. Valid values are yes
> >     and no. The default value is no.
> >     (Traditional in this context means that the NSD client and server
> >     are configured with IBM Spectrum Scale rather than with IBM Spectrum
> >     Scale RAID.
> >     The latter is a component of IBM Elastic Storage Server (ESS) and of
> >     IBM GPFS Storage Server (GSS).)
> > 
> >     The checksum procedure detects any corruption by the network of the
> >     data in the NSD RPCs that are exchanged between the NSD client and the
> >     server. A checksum error triggers a request to retransmit the message.
> > 
> >     When this attribute is enabled on a client node, the client
> >     indicates in each of its requests to the server that it is using
> >     checksums. The server uses checksums only in
> >     response to client requests in which the indicator is set. A client
> >     node that accesses a file system that belongs to another cluster can
> >     use checksums in the same way.
> > 
> >     You can change the value of the this attribute for an entire cluster
> >     without shutting down the mmfsd daemon, or for one or more nodes
> >     without restarting the nodes.
> > 
> >     Note:
> >     * Enabling this feature can result in significant I/O performance
> >     degradation and a considerable increase in CPU usage.
> > 
> >     * To enable checksums for a subset of the nodes in a cluster, issue
> >     a command like the following one:
> >         mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
> > 
> >         The -N flag is valid for this attribute.
> > 
> >     -- 
> >     Mit freundlichen Gr??en / Kind regards
> > 
> >     Achim Rehor
> > 
> >     Technical Support Specialist S?pectrum Scale and ESS (SME)
> >     Advisory Product Services Professional
> >     IBM Systems Storage Support - EMEA
> > 
> >     _Achim.Rehor at de.ibm.com_
> >     <mailto:Achim.Rehor at de.ibm.com <mailto:Achim.Rehor at de.ibm.com>> +49-170-4521194
> >     IBM Deutschland GmbH
> >     Vorsitzender des Aufsichtsrats: Sebastian Krause
> >     Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
> >     Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
> >     Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
> >     Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
> > 
> > 
> >     -----Original Message-----
> >     *From*: Stephan Graf <_st.graf at fz-juelich.de_
> >     <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>>
> >     *Reply-To*: gpfsug main discussion list <_gpfsug-discuss at gpfsug.org_
> >     <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>>
> >     *To*: gpfsug-discuss <_gpfsug-discuss at gpfsug.org_
> >     <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>>
> >     *Subject*: [EXTERNAL] [gpfsug-discuss] Protection against silent
> >     data corruption
> >     *Date*: Thu, 02 Jun 2022 16:31:43 +0200
> > 
> >     Hi,
> > 
> >     I am wondering if there is an option in SS to enable some checking to
> >     detect silent data corruption.
> > 
> >     Form GNR I know that there is End-to-End integrity. So a checksum is
> >     stored in addition.
> > 
> >     The background is that we are facing an issue where in some files
> >     (which
> >     have data replication =  2) the mmrestripefile is reporting, that one
> >     block is mismatching it's copy (the storage cluster is running SS
> >     without GNR).
> >     We have validated that the copied block is fine, but the original
> >     one is
> >     broken (and this is what is returned on read access).
> >     SS right now in our installation is unable to determine which is the
> >     correct one.
> >     Is there any option to enable this kind of feature in SS? If not, does
> >     it make sense to create an "IDEA" for it?
> > 
> >     Stephan
> > 
> >     _______________________________________________
> >     gpfsug-discuss mailing list
> >     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org <http://gpfsug.org/>>
> >     _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_ <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_>
> >     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
> >     _______________________________________________
> >     gpfsug-discuss mailing list
> >     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org <http://gpfsug.org/>>_
> >     __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_ <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_>
> >     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
> > 
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at gpfsug.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org> 
> > <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
> > 
> > 
> > 
> > 
> > 
> > _______________________________________________
> > gpfsug-discuss mailing list
> > gpfsug-discuss at gpfsug.org
> > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> 
> -- 
> Stephan Graf
> Juelich Supercomputing Centre
> 
> Phone:  +49-2461-61-6578
> Fax:    +49-2461-61-6656
> E-mail: st.graf at fz-juelich.de
> WWW:    http://www.fz-juelich.de/jsc/ <http://www.fz-juelich.de/jsc/>
> ---------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------------------------------
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
> Prof. Dr. Frauke Melchior
> ---------------------------------------------------------------------------------------------
> ---------------------------------------------------------------------------------------------
> [attachment "smime.p7s" deleted by Huzefa H Pancha/India/IBM] _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> 
> 
> 
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220609/096fa9d5/attachment.htm>

From Achim.Rehor at de.ibm.com  Fri Jun 10 09:01:01 2022
From: Achim.Rehor at de.ibm.com (Achim Rehor)
Date: Fri, 10 Jun 2022 08:01:01 +0000
Subject: [gpfsug-discuss] Protection against silent data corruption
In-Reply-To: <6423D118-609A-4767-8F96-79B1D8EB4C8F@ulmer.org>
References: <804f4f79-e852-9713-6253-f006b1920c11@fz-juelich.de>
 <e55ee73fa20f855f29c9f40e5105b51aba7fe10d.camel@de.ibm.com>
 <8359A397-6332-4791-A153-DF6752EE4806@ulmer.org>
 <OFAF98DBA1.800ECFBC-ON8525885B.0065E15A-6525885B.006616EA@ibm.com>
 <101bf257-ee13-11fd-95f4-523135dbb57b@fz-juelich.de>
 <OF864448A1.6B152580-ON8525885C.0066C439-6525885C.00670ED6@ibm.com>
 <6423D118-609A-4767-8F96-79B1D8EB4C8F@ulmer.org>
Message-ID: <da9bc89dec7eeeb201a5f989fde08983c5c3a84c.camel@de.ibm.com>

Thanks Stephen,

for clarifying, i misread the initial question, and thanks Stefan for raising that IDEA.
The new address for raising RFEs/IDEAs on GPFS now is : https://ibm-sys-storage.ideas.ibm.com/ideas?project=GPFS


--

Mit freundlichen Gr??en / Kind regards

Achim Rehor

-----Original Message-----
From: Stephen Ulmer <ulmer at ulmer.org<mailto:Stephen%20Ulmer%20%3culmer at ulmer.org%3e>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data corruption
Date: Thu, 09 Jun 2022 15:47:07 -0400

Just to be clear: any follow-up should be directed to Stephan, who is requesting the feature. I am well aware that Scale does not provide this feature, and was just clarifying Stephan?s question for Achim, who answered the question with an
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Just to be clear: any follow-up should be directed to Stephan, who is requesting the feature.

I am well aware that Scale does not provide this feature, and was just clarifying Stephan?s question for Achim, who answered the question with an unrelated reference after which Scale support replied to me.

This is also where I notice that for all that is holy, the generated IDEA links point to DeveloperWorks and don?t even get you to the correct forum thread. Sigh.

--
Stephen


On Jun 9, 2022, at 2:45 PM, IBM Spectrum Scale <scale at us.ibm.com<mailto:scale at us.ibm.com>> wrote:


Thanks Stephan.
This will be looked into and accordingly prioritized by the offering manager team. Incase the IBM team has any further questions on this then we will get back to you.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of  Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.

If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact  1-800-237-5511 in the United States or your local IBM Service Center in other countries.

The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team.

<graycol.gif>"Stephan Graf" ---09-06-2022 11.31.01 AM---Hi, I have create an IDEA for it:

From: "Stephan Graf" <st.graf at fz-juelich.de<mailto:st.graf at fz-juelich.de>>
To: <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
Date: 09-06-2022 11.31 AM
Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data corruption
Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>>

________________________________


Hi,

I have create an IDEA for it:
https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851

Stephan


Am 08.06.2022 um 20:35 schrieb IBM Spectrum Scale:
> Hi Stephen,
>
> Currently such a feature is not available in Spectrum Scale product.
>
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks Forum
> at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
> <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please contact
>   1-800-237-5511 in the United States or your local IBM Service Center
> in other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for "Stephen Ulmer" ---02-06-2022 11.32.27
> PM---This only adds a checksum to the NSD wire protocol. The q"Stephen
> Ulmer" ---02-06-2022 11.32.27 PM---This only adds a checksum to the NSD
> wire protocol. The question was about detecting data corruption
>
> From: "Stephen Ulmer" <ulmer at ulmer.org<mailto:ulmer at ulmer.org>>
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>>
> Date: 02-06-2022 11.32 PM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data
> corruption
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>>
>
> ------------------------------------------------------------------------
>
>
>
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest. -- Stephen On Jun 2, 2022, at
> 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com<mailto:Achim.Rehor at de.ibm.com>> wrote: hi Stephan,
> ????????????????????????????
>
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest.
>
> --
> Stephen
>
>
>     On Jun 2, 2022, at 1:01 PM, Achim Rehor <_Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com>> wrote:
>
>     hi Stephan,
>
>     there is, see mmchconfig man page :
>
>     nsdCksumTraditional
>     This attribute enables checksum data-integrity checking between a
>     traditional NSD client node and its NSD server. Valid values are yes
>     and no. The default value is no.
>     (Traditional in this context means that the NSD client and server
>     are configured with IBM Spectrum Scale rather than with IBM Spectrum
>     Scale RAID.
>     The latter is a component of IBM Elastic Storage Server (ESS) and of
>     IBM GPFS Storage Server (GSS).)
>
>     The checksum procedure detects any corruption by the network of the
>     data in the NSD RPCs that are exchanged between the NSD client and the
>     server. A checksum error triggers a request to retransmit the message.
>
>     When this attribute is enabled on a client node, the client
>     indicates in each of its requests to the server that it is using
>     checksums. The server uses checksums only in
>     response to client requests in which the indicator is set. A client
>     node that accesses a file system that belongs to another cluster can
>     use checksums in the same way.
>
>     You can change the value of the this attribute for an entire cluster
>     without shutting down the mmfsd daemon, or for one or more nodes
>     without restarting the nodes.
>
>     Note:
>     * Enabling this feature can result in significant I/O performance
>     degradation and a considerable increase in CPU usage.
>
>     * To enable checksums for a subset of the nodes in a cluster, issue
>     a command like the following one:
>         mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
>
>         The -N flag is valid for this attribute.
>
>     --
>     Mit freundlichen Gr??en / Kind regards
>
>     Achim Rehor
>
>     Technical Support Specialist S?pectrum Scale and ESS (SME)
>     Advisory Product Services Professional
>     IBM Systems Storage Support - EMEA
>
>     _Achim.Rehor at de.ibm.com_
>     <mailto:Achim.Rehor at de.ibm.com> +49-170-4521194
>     IBM Deutschland GmbH
>     Vorsitzender des Aufsichtsrats: Sebastian Krause
>     Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
>     Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
>     Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
>     Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
>
>     -----Original Message-----
>     *From*: Stephan Graf <_st.graf at fz-juelich.de_
>     <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
>     *Reply-To*: gpfsug main discussion list <_gpfsug-discuss at gpfsug.org_
>     <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
>     *To*: gpfsug-discuss <_gpfsug-discuss at gpfsug.org_
>     <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
>     *Subject*: [EXTERNAL] [gpfsug-discuss] Protection against silent
>     data corruption
>     *Date*: Thu, 02 Jun 2022 16:31:43 +0200
>
>     Hi,
>
>     I am wondering if there is an option in SS to enable some checking to
>     detect silent data corruption.
>
>     Form GNR I know that there is End-to-End integrity. So a checksum is
>     stored in addition.
>
>     The background is that we are facing an issue where in some files
>     (which
>     have data replication =  2) the mmrestripefile is reporting, that one
>     block is mismatching it's copy (the storage cluster is running SS
>     without GNR).
>     We have validated that the copied block is fine, but the original
>     one is
>     broken (and this is what is returned on read access).
>     SS right now in our installation is unable to determine which is the
>     correct one.
>     Is there any option to enable this kind of feature in SS? If not, does
>     it make sense to create an "IDEA" for it?
>
>     Stephan
>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org<http://gpfsug.org/>>
>     _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_>
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
>     _______________________________________________
>     gpfsug-discuss mailing list
>     gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org<http://gpfsug.org/>>_
>     __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_>
>     <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org<http://gpfsug.org>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org<http://gpfsug.org>
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>

--
Stephan Graf
Juelich Supercomputing Centre

Phone:  +49-2461-61-6578
Fax:    +49-2461-61-6656
E-mail: st.graf at fz-juelich.de<mailto:st.graf at fz-juelich.de>
WWW:    http://www.fz-juelich.de/jsc/<http://www.fz-juelich.de/jsc/>
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
[attachment "smime.p7s" deleted by Huzefa H Pancha/India/IBM] _______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220610/0904563d/attachment.htm>

From scale at us.ibm.com  Fri Jun 10 19:30:10 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Sat, 11 Jun 2022 00:00:10 +0530
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
 <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
Message-ID: <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>


Hi Jake,

Just checking if you or someone from you squad got a chance to respond to
Leslie's Watch folder query.


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	IBM Spectrum Scale/Poughkeepsie/IBM at IBMUS
To:	"gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>,
            Jacob M Tick/Tucson/IBM at IBMMail
Cc:	"gpfsug main discussion list"
            <gpfsug-discuss at spectrumscale.org>, "gpfsug-discuss"
            <gpfsug-discuss-bounces at gpfsug.org>
Date:	08-06-2022 01.27 AM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] Watch folders
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi Jake, Can you or some from your squad please answer the below Watch
Folder query. Regards, The Spectrum Scale (GPFS) team
------------------------------------------------------------------------------------------------------------------

ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd


Hi Jake,

Can you or some from your squad please answer the below Watch Folder query.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.

Inactive hide details for "leslie elliott" ---04-06-2022 12.58.48 PM---Hi
all I was wondering if anyone had any scoping suggest"leslie elliott"
---04-06-2022 12.58.48 PM---Hi all I was wondering if anyone had any
scoping suggestions for enabling this

From: "leslie elliott" <leslie.james.elliott at gmail.com>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
Date: 04-06-2022 12.58 PM
Subject: [EXTERNAL] [gpfsug-discuss] Watch folders
Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi all I was wondering if anyone had any scoping suggestions for enabling
this? feature for multiple filesystems with SMB and NFS shares? We are
running a standalone kafka cluster, not part of spectrumscale,? and each of
the multiple file system
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Hi all

I was wondering if anyone had any scoping suggestions for enabling this
feature for multiple filesystems with SMB and NFS shares

We are running a standalone kafka cluster, not part of spectrumscale,
and each of the multiple file system watches, update this with individual
topics
for each file system

We have noticed file system access being affected negatively by the
watches
when we were running all the 10 filesystems at the same time.

All of the filesets are AFM, some to NFS homes, and some to NSD homes

any feedback appreciated

leslie
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/72f8cac9/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/72f8cac9/attachment.gif>

From scale at us.ibm.com  Fri Jun 10 19:30:10 2022
From: scale at us.ibm.com (IBM Spectrum Scale)
Date: Sat, 11 Jun 2022 00:00:10 +0530
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
 <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
Message-ID: <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>


Hi Jake,

Just checking if you or someone from you squad got a chance to respond to
Leslie's Watch folder query.


Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.


If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.


From:	IBM Spectrum Scale/Poughkeepsie/IBM at IBMUS
To:	"gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>,
            Jacob M Tick/Tucson/IBM at IBMMail
Cc:	"gpfsug main discussion list"
            <gpfsug-discuss at spectrumscale.org>, "gpfsug-discuss"
            <gpfsug-discuss-bounces at gpfsug.org>
Date:	08-06-2022 01.27 AM
Subject:	[EXTERNAL] Re: [gpfsug-discuss] Watch folders
Sent by:	"gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi Jake, Can you or some from your squad please answer the below Watch
Folder query. Regards, The Spectrum Scale (GPFS) team
------------------------------------------------------------------------------------------------------------------

ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd


Hi Jake,

Can you or some from your squad please answer the below Watch Folder query.

Regards, The Spectrum Scale (GPFS) team

------------------------------------------------------------------------------------------------------------------

If you feel that your question can benefit other users of  Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.

If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.

The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.

Inactive hide details for "leslie elliott" ---04-06-2022 12.58.48 PM---Hi
all I was wondering if anyone had any scoping suggest"leslie elliott"
---04-06-2022 12.58.48 PM---Hi all I was wondering if anyone had any
scoping suggestions for enabling this

From: "leslie elliott" <leslie.james.elliott at gmail.com>
To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
Date: 04-06-2022 12.58 PM
Subject: [EXTERNAL] [gpfsug-discuss] Watch folders
Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>


Hi all I was wondering if anyone had any scoping suggestions for enabling
this? feature for multiple filesystems with SMB and NFS shares? We are
running a standalone kafka cluster, not part of spectrumscale,? and each of
the multiple file system
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd
Hi all

I was wondering if anyone had any scoping suggestions for enabling this
feature for multiple filesystems with SMB and NFS shares

We are running a standalone kafka cluster, not part of spectrumscale,
and each of the multiple file system watches, update this with individual
topics
for each file system

We have noticed file system access being affected negatively by the
watches
when we were running all the 10 filesystems at the same time.

All of the filesets are AFM, some to NFS homes, and some to NSD homes

any feedback appreciated

leslie
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/72f8cac9/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/72f8cac9/attachment-0001.gif>

From leslie.james.elliott at gmail.com  Fri Jun 10 22:02:20 2022
From: leslie.james.elliott at gmail.com (leslie elliott)
Date: Sat, 11 Jun 2022 07:02:20 +1000
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
 <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
 <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>
Message-ID: <CANBv+tssYr5nTnoZjX=HiEkxipZCAZk1jFHR260FEZdWcLEOTA@mail.gmail.com>

thanks for chasing this up

I will log a support call if that is easier to track this

was hoping this was something someone had seen already
but doesn't look like it so far

leslie

On Sat, 11 Jun 2022 at 04:30, IBM Spectrum Scale <scale at us.ibm.com> wrote:

> Hi Jake,
>
> Just checking if you or someone from you squad got a chance to respond to
> Leslie's Watch folder query.
>
>
> Regards, The Spectrum Scale (GPFS) team
>
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
>
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
>  1-800-237-5511 in the United States or your local IBM Service Center in
> other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> [image: Inactive hide details for IBM Spectrum Scale---08-06-2022 01.27.29
> AM---Hi Jake, Can you or some from your squad please answer]IBM Spectrum
> Scale---08-06-2022 01.27.29 AM---Hi Jake, Can you or some from your squad
> please answer the below Watch Folder query.
>
> From: IBM Spectrum Scale/Poughkeepsie/IBM at IBMUS
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>, Jacob M
> Tick/Tucson/IBM at IBMMail
> Cc: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>,
> "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> Date: 08-06-2022 01.27 AM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Watch folders
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> ------------------------------
>
>
>
> Hi Jake, Can you or some from your squad please answer the below Watch
> Folder query. Regards, The Spectrum Scale (GPFS) team
> ------------------------------------------------------------------------------------------------------------------
>
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hi Jake,
>
> Can you or some from your squad please answer the below Watch Folder query.
>
> Regards, The Spectrum Scale (GPFS) team
>
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> *https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479*
> <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>.
>
>
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
>  1-800-237-5511 in the United States or your local IBM Service Center in
> other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> [image: Inactive hide details for "leslie elliott" ---04-06-2022 12.58.48
> PM---Hi all I was wondering if anyone had any scoping suggest]"leslie
> elliott" ---04-06-2022 12.58.48 PM---Hi all I was wondering if anyone had
> any scoping suggestions for enabling this
>
> From: "leslie elliott" <leslie.james.elliott at gmail.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Date: 04-06-2022 12.58 PM
> Subject: [EXTERNAL] [gpfsug-discuss] Watch folders
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> ------------------------------
>
>
>
> Hi all I was wondering if anyone had any scoping suggestions for enabling
> this  feature for multiple filesystems with SMB and NFS shares  We are
> running a standalone kafka cluster, not part of spectrumscale,  and each of
> the multiple file system
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> Hi all
>
> I was wondering if anyone had any scoping suggestions for enabling this
> feature for multiple filesystems with SMB and NFS shares
>
> We are running a standalone kafka cluster, not part of spectrumscale,
> and each of the multiple file system watches, update this with individual
> topics
> for each file system
>
> We have noticed file system access being affected negatively by the
> watches
> when we were running all the 10 filesystems at the same time.
>
> All of the filesets are AFM, some to NFS homes, and some to NSD homes
>
> any feedback appreciated
>
> leslie
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org*
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/23cc681a/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/23cc681a/attachment.gif>

From leslie.james.elliott at gmail.com  Fri Jun 10 22:02:20 2022
From: leslie.james.elliott at gmail.com (leslie elliott)
Date: Sat, 11 Jun 2022 07:02:20 +1000
Subject: [gpfsug-discuss] Watch folders
In-Reply-To: <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>
References: <CANBv+tsKqkGPHMCn23ushzQdvCE5+GVASyOoBfbXqFD0QbFy8A@mail.gmail.com>
 <OF7805EC22.8820535F-ON8525885A.006D17FD-6525885A.006D392E@ibm.com>
 <OFEAA7DB8E.8E5D9036-ON8525885D.006569AA-6525885D.0065A374@ibm.com>
Message-ID: <CANBv+tssYr5nTnoZjX=HiEkxipZCAZk1jFHR260FEZdWcLEOTA@mail.gmail.com>

thanks for chasing this up

I will log a support call if that is easier to track this

was hoping this was something someone had seen already
but doesn't look like it so far

leslie

On Sat, 11 Jun 2022 at 04:30, IBM Spectrum Scale <scale at us.ibm.com> wrote:

> Hi Jake,
>
> Just checking if you or someone from you squad got a chance to respond to
> Leslie's Watch folder query.
>
>
> Regards, The Spectrum Scale (GPFS) team
>
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479.
>
>
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
>  1-800-237-5511 in the United States or your local IBM Service Center in
> other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> [image: Inactive hide details for IBM Spectrum Scale---08-06-2022 01.27.29
> AM---Hi Jake, Can you or some from your squad please answer]IBM Spectrum
> Scale---08-06-2022 01.27.29 AM---Hi Jake, Can you or some from your squad
> please answer the below Watch Folder query.
>
> From: IBM Spectrum Scale/Poughkeepsie/IBM at IBMUS
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>, Jacob M
> Tick/Tucson/IBM at IBMMail
> Cc: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>,
> "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> Date: 08-06-2022 01.27 AM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Watch folders
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> ------------------------------
>
>
>
> Hi Jake, Can you or some from your squad please answer the below Watch
> Folder query. Regards, The Spectrum Scale (GPFS) team
> ------------------------------------------------------------------------------------------------------------------
>
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hi Jake,
>
> Can you or some from your squad please answer the below Watch Folder query.
>
> Regards, The Spectrum Scale (GPFS) team
>
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of  Spectrum Scale
> (GPFS), then please post it to the public IBM developerWroks Forum at
> *https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479*
> <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>.
>
>
> If your query concerns a potential software error in Spectrum Scale (GPFS)
> and you have an IBM software maintenance contract please contact
>  1-800-237-5511 in the United States or your local IBM Service Center in
> other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> [image: Inactive hide details for "leslie elliott" ---04-06-2022 12.58.48
> PM---Hi all I was wondering if anyone had any scoping suggest]"leslie
> elliott" ---04-06-2022 12.58.48 PM---Hi all I was wondering if anyone had
> any scoping suggestions for enabling this
>
> From: "leslie elliott" <leslie.james.elliott at gmail.com>
> To: "gpfsug main discussion list" <gpfsug-discuss at spectrumscale.org>
> Date: 04-06-2022 12.58 PM
> Subject: [EXTERNAL] [gpfsug-discuss] Watch folders
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
> ------------------------------
>
>
>
> Hi all I was wondering if anyone had any scoping suggestions for enabling
> this  feature for multiple filesystems with SMB and NFS shares  We are
> running a standalone kafka cluster, not part of spectrumscale,  and each of
> the multiple file system
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> Hi all
>
> I was wondering if anyone had any scoping suggestions for enabling this
> feature for multiple filesystems with SMB and NFS shares
>
> We are running a standalone kafka cluster, not part of spectrumscale,
> and each of the multiple file system watches, update this with individual
> topics
> for each file system
>
> We have noticed file system access being affected negatively by the
> watches
> when we were running all the 10 filesystems at the same time.
>
> All of the filesets are AFM, some to NFS homes, and some to NSD homes
>
> any feedback appreciated
>
> leslie
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> *http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org*
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/23cc681a/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220611/23cc681a/attachment-0001.gif>

From chair at gpfsug.org  Mon Jun 13 12:12:41 2022
From: chair at gpfsug.org (chair at gpfsug.org)
Date: Mon, 13 Jun 2022 12:12:41 +0100
Subject: [gpfsug-discuss] UK Spectrum Scale User Group meeting 30th June 2022
Message-ID: <c1713bec979c0b834f78adb1d5c717c8@gpfsug.org>

Hi all,

Just a reminder that the next UK User Group meeting will be taking place 
in London (IBM York Road) on 30th June 2022.

Registration is open at 
https://www.eventbrite.co.uk/e/spectrum-scale-user-group-registration-321290978967

The agenda is below

9:30 ? 10:00 	Arrivals and refreshments
10:00 ? 10:15 	Introductions and committee updates, Paul Tomlinson Group 
Chair and Caroline Bradley, Group Secretary
10:15 ? 10:35 	Strategy Update (IBM)
10:35 ? 10:55 	New S3 Access for AI and Analytics (IBM)
10:55 ? 11:20 	What is new in Spectrum Scale and ESS (IBM)
11:20 ? 11:40 	nvidia GPUDirect Storage (IBM)
11:40 ? 12:00 	New Deplyoment using Ansible and Terraform (IBM)
12:00 ? 13:00 	Buffet Lunch with viewings of :- Quantum, Immersive Room 
and AI Cars
13:00 ? 13:20 	Migrating Spectrum Scale using Atmepo Software (Atempo)
13:20 ? 13:40 	Monitoring and Serviceability Enhancements (IBM)
13:40 ? 14:00 	Spectrum Scale and Spectrum Discover for Data Management 
University of Oslo
14:00 ? 14:30 	Performance update (IBM)
14:30 ? 15:00 	Tea Break with viewing of Boston Dynamics, Spot the Robot 
Dog
15:00 ? 15:30 	Data orchestration across the global data platform (IBM)
15:30 ? 16:00 	AFM Deep Dive (IBM)
16:00 ? 17:00 	Group discussion, Challenges, Experiences and Questions 
Led by Paul Tomlinson
17:00 	Drinks reception

Thanks

Paul


From pinto at scinet.utoronto.ca  Mon Jun 20 19:04:05 2022
From: pinto at scinet.utoronto.ca (Jaime Pinto)
Date: Mon, 20 Jun 2022 14:04:05 -0400
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
Message-ID: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>

I'm wondering if it's possible to shrink GPFS gracefully. I've seen some references to that effect on some presentations, however I can't find detailed instructions on any formal IBM documentation on how to do it.

About 3 years ago we launched a new GPFS deployment with 3 DSS-G enclosures (9.6PB usable).
Some 1.5 years later we added 2 more enclosures, for a total of 16PB, and only 7PB occupancy so far.

Basically I'd like to return to the original 3 enclosures, and still maintain the (8+2p) parity level.

Any suggestions?
Thanks
Jaime

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - www.scinet.utoronto.ca
University of Toronto


From jonathan.buzzard at strath.ac.uk  Mon Jun 20 19:40:16 2022
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 20 Jun 2022 19:40:16 +0100
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
Message-ID: <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>

On 20/06/2022 19:04, Jaime Pinto wrote:
> 
> I'm wondering if it's possible to shrink GPFS gracefully.

Yes absolutely, been possible since at least version 2.2 and probably older.

> I've seen some 
> references to that effect on some presentations, however I can't find 
> detailed instructions on any formal IBM documentation on how to do it.
>

Use mmdeldisk to remove the NSD(s) from a file system. This will take a 
while so I recommend in the *STRONGEST* possible terms running it in a 
screen or tmux session. By a while it could be days or even weeks 
depending on how much data needs to be moved about.

Once you have removed the NSD's from a file system then you can use 
mmdelnsd to wipe the NSD descriptors from the disks if necessary.


> About 3 years ago we launched a new GPFS deployment with 3 DSS-G 
> enclosures (9.6PB usable).
> Some 1.5 years later we added 2 more enclosures, for a total of 16PB, 
> and only 7PB occupancy so far.
> 
> Basically I'd like to return to the original 3 enclosures, and still 
> maintain the (8+2p) parity level.
> 
> Any suggestions?

Not being sarky but really use Google. Say "gpfs remove nsd from file 
system" and select the first link!


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From luis.bolinches at fi.ibm.com  Mon Jun 20 19:54:29 2022
From: luis.bolinches at fi.ibm.com (Luis Bolinches)
Date: Mon, 20 Jun 2022 18:54:29 +0000
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
Message-ID: <9D80BF69-B3D0-4B9B-86E5-1CABCEF4F63E@fi.ibm.com>

Hi

I?d like to add that you will be removing, assuming identical systems and networks, 2/5 of your throughput. Hope that is ok too. 

--
Cheers

> On 20. Jun 2022, at 21.42, Jonathan Buzzard <jonathan.buzzard at strath.ac.uk> wrote:
> 
> ?On 20/06/2022 19:04, Jaime Pinto wrote:
>> I'm wondering if it's possible to shrink GPFS gracefully.
> 
> Yes absolutely, been possible since at least version 2.2 and probably older.
> 
>> I've seen some references to that effect on some presentations, however I can't find detailed instructions on any formal IBM documentation on how to do it.
>> 
> 
> Use mmdeldisk to remove the NSD(s) from a file system. This will take a while so I recommend in the *STRONGEST* possible terms running it in a screen or tmux session. By a while it could be days or even weeks depending on how much data needs to be moved about.
> 
> Once you have removed the NSD's from a file system then you can use mmdelnsd to wipe the NSD descriptors from the disks if necessary.
> 
> 
>> About 3 years ago we launched a new GPFS deployment with 3 DSS-G enclosures (9.6PB usable).
>> Some 1.5 years later we added 2 more enclosures, for a total of 16PB, and only 7PB occupancy so far.
>> Basically I'd like to return to the original 3 enclosures, and still maintain the (8+2p) parity level.
>> Any suggestions?
> 
> Not being sarky but really use Google. Say "gpfs remove nsd from file system" and select the first link!
> 
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 

Unless otherwise stated above:

Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3
Registered in Finland

From pinto at scinet.utoronto.ca  Mon Jun 20 20:12:22 2022
From: pinto at scinet.utoronto.ca (Jaime Pinto)
Date: Mon, 20 Jun 2022 15:12:22 -0400
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
Message-ID: <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>

Thanks JAB and Luis

I know, there are mmdelnsd, mmdeldisk, mmrestripefs and a few other correlated mm* commands. They are very high-level in work in bulk discreet fashion (I mean, considering the number of NSDs we have, each deletion will shave 4% of the storage at once, that is too much).

Maybe I should have used the term "very gradual" instead of "gracefully" in my original email.

I'm just looking to do this in a very gradual and controlled fashion, just delete(or fail) a couple of hard drives at the time. In fact, I'd like to carefully specify which hard drives (not volumes) are removed from the pool, and in which order, and set which drives should remain in read-only mode (since they will be removed later, so no data is written to them during mmrestripefs), and so on.

I guess I'm looking for an article or a white paper on how to do this under "my absolute control", if that makes sense.

After this exercise I expect the occupancy to be at 68% with the remaining enclosures.

I'll them repurpose the left over enclosures/drives to run some experiments, and later on grow the file system again.

Thanks
Jaime


On 6/20/2022 14:40:16, Jonathan Buzzard wrote:
> On 20/06/2022 19:04, Jaime Pinto wrote:
>>
>> I'm wondering if it's possible to shrink GPFS gracefully.
> 
> Yes absolutely, been possible since at least version 2.2 and probably older.
> 
>> I've seen some references to that effect on some presentations, however I can't find detailed instructions on any formal IBM documentation on how to do it.
>>
> 
> Use mmdeldisk to remove the NSD(s) from a file system. This will take a while so I recommend in the *STRONGEST* possible terms running it in a screen or tmux session. By a while it could be days or even weeks depending on how much data needs to be moved about.
> 
> Once you have removed the NSD's from a file system then you can use mmdelnsd to wipe the NSD descriptors from the disks if necessary.
> 
> 
>> About 3 years ago we launched a new GPFS deployment with 3 DSS-G enclosures (9.6PB usable).
>> Some 1.5 years later we added 2 more enclosures, for a total of 16PB, and only 7PB occupancy so far.
>>
>> Basically I'd like to return to the original 3 enclosures, and still maintain the (8+2p) parity level.
>>
>> Any suggestions?
> 
> Not being sarky but really use Google. Say "gpfs remove nsd from file system" and select the first link!
> 
> 
> JAB.
> 

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - www.scinet.utoronto.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477


From jonathan.buzzard at strath.ac.uk  Mon Jun 20 20:19:15 2022
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 20 Jun 2022 20:19:15 +0100
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
Message-ID: <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>

On 20/06/2022 20:12, Jaime Pinto wrote:
> 
> Thanks JAB and Luis
> 
> I know, there are mmdelnsd, mmdeldisk, mmrestripefs and a few other 
> correlated mm* commands. They are very high-level in work in bulk 
> discreet fashion (I mean, considering the number of NSDs we have, each 
> deletion will shave 4% of the storage at once, that is too much).
> 

Then you are goosed. An NSD cannot be changed in size once created and 
can only ever be in a file system or out a file system. The only way to 
change the size of a GPFS file system is by adding or removing NSD's.

I am not a fan of how the DSS-G creates small numbers of huge NSD's. In 
fact the script sucks a *lot* from a systems admin perspective. Then 
again someone at IBM thought redeploying your entire OS every time you 
want to make a point release upgrade was a good idea.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From luis.bolinches at fi.ibm.com  Mon Jun 20 20:47:06 2022
From: luis.bolinches at fi.ibm.com (Luis Bolinches)
Date: Mon, 20 Jun 2022 19:47:06 +0000
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
 <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
Message-ID: <16EFF080-E403-4249-885C-C25532342BD3@fi.ibm.com>

Those redeploys days are gone :)

And on ESS you get a fix number of vdisks per enclosure. To avoid having a 1PB or bigger vdisk. That makes it as you mentioned ? not manageable. 

--
Cheers

> On 20. Jun 2022, at 22.20, Jonathan Buzzard <jonathan.buzzard at strath.ac.uk> wrote:
> 
> ?On 20/06/2022 20:12, Jaime Pinto wrote:
>> Thanks JAB and Luis
>> I know, there are mmdelnsd, mmdeldisk, mmrestripefs and a few other correlated mm* commands. They are very high-level in work in bulk discreet fashion (I mean, considering the number of NSDs we have, each deletion will shave 4% of the storage at once, that is too much).
> 
> Then you are goosed. An NSD cannot be changed in size once created and can only ever be in a file system or out a file system. The only way to change the size of a GPFS file system is by adding or removing NSD's.
> 
> I am not a fan of how the DSS-G creates small numbers of huge NSD's. In fact the script sucks a *lot* from a systems admin perspective. Then again someone at IBM thought redeploying your entire OS every time you want to make a point release upgrade was a good idea.
> 
> 
> JAB.
> 
> -- 
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
> 
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org 

Unless otherwise stated above:

Oy IBM Finland Ab
PL 265, 00101 Helsinki, Finland
Business ID, Y-tunnus: 0195876-3
Registered in Finland

From anacreo at gmail.com  Mon Jun 20 21:07:03 2022
From: anacreo at gmail.com (Alec)
Date: Mon, 20 Jun 2022 13:07:03 -0700
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
 <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
Message-ID: <CAGhSTwg+OVEGkhCGoJrYe5TNeQN8Qo5FTGozNpkyjj-dbFq=ew@mail.gmail.com>

Okay if you have double parity why not just resize the disk.  And let gpfs
recover using the parity.  And by the way there is a qos setting for
maintenance operations and you can give that higher priority to make the
recovery/deleting/adding operations quicker.  Also I don't know if this
matters in gpfs but you may want to change the affinity for disk
(distribute the primary/first node for each disk/Christmas tree it) to a
different servers to spread the load. I don't know if gpfs will actually
use that to distribute load, but worth checking.

Alec

On Mon, Jun 20, 2022, 12:20 PM Jonathan Buzzard <
jonathan.buzzard at strath.ac.uk> wrote:

> On 20/06/2022 20:12, Jaime Pinto wrote:
> >
> > Thanks JAB and Luis
> >
> > I know, there are mmdelnsd, mmdeldisk, mmrestripefs and a few other
> > correlated mm* commands. They are very high-level in work in bulk
> > discreet fashion (I mean, considering the number of NSDs we have, each
> > deletion will shave 4% of the storage at once, that is too much).
> >
>
> Then you are goosed. An NSD cannot be changed in size once created and
> can only ever be in a file system or out a file system. The only way to
> change the size of a GPFS file system is by adding or removing NSD's.
>
> I am not a fan of how the DSS-G creates small numbers of huge NSD's. In
> fact the script sucks a *lot* from a systems admin perspective. Then
> again someone at IBM thought redeploying your entire OS every time you
> want to make a point release upgrade was a good idea.
>
>
> JAB.
>
> --
> Jonathan A. Buzzard                         Tel: +44141-5483420
> HPC System Administrator, ARCHIE-WeSt.
> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220620/8924f4bc/attachment.htm>

From anacreo at gmail.com  Mon Jun 20 21:10:04 2022
From: anacreo at gmail.com (Alec)
Date: Mon, 20 Jun 2022 13:10:04 -0700
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <CAGhSTwg+OVEGkhCGoJrYe5TNeQN8Qo5FTGozNpkyjj-dbFq=ew@mail.gmail.com>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
 <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
 <CAGhSTwg+OVEGkhCGoJrYe5TNeQN8Qo5FTGozNpkyjj-dbFq=ew@mail.gmail.com>
Message-ID: <CAGhSTwiqUnSb-WEo7cUOubeaK-ZupAi0W69Hwi6i=a0G88AJQg@mail.gmail.com>

In production we've come up on an FS with missing disks and GPFS just
carries on giving io errors on unusable files.. you could simply stop the
disk and bring up the FS and see what it looks like, maybe do a full backup
to null devices to make sure all the data is truely readable..  then decide
if you want to just delete the disk and add in another disk and let GPFS
recover the situation.

Alec

On Mon, Jun 20, 2022, 1:07 PM Alec <anacreo at gmail.com> wrote:

> Okay if you have double parity why not just resize the disk.  And let gpfs
> recover using the parity.  And by the way there is a qos setting for
> maintenance operations and you can give that higher priority to make the
> recovery/deleting/adding operations quicker.  Also I don't know if this
> matters in gpfs but you may want to change the affinity for disk
> (distribute the primary/first node for each disk/Christmas tree it) to a
> different servers to spread the load. I don't know if gpfs will actually
> use that to distribute load, but worth checking.
>
> Alec
>
> On Mon, Jun 20, 2022, 12:20 PM Jonathan Buzzard <
> jonathan.buzzard at strath.ac.uk> wrote:
>
>> On 20/06/2022 20:12, Jaime Pinto wrote:
>> >
>> > Thanks JAB and Luis
>> >
>> > I know, there are mmdelnsd, mmdeldisk, mmrestripefs and a few other
>> > correlated mm* commands. They are very high-level in work in bulk
>> > discreet fashion (I mean, considering the number of NSDs we have, each
>> > deletion will shave 4% of the storage at once, that is too much).
>> >
>>
>> Then you are goosed. An NSD cannot be changed in size once created and
>> can only ever be in a file system or out a file system. The only way to
>> change the size of a GPFS file system is by adding or removing NSD's.
>>
>> I am not a fan of how the DSS-G creates small numbers of huge NSD's. In
>> fact the script sucks a *lot* from a systems admin perspective. Then
>> again someone at IBM thought redeploying your entire OS every time you
>> want to make a point release upgrade was a good idea.
>>
>>
>> JAB.
>>
>> --
>> Jonathan A. Buzzard                         Tel: +44141-5483420
>> HPC System Administrator, ARCHIE-WeSt.
>> University of Strathclyde, John Anderson Building, Glasgow. G4 0NG
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220620/2ebb7d7a/attachment.htm>

From jonathan.buzzard at strath.ac.uk  Mon Jun 20 21:44:01 2022
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 20 Jun 2022 21:44:01 +0100
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <16EFF080-E403-4249-885C-C25532342BD3@fi.ibm.com>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
 <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
 <16EFF080-E403-4249-885C-C25532342BD3@fi.ibm.com>
Message-ID: <2f503ed4-01a3-e8c6-2e5d-808a99e8bc15@strath.ac.uk>

On 20/06/2022 20:47, Luis Bolinches wrote:
> 
> Those redeploys days are gone :)
> 

Not yet for DSS-G unfortunately. However even if they are, it means
someone thought it was an acceptable idea at some point.

> And on ESS you get a fix number of vdisks per enclosure. To avoid
> having a 1PB or bigger vdisk. That makes it as you mentioned ? not
> manageable.
> 

The issue I have is what I want to do is reserve a set number of disks 
per tray/enclosure as spares. Ok not actual disks but the capacity of a 
disk as a spare. As of last February that was not possible, I had to 
mess about creating and destroying the vdisks till I got where I wanted 
to be. What a palaver that was.

I was also super unimpressed that the scripts throw a wobbler because my 
two DSS-G servers where named gpfs1 and gpfs2 which was not a problem on 
4.2.x and not actually documented anywhere.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From jonathan.buzzard at strath.ac.uk  Mon Jun 20 21:56:13 2022
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 20 Jun 2022 21:56:13 +0100
Subject: [gpfsug-discuss] How to shrink GPFS on DSSG's?
In-Reply-To: <CAGhSTwg+OVEGkhCGoJrYe5TNeQN8Qo5FTGozNpkyjj-dbFq=ew@mail.gmail.com>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
 <1010bbbc-3da6-3e51-9737-8e1063730c44@strath.ac.uk>
 <e2be6162-4599-f600-8097-c64d65676cc7@scinet.utoronto.ca>
 <8e9acc09-fd16-c06e-b637-f95551d2bc11@strath.ac.uk>
 <CAGhSTwg+OVEGkhCGoJrYe5TNeQN8Qo5FTGozNpkyjj-dbFq=ew@mail.gmail.com>
Message-ID: <1794f84a-ce66-32e8-8e2e-5541b3dc1573@strath.ac.uk>

On 20/06/2022 21:07, Alec wrote:
> Also I don't know if 
> this matters in gpfs but you may want to change the affinity for disk 
> (distribute the primary/first node for each disk/Christmas tree it) to a 
> different servers to spread the load. I don't know if gpfs will actually 
> use that to distribute load, but worth checking.
> 
Certainly made a difference historically. Perhaps not so much these days 
as your NSD servers are wildly more powerful than they used to be.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From p.childs at qmul.ac.uk  Wed Jun 22 10:59:50 2022
From: p.childs at qmul.ac.uk (Peter Childs)
Date: Wed, 22 Jun 2022 09:59:50 +0000
Subject: [gpfsug-discuss] [EXTERNAL]  How to shrink GPFS on DSSG's?
In-Reply-To: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
References: <e5ab0c48-9189-cc86-4b69-72d895180216@scinet.utoronto.ca>
Message-ID: <AS8PR07MB77197C45AEA7638C54116667A4B29@AS8PR07MB7719.eurprd07.prod.outlook.com>

Having only just got an ESS I'm still learning how GNR works. as I read it there are currently two "breeds" of GNR, the version on the "DSS and ESS appliances" and the one in "Erasure Code Edition"

As I understand it from past talks using mmdeldisk to remove a disk works fine in none GNR editions but is not the best way to do the task.

My understanding is that you should

mmchdisk suspend/empty # so new data is not put on the disk but the disk remains available for read.
mmrestripefs -m # to move the data off the disk
mmdeldisk # to actually remove the disk which should be fast as its already been emptied.

We have done this with success in the past to migrate data between Raid6 arrays, with success.

I believe there are some commands with mmvdisk to re-shape recovery groups in GNR but I've not as yet worked out how they work.


Peter Childs

________________________________________
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Jaime Pinto <pinto at scinet.utoronto.ca>
Sent: Monday, June 20, 2022 7:04 PM
To: gpfsug-discuss at spectrumscale.org
Subject: [EXTERNAL] [gpfsug-discuss] How to shrink GPFS on DSSG's?

CAUTION: This email originated from outside of QMUL. Do not click links or open attachments unless you recognise the sender and know the content is safe.


I'm wondering if it's possible to shrink GPFS gracefully. I've seen some references to that effect on some presentations, however I can't find detailed instructions on any formal IBM documentation on how to do it.

About 3 years ago we launched a new GPFS deployment with 3 DSS-G enclosures (9.6PB usable).
Some 1.5 years later we added 2 more enclosures, for a total of 16PB, and only 7PB occupancy so far.

Basically I'd like to return to the original 3 enclosures, and still maintain the (8+2p) parity level.

Any suggestions?
Thanks
Jaime

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - www.scinet.utoronto.ca
University of Toronto

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


From chair at gpfsug.org  Sun Jun 26 11:11:29 2022
From: chair at gpfsug.org (chair at gpfsug.org)
Date: Sun, 26 Jun 2022 11:11:29 +0100
Subject: [gpfsug-discuss] Spectrum Scale Users Group 30th June - Logistics
Message-ID: <dbc88d50e217a43c3f6e57aeb1d4f731@gpfsug.org>

Hi all,

For those attending the User Group this week, please bring Photogaphic 
ID for entry into the IBM building.
Also, we will be meeting in "The Mulberry Bush" pub 
(https://www.mulberrybushpub.co.uk/) on the Wednesday evening, if anyone 
wishes to join us.

I look forward to seeing you all next week

Regards

Paul


From tina.friedrich at it.ox.ac.uk  Thu Jun 30 12:31:43 2022
From: tina.friedrich at it.ox.ac.uk (Tina Friedrich)
Date: Thu, 30 Jun 2022 12:31:43 +0100
Subject: [gpfsug-discuss] quickest way to delete all files (and directories)
 in a file system
Message-ID: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>

Hello everyone,

this should be a simple question, but we can't quite figure out how to 
best proceed.

We have some file systems that we want to, basically, empty out. As in 
remove all files and directories currently on them. Both contain a 
pretty large number of files/directories (something like 50,000,000, 
with sometimes silly characters in the file names). 'rm -rf' clearly 
isn't the way to go forward.

We've come up with either 'mmapplypolicy' (i.e. a policy to remove all 
files) or removing and re-creating the file systems as options (open to 
other suggestions!).

We want the file systems still; ideally without having to redo the 
authentication and key swaps etc for the 'remote' clusters using them.

This is a Lenovo DSS, but I don't think it makes much of a difference.

So - what's the best way to proceed?

If it is mmapplypolicy - does anyone have a (tested/known working) 
example of a policy to simply remove all files?

Thanks,
Tina

-- 
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk


From olaf.weiser at de.ibm.com  Thu Jun 30 13:05:33 2022
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Thu, 30 Jun 2022 12:05:33 +0000
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
Message-ID: <DM6PR15MB39903B47D4AFF62300BA5A0DC9BA9@DM6PR15MB3990.namprd15.prod.outlook.com>

Hi Tina,
I think its much faster to recreate the file system
after that .. it is enough to     do

mmauth grant  {RemoteClusterName | all} -f {Device

in my case ..its always ... grant all -f all ?

and every remote mount will  work  as before.. the remote cluster key information is in the cluster CCR  .. not in the filesystem..

Pay attention.. when you 'll create the file system, it will be created with the current code's version... in Case remote cluster is backlevel.. don't forget to specify --version

have fun ?


________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von Tina Friedrich <tina.friedrich at it.ox.ac.uk>
Gesendet: Donnerstag, 30. Juni 2022 13:31
An: 'gpfsug main discussion list' <gpfsug-discuss at spectrumscale.org>
Betreff: [EXTERNAL] [gpfsug-discuss] quickest way to delete all files (and directories) in a file system

Hello everyone,

this should be a simple question, but we can't quite figure out how to
best proceed.

We have some file systems that we want to, basically, empty out. As in
remove all files and directories currently on them. Both contain a
pretty large number of files/directories (something like 50,000,000,
with sometimes silly characters in the file names). 'rm -rf' clearly
isn't the way to go forward.

We've come up with either 'mmapplypolicy' (i.e. a policy to remove all
files) or removing and re-creating the file systems as options (open to
other suggestions!).

We want the file systems still; ideally without having to redo the
authentication and key swaps etc for the 'remote' clusters using them.

This is a Lenovo DSS, but I don't think it makes much of a difference.

So - what's the best way to proceed?

If it is mmapplypolicy - does anyone have a (tested/known working)
example of a policy to simply remove all files?

Thanks,
Tina

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk  http://www.it.ox.ac.uk

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
H
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220630/e9e145df/attachment.htm>

From antony.steel at belisama.com.sg  Thu Jun 30 13:26:16 2022
From: antony.steel at belisama.com.sg (Antony Steel)
Date: Thu, 30 Jun 2022 22:26:16 +1000
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
Message-ID: <181b4938702.33b0dff581274.2969139072384838762@belisama.com.sg>

Hi,

Perhaps use filesets?? Quicker to remove and recreate?


Keep safe,


Antony Steel

CTO Belisama


amailto:ntony.steel at belisama.com.sg

Singapore: +65 9789 6663

Australia +61 4 1980 3049

http://www.belisama.com.sg


---- On Thu, 30 Jun 2022 21:31:43 +1000 Tina Friedrich <tina.friedrich at it.ox.ac.uk> wrote ---


Hello everyone, 
 
this should be a simple question, but we can't quite figure out how to 
best proceed. 
 
We have some file systems that we want to, basically, empty out. As in 
remove all files and directories currently on them. Both contain a 
pretty large number of files/directories (something like 50,000,000, 
with sometimes silly characters in the file names). 'rm -rf' clearly 
isn't the way to go forward. 
 
We've come up with either 'mmapplypolicy' (i.e. a policy to remove all 
files) or removing and re-creating the file systems as options (open to 
other suggestions!). 
 
We want the file systems still; ideally without having to redo the 
authentication and key swaps etc for the 'remote' clusters using them. 
 
This is a Lenovo DSS, but I don't think it makes much of a difference. 
 
So - what's the best way to proceed? 
 
If it is mmapplypolicy - does anyone have a (tested/known working) 
example of a policy to simply remove all files? 
 
Thanks, 
Tina 
 
-- 
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator 
 
Research Computing and Support Services 
IT Services, University of Oxford 
http://www.arc.ox.ac.uk http://www.it.ox.ac.uk 
 
_______________________________________________ 
gpfsug-discuss mailing list 
gpfsug-discuss at gpfsug.org 
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220630/16149dc7/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1655907620786000_629392709.png
Type: image/png
Size: 12431 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220630/16149dc7/attachment.png>

From stockf at us.ibm.com  Thu Jun 30 13:47:45 2022
From: stockf at us.ibm.com (Frederick Stock)
Date: Thu, 30 Jun 2022 12:47:45 +0000
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <DM6PR15MB39903B47D4AFF62300BA5A0DC9BA9@DM6PR15MB3990.namprd15.prod.outlook.com>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
 <DM6PR15MB39903B47D4AFF62300BA5A0DC9BA9@DM6PR15MB3990.namprd15.prod.outlook.com>
Message-ID: <BL3PR15MB5433F91C0D2ADC79577470ECFDBA9@BL3PR15MB5433.namprd15.prod.outlook.com>

For speed Olaf?s recommendation is the best option.  If you really do not want to remove the file systems and recreate them, and the version of Scale is fairly current, you could use the mmfind command to simplify creating a policy to remove the files.  Still removing 50M files will take some time.

Fred

Fred Stock, Spectrum Scale Development Advocacy
stockf at us.ibm.com<mailto:stockf at us.ibm.com> | 720-430-8821


From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Olaf Weiser <olaf.weiser at de.ibm.com>
Date: Thursday, June 30, 2022 at 8:09 AM
To: 'gpfsug main discussion list' <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] quickest way to delete all files (and directories) in a file system
Hi Tina, I think its much faster to recreate the file system after that .. it is enough to do mmauth grant {RemoteClusterName | all} -f {Device in my case ..its always ... grant all -f all ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
Hi Tina,
I think its much faster to recreate the file system
after that .. it is enough to     do

mmauth grant  {RemoteClusterName | all} -f {Device
in my case ..its always ... grant all -f all ?

and every remote mount will  work  as before.. the remote cluster key information is in the cluster CCR  .. not in the filesystem..

Pay attention.. when you 'll create the file system, it will be created with the current code's version... in Case remote cluster is backlevel.. don't forget to specify --version

have fun ?


________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von Tina Friedrich <tina.friedrich at it.ox.ac.uk>
Gesendet: Donnerstag, 30. Juni 2022 13:31
An: 'gpfsug main discussion list' <gpfsug-discuss at spectrumscale.org>
Betreff: [EXTERNAL] [gpfsug-discuss] quickest way to delete all files (and directories) in a file system

Hello everyone,

this should be a simple question, but we can't quite figure out how to
best proceed.

We have some file systems that we want to, basically, empty out. As in
remove all files and directories currently on them. Both contain a
pretty large number of files/directories (something like 50,000,000,
with sometimes silly characters in the file names). 'rm -rf' clearly
isn't the way to go forward.

We've come up with either 'mmapplypolicy' (i.e. a policy to remove all
files) or removing and re-creating the file systems as options (open to
other suggestions!).

We want the file systems still; ideally without having to redo the
authentication and key swaps etc for the 'remote' clusters using them.

This is a Lenovo DSS, but I don't think it makes much of a difference.

So - what's the best way to proceed?

If it is mmapplypolicy - does anyone have a (tested/known working)
example of a policy to simply remove all files?

Thanks,
Tina

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk<http://www.arc.ox.ac.uk>  http://www.it.ox.ac.uk<http://www.it.ox.ac.uk>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
H
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220630/9d2c0a4b/attachment.htm>

From stockf at us.ibm.com  Thu Jun 30 13:47:45 2022
From: stockf at us.ibm.com (Frederick Stock)
Date: Thu, 30 Jun 2022 12:47:45 +0000
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <DM6PR15MB39903B47D4AFF62300BA5A0DC9BA9@DM6PR15MB3990.namprd15.prod.outlook.com>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
 <DM6PR15MB39903B47D4AFF62300BA5A0DC9BA9@DM6PR15MB3990.namprd15.prod.outlook.com>
Message-ID: <BL3PR15MB5433F91C0D2ADC79577470ECFDBA9@BL3PR15MB5433.namprd15.prod.outlook.com>

For speed Olaf?s recommendation is the best option.  If you really do not want to remove the file systems and recreate them, and the version of Scale is fairly current, you could use the mmfind command to simplify creating a policy to remove the files.  Still removing 50M files will take some time.

Fred

Fred Stock, Spectrum Scale Development Advocacy
stockf at us.ibm.com<mailto:stockf at us.ibm.com> | 720-430-8821


From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Olaf Weiser <olaf.weiser at de.ibm.com>
Date: Thursday, June 30, 2022 at 8:09 AM
To: 'gpfsug main discussion list' <gpfsug-discuss at spectrumscale.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] quickest way to delete all files (and directories) in a file system
Hi Tina, I think its much faster to recreate the file system after that .. it is enough to do mmauth grant {RemoteClusterName | all} -f {Device in my case ..its always ... grant all -f all ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
Hi Tina,
I think its much faster to recreate the file system
after that .. it is enough to     do

mmauth grant  {RemoteClusterName | all} -f {Device
in my case ..its always ... grant all -f all ?

and every remote mount will  work  as before.. the remote cluster key information is in the cluster CCR  .. not in the filesystem..

Pay attention.. when you 'll create the file system, it will be created with the current code's version... in Case remote cluster is backlevel.. don't forget to specify --version

have fun ?


________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von Tina Friedrich <tina.friedrich at it.ox.ac.uk>
Gesendet: Donnerstag, 30. Juni 2022 13:31
An: 'gpfsug main discussion list' <gpfsug-discuss at spectrumscale.org>
Betreff: [EXTERNAL] [gpfsug-discuss] quickest way to delete all files (and directories) in a file system

Hello everyone,

this should be a simple question, but we can't quite figure out how to
best proceed.

We have some file systems that we want to, basically, empty out. As in
remove all files and directories currently on them. Both contain a
pretty large number of files/directories (something like 50,000,000,
with sometimes silly characters in the file names). 'rm -rf' clearly
isn't the way to go forward.

We've come up with either 'mmapplypolicy' (i.e. a policy to remove all
files) or removing and re-creating the file systems as options (open to
other suggestions!).

We want the file systems still; ideally without having to redo the
authentication and key swaps etc for the 'remote' clusters using them.

This is a Lenovo DSS, but I don't think it makes much of a difference.

So - what's the best way to proceed?

If it is mmapplypolicy - does anyone have a (tested/known working)
example of a policy to simply remove all files?

Thanks,
Tina

--
Tina Friedrich, Advanced Research Computing Snr HPC Systems Administrator

Research Computing and Support Services
IT Services, University of Oxford
http://www.arc.ox.ac.uk<http://www.arc.ox.ac.uk>  http://www.it.ox.ac.uk<http://www.it.ox.ac.uk>

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
H
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220630/9d2c0a4b/attachment-0001.htm>

From pinto at scinet.utoronto.ca  Thu Jun 30 14:55:02 2022
From: pinto at scinet.utoronto.ca (Jaime Pinto)
Date: Thu, 30 Jun 2022 09:55:02 -0400
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
Message-ID: <d6ee2416-9344-177e-7e54-8f853cac8510@scinet.utoronto.ca>

Hi Tina

Please see attachment for a working version of 'mmrmdir', that I've been 
using for over 10 years. You may have to tweak it a bit for the name of 
the node you want to run it from, and the location of the policy (also 
attached).

I have used all the suggested ways so far on this thread to delete files 
in bulk. I still prefer to use this script when I don't want to disturb 
anything else on the cluster setup, in particular multi-cluster as you 
appear to have. It gives absolute and fine control of what to delete. 
You may also use it in test mode, and gradually delete only subsets of 
directories if you wish. Traversing the inodes database and creating the 
list of files to delete is what takes most of the time, whether deleting 
1M or 50M files.

To that effect, deleting and recreating file systems or filesets still 
takes a very long time, if those areas are populated with files.

Best
Jaime


On 6/30/22 07:31, Tina Friedrich wrote:
> Hello everyone,
> 
> this should be a simple question, but we can't quite figure out how to 
> best proceed.
> 
> We have some file systems that we want to, basically, empty out. As in 
> remove all files and directories currently on them. Both contain a 
> pretty large number of files/directories (something like 50,000,000, 
> with sometimes silly characters in the file names). 'rm -rf' clearly 
> isn't the way to go forward.
> 
> We've come up with either 'mmapplypolicy' (i.e. a policy to remove all 
> files) or removing and re-creating the file systems as options (open to 
> other suggestions!).
> 
> We want the file systems still; ideally without having to redo the 
> authentication and key swaps etc for the 'remote' clusters using them.
> 
> This is a Lenovo DSS, but I don't think it makes much of a difference.
> 
> So - what's the best way to proceed?
> 
> If it is mmapplypolicy - does anyone have a (tested/known working) 
> example of a policy to simply remove all files?
> 
> Thanks,
> Tina
> 

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - www.scinet.utoronto.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
-------------- next part --------------
#!/bin/bash

echo ""
echo "Command issued: "$0" "$@
echo ""

if [ "${HOSTNAME:0:12}" != datamover ]; then
	echo "You can only use mmrmdir on the datamovers"
	echo
	exit
fi

if [ "$1" == "" ] || [ "$1" == "-h" ] || [ "$1" == "-help" ] || [ "$1" == "--h" ] || [ "$1" == "--help" ] || [ $# -gt 2 ] || [ "$1" == "-test" ]; then
	echo "Usage: mmrmdir <directory absolute path> [-test]"
	echo "       -test  to verify what will be deleted"
	echo
	exit
fi

if [ "$2" != "" ] &&  [ "$2" != "-test" ]; then
        echo "Usage: mmrmdir <directory absolute path> [-test]"
        echo "       -test  to verify what will be deleted"
        echo
        exit
fi

echo -n "You have 10 seconds to cancel:"
for a in `seq 0 9`; do
    echo -n "  $a"
    sleep 1;
done
echo "  resuming ..."

LOCATION=$1
slash=`echo $LOCATION | grep /`

if [ "$slash" == "" ]; then
	echo "$LOCATION is not an absolute path"
	echo
	exit
fi

if [ "$2" == "-test" ]; then
	mmapplypolicy $LOCATION -P /usr/lpp/mmfs/bin/mmpolicyRules-DELETE-ALL -I test -L 2
else
	mmapplypolicy $LOCATION -P /usr/lpp/mmfs/bin/mmpolicyRules-DELETE-ALL -I defer -L 2
	if [ "$?" != 0 ]
	then
		echo #### there was an error with mmapplypolicy execution ####
	else
		echo removing empty directories in $LOCATION
		rm -rf $LOCATION
	fi
fi
exit 0
-------------- next part --------------
/* Define deletion rules for aged files in /dev/scratch (system pool by default).
   If the file has not been accessed in 90 days AND not owned by root then delete it. */

RULE 'DelSystem' DELETE
	FROM POOL 'system' 
	FOR FILESET('root')


From pinto at scinet.utoronto.ca  Thu Jun 30 14:55:02 2022
From: pinto at scinet.utoronto.ca (Jaime Pinto)
Date: Thu, 30 Jun 2022 09:55:02 -0400
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
Message-ID: <d6ee2416-9344-177e-7e54-8f853cac8510@scinet.utoronto.ca>

Hi Tina

Please see attachment for a working version of 'mmrmdir', that I've been 
using for over 10 years. You may have to tweak it a bit for the name of 
the node you want to run it from, and the location of the policy (also 
attached).

I have used all the suggested ways so far on this thread to delete files 
in bulk. I still prefer to use this script when I don't want to disturb 
anything else on the cluster setup, in particular multi-cluster as you 
appear to have. It gives absolute and fine control of what to delete. 
You may also use it in test mode, and gradually delete only subsets of 
directories if you wish. Traversing the inodes database and creating the 
list of files to delete is what takes most of the time, whether deleting 
1M or 50M files.

To that effect, deleting and recreating file systems or filesets still 
takes a very long time, if those areas are populated with files.

Best
Jaime


On 6/30/22 07:31, Tina Friedrich wrote:
> Hello everyone,
> 
> this should be a simple question, but we can't quite figure out how to 
> best proceed.
> 
> We have some file systems that we want to, basically, empty out. As in 
> remove all files and directories currently on them. Both contain a 
> pretty large number of files/directories (something like 50,000,000, 
> with sometimes silly characters in the file names). 'rm -rf' clearly 
> isn't the way to go forward.
> 
> We've come up with either 'mmapplypolicy' (i.e. a policy to remove all 
> files) or removing and re-creating the file systems as options (open to 
> other suggestions!).
> 
> We want the file systems still; ideally without having to redo the 
> authentication and key swaps etc for the 'remote' clusters using them.
> 
> This is a Lenovo DSS, but I don't think it makes much of a difference.
> 
> So - what's the best way to proceed?
> 
> If it is mmapplypolicy - does anyone have a (tested/known working) 
> example of a policy to simply remove all files?
> 
> Thanks,
> Tina
> 

---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - www.scinet.utoronto.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
-------------- next part --------------
#!/bin/bash

echo ""
echo "Command issued: "$0" "$@
echo ""

if [ "${HOSTNAME:0:12}" != datamover ]; then
	echo "You can only use mmrmdir on the datamovers"
	echo
	exit
fi

if [ "$1" == "" ] || [ "$1" == "-h" ] || [ "$1" == "-help" ] || [ "$1" == "--h" ] || [ "$1" == "--help" ] || [ $# -gt 2 ] || [ "$1" == "-test" ]; then
	echo "Usage: mmrmdir <directory absolute path> [-test]"
	echo "       -test  to verify what will be deleted"
	echo
	exit
fi

if [ "$2" != "" ] &&  [ "$2" != "-test" ]; then
        echo "Usage: mmrmdir <directory absolute path> [-test]"
        echo "       -test  to verify what will be deleted"
        echo
        exit
fi

echo -n "You have 10 seconds to cancel:"
for a in `seq 0 9`; do
    echo -n "  $a"
    sleep 1;
done
echo "  resuming ..."

LOCATION=$1
slash=`echo $LOCATION | grep /`

if [ "$slash" == "" ]; then
	echo "$LOCATION is not an absolute path"
	echo
	exit
fi

if [ "$2" == "-test" ]; then
	mmapplypolicy $LOCATION -P /usr/lpp/mmfs/bin/mmpolicyRules-DELETE-ALL -I test -L 2
else
	mmapplypolicy $LOCATION -P /usr/lpp/mmfs/bin/mmpolicyRules-DELETE-ALL -I defer -L 2
	if [ "$?" != 0 ]
	then
		echo #### there was an error with mmapplypolicy execution ####
	else
		echo removing empty directories in $LOCATION
		rm -rf $LOCATION
	fi
fi
exit 0
-------------- next part --------------
/* Define deletion rules for aged files in /dev/scratch (system pool by default).
   If the file has not been accessed in 90 days AND not owned by root then delete it. */

RULE 'DelSystem' DELETE
	FROM POOL 'system' 
	FOR FILESET('root')


From harr1 at llnl.gov  Thu Jun 30 16:08:54 2022
From: harr1 at llnl.gov (Cameron Harr)
Date: Thu, 30 Jun 2022 08:08:54 -0700
Subject: [gpfsug-discuss] quickest way to delete all files (and
 directories) in a file system
In-Reply-To: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
References: <15938fc5-9c06-5024-6f5e-9e3d64129b12@it.ox.ac.uk>
Message-ID: <c2d6d8e5-4c82-68af-0adc-c9d6e933ac94@llnl.gov>

If you have MPI infrastructure set up already on some clients, 'drm' 
from the MPI File Utils can delete them fairly quickly (e.g. with 256 procs)

https://github.com/hpc/mpifileutils

On 6/30/22 4:31 AM, Tina Friedrich wrote:
> Hello everyone,
>
> this should be a simple question, but we can't quite figure out how to 
> best proceed.
>
> We have some file systems that we want to, basically, empty out. As in 
> remove all files and directories currently on them. Both contain a 
> pretty large number of files/directories (something like 50,000,000, 
> with sometimes silly characters in the file names). 'rm -rf' clearly 
> isn't the way to go forward.
>
> We've come up with either 'mmapplypolicy' (i.e. a policy to remove all 
> files) or removing and re-creating the file systems as options (open 
> to other suggestions!).
>
> We want the file systems still; ideally without having to redo the 
> authentication and key swaps etc for the 'remote' clusters using them.
>
> This is a Lenovo DSS, but I don't think it makes much of a difference.
>
> So - what's the best way to proceed?
>
> If it is mmapplypolicy - does anyone have a (tested/known working) 
> example of a policy to simply remove all files?
>
> Thanks,
> Tina
>