[gpfsug-discuss] Protection against silent data corruption
Stephan Graf
st.graf at fz-juelich.de
Thu Jun 9 06:59:13 BST 2022
Hi,
I have create an IDEA for it:
https://ibm-sys-storage.ideas.ibm.com/ideas/GPFS-I-851
Stephan
Am 08.06.2022 um 20:35 schrieb IBM Spectrum Scale:
> Hi Stephen,
>
> Currently such a feature is not available in Spectrum Scale product.
>
>
> Regards, The Spectrum Scale (GPFS) team
>
> ------------------------------------------------------------------------------------------------------------------
> If you feel that your question can benefit other users of Spectrum
> Scale (GPFS), then please post it to the public IBM developerWroks Forum
> at
> https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
> <https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479>.
>
>
> If your query concerns a potential software error in Spectrum Scale
> (GPFS) and you have an IBM software maintenance contract please contact
> 1-800-237-5511 in the United States or your local IBM Service Center
> in other countries.
>
> The forum is informally monitored as time permits and should not be used
> for priority messages to the Spectrum Scale (GPFS) team.
>
> Inactive hide details for "Stephen Ulmer" ---02-06-2022 11.32.27
> PM---This only adds a checksum to the NSD wire protocol. The q"Stephen
> Ulmer" ---02-06-2022 11.32.27 PM---This only adds a checksum to the NSD
> wire protocol. The question was about detecting data corruption
>
> From: "Stephen Ulmer" <ulmer at ulmer.org>
> To: "gpfsug main discussion list" <gpfsug-discuss at gpfsug.org>
> Date: 02-06-2022 11.32 PM
> Subject: [EXTERNAL] Re: [gpfsug-discuss] Protection against silent data
> corruption
> Sent by: "gpfsug-discuss" <gpfsug-discuss-bounces at gpfsug.org>
>
> ------------------------------------------------------------------------
>
>
>
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest. -- Stephen On Jun 2, 2022, at
> 1:01 PM, Achim Rehor <Achim.Rehor at de.ibm.com> wrote: hi Stephan,
>
> ZjQcmQRYFpfptBannerStart
> *This Message Is From an External Sender *
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> This only adds a checksum to the NSD wire protocol. The question was
> about detecting data corruption at rest.
>
> --
> Stephen
>
>
> On Jun 2, 2022, at 1:01 PM, Achim Rehor <_Achim.Rehor at de.ibm.com_
> <mailto:Achim.Rehor at de.ibm.com>> wrote:
>
> hi Stephan,
>
> there is, see mmchconfig man page :
>
> nsdCksumTraditional
> This attribute enables checksum data-integrity checking between a
> traditional NSD client node and its NSD server. Valid values are yes
> and no. The default value is no.
> (Traditional in this context means that the NSD client and server
> are configured with IBM Spectrum Scale rather than with IBM Spectrum
> Scale RAID.
> The latter is a component of IBM Elastic Storage Server (ESS) and of
> IBM GPFS Storage Server (GSS).)
>
> The checksum procedure detects any corruption by the network of the
> data in the NSD RPCs that are exchanged between the NSD client and the
> server. A checksum error triggers a request to retransmit the message.
>
> When this attribute is enabled on a client node, the client
> indicates in each of its requests to the server that it is using
> checksums. The server uses checksums only in
> response to client requests in which the indicator is set. A client
> node that accesses a file system that belongs to another cluster can
> use checksums in the same way.
>
> You can change the value of the this attribute for an entire cluster
> without shutting down the mmfsd daemon, or for one or more nodes
> without restarting the nodes.
>
> Note:
> * Enabling this feature can result in significant I/O performance
> degradation and a considerable increase in CPU usage.
>
> * To enable checksums for a subset of the nodes in a cluster, issue
> a command like the following one:
> mmchconfig nsdCksumTraditional=yes -i -N <subset-of-nodes>
>
> The -N flag is valid for this attribute.
>
> --
> Mit freundlichen Grüßen / Kind regards
>
> Achim Rehor
>
> Technical Support Specialist Spectrum Scale and ESS (SME)
> Advisory Product Services Professional
> IBM Systems Storage Support - EMEA
>
> _Achim.Rehor at de.ibm.com_
> <mailto:Achim.Rehor at de.ibm.com> +49-170-4521194
> IBM Deutschland GmbH
> Vorsitzender des Aufsichtsrats: Sebastian Krause
> Geschäftsführung: Gregor Pillen (Vorsitzender), Nicole Reimer,
> Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
> Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
> Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940
>
>
> -----Original Message-----
> *From*: Stephan Graf <_st.graf at fz-juelich.de_
> <mailto:Stephan%20Graf%20%3cst.graf at fz-juelich.de%3e>>
> *Reply-To*: gpfsug main discussion list <_gpfsug-discuss at gpfsug.org_
> <mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
> *To*: gpfsug-discuss <_gpfsug-discuss at gpfsug.org_
> <mailto:gpfsug-discuss%20%3cgpfsug-discuss at gpfsug.org%3e>>
> *Subject*: [EXTERNAL] [gpfsug-discuss] Protection against silent
> data corruption
> *Date*: Thu, 02 Jun 2022 16:31:43 +0200
>
> Hi,
>
> I am wondering if there is an option in SS to enable some checking to
> detect silent data corruption.
>
> Form GNR I know that there is End-to-End integrity. So a checksum is
> stored in addition.
>
> The background is that we are facing an issue where in some files
> (which
> have data replication = 2) the mmrestripefile is reporting, that one
> block is mismatching it's copy (the storage cluster is running SS
> without GNR).
> We have validated that the copied block is fine, but the original
> one is
> broken (and this is what is returned on read access).
> SS right now in our installation is unable to determine which is the
> correct one.
> Is there any option to enable this kind of feature in SS? If not, does
> it make sense to create an "IDEA" for it?
>
> Stephan
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>
> _http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at _gpfsug.org_ <http://gpfsug.org>_
> __http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org_
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
> <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org>
>
>
>
>
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
--
Stephan Graf
Juelich Supercomputing Centre
Phone: +49-2461-61-6578
Fax: +49-2461-61-6656
E-mail: st.graf at fz-juelich.de
WWW: http://www.fz-juelich.de/jsc/
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht,
Prof. Dr. Frauke Melchior
---------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5360 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20220609/cf20f600/attachment.bin>
More information about the gpfsug-discuss
mailing list