From naren.rajasingam at systemethix.com.au  Fri Mar  1 23:20:35 2024
From: naren.rajasingam at systemethix.com.au (Naren Rajasingam)
Date: Fri, 1 Mar 2024 23:20:35 +0000
Subject: [gpfsug-discuss] Greetings
Message-ID: <SY7PR01MB93898B76CC2F6FD0C77652A9BD5E2@SY7PR01MB9389.ausprd01.prod.outlook.com>

It's been a while since I had been previously part of this users group.
My name is Naren Rajasingam and I have been working with GPFS/Spectrum Scale/Storage Scale since 2010.

I am formally from IBM (left the company in 2015 after 16 years there) and now work for Systemethix as a senior technical consultant specialising in Scale.

Cheers,


Kind Regards,
-Naren


Naren Rajasingam  Senior Technology Consultant
Mobile: +61 (0)419 513 189
Email: naren.rajasingam at systemethix.com.au
Systemethix Australia
www.systemethix.com.au

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240301/d88fa45d/attachment.htm>

From TROPPENS at de.ibm.com  Mon Mar  4 22:51:10 2024
From: TROPPENS at de.ibm.com (Ulf Troppens)
Date: Mon, 4 Mar 2024 22:51:10 +0000
Subject: [gpfsug-discuss] storage-scale-object - summary
In-Reply-To: <DM6PR15MB39424C85E8CDB8C2DB6937A8D95F2@DM6PR15MB3942.namprd15.prod.outlook.com>
References: <DM6PR15MB39424C85E8CDB8C2DB6937A8D95F2@DM6PR15MB3942.namprd15.prod.outlook.com>
Message-ID: <IA1PR15MB59834F0AC02EE2B697A32597ED232@IA1PR15MB5983.namprd15.prod.outlook.com>

Hi List,

more details can be shared under NDA. Please ask your IBM sales rep to contact me.
Or meet me at the German User meeting this Wednesday.

Best,
Ulf

Ulf Troppens
Product Manager - IBM Storage for Data and AI, Data-Intensive Workflows

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt / Gesch?ftsf?hrung: David Faller
Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Alexander Saupp
Sent: Thursday, 29 February 2024 08:28
To: gpfsug-discuss at gpfsug.org
Subject: [EXTERNAL] [gpfsug-discuss] storage-scale-object - summary

Hi all, there was a couple of very good concerns, questions and statements already. I?d like to summarize my very personal understanding ? not necessarily officially speaking for my employer IBM. Be invited to reach out to IBM Client Engineering
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
    Report Suspicious  <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!1e-vTR6zRvm6FYv7uaHFzgj41sfr_2rA0nZjedJ_lQGMCGN1gtQLVIyjZf4of_cM_2Ovb3iEZbaX6sd7qMstMjdI_1L-abvbMVZLe6_1VItDk9Qab4viO_hZvT3iTqze-hhMOlX_4B4$>   ?
ZjQcmQRYFpfptBannerEnd
Hi all,

there was a couple of very good concerns, questions and statements already. I?d like to summarize my very personal understanding ? not necessarily officially speaking for my employer IBM.
Be invited to reach out to IBM Client Engineering (a presales invest by IBM) if you have a need to discuss, demo or evaluate in connection to an active Opportunity ( i know, but that?s the rules of engagement )


  *   Swift S3 is complex to maintain, as said by Christoph Martin. IBM supports multiple stacks for multiple products. That along with currency the main reason to move away


  *   For alternatives a focus on ?unified access via file and s3? was set, so the feature set required is something (for you as a customer) to evaluate.


  *   You can find references on our future architecture publicly available, its based on the same noobaa stack that is used in RedHat ODF MCG.
I would like to recommend the following blog of my IBM CE peer Nils Haustein:

https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/21/s3-tiering-to-tape-with-noobaa-part-1-introduction

https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/21/s3-tiering-to-tape-with-noobaa-part-2-deployment

https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/26/s3-tiering-to-tape-with-noobaa-part-3-tiering


  *   Some might consider release timing, .. But here is the plan, thanks for already outlining, Renar!
https://www.ibm.com/docs/en/storage-scale/5.1.9?topic=summary-changes

     *   IBM Storage Scale 5.1.8 is the last release that has CES Swift Object protocol.
     *   IBM Storage Scale 5.1.9 [EUS release] will tolerate the update of a CES node from IBM Storage Scale 5.1.8 - so if you have it, you can keep it
     *   I?m expecting TechPreview and GA within the next two releases ? technical details as per above blog.

I hope this helps to clarify IBM?s plan of record. I?d like to reinvite you to reach out to IBM (via IBM Sales / directly) if you?d like to follow-up.


Mit freundlichen Gr??en / Kind regards

Alexander Saupp
Senior Technology Engineer | IBM Client Engineering EMEA | Software Defined Storage
+49 172 725 1072 Mobile
alexander.saupp at de.ibm.com<mailto:alexander.saupp at de.ibm.com>

IBM Data Privacy Statement<https://www.ibm.com/privacy/us/en/>
IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Gesch?ftsf?hrung: Wolfgang Wendt (Vorsitzender), Dr. Andreas Buchelt, Dr. Frank Kohls, Christine Rupp
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240304/0f87818f/attachment.htm>

From Peter.Hruska at mcomputers.cz  Fri Mar  8 13:02:56 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Fri, 8 Mar 2024 13:02:56 +0000
Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows
 GPFS clients
Message-ID: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>

Hello,

I would like to ask if there is any possible way to unify access IDs for Samba share and GPFS client on Windows.

The model situation looks like this - we have a GPFS cluster with one filesystem. On the filesystem we have a Samba shared directory. MMauth is configured to Active directory for file access and works.
The Windows machine is also part of the GPFS cluster and therefore it is able to mount the filesystem "directly" by mmmount. However the user/group IDs that are used by this access method are not consistent with the IDs used by Samba and access to the same data is not working well.
Is there is any solution to this situation? I tried to study the documentation but I didn't find a clear information whether this is or isn't possible but I also didn't find a way to unify the access.


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:d08ab7cf0ea126f3f6cd0af2c9a3127280680285.camel at mcomputers.cz-0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/fc5cad1e/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/fc5cad1e/attachment.png>

From cabrillo at ifca.unican.es  Fri Mar  8 14:39:50 2024
From: cabrillo at ifca.unican.es (Iban Cabrillo)
Date: Fri, 8 Mar 2024 14:39:50 +0000 (UTC)
Subject: [gpfsug-discuss] pagepool
Message-ID: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>

Good afternoon, 
We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: 

pagepool 323908133683 

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. 

Any advice is welcomed, 

Regards, I 
-- 

================================================================ 
Ib?n Cabrillo Bartolom? 
Instituto de F?sica de Cantabria (IFCA-CSIC) 
Santander, Spain 
Tel: +34942200969/+34669930421 
Responsible for advanced computing service (RSC) 
========================================================================================= 
========================================================================================= 
All our suppliers must know and accept IFCA policy available at: 

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers 
========================================================================================== 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/7afc09cb/attachment.htm>

From mjarsulic at bsd.uchicago.edu  Fri Mar  8 14:50:02 2024
From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD])
Date: Fri, 8 Mar 2024 14:50:02 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>

Ib?n,

What are you using for your scheduler?

On my compute nodes, I am setting the pagepool to 16 GB and setting aside specialized memory for GPFS that will not be allocated to jobs.

--
Mike Jarsulic
Associate Director, Scientific Computing
Center for Research Informatics | Biological Sciences Division
University of Chicago
5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Iban Cabrillo <cabrillo at ifca.unican.es>
Date: Friday, March 8, 2024 at 8:44?AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] pagepool
Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes
ZjQcmQRYFpfptBannerStart
External: Use caution with links, attachments, and providing information.
<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/MvNZe7V6M35iZPhbgng-hfU!9jECkqdfV_6Ic3NHXjlbIhrBhB3hwIztWE3CyPV6abEs3lGpxOjkRYIAc9f1p9-m8iXypBQN9kyE2G7S4g_sF2IPGxTSiOuvB1zZG9lqRzCpX3lnwnuoF6rsi0lN1pk$>
Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/MvNZe7V6M35iZPhbgng-hfU!9jECkqdfV_6Ic3NHXjlbIhrBhB3hwIztWE3CyPV6abEs3lGpxOjkRYIAc9f1p9-m8iXypBQN9kyE2G7S4g_sF2IPGxTSiOuvB1zZG9lqRzCpX3lnwnuoF6rsi0lN1pk$>


ZjQcmQRYFpfptBannerEnd
Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!zUWNQw6r2gJHAEevKvSe63ovvUvDUaiY2BLKNW8LtZPEBL-jxfZ95n157k2625nYknTLVAnh5E5C2fZmJt4rj1OpSjazzj4$>
==========================================================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/1a2891b7/attachment.htm>

From mjarsulic at bsd.uchicago.edu  Fri Mar  8 14:50:02 2024
From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD])
Date: Fri, 8 Mar 2024 14:50:02 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>

Ib?n,

What are you using for your scheduler?

On my compute nodes, I am setting the pagepool to 16 GB and setting aside specialized memory for GPFS that will not be allocated to jobs.

--
Mike Jarsulic
Associate Director, Scientific Computing
Center for Research Informatics | Biological Sciences Division
University of Chicago
5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Iban Cabrillo <cabrillo at ifca.unican.es>
Date: Friday, March 8, 2024 at 8:44?AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] pagepool
Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes
ZjQcmQRYFpfptBannerStart
External: Use caution with links, attachments, and providing information.
<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/MvNZe7V6M35iZPhbgng-hfU!9jECkqdfV_6Ic3NHXjlbIhrBhB3hwIztWE3CyPV6abEs3lGpxOjkRYIAc9f1p9-m8iXypBQN9kyE2G7S4g_sF2IPGxTSiOuvB1zZG9lqRzCpX3lnwnuoF6rsi0lN1pk$>
Report Suspicious <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/MvNZe7V6M35iZPhbgng-hfU!9jECkqdfV_6Ic3NHXjlbIhrBhB3hwIztWE3CyPV6abEs3lGpxOjkRYIAc9f1p9-m8iXypBQN9kyE2G7S4g_sF2IPGxTSiOuvB1zZG9lqRzCpX3lnwnuoF6rsi0lN1pk$>


ZjQcmQRYFpfptBannerEnd
Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!zUWNQw6r2gJHAEevKvSe63ovvUvDUaiY2BLKNW8LtZPEBL-jxfZ95n157k2625nYknTLVAnh5E5C2fZmJt4rj1OpSjazzj4$>
==========================================================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/1a2891b7/attachment-0001.htm>

From cabrillo at ifca.unican.es  Fri Mar  8 15:14:37 2024
From: cabrillo at ifca.unican.es (Iban Cabrillo)
Date: Fri, 8 Mar 2024 15:14:37 +0000 (UTC)
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
Message-ID: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>

HI Mike,
  Slurm 23.02.7-1.el9

Regards, I

-- 
================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
==========================================================================================


From cabrillo at ifca.unican.es  Fri Mar  8 15:14:37 2024
From: cabrillo at ifca.unican.es (Iban Cabrillo)
Date: Fri, 8 Mar 2024 15:14:37 +0000 (UTC)
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
Message-ID: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>

HI Mike,
  Slurm 23.02.7-1.el9

Regards, I

-- 
================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
==========================================================================================


From jonathan.buzzard at strath.ac.uk  Fri Mar  8 15:50:19 2024
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Fri, 8 Mar 2024 15:50:19 +0000
Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and
 Windows GPFS clients
In-Reply-To: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>
References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>
Message-ID: <c0f79dc0-4714-4d7f-a326-05931f93f36b@strath.ac.uk>

On 08/03/2024 13:02, Peter Hru?ka wrote:

> Hello,
> 
> I would like to ask if there is any possible way to unify access IDs for 
> Samba share and GPFS client on Windows.
> 
> The model situation looks like this - we have a GPFS cluster with one 
> filesystem. On the filesystem we have a Samba shared directory. MMauth 
> is configured to Active directory for file access and works.
> The Windows machine is also part of the GPFS cluster and therefore it is 
> able to mount the filesystem "directly" by mmmount. However the 
> user/group IDs that are used by this access method are not consistent 
> with the IDs used by Samba and access to the same data is not working well.
> Is there is any solution to this situation? I tried to study the 
> documentation but I didn't find a clear information whether this is or 
> isn't possible but I also didn't find a way to unify the access.
> 

The obvious question is to ask if your Active Directory has it's RFC 
23037bis fields populated?


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From mjarsulic at bsd.uchicago.edu  Fri Mar  8 16:01:30 2024
From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD])
Date: Fri, 8 Mar 2024 16:01:30 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
 <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>
Message-ID: <BY5PR04MB70279E50986719A0A003765D92272@BY5PR04MB7027.namprd04.prod.outlook.com>

Ib?n,

What I did in my slurm.conf is use the MemSpecLimit option. For a node with 128 GB and 16 GB GPFS pagepool, I set the memspec like this:


NodeName=cri22cn[001-156] CPUs=32 RealMemory=128000 MemSpecLimit=20000


--
Mike Jarsulic
Associate Director, Scientific Computing
Center for Research Informatics | Biological Sciences Division
University of Chicago
5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Iban Cabrillo <cabrillo at ifca.unican.es>
Date: Friday, March 8, 2024 at 9:19?AM
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Cc: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] pagepool
HI Mike,
  Slurm 23.02.7-1.el9

Regards, I

--
================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://urldefense.com/v3/__https://confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$>
==========================================================================================


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$<https://urldefense.com/v3/__http:/gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/3e4af2c8/attachment.htm>

From mjarsulic at bsd.uchicago.edu  Fri Mar  8 16:01:30 2024
From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD])
Date: Fri, 8 Mar 2024 16:01:30 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
 <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es>
Message-ID: <BY5PR04MB70279E50986719A0A003765D92272@BY5PR04MB7027.namprd04.prod.outlook.com>

Ib?n,

What I did in my slurm.conf is use the MemSpecLimit option. For a node with 128 GB and 16 GB GPFS pagepool, I set the memspec like this:


NodeName=cri22cn[001-156] CPUs=32 RealMemory=128000 MemSpecLimit=20000


--
Mike Jarsulic
Associate Director, Scientific Computing
Center for Research Informatics | Biological Sciences Division
University of Chicago
5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Iban Cabrillo <cabrillo at ifca.unican.es>
Date: Friday, March 8, 2024 at 9:19?AM
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org>
Cc: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: Re: [gpfsug-discuss] pagepool
HI Mike,
  Slurm 23.02.7-1.el9

Regards, I

--
================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://urldefense.com/v3/__https://confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$>
==========================================================================================


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$<https://urldefense.com/v3/__http:/gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/3e4af2c8/attachment-0001.htm>

From Peter.Hruska at mcomputers.cz  Fri Mar  8 16:08:35 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Fri, 8 Mar 2024 16:08:35 +0000
Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and
 Windows GPFS clients
In-Reply-To: <c0f79dc0-4714-4d7f-a326-05931f93f36b@strath.ac.uk>
References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>
 <c0f79dc0-4714-4d7f-a326-05931f93f36b@strath.ac.uk>
Message-ID: <e11c7ff0b68adc55588a617eacb1f49160dd2730.camel@mcomputers.cz>

Hello Jonathan,

Thank you for the answer. Since I used Automatic ID-mapping method for the mmauth deployment I didn't do anything regarding RFC2307.
I chose this approach because we don't want to use kerberos for NFS authentication (although we will use NFS for separate data access).
I'll check on that. If you have any hints I would appreciate them.


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:c95a681f19a2dd92eac22c858b7c1d0dfa182335.camel at mcomputers.cz-0]


On Fri, 2024-03-08 at 15:50 +0000, Jonathan Buzzard wrote:
EXTERN? ODES?LATEL


On 08/03/2024 13:02, Peter Hru?ka wrote:

Hello,

I would like to ask if there is any possible way to unify access IDs for
Samba share and GPFS client on Windows.

The model situation looks like this - we have a GPFS cluster with one
filesystem. On the filesystem we have a Samba shared directory. MMauth
is configured to Active directory for file access and works.
The Windows machine is also part of the GPFS cluster and therefore it is
able to mount the filesystem "directly" by mmmount. However the
user/group IDs that are used by this access method are not consistent
with the IDs used by Samba and access to the same data is not working well.
Is there is any solution to this situation? I tried to study the
documentation but I didn't find a clear information whether this is or
isn't possible but I also didn't find a way to unify the access.


The obvious question is to ask if your Active Directory has it's RFC
23037bis fields populated?


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/a4302f25/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/a4302f25/attachment.png>

From jonathan.buzzard at strath.ac.uk  Fri Mar  8 16:18:47 2024
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Fri, 8 Mar 2024 16:18:47 +0000
Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and
 Windows GPFS clients
In-Reply-To: <e11c7ff0b68adc55588a617eacb1f49160dd2730.camel@mcomputers.cz>
References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>
 <c0f79dc0-4714-4d7f-a326-05931f93f36b@strath.ac.uk>
 <e11c7ff0b68adc55588a617eacb1f49160dd2730.camel@mcomputers.cz>
Message-ID: <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk>

On 08/03/2024 16:08, Peter Hru?ka wrote:

> Hello Jonathan,
> 
> Thank you for the answer. Since I used Automatic ID-mapping method for 
> the mmauth deployment I didn't do anything regarding RFC2307.
> I chose this approach because we don't want to use kerberos for NFS 
> authentication (although we will use NFS for separate data access).
> I'll check on that. If you have any hints I would appreciate them.
> 

Consistent mapping won't work without RFC2307bis attributes being 
populated as far as I am aware. Windows knows nothing about the 
idmap_rid, it only knows about SID's

Mixing NFS and Samba out the same file system or at the very least the 
same directory hierarchy is a mugs game. There in lies a gigantic pit of 
woe for all those foolish enough to try based on personal experience.

JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From jonathan.buzzard at strath.ac.uk  Fri Mar  8 16:25:17 2024
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Fri, 8 Mar 2024 16:25:17 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
Message-ID: <ca734a96-aea7-42f7-8e5d-7f4509e777a4@strath.ac.uk>

On 08/03/2024 14:50, Jarsulic, Michael [BSD] wrote:

> Ib?n,
> 
> What are you using for your scheduler?
> 
> On my compute nodes, I am setting the pagepool to 16 GB and setting 
> aside specialized memory for GPFS that will not be allocated to jobs.
> 

What you would normally do is create a node class

mmcrnodeclass compute -N node001,node002,node003,node004,.....

then set the pagepool appropriately

mmchconfig pagepool=16G -i -N compute

We then use slurm to limit the maximum amount of RAM a job can have on a 
node to be physical RAM minus the pagepool size minus a bit more for 
good measure to allow for the OS.

If the OOM is kicking in then you need to reduce the RAM limit in slurm 
some more till it stops.

Note we also twiddle with some other limits for compute nodes

mmchconfig maxFilesToCache=8000 -N compute
mmchconfig maxStatCache=16000 -N compute

We have a slew of node classes where these settings are tweaked to 
account for their RAM and their role so dssg, compute, 
gpu,protocol,teaching, and login. All nodes belong to one or more node 
classes. Which reminds me I need a gui node class now.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From scale at us.ibm.com  Fri Mar  8 16:31:44 2024
From: scale at us.ibm.com (scale)
Date: Fri, 8 Mar 2024 16:31:44 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <ca734a96-aea7-42f7-8e5d-7f4509e777a4@strath.ac.uk>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <BY5PR04MB7027646E1C9F4744DC04E78492272@BY5PR04MB7027.namprd04.prod.outlook.com>
 <ca734a96-aea7-42f7-8e5d-7f4509e777a4@strath.ac.uk>
Message-ID: <BL1PR15MB53615BF488923D22467FB36283272@BL1PR15MB5361.namprd15.prod.outlook.com>

To the original question about the size of the pagepool being 323908133683 bytes, I think that should only be applied to the IO nodes (the ones handling all the NSD IOs) not every node in the cluster.  The large pagepool size is needed on the IO nodes for GNR to function properly.

From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
Date: Friday, March 8, 2024 at 11:27?AM
To: gpfsug-discuss at gpfsug.org <gpfsug-discuss at gpfsug.org>
Subject: [EXTERNAL] Re: [gpfsug-discuss] pagepool
On 08/03/2024 14:50, Jarsulic, Michael [BSD] wrote:

> Ib?n,
>
> What are you using for your scheduler?
>
> On my compute nodes, I am setting the pagepool to 16 GB and setting
> aside specialized memory for GPFS that will not be allocated to jobs.
>

What you would normally do is create a node class

mmcrnodeclass compute -N node001,node002,node003,node004,.....

then set the pagepool appropriately

mmchconfig pagepool=16G -i -N compute

We then use slurm to limit the maximum amount of RAM a job can have on a
node to be physical RAM minus the pagepool size minus a bit more for
good measure to allow for the OS.

If the OOM is kicking in then you need to reduce the RAM limit in slurm
some more till it stops.

Note we also twiddle with some other limits for compute nodes

mmchconfig maxFilesToCache=8000 -N compute
mmchconfig maxStatCache=16000 -N compute

We have a slew of node classes where these settings are tweaked to
account for their RAM and their role so dssg, compute,
gpu,protocol,teaching, and login. All nodes belong to one or more node
classes. Which reminds me I need a gui node class now.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/8df6a9e7/attachment.htm>

From ewahl at osc.edu  Fri Mar  8 16:32:57 2024
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 8 Mar 2024 16:32:57 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>

Yikes!  Those must be some mighty large memory compute nodes!   That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes.
(so ~324 GB)  Even on our 1TB+ memory machines we do not tune it that high.

You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc)
You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc)
I would address this here, rather than on the Slurm side.   Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs.  Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup.

Example from a cross mounted compute-side cluster.  Default is 1GB:
[root at nostorage-manager1 ~]# mmlsconfig pagepool
pagepool 1024M
pagepool 4G [k8,pitzer]
pagepool 64G [ascend]
pagepool 16G [ib-spire-login,owenslogin,pitzerlogin]
pagepool 48G [dm]
pagepool 4G [cardinal]
pagepool 64G [cardinal_quadport]

example from the ESS/DSS server side.  Later ESS versions set things by mmvdisk groups, rather than server type.
# mmlsconfig pagepool
pagepool 32G
pagepool 358G [gss_ppc64]
pagepool 16384M [ibmems11-hs,ems]
pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs]
pagepool 64G [sp]
pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs]
pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs]
pagepool 96G [ces]

example of nodeclasses used to address other settings, such as what Infiniband port(s) to use.
# mmlsconfig verbsports
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_2 [pitzer_dualport]
verbsPorts mlx4_1/1 mlx4_1/2 [dm]
verbsPorts mlx5_0 mlx5_2 [k8_dualport]
verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport]

Ed Wahl
Ohio Supercomputer Center
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Iban Cabrillo
Sent: Friday, March 8, 2024 9:40 AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] pagepool

Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes

Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!KGKeukY!3o_dGRsvxDtOG6Z646nJEb9ehb_ondS1kL3gecKjKN7mvMULc6h9iKST-ihDjnWz04X-lcNATjPzLDB2eW7P$>
==========================================================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/0117d34e/attachment.htm>

From novosirj at rutgers.edu  Fri Mar  8 16:35:47 2024
From: novosirj at rutgers.edu (Ryan Novosielski)
Date: Fri, 8 Mar 2024 16:35:47 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <7B6AC75B-43AF-4696-AAA9-0296D7211B20@rutgers.edu>

What are the units on that ? is that 323GB? Zero chance you need it that high on clients.

Just for perspective, our pagepool on our clients is 4GB and on the DSS-G, it is 242GB.

I would suggest that you start with the settings in /opt/lenovo/dss/bin/dssClientConfig.sh (the settings themselves are in v5.worker.dssClientConfig in the same directory), if you have a brand new config and don?t have to worry about breaking your system with the wrong values (I have to be more careful with that as some of our values are higher than those defaults already). You just made me worry that perhaps I was still running with an out-of-date value there, but the default is still to raise the pagepool for clients to 4GB if you don?t specify otherwise.

What I was told by Lenovo years ago was that this is about the level where you start not to notice any difference when you go larger. You may want to test values for this for your workloads/see whether you fill it up if it?s set to that Lenovo default and then reconsider.

You can change it for a single node with -N <nodename>, if you want to test.

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

On Mar 8, 2024, at 09:39, Iban Cabrillo <cabrillo at ifca.unican.es> wrote:

Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
==========================================================================================


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/9c385702/attachment.htm>

From novosirj at rutgers.edu  Fri Mar  8 16:35:47 2024
From: novosirj at rutgers.edu (Ryan Novosielski)
Date: Fri, 8 Mar 2024 16:35:47 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <7B6AC75B-43AF-4696-AAA9-0296D7211B20@rutgers.edu>

What are the units on that ? is that 323GB? Zero chance you need it that high on clients.

Just for perspective, our pagepool on our clients is 4GB and on the DSS-G, it is 242GB.

I would suggest that you start with the settings in /opt/lenovo/dss/bin/dssClientConfig.sh (the settings themselves are in v5.worker.dssClientConfig in the same directory), if you have a brand new config and don?t have to worry about breaking your system with the wrong values (I have to be more careful with that as some of our values are higher than those defaults already). You just made me worry that perhaps I was still running with an out-of-date value there, but the default is still to raise the pagepool for clients to 4GB if you don?t specify otherwise.

What I was told by Lenovo years ago was that this is about the level where you start not to notice any difference when you go larger. You may want to test values for this for your workloads/see whether you fill it up if it?s set to that Lenovo default and then reconsider.

You can change it for a single node with -N <nodename>, if you want to test.

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

On Mar 8, 2024, at 09:39, Iban Cabrillo <cabrillo at ifca.unican.es> wrote:

Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
==========================================================================================


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/9c385702/attachment-0001.htm>

From ewahl at osc.edu  Fri Mar  8 16:32:57 2024
From: ewahl at osc.edu (Wahl, Edward)
Date: Fri, 8 Mar 2024 16:32:57 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>

Yikes!  Those must be some mighty large memory compute nodes!   That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes.
(so ~324 GB)  Even on our 1TB+ memory machines we do not tune it that high.

You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc)
You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc)
I would address this here, rather than on the Slurm side.   Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs.  Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup.

Example from a cross mounted compute-side cluster.  Default is 1GB:
[root at nostorage-manager1 ~]# mmlsconfig pagepool
pagepool 1024M
pagepool 4G [k8,pitzer]
pagepool 64G [ascend]
pagepool 16G [ib-spire-login,owenslogin,pitzerlogin]
pagepool 48G [dm]
pagepool 4G [cardinal]
pagepool 64G [cardinal_quadport]

example from the ESS/DSS server side.  Later ESS versions set things by mmvdisk groups, rather than server type.
# mmlsconfig pagepool
pagepool 32G
pagepool 358G [gss_ppc64]
pagepool 16384M [ibmems11-hs,ems]
pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs]
pagepool 64G [sp]
pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs]
pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs]
pagepool 96G [ces]

example of nodeclasses used to address other settings, such as what Infiniband port(s) to use.
# mmlsconfig verbsports
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_2 [pitzer_dualport]
verbsPorts mlx4_1/1 mlx4_1/2 [dm]
verbsPorts mlx5_0 mlx5_2 [k8_dualport]
verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport]

Ed Wahl
Ohio Supercomputer Center
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> On Behalf Of Iban Cabrillo
Sent: Friday, March 8, 2024 9:40 AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org>
Subject: [gpfsug-discuss] pagepool

Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes

Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!KGKeukY!3o_dGRsvxDtOG6Z646nJEb9ehb_ondS1kL3gecKjKN7mvMULc6h9iKST-ihDjnWz04X-lcNATjPzLDB2eW7P$>
==========================================================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/0117d34e/attachment-0001.htm>

From novosirj at rutgers.edu  Fri Mar  8 16:50:25 2024
From: novosirj at rutgers.edu (Ryan Novosielski)
Date: Fri, 8 Mar 2024 16:50:25 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>
Message-ID: <DDBA6758-01CD-4630-8E6D-18F11CD6789D@rutgers.edu>

Curious, if you could say something about how you ended up with some page pool values on your client side that are that high. For what use cases does 64GB, for example, make a difference?

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

On Mar 8, 2024, at 11:32, Wahl, Edward <ewahl at osc.edu> wrote:

Yikes!  Those must be some mighty large memory compute nodes!   That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes.
(so ~324 GB)  Even on our 1TB+ memory machines we do not tune it that high.

You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc)
You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc)
I would address this here, rather than on the Slurm side.   Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs.  Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup.

Example from a cross mounted compute-side cluster.  Default is 1GB:
[root at nostorage-manager1 ~]# mmlsconfig pagepool
pagepool 1024M
pagepool 4G [k8,pitzer]
pagepool 64G [ascend]
pagepool 16G [ib-spire-login,owenslogin,pitzerlogin]
pagepool 48G [dm]
pagepool 4G [cardinal]
pagepool 64G [cardinal_quadport]

example from the ESS/DSS server side.  Later ESS versions set things by mmvdisk groups, rather than server type.
# mmlsconfig pagepool
pagepool 32G
pagepool 358G [gss_ppc64]
pagepool 16384M [ibmems11-hs,ems]
pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs]
pagepool 64G [sp]
pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs]
pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs]
pagepool 96G [ces]

example of nodeclasses used to address other settings, such as what Infiniband port(s) to use.
# mmlsconfig verbsports
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_2 [pitzer_dualport]
verbsPorts mlx4_1/1 mlx4_1/2 [dm]
verbsPorts mlx5_0 mlx5_2 [k8_dualport]
verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport]

Ed Wahl
Ohio Supercomputer Center
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> On Behalf Of Iban Cabrillo
Sent: Friday, March 8, 2024 9:40 AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [gpfsug-discuss] pagepool

Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes
Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!KGKeukY!3o_dGRsvxDtOG6Z646nJEb9ehb_ondS1kL3gecKjKN7mvMULc6h9iKST-ihDjnWz04X-lcNATjPzLDB2eW7P$>
==========================================================================================

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/652a3f07/attachment.htm>

From novosirj at rutgers.edu  Fri Mar  8 16:50:25 2024
From: novosirj at rutgers.edu (Ryan Novosielski)
Date: Fri, 8 Mar 2024 16:50:25 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <CH0PR01MB682540BA80F7C13016DCCF6BA8272@CH0PR01MB6825.prod.exchangelabs.com>
Message-ID: <DDBA6758-01CD-4630-8E6D-18F11CD6789D@rutgers.edu>

Curious, if you could say something about how you ended up with some page pool values on your client side that are that high. For what use cases does 64GB, for example, make a difference?

--
#BlackLivesMatter
____
|| \\UTGERS,     |---------------------------*O*---------------------------
||_// the State  |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ  | Office of Advanced Research Computing - MSB A555B, Newark
     `'

On Mar 8, 2024, at 11:32, Wahl, Edward <ewahl at osc.edu> wrote:

Yikes!  Those must be some mighty large memory compute nodes!   That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes.
(so ~324 GB)  Even on our 1TB+ memory machines we do not tune it that high.

You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc)
You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc)
I would address this here, rather than on the Slurm side.   Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs.  Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup.

Example from a cross mounted compute-side cluster.  Default is 1GB:
[root at nostorage-manager1 ~]# mmlsconfig pagepool
pagepool 1024M
pagepool 4G [k8,pitzer]
pagepool 64G [ascend]
pagepool 16G [ib-spire-login,owenslogin,pitzerlogin]
pagepool 48G [dm]
pagepool 4G [cardinal]
pagepool 64G [cardinal_quadport]

example from the ESS/DSS server side.  Later ESS versions set things by mmvdisk groups, rather than server type.
# mmlsconfig pagepool
pagepool 32G
pagepool 358G [gss_ppc64]
pagepool 16384M [ibmems11-hs,ems]
pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs]
pagepool 64G [sp]
pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs]
pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs]
pagepool 96G [ces]

example of nodeclasses used to address other settings, such as what Infiniband port(s) to use.
# mmlsconfig verbsports
verbsPorts mlx5_0
verbsPorts mlx5_0 mlx5_2 [pitzer_dualport]
verbsPorts mlx4_1/1 mlx4_1/2 [dm]
verbsPorts mlx5_0 mlx5_2 [k8_dualport]
verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport]

Ed Wahl
Ohio Supercomputer Center
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>> On Behalf Of Iban Cabrillo
Sent: Friday, March 8, 2024 9:40 AM
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss at spectrumscale.org>>
Subject: [gpfsug-discuss] pagepool

Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes
Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
--

================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers<https://urldefense.com/v3/__https:/confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!KGKeukY!3o_dGRsvxDtOG6Z646nJEb9ehb_ondS1kL3gecKjKN7mvMULc6h9iKST-ihDjnWz04X-lcNATjPzLDB2eW7P$>
==========================================================================================

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org<http://gpfsug.org/>
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/652a3f07/attachment-0001.htm>

From Achim.Rehor at de.ibm.com  Fri Mar  8 16:57:30 2024
From: Achim.Rehor at de.ibm.com (Achim Rehor)
Date: Fri, 8 Mar 2024 16:57:30 +0000
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
Message-ID: <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com>

we do ship a sample file under : /usr/lpp/mmfs/samples/gss  with the gpfs.gnr rpm
for both Servers and Clients, which should be taking care of some settings, including pagepool : gssClientConfig.sh gssServerConfig.sh

These are described here :   https://www.ibm.com/docs/en/storage-scale-system/6.0.2?topic=guide-client-node-tuning-recommendations

--
Mit freundlichen Gr??en / Kind regards

Achim Rehor

Technical Support Specialist S?pectrum Scale and ESS (SME)
Advisory Product Services Professional
IBM Systems Storage Support - EMEA

Achim.Rehor at de.ibm.com<mailto:Achim.Rehor at de.ibm.com> +49-170-4521194
IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940


-----Original Message-----
From: Iban Cabrillo <cabrillo at ifca.unican.es<mailto:Iban%20Cabrillo%20%3ccabrillo at ifca.unican.es%3e>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org%3e>>
To: gpfsug-discuss <gpfsug-discuss at spectrumscale.org<mailto:gpfsug-discuss%20%3cgpfsug-discuss at spectrumscale.org%3e>>
Subject: [EXTERNAL] [gpfsug-discuss] pagepool
Date: Fri, 08 Mar 2024 14:39:50 +0000

Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
<https://us-phishalarm-ewt.proofpoint.com/EWT/v1/PjiDSg!18-h4n7_ChnbuiX7e0GFTsYUJNVBNZ_E_ElE67ONzl3QT1w6PYnUN6WAu4ZI4rzZ2IdOCgl542cMjJGz23JcppeicExDJKjQVmhBsC1aLex9mk22FNWY36mr$>
Report Suspicious

ZjQcmQRYFpfptBannerEnd
Good afternoon,
   We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value:

    pagepool 323908133683

But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs.

Any advice is welcomed,

Regards, I
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240308/eeb310bd/attachment.htm>

From cabrillo at ifca.unican.es  Fri Mar  8 19:19:46 2024
From: cabrillo at ifca.unican.es (Iban Cabrillo)
Date: Fri, 8 Mar 2024 19:19:46 +0000 (UTC)
Subject: [gpfsug-discuss] pagepool
In-Reply-To: <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com>
References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es>
 <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com>
Message-ID: <1257158574.6655748.1709925586705.JavaMail.zimbra@ifca.unican.es>

HI Guys,
  Thanks a lot!! for all these usefull information

Regards, I
-- 
================================================================
  Ib?n Cabrillo Bartolom?
  Instituto de F?sica de Cantabria (IFCA-CSIC)
  Santander, Spain
  Tel: +34942200969/+34669930421
  Responsible for advanced computing service (RSC)
=========================================================================================
=========================================================================================
All our suppliers must know and accept IFCA policy available at:

https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers
==========================================================================================


From sarah.walters at uq.edu.au  Mon Mar 11 03:09:44 2024
From: sarah.walters at uq.edu.au (Sarah Walters)
Date: Mon, 11 Mar 2024 03:09:44 +0000
Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and
 Windows GPFS clients
In-Reply-To: <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk>
References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz>
 <c0f79dc0-4714-4d7f-a326-05931f93f36b@strath.ac.uk>
 <e11c7ff0b68adc55588a617eacb1f49160dd2730.camel@mcomputers.cz>
 <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk>
Message-ID: <SYCP282MB0175CC6C8166708119A11AB7B3242@SYCP282MB0175.AUSP282.PROD.OUTLOOK.COM>

It works just fine at UQ, using an AFM cache. We have NFS-only at the 'home' but we have thousands of filesets coming out of NFS and SMB on our cache. Not, technically, a preferred configuration to have that many of them, but it's possible.


Sarah Walters

BCompSc

Research Computing Systems Engineer


Research Computing Centre

The University of Queensland

Brisbane QLD 4072 Australia


E sarah.walters at uq.edu.au W www.rcc.uq.edu.au


CRICOS code: 00025B


The University of Queensland is embracing the Green Office philosophy. Please consider the environment before printing this email.


This email (including any attached files) is intended solely for the addressee and may contain confidential information of The University of Queensland. If you are not the addressee, you are notified that any transmission, distribution, printing or photocopying of this email is prohibited. If you have received this email in error, please delete and notify me. Unless explicitly stated, the opinions expressed in this email do not represent the official position of The University of Queensland.

________________________________
From: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> on behalf of Jonathan Buzzard <jonathan.buzzard at strath.ac.uk>
Sent: Saturday, 9 March 2024 02:18
To: gpfsug-discuss at gpfsug.org <gpfsug-discuss at gpfsug.org>
Subject: Re: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients

On 08/03/2024 16:08, Peter Hru?ka wrote:

> Hello Jonathan,
>
> Thank you for the answer. Since I used Automatic ID-mapping method for
> the mmauth deployment I didn't do anything regarding RFC2307.
> I chose this approach because we don't want to use kerberos for NFS
> authentication (although we will use NFS for separate data access).
> I'll check on that. If you have any hints I would appreciate them.
>

Consistent mapping won't work without RFC2307bis attributes being
populated as far as I am aware. Windows knows nothing about the
idmap_rid, it only knows about SID's

Mixing NFS and Samba out the same file system or at the very least the
same directory hierarchy is a mugs game. There in lies a gigantic pit of
woe for all those foolish enough to try based on personal experience.

JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/758ac1fd/attachment.htm>

From Peter.Hruska at mcomputers.cz  Mon Mar 11 13:21:32 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Mon, 11 Mar 2024 13:21:32 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
Message-ID: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>

Hello,

We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:0e66df54ea6e2d2372ddf5fb3417f35a416a893f.camel at mcomputers.cz-0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/e5cfd6b3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/e5cfd6b3/attachment.png>

From Renar.Grunenberg at huk-coburg.de  Mon Mar 11 13:56:04 2024
From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar)
Date: Mon, 11 Mar 2024 13:56:04 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
Message-ID: <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de>

Hallo Peter,
my to cents to this.
Set the diasbleDIO=yes Parameter to DEFAULT and use the

! dioSmallSeqWriteBatching 1
  disableDIO 0
And give it a try.

Regards Renar


Renar Grunenberg
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:        09561 96-44110
Telefax:        09561 96-44104
E-Mail: Renar.Grunenberg at huk-coburg.de
Internet:       www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas.
________________________________
Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> Im Auftrag von Peter Hru?ka
Gesendet: Montag, 11. M?rz 2024 14:22
An: gpfsug-discuss at gpfsug.org
Betreff: [gpfsug-discuss] Slow performance on writes when using direct io

Hello,

We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?


--
S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista
M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:image001.png at 01DA73C4.13BF9D80]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/1cc72541/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13955 bytes
Desc: image001.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/1cc72541/attachment.png>

From Peter.Hruska at mcomputers.cz  Mon Mar 11 15:13:49 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Mon, 11 Mar 2024 15:13:49 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
 <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de>
Message-ID: <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz>

Hello Renar,

Thank you for the suggestion. I tried the configuration changes. They however do not seem to have any effect - the performance seems identical. I also tried all 4 combinations. I checked the documentation on the parameter "dioSmallSeqWriteBatching" and it states that small IOs are considered under 64kb. I've also found a parameter "dioSmallSeqWriteThreshold" but my GPFS version (5.1.9.0) claims that it's an unknown attribute.


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:7960eefed7c420f97bb5d3ea3ccc383e18a57bcd.camel at mcomputers.cz-0]


On Mon, 2024-03-11 at 13:56 +0000, Grunenberg, Renar wrote:
EXTERN? ODES?LATEL

Hallo Peter,
my to cents to this.
Set the diasbleDIO=yes Parameter to DEFAULT and use the

! dioSmallSeqWriteBatching 1
  disableDIO 0
And give it a try.

Regards Renar


Renar Grunenberg
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:        09561 96-44110
Telefax:        09561 96-44104
E-Mail: Renar.Grunenberg at huk-coburg.de
Internet:       www.huk.de

________________________________
HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas.
________________________________
Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org>Im Auftrag von Peter Hru?ka
Gesendet: Montag, 11. M?rz 2024 14:22
An: gpfsug-discuss at gpfsug.org
Betreff: [gpfsug-discuss] Slow performance on writes when using direct io

Hello,

We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?


--
S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista
M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:image001.png at 01DA73C4.13BF9D80]


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/eb5d3ac3/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/eb5d3ac3/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13955 bytes
Desc: image001.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/eb5d3ac3/attachment-0001.png>

From Peter.Hruska at mcomputers.cz  Mon Mar 11 16:43:03 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Mon, 11 Mar 2024 16:43:03 +0000
Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow
Message-ID: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz>

Hello,

We've encountered yet another performance flaw. We have a GPFS filesystem mounted using GPFS binaries on Windows. When we try to rewrite a file on the GPFS filesystem rewriting speed is much slower than writing to a new file. The difference ratio we measured is about 3.5 times. From the task manager it is visible that there is excessive amount of reading from the network when rewriting. This is even visible on the NDS server as io activity. However when rewriting on a linux client there are no reads while rewriting. Has anyone encountered such problems?
To replicate the issue is is possidle to run fio twice with the same configuration to achieve rewriting or to run iozone. Both tools return similar outputs.

--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:d720b14ac4f99022ef331403ba1fb9d89b0f0d64.camel at mcomputers.cz-0]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/bd855ee6/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/bd855ee6/attachment.png>

From Renar.Grunenberg at huk-coburg.de  Mon Mar 11 17:04:57 2024
From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar)
Date: Mon, 11 Mar 2024 17:04:57 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
 <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de>
 <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz>
Message-ID: <dc9a76952929482499776120ffac4b0e@huk-coburg.de>

Hallo Peter,
it?s difficult to give here the right recommendation, what here relevant are the io-size, the current IO-pattern of your IO?s the current config of your deamons and so on. Best is open a performance Ticket and work on this.
There are a presentation on the usergroup to hopefully figured out a little bit to these mentioned parameter. You can find this here:
https://www.spectrumscaleug.org/wp-content/uploads/2020/09/004-spectrum-scale-performance-update.pdf


Renar Grunenberg
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:        09561 96-44110
Telefax:        09561 96-44104
E-Mail: Renar.Grunenberg at huk-coburg.de
Internet:       www.huk.de
________________________________
HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas.
________________________________
Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> Im Auftrag von Peter Hru?ka
Gesendet: Montag, 11. M?rz 2024 16:14
An: gpfsug-discuss at gpfsug.org
Betreff: Re: [gpfsug-discuss] Slow performance on writes when using direct io

Hello Renar,

Thank you for the suggestion. I tried the configuration changes. They however do not seem to have any effect - the performance seems identical. I also tried all 4 combinations. I checked the documentation on the parameter "dioSmallSeqWriteBatching" and it states that small IOs are considered under 64kb. I've also found a parameter "dioSmallSeqWriteThreshold" but my GPFS version (5.1.9.0) claims that it's an unknown attribute.


--
S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista
M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:image001.png at 01DA73DD.B7502320]


On Mon, 2024-03-11 at 13:56 +0000, Grunenberg, Renar wrote:
EXTERN? ODES?LATEL
Hallo Peter,
my to cents to this.
Set the diasbleDIO=yes Parameter to DEFAULT and use the

! dioSmallSeqWriteBatching 1
  disableDIO 0
And give it a try.

Regards Renar


Renar Grunenberg
Abteilung Informatik - Betrieb

HUK-COBURG
Bahnhofsplatz
96444 Coburg
Telefon:

09561 96-44110

Telefax:

09561 96-44104

E-Mail:

Renar.Grunenberg at huk-coburg.de<mailto:Renar.Grunenberg at huk-coburg.de>

Internet:

www.huk.de<http://www.huk.de>

________________________________
HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg
Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021
Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg
Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin.
Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas.
________________________________
Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen.
Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben,
informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht.
Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet.

This information may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this information in error) please notify the
sender immediately and destroy this information.
Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden.
________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org<mailto:gpfsug-discuss-bounces at gpfsug.org>>Im Auftrag von Peter Hru?ka
Gesendet: Montag, 11. M?rz 2024 14:22
An: gpfsug-discuss at gpfsug.org<mailto:gpfsug-discuss at gpfsug.org>
Betreff: [gpfsug-discuss] Slow performance on writes when using direct io

Hello,

We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?


--
S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista
M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:image001.png at 01DA73DD.B7502320]


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/bac05a9d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13955 bytes
Desc: image001.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240311/bac05a9d/attachment.png>

From jonathan.buzzard at strath.ac.uk  Mon Mar 11 17:35:12 2024
From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard)
Date: Mon, 11 Mar 2024 17:35:12 +0000
Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow
In-Reply-To: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz>
References: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz>
Message-ID: <a88fa831-464f-4762-801b-d884776e5e6c@strath.ac.uk>

On 11/03/2024 16:43, Peter Hru?ka wrote:

> Hello,
> 
> We've encountered yet another performance flaw. We have a GPFS 
> filesystem mounted using GPFS binaries on Windows. When we try to 
> rewrite a file on the GPFS filesystem rewriting speed is much slower 
> than writing to a new file. The difference ratio we measured is about 
> 3.5 times. From the task manager it is visible that there is excessive 
> amount of reading from the network when rewriting. This is even visible 
> on the NDS server as io activity. However when rewriting on a linux 
> client there are no reads while rewriting. Has anyone encountered such 
> problems?
> To replicate the issue is is possidle to run fio twice with the same 
> configuration to achieve rewriting or to run iozone. Both tools return 
> similar outputs.

Kind of yes.

What are you using to "rewrite" the file?

What we saw initially over Samba was certain Microsoft applications when 
rewriting a file had truly abysmal performance. The same application 
when saving the same document to a new file and the performance was as 
expected.

After much digging into it the cause (it was weeks of person effort) it 
was determined to be down to the application writing the file *one* byte 
at a time. Basically some idiot C++ developer at Microsoft decided to 
ignore the C++ library because it has "bugs" and write their own 
formatted output routines.

It was not noticeable saving to a local disk, but the instant you tried 
saving to a network drive the performance was truly awful. Basically the 
increased latency of single character IO was the issue.

Note that the issue was not confined to Samba and GPFS, as we verified 
the same abysmal performance with a Windows 2008 R2 server running on 
File and Print Sharing on NTFS on bare metal hardware. It was also not 
confined to just Windows the same awful performance happened on Macs 
too. In fact that is where it first came to light.

Might be the cause of your problem, might not.


JAB.

-- 
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


From salvet at ics.muni.cz  Tue Mar 12 08:59:14 2024
From: salvet at ics.muni.cz (Zdenek Salvet)
Date: Tue, 12 Mar 2024 09:59:14 +0100
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
Message-ID: <20240312085914.GM10934@horn.ics.muni.cz>

On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote:
> We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?

Hello,
did you use pre-allocated file(s) (was it re-write) ? 
libaio traffic is not really asynchronous with respect to necessary metadata
operations (allocating new space and writing allocation structures to disk)
in most Linux filesystems and I guess this case is not heavily optimized
in GPFS either (dioSmallSeqWriteBatching feature may help a little but
it targets different scenario I think).

Best regards,
Zdenek Salvet                                              salvet at ics.muni.cz 
Institute of Computer Science of Masaryk University, Brno, Czech Republic
and CESNET, z.s.p.o., Prague, Czech Republic
Phone: ++420-549 49 6534                           Fax: ++420-541 212 747
----------------------------------------------------------------------------
      Teamwork is essential -- it allows you to blame someone else.


From Peter.Hruska at mcomputers.cz  Tue Mar 12 10:59:34 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Tue, 12 Mar 2024 10:59:34 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <20240312085914.GM10934@horn.ics.muni.cz>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
 <20240312085914.GM10934@horn.ics.muni.cz>
Message-ID: <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz>

Hello,

The direct writes are problematic on both writes and rewrites. Rewrites alone are another issue we have noticed.
Since indirect (direct=0) workloads are fine, it seems that the easiest solution could be to force indirect IO operations for all workloads. However we didn't find such possibility.


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:28fa35781b86a55c01b26ed5221a254b716e5f82.camel at mcomputers.cz-0]


On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote:
EXTERN? ODES?LATEL


On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote:
We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?

Hello,
did you use pre-allocated file(s) (was it re-write) ?
libaio traffic is not really asynchronous with respect to necessary metadata
operations (allocating new space and writing allocation structures to disk)
in most Linux filesystems and I guess this case is not heavily optimized
in GPFS either (dioSmallSeqWriteBatching feature may help a little but
it targets different scenario I think).

Best regards,
Zdenek Salvet                                              salvet at ics.muni.cz<mailto:salvet at ics.muni.cz>
Institute of Computer Science of Masaryk University, Brno, Czech Republic
and CESNET, z.s.p.o., Prague, Czech Republic
Phone: ++420-549 49 6534                           Fax: ++420-541 212 747
----------------------------------------------------------------------------
      Teamwork is essential -- it allows you to blame someone else.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/2eae04ec/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/2eae04ec/attachment.png>

From uwe.falke at kit.edu  Tue Mar 12 11:21:49 2024
From: uwe.falke at kit.edu (Uwe Falke)
Date: Tue, 12 Mar 2024 12:21:49 +0100
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
 <20240312085914.GM10934@horn.ics.muni.cz>
 <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz>
Message-ID: <ce2478f0-fb1d-4b55-913a-ce00ff6d0cdb@kit.edu>

Just thinking: an application should do direct IO for a good reason, and 
only then. "Forcing DIO" is probably not the right thing to do - rather 
check why an app does DIO and either change the app's behaviour if 
reasonable are maybe use a special? pool for it using mirrored SSDs or so.

BTW, the ESS have some nice mechanism to do small IOs (also direct ones 
I suppose) quickly by buffering them on flash/NVRAM (where the data is 
considered persistently stored, hence the IO requests are completed 
quickly).

Uwe


On 12.03.24 11:59, Peter Hru?ka wrote:
> Hello,
>
> The direct writes are problematic on both writes and rewrites. 
> Rewrites alone are another issue we have noticed.
> Since indirect (direct=0) workloads are fine, it seems that the 
> easiest solution could be to force indirect IO operations for all 
> workloads. However we didn't find such possibility.
>
> -- 
> S p??n?m p?kn?ho dne / Best regards
>
> *Mgr. Peter Hru?ka*
> IT specialista
>
> *M Computers s.r.o.*
> ?lehlova 3100/10, 628 00 Brno-L??e? (mapa <https://mapy.cz/s/gafufehufe>)
> T:+420 515 538 136
> E: peter.hruska at mcomputers.cz <mailto:peter.hruska at mcomputers.cz>
>
> www.mcomputers.cz <http://www.mcomputers.cz/>
> www.lenovoshop.cz <http://www.lenovoshop.cz/>
>
>
>
> On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote:
>> EXTERN? ODES?LATEL
>>
>>
>> On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote:
>>> We encountered a problem with performance of writes on GPFS when the 
>>> application uses direct io access. To simulate the issue it is 
>>> enough to run fio with option direct=1. The performance drop is 
>>> quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct 
>>> GPFS to ignore direct IO by using "disableDIO=yes". The directive 
>>> didn't have any effect. Is there any possibility how to achieve that 
>>> GPFS would ignore direct IO requests and use caching for everything?
>>
>> Hello,
>> did you use pre-allocated file(s) (was it re-write) ?
>> libaio traffic is not really asynchronous with respect to necessary 
>> metadata
>> operations (allocating new space and writing allocation structures to 
>> disk)
>> in most Linux filesystems and I guess this case is not heavily optimized
>> in GPFS either (dioSmallSeqWriteBatching feature may help a little but
>> it targets different scenario I think).
>>
>> Best regards,
>> Zdenek Salvet salvet at ics.muni.cz
>> Institute of Computer Science of Masaryk University, Brno, Czech Republic
>> and CESNET, z.s.p.o., Prague, Czech Republic
>> Phone: ++420-549 49 6534?????????????????????????? Fax: ++420-541 212 747
>> ----------------------------------------------------------------------------
>> ????? Teamwork is essential -- it allows you to blame someone else.
>>
>>
>> _______________________________________________
>> gpfsug-discuss mailing list
>> gpfsug-discuss at gpfsug.org
>> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
>
> _______________________________________________
> gpfsug-discuss mailing list
> gpfsug-discuss at gpfsug.org
> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org

-- 
Karlsruhe Institute of Technology (KIT)
Scientific Computing Centre (SCC)
Scientific Data Management (SDM)

Uwe Falke

Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
D-76344 Eggenstein-Leopoldshafen

Tel: +49 721 608 28024
Email:uwe.falke at kit.edu
www.scc.kit.edu

Registered office:
Kaiserstra?e 12, 76131 Karlsruhe, Germany

KIT ? The Research University in the Helmholtz Association
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/0851678f/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/0851678f/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5814 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/0851678f/attachment.bin>

From Peter.Hruska at mcomputers.cz  Tue Mar 12 12:03:46 2024
From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=)
Date: Tue, 12 Mar 2024 12:03:46 +0000
Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow
In-Reply-To: <a88fa831-464f-4762-801b-d884776e5e6c@strath.ac.uk>
References: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz>
 <a88fa831-464f-4762-801b-d884776e5e6c@strath.ac.uk>
Message-ID: <911aa0dc39fe50dd5f3de2038cc6239f294f1a6a.camel@mcomputers.cz>

Hello,

For the ease of testing and reproduction we use fio and iozone benchmarking tools. The issue you desrcibe looks pretty similar. However it occurs using GPFS mount only. Here are some results from out testing environment:

Linux NSD server:
Writes: 3472 MiB/s
Re-writes: 3476 MiB/s

Windows GPFS:
Writes: 2487 MiB/s
Re-writes: 250 MiB/s

Windows Samba:
Writes: 997 MiB/s
Re-writes: 1269 MiB/s


--

S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:a57a2d9ed2ea460d8bb2abefe678c4fab3cdaeeb.camel at mcomputers.cz-0]


On Mon, 2024-03-11 at 17:35 +0000, Jonathan Buzzard wrote:
EXTERN? ODES?LATEL


On 11/03/2024 16:43, Peter Hru?ka wrote:

Hello,

We've encountered yet another performance flaw. We have a GPFS
filesystem mounted using GPFS binaries on Windows. When we try to
rewrite a file on the GPFS filesystem rewriting speed is much slower
than writing to a new file. The difference ratio we measured is about
3.5 times. From the task manager it is visible that there is excessive
amount of reading from the network when rewriting. This is even visible
on the NDS server as io activity. However when rewriting on a linux
client there are no reads while rewriting. Has anyone encountered such
problems?
To replicate the issue is is possidle to run fio twice with the same
configuration to achieve rewriting or to run iozone. Both tools return
similar outputs.

Kind of yes.

What are you using to "rewrite" the file?

What we saw initially over Samba was certain Microsoft applications when
rewriting a file had truly abysmal performance. The same application
when saving the same document to a new file and the performance was as
expected.

After much digging into it the cause (it was weeks of person effort) it
was determined to be down to the application writing the file *one* byte
at a time. Basically some idiot C++ developer at Microsoft decided to
ignore the C++ library because it has "bugs" and write their own
formatted output routines.

It was not noticeable saving to a local disk, but the instant you tried
saving to a network drive the performance was truly awful. Basically the
increased latency of single character IO was the issue.

Note that the issue was not confined to Samba and GPFS, as we verified
the same abysmal performance with a Windows 2008 R2 server running on
File and Print Sharing on NTFS on bare metal hardware. It was also not
confined to just Windows the same awful performance happened on Macs
too. In fact that is where it first came to light.

Might be the cause of your problem, might not.


JAB.

--
Jonathan A. Buzzard                         Tel: +44141-5483420
HPC System Administrator, ARCHIE-WeSt.
University of Strathclyde, John Anderson Building, Glasgow. G4 0NG


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/d97c3043/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mcomputers_podpis_2024.png
Type: image/png
Size: 13955 bytes
Desc: mcomputers_podpis_2024.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/d97c3043/attachment.png>

From olaf.weiser at de.ibm.com  Tue Mar 12 12:12:24 2024
From: olaf.weiser at de.ibm.com (Olaf Weiser)
Date: Tue, 12 Mar 2024 12:12:24 +0000
Subject: [gpfsug-discuss] Slow performance on writes when using direct io
In-Reply-To: <ce2478f0-fb1d-4b55-913a-ce00ff6d0cdb@kit.edu>
References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz>
 <20240312085914.GM10934@horn.ics.muni.cz>
 <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz>
 <ce2478f0-fb1d-4b55-913a-ce00ff6d0cdb@kit.edu>
Message-ID: <CY5PR15MB5439F9D4C3FC82E84533F624C92B2@CY5PR15MB5439.namprd15.prod.outlook.com>

just for completeness, let's use this thread to document an undocumented parameter, which should not be used any more

disableDIO=yes  IS NOT disabling DIO ?

The parameter is a bit misleading by its name.  DIRECT IO is  a hint from an application to bypass cache. However mostly it is expected by application programmers to have the IO safely on disk, when it gets ack'd
To make this behavior absolutely sure, there is the O_SYNC flag for writes to make that happen. However, is pretty common , just to expect DIRECT is similar seen as a synonym   more details here :https://man7.org/linux/man-pages/man2/open.2.html
However, GPFS will handle O_DIRECT similar to follow the expectation of programmers.

So in GPFS, for directIO (O_CIRECT) (by passing caches) we can benefit and use a so called optimized IO path .. which takes advantage of LINUX AIO...
however, there are multiple situation, where you can NOT write directly.. e.g. if the IO is not aligned or you want to append a file, when there so no block (no disk address yet allocated to that file) etc...Then you can't process the IO in this directIO .. optimized path
[[some other  aspects:
for data, which gets accessed w/o caching.. is that you disable prefetching ?
and
more relevant here:   you need other (more efficient tokens) than for buffered writes]]

Lets say, an application appends a file (similar to create a file and write, ) Usually direct IOs are rather small. As long you write small IOs into an existing block, direct IO is fine - but then - if  you fill the last block fully with data, you need to allocate  a new block. You can't write directly DIRECT_IO .. to no-where ? So we need to allocate a new block (as any other file system as well) .
 This means for GPFS, that we need to leave the optimized (direct)IO path and allocate a new block by allocating according buffers first
When this is done, we finally  sync  the data from the "direct"-IO  before ack'in  the IO to follow the expected symantec of the application programmer, which obviously used the O_DIRECT .

For some workloads , which mostly create/write and append files, this is happen on each block .. so frequently... causing GPFS to change going in and out the optimized IO path, which causes also changing tokens
To avoid that, a very long time ago -disableDIO - was introduced as a quick and efficient work around.

in the meantime, there is an heuristic in our code, that automatically detects such cases and should NOT use this parameter disableDIO any more

Since GPFS 5.0.x  we introduced  dioSmallSeqWriteBatching=yes .  PLEASE use this parameter. By default, the optimization kicks in when we see three AIO/DIO writes that  are no larger 64k bytes each and no more than than one write length appart.
If you know, that your application does larger DIO writes..let us know, open a SF ticket, there are further options.


BACK to the origin question ?you may consider
--preallocate blocks to the file (s)
--double check "active snapshots" (copy on write for DIRECT IO is very expensive)
--adjust your block size / RAID config to lower write amplification
--check network rtt for token traffic !!!
--try to avoid HDD based backend, as they # IOPS is very limited
last but not least - talk to your application programmers ... ? if they really need, what they programmed ?

________________________________
Von: gpfsug-discuss <gpfsug-discuss-bounces at gpfsug.org> im Auftrag von Uwe Falke <uwe.falke at kit.edu>
Gesendet: Dienstag, 12. M?rz 2024 12:21
An: gpfsug-discuss at gpfsug.org <gpfsug-discuss at gpfsug.org>
Betreff: [EXTERNAL] Re: [gpfsug-discuss] Slow performance on writes when using direct io


Just thinking: an application should do direct IO for a good reason, and only then. "Forcing DIO" is probably not the right thing to do - rather check why an app does DIO and either change the app's behaviour if reasonable are maybe use a special  pool for it using mirrored SSDs or so.

BTW, the ESS have some nice mechanism to do small IOs (also direct ones I suppose) quickly by buffering them on flash/NVRAM  (where the data is considered persistently stored, hence the IO requests are completed quickly).

Uwe


On 12.03.24 11:59, Peter Hru?ka wrote:
Hello,

The direct writes are problematic on both writes and rewrites. Rewrites alone are another issue we have noticed.
Since indirect (direct=0) workloads are fine, it seems that the easiest solution could be to force indirect IO operations for all workloads. However we didn't find such possibility.


--


S p??n?m p?kn?ho dne / Best regards

Mgr. Peter Hru?ka
IT specialista

M Computers s.r.o.
?lehlova 3100/10, 628 00 Brno-L??e? (mapa<https://mapy.cz/s/gafufehufe>)
T:+420 515 538 136
E: peter.hruska at mcomputers.cz<mailto:peter.hruska at mcomputers.cz>

www.mcomputers.cz<http://www.mcomputers.cz/>
www.lenovoshop.cz<http://www.lenovoshop.cz/>
[cid:04ece728-7e44-4ab2-b1c6-1928bb3c29ee]


On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote:
EXTERN? ODES?LATEL


On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote:
We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything?

Hello,
did you use pre-allocated file(s) (was it re-write) ?
libaio traffic is not really asynchronous with respect to necessary metadata
operations (allocating new space and writing allocation structures to disk)
in most Linux filesystems and I guess this case is not heavily optimized
in GPFS either (dioSmallSeqWriteBatching feature may help a little but
it targets different scenario I think).

Best regards,
Zdenek Salvet                                              salvet at ics.muni.cz<mailto:salvet at ics.muni.cz>
Institute of Computer Science of Masaryk University, Brno, Czech Republic
and CESNET, z.s.p.o., Prague, Czech Republic
Phone: ++420-549 49 6534                           Fax: ++420-541 212 747
----------------------------------------------------------------------------
      Teamwork is essential -- it allows you to blame someone else.


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org


--
Karlsruhe Institute of Technology (KIT)
Scientific Computing Centre (SCC)
Scientific Data Management (SDM)

Uwe Falke

Hermann-von-Helmholtz-Platz 1, Building 442, Room 187
D-76344 Eggenstein-Leopoldshafen

Tel: +49 721 608 28024
Email: uwe.falke at kit.edu<mailto:uwe.falke at kit.edu>
www.scc.kit.edu<http://www.scc.kit.edu>

Registered office:
Kaiserstra?e 12, 76131 Karlsruhe, Germany

KIT ? The Research University in the Helmholtz Association

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/3fce4a9b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Outlook-fvfsez5b.png
Type: image/png
Size: 13955 bytes
Desc: Outlook-fvfsez5b.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240312/3fce4a9b/attachment.png>

From p.ward at nhm.ac.uk  Tue Mar 26 15:28:37 2024
From: p.ward at nhm.ac.uk (Paul Ward)
Date: Tue, 26 Mar 2024 15:28:37 +0000
Subject: [gpfsug-discuss] TCT - anyone else using?
Message-ID: <LO0P265MB5215D5BEE7222B1CCF787375D3352@LO0P265MB5215.GBRP265.PROD.OUTLOOK.COM>

We've just added AWS as a second cloud container pool to our on premises COS.
Apparently, we're the first to do this!

I'd like to hear from other people using TCT, especially if you have more than one destination.

Kindest regards,
Paul

Paul Ward
TS Infrastructure Architect
Natural History Museum
T: 02079426450
E: p.ward at nhm.ac.uk<mailto:p.ward at nhm.ac.uk>
[cid:image001.png at 01DA7F92.4567E340]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240326/09662030/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 12974 bytes
Desc: image001.png
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240326/09662030/attachment.png>

From vladimir.sapunenko at cnaf.infn.it  Wed Mar 27 14:58:35 2024
From: vladimir.sapunenko at cnaf.infn.it (Vladimir Sapunenko)
Date: Wed, 27 Mar 2024 15:58:35 +0100
Subject: [gpfsug-discuss] occupancy percentage for a pool
Message-ID: <ab43fdc9-7ee9-4ddf-91cb-11a768cdd44d@cnaf.infn.it>

Hello,

since the early days of GPFS there was a limitation of using only 
integer values in the LIMIT directive for placement policy rules.

Now it seems outdated and GPFS documentation says? "You can specify 
OccupancyPercentage as a floating point number , as in the following 
example:

RULE 'r' RESTORE to pool 'x' limit(8.9e1)

However, 8.9e1 = 89, and I'm wondering if it is acceptable to define 
limit? 99.5%? in file placement policy like this:

RULE 'DATA3' SET POOL 'data2' LIMIT(99.5)

mmchpolicy does not complain to such values. My doubt is if GPFS really 
uses fractions of a percent in occupancy calculation.

Is anybody using such limits?

(0.5% in my case make a difference because it corresponds to about 60TB)

Thanks,

Vladimir

||
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240327/58bab7de/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vladimir_sapunenko.vcf
Type: text/vcard
Size: 235 bytes
Desc: not available
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20240327/58bab7de/attachment.vcf>