From naren.rajasingam at systemethix.com.au Fri Mar 1 23:20:35 2024 From: naren.rajasingam at systemethix.com.au (Naren Rajasingam) Date: Fri, 1 Mar 2024 23:20:35 +0000 Subject: [gpfsug-discuss] Greetings Message-ID: It's been a while since I had been previously part of this users group. My name is Naren Rajasingam and I have been working with GPFS/Spectrum Scale/Storage Scale since 2010. I am formally from IBM (left the company in 2015 after 16 years there) and now work for Systemethix as a senior technical consultant specialising in Scale. Cheers, Kind Regards, -Naren Naren Rajasingam Senior Technology Consultant Mobile: +61 (0)419 513 189 Email: naren.rajasingam at systemethix.com.au Systemethix Australia www.systemethix.com.au -------------- next part -------------- An HTML attachment was scrubbed... URL: From TROPPENS at de.ibm.com Mon Mar 4 22:51:10 2024 From: TROPPENS at de.ibm.com (Ulf Troppens) Date: Mon, 4 Mar 2024 22:51:10 +0000 Subject: [gpfsug-discuss] storage-scale-object - summary In-Reply-To: References: Message-ID: Hi List, more details can be shared under NDA. Please ask your IBM sales rep to contact me. Or meet me at the German User meeting this Wednesday. Best, Ulf Ulf Troppens Product Manager - IBM Storage for Data and AI, Data-Intensive Workflows IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Wolfgang Wendt / Gesch?ftsf?hrung: David Faller Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From: gpfsug-discuss On Behalf Of Alexander Saupp Sent: Thursday, 29 February 2024 08:28 To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] storage-scale-object - summary Hi all, there was a couple of very good concerns, questions and statements already. I?d like to summarize my very personal understanding ? not necessarily officially speaking for my employer IBM. Be invited to reach out to IBM Client Engineering ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ? ZjQcmQRYFpfptBannerEnd Hi all, there was a couple of very good concerns, questions and statements already. I?d like to summarize my very personal understanding ? not necessarily officially speaking for my employer IBM. Be invited to reach out to IBM Client Engineering (a presales invest by IBM) if you have a need to discuss, demo or evaluate in connection to an active Opportunity ( i know, but that?s the rules of engagement ) * Swift S3 is complex to maintain, as said by Christoph Martin. IBM supports multiple stacks for multiple products. That along with currency the main reason to move away * For alternatives a focus on ?unified access via file and s3? was set, so the feature set required is something (for you as a customer) to evaluate. * You can find references on our future architecture publicly available, its based on the same noobaa stack that is used in RedHat ODF MCG. I would like to recommend the following blog of my IBM CE peer Nils Haustein: https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/21/s3-tiering-to-tape-with-noobaa-part-1-introduction https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/21/s3-tiering-to-tape-with-noobaa-part-2-deployment https://community.ibm.com/community/user/storage/blogs/nils-haustein1/2024/02/26/s3-tiering-to-tape-with-noobaa-part-3-tiering * Some might consider release timing, .. But here is the plan, thanks for already outlining, Renar! https://www.ibm.com/docs/en/storage-scale/5.1.9?topic=summary-changes * IBM Storage Scale 5.1.8 is the last release that has CES Swift Object protocol. * IBM Storage Scale 5.1.9 [EUS release] will tolerate the update of a CES node from IBM Storage Scale 5.1.8 - so if you have it, you can keep it * I?m expecting TechPreview and GA within the next two releases ? technical details as per above blog. I hope this helps to clarify IBM?s plan of record. I?d like to reinvite you to reach out to IBM (via IBM Sales / directly) if you?d like to follow-up. Mit freundlichen Gr??en / Kind regards Alexander Saupp Senior Technology Engineer | IBM Client Engineering EMEA | Software Defined Storage +49 172 725 1072 Mobile alexander.saupp at de.ibm.com IBM Data Privacy Statement IBM Deutschland GmbH Vorsitzender des Aufsichtsrats: Sebastian Krause Gesch?ftsf?hrung: Wolfgang Wendt (Vorsitzender), Dr. Andreas Buchelt, Dr. Frank Kohls, Christine Rupp Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.Hruska at mcomputers.cz Fri Mar 8 13:02:56 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Fri, 8 Mar 2024 13:02:56 +0000 Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients Message-ID: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> Hello, I would like to ask if there is any possible way to unify access IDs for Samba share and GPFS client on Windows. The model situation looks like this - we have a GPFS cluster with one filesystem. On the filesystem we have a Samba shared directory. MMauth is configured to Active directory for file access and works. The Windows machine is also part of the GPFS cluster and therefore it is able to mount the filesystem "directly" by mmmount. However the user/group IDs that are used by this access method are not consistent with the IDs used by Samba and access to the same data is not working well. Is there is any solution to this situation? I tried to study the documentation but I didn't find a clear information whether this is or isn't possible but I also didn't find a way to unify the access. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:d08ab7cf0ea126f3f6cd0af2c9a3127280680285.camel at mcomputers.cz-0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From cabrillo at ifca.unican.es Fri Mar 8 14:39:50 2024 From: cabrillo at ifca.unican.es (Iban Cabrillo) Date: Fri, 8 Mar 2024 14:39:50 +0000 (UTC) Subject: [gpfsug-discuss] pagepool Message-ID: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From mjarsulic at bsd.uchicago.edu Fri Mar 8 14:50:02 2024 From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD]) Date: Fri, 8 Mar 2024 14:50:02 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Ib?n, What are you using for your scheduler? On my compute nodes, I am setting the pagepool to 16 GB and setting aside specialized memory for GPFS that will not be allocated to jobs. -- Mike Jarsulic Associate Director, Scientific Computing Center for Research Informatics | Biological Sciences Division University of Chicago 5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066 From: gpfsug-discuss on behalf of Iban Cabrillo Date: Friday, March 8, 2024 at 8:44?AM To: gpfsug-discuss Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes ZjQcmQRYFpfptBannerStart External: Use caution with links, attachments, and providing information. Report Suspicious ZjQcmQRYFpfptBannerEnd Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From mjarsulic at bsd.uchicago.edu Fri Mar 8 14:50:02 2024 From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD]) Date: Fri, 8 Mar 2024 14:50:02 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Ib?n, What are you using for your scheduler? On my compute nodes, I am setting the pagepool to 16 GB and setting aside specialized memory for GPFS that will not be allocated to jobs. -- Mike Jarsulic Associate Director, Scientific Computing Center for Research Informatics | Biological Sciences Division University of Chicago 5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066 From: gpfsug-discuss on behalf of Iban Cabrillo Date: Friday, March 8, 2024 at 8:44?AM To: gpfsug-discuss Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes ZjQcmQRYFpfptBannerStart External: Use caution with links, attachments, and providing information. Report Suspicious ZjQcmQRYFpfptBannerEnd Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From cabrillo at ifca.unican.es Fri Mar 8 15:14:37 2024 From: cabrillo at ifca.unican.es (Iban Cabrillo) Date: Fri, 8 Mar 2024 15:14:37 +0000 (UTC) Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> HI Mike, Slurm 23.02.7-1.el9 Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== From cabrillo at ifca.unican.es Fri Mar 8 15:14:37 2024 From: cabrillo at ifca.unican.es (Iban Cabrillo) Date: Fri, 8 Mar 2024 15:14:37 +0000 (UTC) Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> HI Mike, Slurm 23.02.7-1.el9 Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== From jonathan.buzzard at strath.ac.uk Fri Mar 8 15:50:19 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 8 Mar 2024 15:50:19 +0000 Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients In-Reply-To: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> Message-ID: On 08/03/2024 13:02, Peter Hru?ka wrote: > Hello, > > I would like to ask if there is any possible way to unify access IDs for > Samba share and GPFS client on Windows. > > The model situation looks like this - we have a GPFS cluster with one > filesystem. On the filesystem we have a Samba shared directory. MMauth > is configured to Active directory for file access and works. > The Windows machine is also part of the GPFS cluster and therefore it is > able to mount the filesystem "directly" by mmmount. However the > user/group IDs that are used by this access method are not consistent > with the IDs used by Samba and access to the same data is not working well. > Is there is any solution to this situation? I tried to study the > documentation but I didn't find a clear information whether this is or > isn't possible but I also didn't find a way to unify the access. > The obvious question is to ask if your Active Directory has it's RFC 23037bis fields populated? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From mjarsulic at bsd.uchicago.edu Fri Mar 8 16:01:30 2024 From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD]) Date: Fri, 8 Mar 2024 16:01:30 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> Message-ID: Ib?n, What I did in my slurm.conf is use the MemSpecLimit option. For a node with 128 GB and 16 GB GPFS pagepool, I set the memspec like this: NodeName=cri22cn[001-156] CPUs=32 RealMemory=128000 MemSpecLimit=20000 -- Mike Jarsulic Associate Director, Scientific Computing Center for Research Informatics | Biological Sciences Division University of Chicago 5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066 From: gpfsug-discuss on behalf of Iban Cabrillo Date: Friday, March 8, 2024 at 9:19?AM To: gpfsug main discussion list Cc: gpfsug-discuss Subject: Re: [gpfsug-discuss] pagepool HI Mike, Slurm 23.02.7-1.el9 Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://urldefense.com/v3/__https://confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$ ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mjarsulic at bsd.uchicago.edu Fri Mar 8 16:01:30 2024 From: mjarsulic at bsd.uchicago.edu (Jarsulic, Michael [BSD]) Date: Fri, 8 Mar 2024 16:01:30 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> <1760522486.6628375.1709910877004.JavaMail.zimbra@ifca.unican.es> Message-ID: Ib?n, What I did in my slurm.conf is use the MemSpecLimit option. For a node with 128 GB and 16 GB GPFS pagepool, I set the memspec like this: NodeName=cri22cn[001-156] CPUs=32 RealMemory=128000 MemSpecLimit=20000 -- Mike Jarsulic Associate Director, Scientific Computing Center for Research Informatics | Biological Sciences Division University of Chicago 5454 South Shore Drive, Chicago, IL 60615 | (773) 702-2066 From: gpfsug-discuss on behalf of Iban Cabrillo Date: Friday, March 8, 2024 at 9:19?AM To: gpfsug main discussion list Cc: gpfsug-discuss Subject: Re: [gpfsug-discuss] pagepool HI Mike, Slurm 23.02.7-1.el9 Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://urldefense.com/v3/__https://confluence.ifca.es/display/IC/Information*Security*Policy*for*External*Suppliers__;KysrKys!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZwsd0o3w$ ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!MvNZe7V6M35iZPhbgng-hfU!xuWKdbbUROCKcfINm9E-WYHhYly8NscrBz7y_8d1oaPKZScUu2x13tMmr3irlIdsPoN7qWk_fscBL4Do79Xh1AeZCv9-RzU$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.Hruska at mcomputers.cz Fri Mar 8 16:08:35 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Fri, 8 Mar 2024 16:08:35 +0000 Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients In-Reply-To: References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> Message-ID: Hello Jonathan, Thank you for the answer. Since I used Automatic ID-mapping method for the mmauth deployment I didn't do anything regarding RFC2307. I chose this approach because we don't want to use kerberos for NFS authentication (although we will use NFS for separate data access). I'll check on that. If you have any hints I would appreciate them. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:c95a681f19a2dd92eac22c858b7c1d0dfa182335.camel at mcomputers.cz-0] On Fri, 2024-03-08 at 15:50 +0000, Jonathan Buzzard wrote: EXTERN? ODES?LATEL On 08/03/2024 13:02, Peter Hru?ka wrote: Hello, I would like to ask if there is any possible way to unify access IDs for Samba share and GPFS client on Windows. The model situation looks like this - we have a GPFS cluster with one filesystem. On the filesystem we have a Samba shared directory. MMauth is configured to Active directory for file access and works. The Windows machine is also part of the GPFS cluster and therefore it is able to mount the filesystem "directly" by mmmount. However the user/group IDs that are used by this access method are not consistent with the IDs used by Samba and access to the same data is not working well. Is there is any solution to this situation? I tried to study the documentation but I didn't find a clear information whether this is or isn't possible but I also didn't find a way to unify the access. The obvious question is to ask if your Active Directory has it's RFC 23037bis fields populated? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From jonathan.buzzard at strath.ac.uk Fri Mar 8 16:18:47 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 8 Mar 2024 16:18:47 +0000 Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients In-Reply-To: References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> Message-ID: <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk> On 08/03/2024 16:08, Peter Hru?ka wrote: > Hello Jonathan, > > Thank you for the answer. Since I used Automatic ID-mapping method for > the mmauth deployment I didn't do anything regarding RFC2307. > I chose this approach because we don't want to use kerberos for NFS > authentication (although we will use NFS for separate data access). > I'll check on that. If you have any hints I would appreciate them. > Consistent mapping won't work without RFC2307bis attributes being populated as far as I am aware. Windows knows nothing about the idmap_rid, it only knows about SID's Mixing NFS and Samba out the same file system or at the very least the same directory hierarchy is a mugs game. There in lies a gigantic pit of woe for all those foolish enough to try based on personal experience. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From jonathan.buzzard at strath.ac.uk Fri Mar 8 16:25:17 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 8 Mar 2024 16:25:17 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: On 08/03/2024 14:50, Jarsulic, Michael [BSD] wrote: > Ib?n, > > What are you using for your scheduler? > > On my compute nodes, I am setting the pagepool to 16 GB and setting > aside specialized memory for GPFS that will not be allocated to jobs. > What you would normally do is create a node class mmcrnodeclass compute -N node001,node002,node003,node004,..... then set the pagepool appropriately mmchconfig pagepool=16G -i -N compute We then use slurm to limit the maximum amount of RAM a job can have on a node to be physical RAM minus the pagepool size minus a bit more for good measure to allow for the OS. If the OOM is kicking in then you need to reduce the RAM limit in slurm some more till it stops. Note we also twiddle with some other limits for compute nodes mmchconfig maxFilesToCache=8000 -N compute mmchconfig maxStatCache=16000 -N compute We have a slew of node classes where these settings are tweaked to account for their RAM and their role so dssg, compute, gpu,protocol,teaching, and login. All nodes belong to one or more node classes. Which reminds me I need a gui node class now. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From scale at us.ibm.com Fri Mar 8 16:31:44 2024 From: scale at us.ibm.com (scale) Date: Fri, 8 Mar 2024 16:31:44 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: To the original question about the size of the pagepool being 323908133683 bytes, I think that should only be applied to the IO nodes (the ones handling all the NSD IOs) not every node in the cluster. The large pagepool size is needed on the IO nodes for GNR to function properly. From: gpfsug-discuss on behalf of Jonathan Buzzard Date: Friday, March 8, 2024 at 11:27?AM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] Re: [gpfsug-discuss] pagepool On 08/03/2024 14:50, Jarsulic, Michael [BSD] wrote: > Ib?n, > > What are you using for your scheduler? > > On my compute nodes, I am setting the pagepool to 16 GB and setting > aside specialized memory for GPFS that will not be allocated to jobs. > What you would normally do is create a node class mmcrnodeclass compute -N node001,node002,node003,node004,..... then set the pagepool appropriately mmchconfig pagepool=16G -i -N compute We then use slurm to limit the maximum amount of RAM a job can have on a node to be physical RAM minus the pagepool size minus a bit more for good measure to allow for the OS. If the OOM is kicking in then you need to reduce the RAM limit in slurm some more till it stops. Note we also twiddle with some other limits for compute nodes mmchconfig maxFilesToCache=8000 -N compute mmchconfig maxStatCache=16000 -N compute We have a slew of node classes where these settings are tweaked to account for their RAM and their role so dssg, compute, gpu,protocol,teaching, and login. All nodes belong to one or more node classes. Which reminds me I need a gui node class now. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Mar 8 16:32:57 2024 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 8 Mar 2024 16:32:57 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Yikes! Those must be some mighty large memory compute nodes! That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes. (so ~324 GB) Even on our 1TB+ memory machines we do not tune it that high. You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc) You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc) I would address this here, rather than on the Slurm side. Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs. Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup. Example from a cross mounted compute-side cluster. Default is 1GB: [root at nostorage-manager1 ~]# mmlsconfig pagepool pagepool 1024M pagepool 4G [k8,pitzer] pagepool 64G [ascend] pagepool 16G [ib-spire-login,owenslogin,pitzerlogin] pagepool 48G [dm] pagepool 4G [cardinal] pagepool 64G [cardinal_quadport] example from the ESS/DSS server side. Later ESS versions set things by mmvdisk groups, rather than server type. # mmlsconfig pagepool pagepool 32G pagepool 358G [gss_ppc64] pagepool 16384M [ibmems11-hs,ems] pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs] pagepool 64G [sp] pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs] pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs] pagepool 96G [ces] example of nodeclasses used to address other settings, such as what Infiniband port(s) to use. # mmlsconfig verbsports verbsPorts mlx5_0 verbsPorts mlx5_0 mlx5_2 [pitzer_dualport] verbsPorts mlx4_1/1 mlx4_1/2 [dm] verbsPorts mlx5_0 mlx5_2 [k8_dualport] verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport] Ed Wahl Ohio Supercomputer Center From: gpfsug-discuss On Behalf Of Iban Cabrillo Sent: Friday, March 8, 2024 9:40 AM To: gpfsug-discuss Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Mar 8 16:35:47 2024 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 8 Mar 2024 16:35:47 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: <7B6AC75B-43AF-4696-AAA9-0296D7211B20@rutgers.edu> What are the units on that ? is that 323GB? Zero chance you need it that high on clients. Just for perspective, our pagepool on our clients is 4GB and on the DSS-G, it is 242GB. I would suggest that you start with the settings in /opt/lenovo/dss/bin/dssClientConfig.sh (the settings themselves are in v5.worker.dssClientConfig in the same directory), if you have a brand new config and don?t have to worry about breaking your system with the wrong values (I have to be more careful with that as some of our values are higher than those defaults already). You just made me worry that perhaps I was still running with an out-of-date value there, but the default is still to raise the pagepool for clients to 4GB if you don?t specify otherwise. What I was told by Lenovo years ago was that this is about the level where you start not to notice any difference when you go larger. You may want to test values for this for your workloads/see whether you fill it up if it?s set to that Lenovo default and then reconsider. You can change it for a single node with -N , if you want to test. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Mar 8, 2024, at 09:39, Iban Cabrillo wrote: Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Mar 8 16:35:47 2024 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 8 Mar 2024 16:35:47 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: <7B6AC75B-43AF-4696-AAA9-0296D7211B20@rutgers.edu> What are the units on that ? is that 323GB? Zero chance you need it that high on clients. Just for perspective, our pagepool on our clients is 4GB and on the DSS-G, it is 242GB. I would suggest that you start with the settings in /opt/lenovo/dss/bin/dssClientConfig.sh (the settings themselves are in v5.worker.dssClientConfig in the same directory), if you have a brand new config and don?t have to worry about breaking your system with the wrong values (I have to be more careful with that as some of our values are higher than those defaults already). You just made me worry that perhaps I was still running with an out-of-date value there, but the default is still to raise the pagepool for clients to 4GB if you don?t specify otherwise. What I was told by Lenovo years ago was that this is about the level where you start not to notice any difference when you go larger. You may want to test values for this for your workloads/see whether you fill it up if it?s set to that Lenovo default and then reconsider. You can change it for a single node with -N , if you want to test. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Mar 8, 2024, at 09:39, Iban Cabrillo wrote: Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Fri Mar 8 16:32:57 2024 From: ewahl at osc.edu (Wahl, Edward) Date: Fri, 8 Mar 2024 16:32:57 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Yikes! Those must be some mighty large memory compute nodes! That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes. (so ~324 GB) Even on our 1TB+ memory machines we do not tune it that high. You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc) You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc) I would address this here, rather than on the Slurm side. Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs. Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup. Example from a cross mounted compute-side cluster. Default is 1GB: [root at nostorage-manager1 ~]# mmlsconfig pagepool pagepool 1024M pagepool 4G [k8,pitzer] pagepool 64G [ascend] pagepool 16G [ib-spire-login,owenslogin,pitzerlogin] pagepool 48G [dm] pagepool 4G [cardinal] pagepool 64G [cardinal_quadport] example from the ESS/DSS server side. Later ESS versions set things by mmvdisk groups, rather than server type. # mmlsconfig pagepool pagepool 32G pagepool 358G [gss_ppc64] pagepool 16384M [ibmems11-hs,ems] pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs] pagepool 64G [sp] pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs] pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs] pagepool 96G [ces] example of nodeclasses used to address other settings, such as what Infiniband port(s) to use. # mmlsconfig verbsports verbsPorts mlx5_0 verbsPorts mlx5_0 mlx5_2 [pitzer_dualport] verbsPorts mlx4_1/1 mlx4_1/2 [dm] verbsPorts mlx5_0 mlx5_2 [k8_dualport] verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport] Ed Wahl Ohio Supercomputer Center From: gpfsug-discuss On Behalf Of Iban Cabrillo Sent: Friday, March 8, 2024 9:40 AM To: gpfsug-discuss Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Mar 8 16:50:25 2024 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 8 Mar 2024 16:50:25 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Curious, if you could say something about how you ended up with some page pool values on your client side that are that high. For what use cases does 64GB, for example, make a difference? -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Mar 8, 2024, at 11:32, Wahl, Edward wrote: Yikes! Those must be some mighty large memory compute nodes! That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes. (so ~324 GB) Even on our 1TB+ memory machines we do not tune it that high. You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc) You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc) I would address this here, rather than on the Slurm side. Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs. Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup. Example from a cross mounted compute-side cluster. Default is 1GB: [root at nostorage-manager1 ~]# mmlsconfig pagepool pagepool 1024M pagepool 4G [k8,pitzer] pagepool 64G [ascend] pagepool 16G [ib-spire-login,owenslogin,pitzerlogin] pagepool 48G [dm] pagepool 4G [cardinal] pagepool 64G [cardinal_quadport] example from the ESS/DSS server side. Later ESS versions set things by mmvdisk groups, rather than server type. # mmlsconfig pagepool pagepool 32G pagepool 358G [gss_ppc64] pagepool 16384M [ibmems11-hs,ems] pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs] pagepool 64G [sp] pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs] pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs] pagepool 96G [ces] example of nodeclasses used to address other settings, such as what Infiniband port(s) to use. # mmlsconfig verbsports verbsPorts mlx5_0 verbsPorts mlx5_0 mlx5_2 [pitzer_dualport] verbsPorts mlx4_1/1 mlx4_1/2 [dm] verbsPorts mlx5_0 mlx5_2 [k8_dualport] verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport] Ed Wahl Ohio Supercomputer Center From: gpfsug-discuss > On Behalf Of Iban Cabrillo Sent: Friday, March 8, 2024 9:40 AM To: gpfsug-discuss > Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Mar 8 16:50:25 2024 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 8 Mar 2024 16:50:25 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: Curious, if you could say something about how you ended up with some page pool values on your client side that are that high. For what use cases does 64GB, for example, make a difference? -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB A555B, Newark `' On Mar 8, 2024, at 11:32, Wahl, Edward wrote: Yikes! Those must be some mighty large memory compute nodes! That is an OK setting for a large memory ESS/DSS server but NOT the compute nodes at my site, as that is in bytes. (so ~324 GB) Even on our 1TB+ memory machines we do not tune it that high. You can set pagepool for nodeclass machines such as all your compute, but pagepool is one of those settings where you will have to restart the clients for it to take effect. (such as most all the rdma settings, etc) You should look into creating a ?nodeclass? for each of your ?node types? if you have not already, so you can avoid OOM issues from just the pagepool, and tune other settings per node-type (rdma/network settings, etc) I would address this here, rather than on the Slurm side. Then you can address (total memory minus the pagepool) for the overall addressability to Slurm for user jobs. Leave some spare memory for the system itself or you will see more memory issues and whatnot when users get close to OOM, even in their cgroup. Example from a cross mounted compute-side cluster. Default is 1GB: [root at nostorage-manager1 ~]# mmlsconfig pagepool pagepool 1024M pagepool 4G [k8,pitzer] pagepool 64G [ascend] pagepool 16G [ib-spire-login,owenslogin,pitzerlogin] pagepool 48G [dm] pagepool 4G [cardinal] pagepool 64G [cardinal_quadport] example from the ESS/DSS server side. Later ESS versions set things by mmvdisk groups, rather than server type. # mmlsconfig pagepool pagepool 32G pagepool 358G [gss_ppc64] pagepool 16384M [ibmems11-hs,ems] pagepool 324383477760 [ess3200_mmvdisk_ibmessio13_hs_ibmessio14_hs,ess3200_mmvdisk_ibmessio15_hs_ibmessio16_hs,ess3200_mmvdisk_ibmessio17_hs_ibmessio18_hs] pagepool 64G [sp] pagepool 384399572992 [ibmgssio1_hsibmgssio2_hs,ibmgssio3_hsibmgssio4_hs,ibmgssio5_hsibmgssio6_hs] pagepool 573475966156 [ess5k_mmvdisk_ibmessio11_hs_ibmessio12_hs] pagepool 96G [ces] example of nodeclasses used to address other settings, such as what Infiniband port(s) to use. # mmlsconfig verbsports verbsPorts mlx5_0 verbsPorts mlx5_0 mlx5_2 [pitzer_dualport] verbsPorts mlx4_1/1 mlx4_1/2 [dm] verbsPorts mlx5_0 mlx5_2 [k8_dualport] verbsPorts mlx5_0 mlx5_1 mlx5_2 mlx5_3 [cardinal_quadport] Ed Wahl Ohio Supercomputer Center From: gpfsug-discuss > On Behalf Of Iban Cabrillo Sent: Friday, March 8, 2024 9:40 AM To: gpfsug-discuss > Subject: [gpfsug-discuss] pagepool Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Achim.Rehor at de.ibm.com Fri Mar 8 16:57:30 2024 From: Achim.Rehor at de.ibm.com (Achim Rehor) Date: Fri, 8 Mar 2024 16:57:30 +0000 Subject: [gpfsug-discuss] pagepool In-Reply-To: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> Message-ID: <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com> we do ship a sample file under : /usr/lpp/mmfs/samples/gss with the gpfs.gnr rpm for both Servers and Clients, which should be taking care of some settings, including pagepool : gssClientConfig.sh gssServerConfig.sh These are described here : https://www.ibm.com/docs/en/storage-scale-system/6.0.2?topic=guide-client-node-tuning-recommendations -- Mit freundlichen Gr??en / Kind regards Achim Rehor Technical Support Specialist S?pectrum Scale and ESS (SME) Advisory Product Services Professional IBM Systems Storage Support - EMEA Achim.Rehor at de.ibm.com +49-170-4521194 IBM Deutschland GmbH Vorsitzender des Aufsichtsrats: Sebastian Krause Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer, Gabriele Schwarenthorer, Christine Rupp, Frank Theisen Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940 -----Original Message----- From: Iban Cabrillo > Reply-To: gpfsug main discussion list > To: gpfsug-discuss > Subject: [EXTERNAL] [gpfsug-discuss] pagepool Date: Fri, 08 Mar 2024 14:39:50 +0000 Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ZjQcmQRYFpfptBannerEnd Good afternoon, We are new to the DSS system configurations. Reviewing the configuration I have seen that the default pagepool is set to this value: pagepool 323908133683 But not only in the DSS servers, but also in the rest of the HPC nodes and I don't know if it is an excessive value. We are noticing that some jobs are dying by "Memory cgroup out of memory: Killed process XXX", and my doubt is if this pagepool is reserving too much memory for the mmfs process in decripento of the execution of jobs. Any advice is welcomed, Regards, I _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From cabrillo at ifca.unican.es Fri Mar 8 19:19:46 2024 From: cabrillo at ifca.unican.es (Iban Cabrillo) Date: Fri, 8 Mar 2024 19:19:46 +0000 (UTC) Subject: [gpfsug-discuss] pagepool In-Reply-To: <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com> References: <1642528025.6621892.1709908790701.JavaMail.zimbra@ifca.unican.es> <0650cb15204a93cccc1b7d4845bf2b75f93ab759.camel@de.ibm.com> Message-ID: <1257158574.6655748.1709925586705.JavaMail.zimbra@ifca.unican.es> HI Guys, Thanks a lot!! for all these usefull information Regards, I -- ================================================================ Ib?n Cabrillo Bartolom? Instituto de F?sica de Cantabria (IFCA-CSIC) Santander, Spain Tel: +34942200969/+34669930421 Responsible for advanced computing service (RSC) ========================================================================================= ========================================================================================= All our suppliers must know and accept IFCA policy available at: https://confluence.ifca.es/display/IC/Information+Security+Policy+for+External+Suppliers ========================================================================================== From sarah.walters at uq.edu.au Mon Mar 11 03:09:44 2024 From: sarah.walters at uq.edu.au (Sarah Walters) Date: Mon, 11 Mar 2024 03:09:44 +0000 Subject: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients In-Reply-To: <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk> References: <13674aa9d727df5c6f23710a33d540e76640a540.camel@mcomputers.cz> <6c2a7f94-fda1-4a4b-893a-0edd55fbda26@strath.ac.uk> Message-ID: It works just fine at UQ, using an AFM cache. We have NFS-only at the 'home' but we have thousands of filesets coming out of NFS and SMB on our cache. Not, technically, a preferred configuration to have that many of them, but it's possible. Sarah Walters BCompSc Research Computing Systems Engineer Research Computing Centre The University of Queensland Brisbane QLD 4072 Australia E sarah.walters at uq.edu.au W www.rcc.uq.edu.au CRICOS code: 00025B The University of Queensland is embracing the Green Office philosophy. Please consider the environment before printing this email. This email (including any attached files) is intended solely for the addressee and may contain confidential information of The University of Queensland. If you are not the addressee, you are notified that any transmission, distribution, printing or photocopying of this email is prohibited. If you have received this email in error, please delete and notify me. Unless explicitly stated, the opinions expressed in this email do not represent the official position of The University of Queensland. ________________________________ From: gpfsug-discuss on behalf of Jonathan Buzzard Sent: Saturday, 9 March 2024 02:18 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Active direcotry based ACLs for Samba and Windows GPFS clients On 08/03/2024 16:08, Peter Hru?ka wrote: > Hello Jonathan, > > Thank you for the answer. Since I used Automatic ID-mapping method for > the mmauth deployment I didn't do anything regarding RFC2307. > I chose this approach because we don't want to use kerberos for NFS > authentication (although we will use NFS for separate data access). > I'll check on that. If you have any hints I would appreciate them. > Consistent mapping won't work without RFC2307bis attributes being populated as far as I am aware. Windows knows nothing about the idmap_rid, it only knows about SID's Mixing NFS and Samba out the same file system or at the very least the same directory hierarchy is a mugs game. There in lies a gigantic pit of woe for all those foolish enough to try based on personal experience. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Peter.Hruska at mcomputers.cz Mon Mar 11 13:21:32 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Mon, 11 Mar 2024 13:21:32 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io Message-ID: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> Hello, We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:0e66df54ea6e2d2372ddf5fb3417f35a416a893f.camel at mcomputers.cz-0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From Renar.Grunenberg at huk-coburg.de Mon Mar 11 13:56:04 2024 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Mon, 11 Mar 2024 13:56:04 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> Message-ID: <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de> Hallo Peter, my to cents to this. Set the diasbleDIO=yes Parameter to DEFAULT and use the ! dioSmallSeqWriteBatching 1 disableDIO 0 And give it a try. Regards Renar Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss Im Auftrag von Peter Hru?ka Gesendet: Montag, 11. M?rz 2024 14:22 An: gpfsug-discuss at gpfsug.org Betreff: [gpfsug-discuss] Slow performance on writes when using direct io Hello, We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:image001.png at 01DA73C4.13BF9D80] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 13955 bytes Desc: image001.png URL: From Peter.Hruska at mcomputers.cz Mon Mar 11 15:13:49 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Mon, 11 Mar 2024 15:13:49 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de> Message-ID: <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz> Hello Renar, Thank you for the suggestion. I tried the configuration changes. They however do not seem to have any effect - the performance seems identical. I also tried all 4 combinations. I checked the documentation on the parameter "dioSmallSeqWriteBatching" and it states that small IOs are considered under 64kb. I've also found a parameter "dioSmallSeqWriteThreshold" but my GPFS version (5.1.9.0) claims that it's an unknown attribute. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:7960eefed7c420f97bb5d3ea3ccc383e18a57bcd.camel at mcomputers.cz-0] On Mon, 2024-03-11 at 13:56 +0000, Grunenberg, Renar wrote: EXTERN? ODES?LATEL Hallo Peter, my to cents to this. Set the diasbleDIO=yes Parameter to DEFAULT and use the ! dioSmallSeqWriteBatching 1 disableDIO 0 And give it a try. Regards Renar Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss Im Auftrag von Peter Hru?ka Gesendet: Montag, 11. M?rz 2024 14:22 An: gpfsug-discuss at gpfsug.org Betreff: [gpfsug-discuss] Slow performance on writes when using direct io Hello, We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:image001.png at 01DA73C4.13BF9D80] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 13955 bytes Desc: image001.png URL: From Peter.Hruska at mcomputers.cz Mon Mar 11 16:43:03 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Mon, 11 Mar 2024 16:43:03 +0000 Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow Message-ID: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz> Hello, We've encountered yet another performance flaw. We have a GPFS filesystem mounted using GPFS binaries on Windows. When we try to rewrite a file on the GPFS filesystem rewriting speed is much slower than writing to a new file. The difference ratio we measured is about 3.5 times. From the task manager it is visible that there is excessive amount of reading from the network when rewriting. This is even visible on the NDS server as io activity. However when rewriting on a linux client there are no reads while rewriting. Has anyone encountered such problems? To replicate the issue is is possidle to run fio twice with the same configuration to achieve rewriting or to run iozone. Both tools return similar outputs. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:d720b14ac4f99022ef331403ba1fb9d89b0f0d64.camel at mcomputers.cz-0] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From Renar.Grunenberg at huk-coburg.de Mon Mar 11 17:04:57 2024 From: Renar.Grunenberg at huk-coburg.de (Grunenberg, Renar) Date: Mon, 11 Mar 2024 17:04:57 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> <3d57a03c74984640b7d78b1a40e27843@huk-coburg.de> <3952cd1487255f84236f987ae5c6aba7d24c256c.camel@mcomputers.cz> Message-ID: Hallo Peter, it?s difficult to give here the right recommendation, what here relevant are the io-size, the current IO-pattern of your IO?s the current config of your deamons and so on. Best is open a performance Ticket and work on this. There are a presentation on the usergroup to hopefully figured out a little bit to these mentioned parameter. You can find this here: https://www.spectrumscaleug.org/wp-content/uploads/2020/09/004-spectrum-scale-performance-update.pdf Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss Im Auftrag von Peter Hru?ka Gesendet: Montag, 11. M?rz 2024 16:14 An: gpfsug-discuss at gpfsug.org Betreff: Re: [gpfsug-discuss] Slow performance on writes when using direct io Hello Renar, Thank you for the suggestion. I tried the configuration changes. They however do not seem to have any effect - the performance seems identical. I also tried all 4 combinations. I checked the documentation on the parameter "dioSmallSeqWriteBatching" and it states that small IOs are considered under 64kb. I've also found a parameter "dioSmallSeqWriteThreshold" but my GPFS version (5.1.9.0) claims that it's an unknown attribute. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:image001.png at 01DA73DD.B7502320] On Mon, 2024-03-11 at 13:56 +0000, Grunenberg, Renar wrote: EXTERN? ODES?LATEL Hallo Peter, my to cents to this. Set the diasbleDIO=yes Parameter to DEFAULT and use the ! dioSmallSeqWriteBatching 1 disableDIO 0 And give it a try. Regards Renar Renar Grunenberg Abteilung Informatik - Betrieb HUK-COBURG Bahnhofsplatz 96444 Coburg Telefon: 09561 96-44110 Telefax: 09561 96-44104 E-Mail: Renar.Grunenberg at huk-coburg.de Internet: www.huk.de ________________________________ HUK-COBURG Haftpflicht-Unterst?tzungs-Kasse kraftfahrender Beamter Deutschlands a. G. in Coburg Reg.-Gericht Coburg HRB 100; St.-Nr. 9212/101/00021 Sitz der Gesellschaft: Bahnhofsplatz, 96444 Coburg Vorsitzender des Aufsichtsrats: Prof. Dr. Heinrich R. Schradin. Vorstand: Klaus-J?rgen Heitmann (Sprecher), Stefan Gronbach, Dr. Hans Olav Her?y, Dr. Helen Reck, Dr. J?rg Rheinl?nder, Thomas Sehn, Daniel Thomas. ________________________________ Diese Nachricht enth?lt vertrauliche und/oder rechtlich gesch?tzte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese Nachricht irrt?mlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Nachricht. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Nachricht ist nicht gestattet. This information may contain confidential and/or privileged information. If you are not the intended recipient (or have received this information in error) please notify the sender immediately and destroy this information. Any unauthorized copying, disclosure or distribution of the material in this information is strictly forbidden. ________________________________ Von: gpfsug-discuss >Im Auftrag von Peter Hru?ka Gesendet: Montag, 11. M?rz 2024 14:22 An: gpfsug-discuss at gpfsug.org Betreff: [gpfsug-discuss] Slow performance on writes when using direct io Hello, We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:image001.png at 01DA73DD.B7502320] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 13955 bytes Desc: image001.png URL: From jonathan.buzzard at strath.ac.uk Mon Mar 11 17:35:12 2024 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 11 Mar 2024 17:35:12 +0000 Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow In-Reply-To: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz> References: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz> Message-ID: On 11/03/2024 16:43, Peter Hru?ka wrote: > Hello, > > We've encountered yet another performance flaw. We have a GPFS > filesystem mounted using GPFS binaries on Windows. When we try to > rewrite a file on the GPFS filesystem rewriting speed is much slower > than writing to a new file. The difference ratio we measured is about > 3.5 times. From the task manager it is visible that there is excessive > amount of reading from the network when rewriting. This is even visible > on the NDS server as io activity. However when rewriting on a linux > client there are no reads while rewriting. Has anyone encountered such > problems? > To replicate the issue is is possidle to run fio twice with the same > configuration to achieve rewriting or to run iozone. Both tools return > similar outputs. Kind of yes. What are you using to "rewrite" the file? What we saw initially over Samba was certain Microsoft applications when rewriting a file had truly abysmal performance. The same application when saving the same document to a new file and the performance was as expected. After much digging into it the cause (it was weeks of person effort) it was determined to be down to the application writing the file *one* byte at a time. Basically some idiot C++ developer at Microsoft decided to ignore the C++ library because it has "bugs" and write their own formatted output routines. It was not noticeable saving to a local disk, but the instant you tried saving to a network drive the performance was truly awful. Basically the increased latency of single character IO was the issue. Note that the issue was not confined to Samba and GPFS, as we verified the same abysmal performance with a Windows 2008 R2 server running on File and Print Sharing on NTFS on bare metal hardware. It was also not confined to just Windows the same awful performance happened on Macs too. In fact that is where it first came to light. Might be the cause of your problem, might not. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From salvet at ics.muni.cz Tue Mar 12 08:59:14 2024 From: salvet at ics.muni.cz (Zdenek Salvet) Date: Tue, 12 Mar 2024 09:59:14 +0100 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> Message-ID: <20240312085914.GM10934@horn.ics.muni.cz> On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote: > We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? Hello, did you use pre-allocated file(s) (was it re-write) ? libaio traffic is not really asynchronous with respect to necessary metadata operations (allocating new space and writing allocation structures to disk) in most Linux filesystems and I guess this case is not heavily optimized in GPFS either (dioSmallSeqWriteBatching feature may help a little but it targets different scenario I think). Best regards, Zdenek Salvet salvet at ics.muni.cz Institute of Computer Science of Masaryk University, Brno, Czech Republic and CESNET, z.s.p.o., Prague, Czech Republic Phone: ++420-549 49 6534 Fax: ++420-541 212 747 ---------------------------------------------------------------------------- Teamwork is essential -- it allows you to blame someone else. From Peter.Hruska at mcomputers.cz Tue Mar 12 10:59:34 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Tue, 12 Mar 2024 10:59:34 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <20240312085914.GM10934@horn.ics.muni.cz> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> <20240312085914.GM10934@horn.ics.muni.cz> Message-ID: <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz> Hello, The direct writes are problematic on both writes and rewrites. Rewrites alone are another issue we have noticed. Since indirect (direct=0) workloads are fine, it seems that the easiest solution could be to force indirect IO operations for all workloads. However we didn't find such possibility. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:28fa35781b86a55c01b26ed5221a254b716e5f82.camel at mcomputers.cz-0] On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote: EXTERN? ODES?LATEL On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote: We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? Hello, did you use pre-allocated file(s) (was it re-write) ? libaio traffic is not really asynchronous with respect to necessary metadata operations (allocating new space and writing allocation structures to disk) in most Linux filesystems and I guess this case is not heavily optimized in GPFS either (dioSmallSeqWriteBatching feature may help a little but it targets different scenario I think). Best regards, Zdenek Salvet salvet at ics.muni.cz Institute of Computer Science of Masaryk University, Brno, Czech Republic and CESNET, z.s.p.o., Prague, Czech Republic Phone: ++420-549 49 6534 Fax: ++420-541 212 747 ---------------------------------------------------------------------------- Teamwork is essential -- it allows you to blame someone else. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From uwe.falke at kit.edu Tue Mar 12 11:21:49 2024 From: uwe.falke at kit.edu (Uwe Falke) Date: Tue, 12 Mar 2024 12:21:49 +0100 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz> References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> <20240312085914.GM10934@horn.ics.muni.cz> <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz> Message-ID: Just thinking: an application should do direct IO for a good reason, and only then. "Forcing DIO" is probably not the right thing to do - rather check why an app does DIO and either change the app's behaviour if reasonable are maybe use a special? pool for it using mirrored SSDs or so. BTW, the ESS have some nice mechanism to do small IOs (also direct ones I suppose) quickly by buffering them on flash/NVRAM (where the data is considered persistently stored, hence the IO requests are completed quickly). Uwe On 12.03.24 11:59, Peter Hru?ka wrote: > Hello, > > The direct writes are problematic on both writes and rewrites. > Rewrites alone are another issue we have noticed. > Since indirect (direct=0) workloads are fine, it seems that the > easiest solution could be to force indirect IO operations for all > workloads. However we didn't find such possibility. > > -- > S p??n?m p?kn?ho dne / Best regards > > *Mgr. Peter Hru?ka* > IT specialista > > *M Computers s.r.o.* > ?lehlova 3100/10, 628 00 Brno-L??e? (mapa ) > T:+420 515 538 136 > E: peter.hruska at mcomputers.cz > > www.mcomputers.cz > www.lenovoshop.cz > > > > On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote: >> EXTERN? ODES?LATEL >> >> >> On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote: >>> We encountered a problem with performance of writes on GPFS when the >>> application uses direct io access. To simulate the issue it is >>> enough to run fio with option direct=1. The performance drop is >>> quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct >>> GPFS to ignore direct IO by using "disableDIO=yes". The directive >>> didn't have any effect. Is there any possibility how to achieve that >>> GPFS would ignore direct IO requests and use caching for everything? >> >> Hello, >> did you use pre-allocated file(s) (was it re-write) ? >> libaio traffic is not really asynchronous with respect to necessary >> metadata >> operations (allocating new space and writing allocation structures to >> disk) >> in most Linux filesystems and I guess this case is not heavily optimized >> in GPFS either (dioSmallSeqWriteBatching feature may help a little but >> it targets different scenario I think). >> >> Best regards, >> Zdenek Salvet salvet at ics.muni.cz >> Institute of Computer Science of Masaryk University, Brno, Czech Republic >> and CESNET, z.s.p.o., Prague, Czech Republic >> Phone: ++420-549 49 6534?????????????????????????? Fax: ++420-541 212 747 >> ---------------------------------------------------------------------------- >> ????? Teamwork is essential -- it allows you to blame someone else. >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at gpfsug.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -- Karlsruhe Institute of Technology (KIT) Scientific Computing Centre (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email:uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5814 bytes Desc: S/MIME Cryptographic Signature URL: From Peter.Hruska at mcomputers.cz Tue Mar 12 12:03:46 2024 From: Peter.Hruska at mcomputers.cz (=?utf-8?B?UGV0ZXIgSHJ1xaFrYQ==?=) Date: Tue, 12 Mar 2024 12:03:46 +0000 Subject: [gpfsug-discuss] Rewriting existing files is incredibly slow In-Reply-To: References: <7af86269539605e69b060b4d0ab8bbf946f96959.camel@mcomputers.cz> Message-ID: <911aa0dc39fe50dd5f3de2038cc6239f294f1a6a.camel@mcomputers.cz> Hello, For the ease of testing and reproduction we use fio and iozone benchmarking tools. The issue you desrcibe looks pretty similar. However it occurs using GPFS mount only. Here are some results from out testing environment: Linux NSD server: Writes: 3472 MiB/s Re-writes: 3476 MiB/s Windows GPFS: Writes: 2487 MiB/s Re-writes: 250 MiB/s Windows Samba: Writes: 997 MiB/s Re-writes: 1269 MiB/s -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:a57a2d9ed2ea460d8bb2abefe678c4fab3cdaeeb.camel at mcomputers.cz-0] On Mon, 2024-03-11 at 17:35 +0000, Jonathan Buzzard wrote: EXTERN? ODES?LATEL On 11/03/2024 16:43, Peter Hru?ka wrote: Hello, We've encountered yet another performance flaw. We have a GPFS filesystem mounted using GPFS binaries on Windows. When we try to rewrite a file on the GPFS filesystem rewriting speed is much slower than writing to a new file. The difference ratio we measured is about 3.5 times. From the task manager it is visible that there is excessive amount of reading from the network when rewriting. This is even visible on the NDS server as io activity. However when rewriting on a linux client there are no reads while rewriting. Has anyone encountered such problems? To replicate the issue is is possidle to run fio twice with the same configuration to achieve rewriting or to run iozone. Both tools return similar outputs. Kind of yes. What are you using to "rewrite" the file? What we saw initially over Samba was certain Microsoft applications when rewriting a file had truly abysmal performance. The same application when saving the same document to a new file and the performance was as expected. After much digging into it the cause (it was weeks of person effort) it was determined to be down to the application writing the file *one* byte at a time. Basically some idiot C++ developer at Microsoft decided to ignore the C++ library because it has "bugs" and write their own formatted output routines. It was not noticeable saving to a local disk, but the instant you tried saving to a network drive the performance was truly awful. Basically the increased latency of single character IO was the issue. Note that the issue was not confined to Samba and GPFS, as we verified the same abysmal performance with a Windows 2008 R2 server running on File and Print Sharing on NTFS on bare metal hardware. It was also not confined to just Windows the same awful performance happened on Macs too. In fact that is where it first came to light. Might be the cause of your problem, might not. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mcomputers_podpis_2024.png Type: image/png Size: 13955 bytes Desc: mcomputers_podpis_2024.png URL: From olaf.weiser at de.ibm.com Tue Mar 12 12:12:24 2024 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Tue, 12 Mar 2024 12:12:24 +0000 Subject: [gpfsug-discuss] Slow performance on writes when using direct io In-Reply-To: References: <6eea2f7a2e2341e0d8d5164edd8eb4c5e87f28f6.camel@mcomputers.cz> <20240312085914.GM10934@horn.ics.muni.cz> <7b539383643e14c1be8aafd9775b322eef0c22bc.camel@mcomputers.cz> Message-ID: just for completeness, let's use this thread to document an undocumented parameter, which should not be used any more disableDIO=yes IS NOT disabling DIO ? The parameter is a bit misleading by its name. DIRECT IO is a hint from an application to bypass cache. However mostly it is expected by application programmers to have the IO safely on disk, when it gets ack'd To make this behavior absolutely sure, there is the O_SYNC flag for writes to make that happen. However, is pretty common , just to expect DIRECT is similar seen as a synonym more details here :https://man7.org/linux/man-pages/man2/open.2.html However, GPFS will handle O_DIRECT similar to follow the expectation of programmers. So in GPFS, for directIO (O_CIRECT) (by passing caches) we can benefit and use a so called optimized IO path .. which takes advantage of LINUX AIO... however, there are multiple situation, where you can NOT write directly.. e.g. if the IO is not aligned or you want to append a file, when there so no block (no disk address yet allocated to that file) etc...Then you can't process the IO in this directIO .. optimized path [[some other aspects: for data, which gets accessed w/o caching.. is that you disable prefetching ? and more relevant here: you need other (more efficient tokens) than for buffered writes]] Lets say, an application appends a file (similar to create a file and write, ) Usually direct IOs are rather small. As long you write small IOs into an existing block, direct IO is fine - but then - if you fill the last block fully with data, you need to allocate a new block. You can't write directly DIRECT_IO .. to no-where ? So we need to allocate a new block (as any other file system as well) . This means for GPFS, that we need to leave the optimized (direct)IO path and allocate a new block by allocating according buffers first When this is done, we finally sync the data from the "direct"-IO before ack'in the IO to follow the expected symantec of the application programmer, which obviously used the O_DIRECT . For some workloads , which mostly create/write and append files, this is happen on each block .. so frequently... causing GPFS to change going in and out the optimized IO path, which causes also changing tokens To avoid that, a very long time ago -disableDIO - was introduced as a quick and efficient work around. in the meantime, there is an heuristic in our code, that automatically detects such cases and should NOT use this parameter disableDIO any more Since GPFS 5.0.x we introduced dioSmallSeqWriteBatching=yes . PLEASE use this parameter. By default, the optimization kicks in when we see three AIO/DIO writes that are no larger 64k bytes each and no more than than one write length appart. If you know, that your application does larger DIO writes..let us know, open a SF ticket, there are further options. BACK to the origin question ?you may consider --preallocate blocks to the file (s) --double check "active snapshots" (copy on write for DIRECT IO is very expensive) --adjust your block size / RAID config to lower write amplification --check network rtt for token traffic !!! --try to avoid HDD based backend, as they # IOPS is very limited last but not least - talk to your application programmers ... ? if they really need, what they programmed ? ________________________________ Von: gpfsug-discuss im Auftrag von Uwe Falke Gesendet: Dienstag, 12. M?rz 2024 12:21 An: gpfsug-discuss at gpfsug.org Betreff: [EXTERNAL] Re: [gpfsug-discuss] Slow performance on writes when using direct io Just thinking: an application should do direct IO for a good reason, and only then. "Forcing DIO" is probably not the right thing to do - rather check why an app does DIO and either change the app's behaviour if reasonable are maybe use a special pool for it using mirrored SSDs or so. BTW, the ESS have some nice mechanism to do small IOs (also direct ones I suppose) quickly by buffering them on flash/NVRAM (where the data is considered persistently stored, hence the IO requests are completed quickly). Uwe On 12.03.24 11:59, Peter Hru?ka wrote: Hello, The direct writes are problematic on both writes and rewrites. Rewrites alone are another issue we have noticed. Since indirect (direct=0) workloads are fine, it seems that the easiest solution could be to force indirect IO operations for all workloads. However we didn't find such possibility. -- S p??n?m p?kn?ho dne / Best regards Mgr. Peter Hru?ka IT specialista M Computers s.r.o. ?lehlova 3100/10, 628 00 Brno-L??e? (mapa) T:+420 515 538 136 E: peter.hruska at mcomputers.cz www.mcomputers.cz www.lenovoshop.cz [cid:04ece728-7e44-4ab2-b1c6-1928bb3c29ee] On Tue, 2024-03-12 at 09:59 +0100, Zdenek Salvet wrote: EXTERN? ODES?LATEL On Mon, Mar 11, 2024 at 01:21:32PM +0000, Peter Hru?ka wrote: We encountered a problem with performance of writes on GPFS when the application uses direct io access. To simulate the issue it is enough to run fio with option direct=1. The performance drop is quite dramatic - 250 MiB/s vs. 2955 MiB/s. We've tried to instruct GPFS to ignore direct IO by using "disableDIO=yes". The directive didn't have any effect. Is there any possibility how to achieve that GPFS would ignore direct IO requests and use caching for everything? Hello, did you use pre-allocated file(s) (was it re-write) ? libaio traffic is not really asynchronous with respect to necessary metadata operations (allocating new space and writing allocation structures to disk) in most Linux filesystems and I guess this case is not heavily optimized in GPFS either (dioSmallSeqWriteBatching feature may help a little but it targets different scenario I think). Best regards, Zdenek Salvet salvet at ics.muni.cz Institute of Computer Science of Masaryk University, Brno, Czech Republic and CESNET, z.s.p.o., Prague, Czech Republic Phone: ++420-549 49 6534 Fax: ++420-541 212 747 ---------------------------------------------------------------------------- Teamwork is essential -- it allows you to blame someone else. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -- Karlsruhe Institute of Technology (KIT) Scientific Computing Centre (SCC) Scientific Data Management (SDM) Uwe Falke Hermann-von-Helmholtz-Platz 1, Building 442, Room 187 D-76344 Eggenstein-Leopoldshafen Tel: +49 721 608 28024 Email: uwe.falke at kit.edu www.scc.kit.edu Registered office: Kaiserstra?e 12, 76131 Karlsruhe, Germany KIT ? The Research University in the Helmholtz Association -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-fvfsez5b.png Type: image/png Size: 13955 bytes Desc: Outlook-fvfsez5b.png URL: From p.ward at nhm.ac.uk Tue Mar 26 15:28:37 2024 From: p.ward at nhm.ac.uk (Paul Ward) Date: Tue, 26 Mar 2024 15:28:37 +0000 Subject: [gpfsug-discuss] TCT - anyone else using? Message-ID: We've just added AWS as a second cloud container pool to our on premises COS. Apparently, we're the first to do this! I'd like to hear from other people using TCT, especially if you have more than one destination. Kindest regards, Paul Paul Ward TS Infrastructure Architect Natural History Museum T: 02079426450 E: p.ward at nhm.ac.uk [cid:image001.png at 01DA7F92.4567E340] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 12974 bytes Desc: image001.png URL: From vladimir.sapunenko at cnaf.infn.it Wed Mar 27 14:58:35 2024 From: vladimir.sapunenko at cnaf.infn.it (Vladimir Sapunenko) Date: Wed, 27 Mar 2024 15:58:35 +0100 Subject: [gpfsug-discuss] occupancy percentage for a pool Message-ID: Hello, since the early days of GPFS there was a limitation of using only integer values in the LIMIT directive for placement policy rules. Now it seems outdated and GPFS documentation says? "You can specify OccupancyPercentage as a floating point number , as in the following example: RULE 'r' RESTORE to pool 'x' limit(8.9e1) However, 8.9e1 = 89, and I'm wondering if it is acceptable to define limit? 99.5%? in file placement policy like this: RULE 'DATA3' SET POOL 'data2' LIMIT(99.5) mmchpolicy does not complain to such values. My doubt is if GPFS really uses fractions of a percent in occupancy calculation. Is anybody using such limits? (0.5% in my case make a difference because it corresponds to about 60TB) Thanks, Vladimir || -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: vladimir_sapunenko.vcf Type: text/vcard Size: 235 bytes Desc: not available URL: