From oehmes at gmail.com Fri Mar 1 01:33:58 2019 From: oehmes at gmail.com (Sven Oehme) Date: Thu, 28 Feb 2019 17:33:58 -0800 Subject: [gpfsug-discuss] Clarification of mmdiag --iohist output In-Reply-To: References: <9338621C-3F85-48DF-AE42-64998680E14C@vanderbilt.edu> Message-ID: Hi, using nsdSmallThreadRatio 1 is not necessarily correct, as it 'significant depends' (most used word combination of performance engineers) on your workload. to give some more background - on reads you need much more threads for small i/os than for large i/os to get maximum performance, the reason is a small i/o usually only reads one strip of data (sitting on one physical device) while a large i/o reads an entire stripe (which typically spans multiple devices). as a more concrete example, in a 8+2p raid setup a single full stripe read will trigger internal reads in parallel to 8 different targets at the same time, so for small i/os you would need 8 times as many small read requests (and therefore threads) to keep the drives busy at the same level. on writes its even more complex, a large full stripe write usually just writes to all target disks, while a tiny small write in the middle might force a read / modify / write which can have a huge write amplification and cause more work than a large full track i/o. raid controller caches also play a significant role here and make this especially hard to optimize as you need to know exactly what and where to measure when you tune to get improvements for real world workload and not just improve your synthetic test but actually hurt your real application performance. i should write a book about this some day ;-) hope that helps. Sven On Thu, Feb 21, 2019 at 4:23 AM Frederick Stock wrote: > Kevin I'm assuming you have seen the article on IBM developerWorks about > the GPFS NSD queues. It provides useful background for analyzing the dump > nsd information. Here I'll list some thoughts for items that you can > investigate/consider. > > If your NSD servers are doing both large (greater than 64K) and small (64K > or less) IOs then you want to have the nsdSmallThreadRatio set to 1 as it > seems you do for the NSD servers. This provides an equal number of SMALL > and LARGE NSD queues. You can also increase the total number of queues > (currently 256) but I cannot determine if that is necessary from the data > you provided. Only on rare occasions have I seen a need to increase the > number of queues. > > The fact that you have 71 highest pending on your LARGE queues and 73 > highest pending on your SMALL queues would imply your IOs are queueing for > a good while either waiting for resources in GPFS or waiting for IOs to > complete. Your maximum buffer size is 16M which is defined to be the > largest IO that can be requested by GPFS. This is the buffer size that > GPFS will use for LARGE IOs. You indicated you had sufficient memory on > the NSD servers but what is the value for the pagepool on those servers, > and what is the value of the nsdBufSpace parameter? If the NSD server is > just that then usually nsdBufSpace is set to 70. The IO buffers used by > the NSD server come from the pagepool so you need sufficient space there > for the maximum number of LARGE IO buffers that would be used concurrently > by GPFS or threads will need to wait for those buffers to become > available. Essentially you want to ensure you have sufficient memory for > the maximum number of IOs all doing a large IO and that value being less > than 70% of the pagepool size. > > You could look at the settings for the FC cards to ensure they are > configured to do the largest IOs possible. I forget the actual values > (have not done this for awhile) but there are settings for the adapters > that control the maximum IO size that will be sent. I think you want this > to be as large as the adapter can handle to reduce the number of messages > needed to complete the large IOs done by GPFS. > > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 <(720)%20430-8821> > stockf at us.ibm.com > > > > ----- Original message ----- > From: "Buterbaugh, Kevin L" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > > Cc: > Subject: Re: [gpfsug-discuss] Clarification of mmdiag --iohist output > Date: Thu, Feb 21, 2019 6:39 AM > > Hi All, > > My thanks to Aaron, Sven, Steve, and whoever responded for the GPFS team. > You confirmed what I suspected ? my example 10 second I/O was _from an NSD > server_ ? and since we?re in a 8 Gb FC SAN environment, it therefore means > - correct me if I?m wrong about this someone - that I?ve got a problem > somewhere in one (or more) of the following 3 components: > > 1) the NSD servers > 2) the SAN fabric > 3) the storage arrays > > I?ve been looking at all of the above and none of them are showing any > obvious problems. I?ve actually got a techie from the storage array vendor > stopping by on Thursday, so I?ll see if he can spot anything there. Our FC > switches are QLogic?s, so I?m kinda screwed there in terms of getting any > help. But I don?t see any errors in the switch logs and ?show perf? on the > switches is showing I/O rates of 50-100 MB/sec on the in use ports, so I > don?t _think_ that?s the issue. > > And this is the GPFS mailing list, after all ? so let?s talk about the NSD > servers. Neither memory (64 GB) nor CPU (2 x quad-core Intel Xeon E5620?s) > appear to be an issue. But I have been looking at the output of ?mmfsadm > saferdump nsd? based on what Aaron and then Steve said. Here?s some fairly > typical output from one of the SMALL queues (I?ve checked several of my 8 > NSD servers and they?re all showing similar output): > > Queue NSD type NsdQueueTraditional [244]: SMALL, threads started 12, > active 3, highest 12, deferred 0, chgSize 0, draining 0, is_chg 0 > requests pending 0, highest pending 73, total processed 4859732 > mutex 0x7F3E449B8F10, reqCond 0x7F3E449B8F58, thCond 0x7F3E449B8F98, > queue 0x7F3E449B8EF0, nFreeNsdRequests 29 > > And for a LARGE queue: > > Queue NSD type NsdQueueTraditional [8]: LARGE, threads started 12, > active 1, highest 12, deferred 0, chgSize 0, draining 0, is_chg 0 > requests pending 0, highest pending 71, total processed 2332966 > mutex 0x7F3E441F3890, reqCond 0x7F3E441F38D8, thCond 0x7F3E441F3918, > queue 0x7F3E441F3870, nFreeNsdRequests 31 > > So my large queues seem to be slightly less utilized than my small queues > overall ? i.e. I see more inactive large queues and they generally have a > smaller ?highest pending? value. > > Question: are those non-zero ?highest pending? values something to be > concerned about? > > I have the following thread-related parameters set: > > [common] > maxReceiverThreads 12 > nsdMaxWorkerThreads 640 > nsdThreadsPerQueue 4 > nsdSmallThreadRatio 3 > workerThreads 128 > > [serverLicense] > nsdMaxWorkerThreads 1024 > nsdThreadsPerQueue 12 > nsdSmallThreadRatio 1 > pitWorkerThreadsPerNode 3 > workerThreads 1024 > > Also, at the top of the ?mmfsadm saferdump nsd? output I see: > > Total server worker threads: running 1008, desired 147, forNSD 147, forGNR > 0, nsdBigBufferSize 16777216 > nsdMultiQueue: 256, nsdMultiQueueType: 1, nsdMinWorkerThreads: 16, > nsdMaxWorkerThreads: 1024 > > Question: is the fact that 1008 is pretty close to 1024 a concern? > > Anything jump out at anybody? I don?t mind sharing full output, but it is > rather lengthy. Is this worthy of a PMR? > > Thanks! > > -- > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 <(615)%20875-9633> > > > On Feb 17, 2019, at 1:01 PM, IBM Spectrum Scale wrote: > > Hi Kevin, > > The I/O hist shown by the command mmdiag --iohist actually depends on the > node on which you are running this command from. > If you are running this on a NSD server node then it will show the time > taken to complete/serve the read or write I/O operation sent from the > client node. > And if you are running this on a client (or non NSD server) node then it > will show the complete time taken by the read or write I/O operation > requested by the client node to complete. > So in a nut shell for the NSD server case it is just the latency of the > I/O done on disk by the server whereas for the NSD client case it also the > latency of send and receive of I/O request to the NSD server along with the > latency of I/O done on disk by the NSD server. > I hope this answers your query. > > > Regards, The Spectrum Scale (GPFS) team > > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale > (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 > > . > > If your query concerns a potential software error in Spectrum Scale (GPFS) > and you have an IBM software maintenance contract please contact > 1-800-237-5511 <(800)%20237-5511> in the United States or your local IBM > Service Center in other countries. > > The forum is informally monitored as time permits and should not be used > for priority messages to the Spectrum Scale (GPFS) team. > > > > From: "Buterbaugh, Kevin L" > To: gpfsug main discussion list > Date: 02/16/2019 08:18 PM > Subject: [gpfsug-discuss] Clarification of mmdiag --iohist output > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi All, > > Been reading man pages, docs, and Googling, and haven?t found a definitive > answer to this question, so I knew exactly where to turn? ;-) > > I?m dealing with some slow I/O?s to certain storage arrays in our > environments ? like really, really slow I/O?s ? here?s just one example > from one of my NSD servers of a 10 second I/O: > > 08:49:34.943186 W data 30:41615622144 2048 10115.192 srv > dm-92 > > So here?s my question ? when mmdiag ?iohist tells me that that I/O took > slightly over 10 seconds, is that: > > 1. The time from when the NSD server received the I/O request from the > client until it shipped the data back onto the wire towards the client? > 2. The time from when the client issued the I/O request until it received > the data back from the NSD server? > 3. Something else? > > I?m thinking it?s #1, but want to confirm. Which one it is has very > obvious implications for our troubleshooting steps. Thanks in advance? > > Kevin > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and > Education > *Kevin.Buterbaugh at vanderbilt.edu* - > (615)875-9633 <(615)%20875-9633> > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C2bfb2e8e30e64fa06c0f08d6959b2d38%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636860891056297114&sdata=5pL67mhVyScJovkRHRqZog9bM5BZG8F2q972czIYAbA%3D&reserved=0 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jdratlif at iu.edu Tue Mar 5 16:21:18 2019 From: jdratlif at iu.edu (Ratliff, John) Date: Tue, 5 Mar 2019 16:21:18 +0000 Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another Message-ID: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu> We use a GPFS file system for our computing clusters and we're working on moving to a new SAN. We originally tried AFM, but it didn't seem to work very well. We tried to do a prefetch on a test policy scan of 100 million files, and after 24 hours it hadn't pre-fetched anything. It wasn't clear what was happening. Some smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred. Since then we started using rsync with the GPFS attrs patch. We have over 600 million files and 700 TB. I split up the rsync tasks with lists of files generated by the policy engine and we transferred the original data in about 2 weeks. Now we're working on final synchronization. I'd like to use one of the delete options to remove files that were sync'd earlier and then deleted. This can't be combined with the files-from option, so it's harder to break up the rsync tasks. Some of the directories I'm running this against have 30-150 million files each. This can take quite some time with a single rsync process. I'm also wondering if any of my rsync options are unnecessary. I was using avHAXS and numeric-ids. I'm thinking the A (acls) and X (xatttrs) might be unnecessary with GPFS->GPFS. We're only using NFSv4 GPFS ACLs. I don't know if GPFS uses any xattrs that rsync would sync or not. Removing those two options removed several system calls, which should make it much faster, but I want to make sure I'm syncing correctly. Also, it seems there is a problem with the GPFS patch on rsync where it will always give an error trying to get GPFS attributes on a symlink, which means it doesn't sync any symlinks when using that option. So you can rsync symlinks or GPFS attrs, but not both at the same time. This has lead to me running two rsyncs, one to get all files and one to get all attributes. Thanks for any ideas or suggestions. John Ratliff | Pervasive Technology Institute | UITS | Research Storage - Indiana University | http://pti.iu.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5670 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Tue Mar 5 16:38:31 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 5 Mar 2019 16:38:31 +0000 Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another In-Reply-To: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu> References: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu> Message-ID: I wrote a patch to mpifileutils which will copy gpfs attributes, but when we played with it with rsync, something was obviously still different about the attrs from each, so use with care. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Ratliff, John [jdratlif at iu.edu] Sent: 05 March 2019 16:21 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another We use a GPFS file system for our computing clusters and we?re working on moving to a new SAN. We originally tried AFM, but it didn?t seem to work very well. We tried to do a prefetch on a test policy scan of 100 million files, and after 24 hours it hadn?t pre-fetched anything. It wasn?t clear what was happening. Some smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred. Since then we started using rsync with the GPFS attrs patch. We have over 600 million files and 700 TB. I split up the rsync tasks with lists of files generated by the policy engine and we transferred the original data in about 2 weeks. Now we?re working on final synchronization. I?d like to use one of the delete options to remove files that were sync?d earlier and then deleted. This can?t be combined with the files-from option, so it?s harder to break up the rsync tasks. Some of the directories I?m running this against have 30-150 million files each. This can take quite some time with a single rsync process. I?m also wondering if any of my rsync options are unnecessary. I was using avHAXS and numeric-ids. I?m thinking the A (acls) and X (xatttrs) might be unnecessary with GPFS->GPFS. We?re only using NFSv4 GPFS ACLs. I don?t know if GPFS uses any xattrs that rsync would sync or not. Removing those two options removed several system calls, which should make it much faster, but I want to make sure I?m syncing correctly. Also, it seems there is a problem with the GPFS patch on rsync where it will always give an error trying to get GPFS attributes on a symlink, which means it doesn?t sync any symlinks when using that option. So you can rsync symlinks or GPFS attrs, but not both at the same time. This has lead to me running two rsyncs, one to get all files and one to get all attributes. Thanks for any ideas or suggestions. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu From Robert.Oesterlin at nuance.com Tue Mar 5 19:57:45 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 5 Mar 2019 19:57:45 +0000 Subject: [gpfsug-discuss] Reminder - Registration now open - US User Group Meeting, April 16-17th, NCAR Boulder Message-ID: Registration is now open: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2019-meeting-tickets-57035376346 Please note that agenda details are not set yet but these will be finalized in the next few weeks - when they are I will post to the registration page and the mailing list. - April 15th: Informal social gather on Monday for those arriving early (location TBD) - April 16th: Full day of talks from IBM and the user community, Social and Networking Event (details TBD) - April 17th: Talks and breakout sessions (If you have any topics for the breakout sessions, let us know) Looking forward to seeing everyone in Boulder! Bob Oesterlin/Kristy Kallback-Rose -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Mar 5 21:38:52 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Tue, 5 Mar 2019 21:38:52 +0000 Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another In-Reply-To: References: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu>, Message-ID: DDN also have a paid for product for doing moving of data (data flow) We found out about it after we did a massive data migration... I can't comment on it other than being aware of it. Sure your local DDN sales person can help. But if only IBM supported some sort of restripe to new block size, we wouldn't have to do this mass migration :-P Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Simon Thompson [S.J.Thompson at bham.ac.uk] Sent: 05 March 2019 16:38 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] suggestions forwar copying one GPFS file system into another I wrote a patch to mpifileutils which will copy gpfs attributes, but when we played with it with rsync, something was obviously still different about the attrs from each, so use with care. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Ratliff, John [jdratlif at iu.edu] Sent: 05 March 2019 16:21 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another We use a GPFS file system for our computing clusters and we?re working on moving to a new SAN. We originally tried AFM, but it didn?t seem to work very well. We tried to do a prefetch on a test policy scan of 100 million files, and after 24 hours it hadn?t pre-fetched anything. It wasn?t clear what was happening. Some smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred. Since then we started using rsync with the GPFS attrs patch. We have over 600 million files and 700 TB. I split up the rsync tasks with lists of files generated by the policy engine and we transferred the original data in about 2 weeks. Now we?re working on final synchronization. I?d like to use one of the delete options to remove files that were sync?d earlier and then deleted. This can?t be combined with the files-from option, so it?s harder to break up the rsync tasks. Some of the directories I?m running this against have 30-150 million files each. This can take quite some time with a single rsync process. I?m also wondering if any of my rsync options are unnecessary. I was using avHAXS and numeric-ids. I?m thinking the A (acls) and X (xatttrs) might be unnecessary with GPFS->GPFS. We?re only using NFSv4 GPFS ACLs. I don?t know if GPFS uses any xattrs that rsync would sync or not. Removing those two options removed several system calls, which should make it much faster, but I want to make sure I?m syncing correctly. Also, it seems there is a problem with the GPFS patch on rsync where it will always give an error trying to get GPFS attributes on a symlink, which means it doesn?t sync any symlinks when using that option. So you can rsync symlinks or GPFS attrs, but not both at the same time. This has lead to me running two rsyncs, one to get all files and one to get all attributes. Thanks for any ideas or suggestions. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From Robert.Oesterlin at nuance.com Tue Mar 5 21:56:54 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 5 Mar 2019 21:56:54 +0000 Subject: [gpfsug-discuss] Migrating billions of files? Message-ID: <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com> I?m looking at migration 3-4 Billion files, maybe 3PB of data between GPFS clusters. Most of the files are small - 60% 8K or less. Ideally I?d like to copy at least 15-20M files per day - ideally 50M. Any thoughts on how achievable this is? Or what to use? Either with AFM, mpifileutils, rsync.. other? Many of these files would be in 4k inodes. Destination is ESS. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From YARD at il.ibm.com Wed Mar 6 09:01:16 2019 From: YARD at il.ibm.com (Yaron Daniel) Date: Wed, 6 Mar 2019 11:01:16 +0200 Subject: [gpfsug-discuss] Migrating billions of files? In-Reply-To: <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com> References: <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com> Message-ID: Hi What permissions you have ? Do u have only Posix , or also SMB attributes ? If only posix attributes you can do the following: - rsync (which will work on different filesets/directories in parallel. - AFM (but in case you need rollback - it will be problematic) Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect ? IL Lab Services (Storage) Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 03/05/2019 11:57 PM Subject: [gpfsug-discuss] Migrating billions of files? Sent by: gpfsug-discuss-bounces at spectrumscale.org I?m looking at migration 3-4 Billion files, maybe 3PB of data between GPFS clusters. Most of the files are small - 60% 8K or less. Ideally I?d like to copy at least 15-20M files per day - ideally 50M. Any thoughts on how achievable this is? Or what to use? Either with AFM, mpifileutils, rsync.. other? Many of these files would be in 4k inodes. Destination is ESS. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=uXadyLeBnskK8mq-S8OjwY-ESxuNxXme9Akj9QaQBiE&s=UdKoJNySkr8itrQaRD9XMkVjBGnVaU8XnyxuKCldX-8&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4786 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5054 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Wed Mar 6 09:08:21 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 6 Mar 2019 09:08:21 +0000 Subject: [gpfsug-discuss] Migrating billions of files? In-Reply-To: References: <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com> Message-ID: <011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> AFM doesn?t work well if you have dependent filesets though .. which we did for quota purposes. Simon From: on behalf of "YARD at il.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 6 March 2019 at 09:01 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Migrating billions of files? Hi What permissions you have ? Do u have only Posix , or also SMB attributes ? If only posix attributes you can do the following: - rsync (which will work on different filesets/directories in parallel. - AFM (but in case you need rollback - it will be problematic) Regards ________________________________ Yaron Daniel 94 Em Ha'Moshavot Rd [cid:_1_0FC36C500FC3669C00318CDBC22583B5] Storage Architect ? IL Lab Services (Storage) Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel [IBM Storage Strategy and Solutions v1][IBM Storage Management and Data Protection v1][cid:_1_0FA0428C0FA03A6C00318CDBC22583B5][cid:_1_0FA044940FA03A6C00318CDBC22583B5] [https://acclaim-production-app.s3.amazonaws.com/images/6c2c3858-6df8-45be-ac2b-f93b8da74e20/Data%2BDriven%2BMulti%2BCloud%2BStrategy%2BV1%2Bver%2B4.png] [FlashSystem A9000/R Foundations] [All Flash Storage Foundations V2] From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 03/05/2019 11:57 PM Subject: [gpfsug-discuss] Migrating billions of files? Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ I?m looking at migration 3-4 Billion files, maybe 3PB of data between GPFS clusters. Most of the files are small - 60% 8K or less. Ideally I?d like to copy at least 15-20M files per day - ideally 50M. Any thoughts on how achievable this is? Or what to use? Either with AFM, mpifileutils, rsync.. other? Many of these files would be in 4k inodes. Destination is ESS. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 1852 bytes Desc: image001.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.gif Type: image/gif Size: 4377 bytes Desc: image002.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.gif Type: image/gif Size: 5094 bytes Desc: image003.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.gif Type: image/gif Size: 4747 bytes Desc: image004.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image005.gif Type: image/gif Size: 4558 bytes Desc: image005.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image006.gif Type: image/gif Size: 5094 bytes Desc: image006.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image007.gif Type: image/gif Size: 4787 bytes Desc: image007.gif URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image008.gif Type: image/gif Size: 5055 bytes Desc: image008.gif URL: From YARD at il.ibm.com Wed Mar 6 09:13:18 2019 From: YARD at il.ibm.com (Yaron Daniel) Date: Wed, 6 Mar 2019 11:13:18 +0200 Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another In-Reply-To: References: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu>, Message-ID: Hi U can also use today Aspera - which will replicate gpfs extended attr. Integration of IBM Aspera Sync with IBM Spectrum Scale: Protecting and Sharing Files Globally http://www.redbooks.ibm.com/redpieces/abstracts/redp5527.html?Open I used in the past the arsync - used for Sonas - i think this is now the Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect ? IL Lab Services (Storage) Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Simon Thompson To: gpfsug main discussion list Date: 03/05/2019 11:39 PM Subject: Re: [gpfsug-discuss] suggestions for copying one GPFS file system into another Sent by: gpfsug-discuss-bounces at spectrumscale.org DDN also have a paid for product for doing moving of data (data flow) We found out about it after we did a massive data migration... I can't comment on it other than being aware of it. Sure your local DDN sales person can help. But if only IBM supported some sort of restripe to new block size, we wouldn't have to do this mass migration :-P Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Simon Thompson [S.J.Thompson at bham.ac.uk] Sent: 05 March 2019 16:38 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] suggestions forwar copying one GPFS file system into another I wrote a patch to mpifileutils which will copy gpfs attributes, but when we played with it with rsync, something was obviously still different about the attrs from each, so use with care. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Ratliff, John [jdratlif at iu.edu] Sent: 05 March 2019 16:21 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another We use a GPFS file system for our computing clusters and we?re working on moving to a new SAN. We originally tried AFM, but it didn?t seem to work very well. We tried to do a prefetch on a test policy scan of 100 million files, and after 24 hours it hadn?t pre-fetched anything. It wasn?t clear what was happening. Some smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred. Since then we started using rsync with the GPFS attrs patch. We have over 600 million files and 700 TB. I split up the rsync tasks with lists of files generated by the policy engine and we transferred the original data in about 2 weeks. Now we?re working on final synchronization. I?d like to use one of the delete options to remove files that were sync?d earlier and then deleted. This can?t be combined with the files-from option, so it?s harder to break up the rsync tasks. Some of the directories I?m running this against have 30-150 million files each. This can take quite some time with a single rsync process. I?m also wondering if any of my rsync options are unnecessary. I was using avHAXS and numeric-ids. I?m thinking the A (acls) and X (xatttrs) might be unnecessary with GPFS->GPFS. We?re only using NFSv4 GPFS ACLs. I don?t know if GPFS uses any xattrs that rsync would sync or not. Removing those two options removed several system calls, which should make it much faster, but I want to make sure I?m syncing correctly. Also, it seems there is a problem with the GPFS patch on rsync where it will always give an error trying to get GPFS attributes on a symlink, which means it doesn?t sync any symlinks when using that option. So you can rsync symlinks or GPFS attrs, but not both at the same time. This has lead to me running two rsyncs, one to get all files and one to get all attributes. Thanks for any ideas or suggestions. John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | https://urldefense.proofpoint.com/v2/url?u=http-3A__pti.iu.edu&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=Yz-c0LCo_QGBe4pgbJEr_zzSX4Q1ttDOaHYmcfLln5U&s=gNzUpbvNUfVteTqZ3zpzpbC4M1lQiopyrIfr46h4Okc&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=Yz-c0LCo_QGBe4pgbJEr_zzSX4Q1ttDOaHYmcfLln5U&s=pG-g3zRAtaMwcmwoabY4dvuI1j3jbLk-uGHZ6nz6TlU&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwIF-g&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=Yz-c0LCo_QGBe4pgbJEr_zzSX4Q1ttDOaHYmcfLln5U&s=pG-g3zRAtaMwcmwoabY4dvuI1j3jbLk-uGHZ6nz6TlU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4786 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5054 bytes Desc: not available URL: From YARD at il.ibm.com Wed Mar 6 09:17:59 2019 From: YARD at il.ibm.com (Yaron Daniel) Date: Wed, 6 Mar 2019 11:17:59 +0200 Subject: [gpfsug-discuss] Migrating billions of files? In-Reply-To: <011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> References: <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com> <011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> Message-ID: Hi U can also use today Aspera - which will replicate gpfs extended attr. Integration of IBM Aspera Sync with IBM Spectrum Scale: Protecting and Sharing Files Globally http://www.redbooks.ibm.com/redpieces/abstracts/redp5527.html?Open Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect ? IL Lab Services (Storage) Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: Simon Thompson To: gpfsug main discussion list Date: 03/06/2019 11:08 AM Subject: Re: [gpfsug-discuss] Migrating billions of files? Sent by: gpfsug-discuss-bounces at spectrumscale.org AFM doesn?t work well if you have dependent filesets though .. which we did for quota purposes. Simon From: on behalf of "YARD at il.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Wednesday, 6 March 2019 at 09:01 To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Migrating billions of files? Hi What permissions you have ? Do u have only Posix , or also SMB attributes ? If only posix attributes you can do the following: - rsync (which will work on different filesets/directories in parallel. - AFM (but in case you need rollback - it will be problematic) Regards Yaron Daniel 94 Em Ha'Moshavot Rd Storage Architect ? IL Lab Services (Storage) Petach Tiqva, 49527 IBM Global Markets, Systems HW Sales Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 03/05/2019 11:57 PM Subject: [gpfsug-discuss] Migrating billions of files? Sent by: gpfsug-discuss-bounces at spectrumscale.org I?m looking at migration 3-4 Billion files, maybe 3PB of data between GPFS clusters. Most of the files are small - 60% 8K or less. Ideally I?d like to copy at least 15-20M files per day - ideally 50M. Any thoughts on how achievable this is? Or what to use? Either with AFM, mpifileutils, rsync.. other? Many of these files would be in 4k inodes. Destination is ESS. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=Bn1XE9uK2a9CZQ8qKnJE3Q&m=B2e9s5aGSXZvMOkd4ZPk_EIjfTloX7O_ExWsyR0RGP8&s=wwIfs_8RrX5Z7mGp2Mehj5z7z2yUhr0r-vO7TMyNUeE&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4786 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5054 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1852 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4377 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4747 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4558 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 4787 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 5055 bytes Desc: not available URL: From alvise.dorigo at psi.ch Wed Mar 6 09:24:08 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Wed, 6 Mar 2019 09:24:08 +0000 Subject: [gpfsug-discuss] Memory accounting for processes writing to GPFS Message-ID: <83A6EEB0EC738F459A39439733AE80452682711C@MBX214.d.ethz.ch> Hello to everyone, Here a PSI we're observing something that in principle seems strange (at least to me). We run a Java application writing into disk by mean of a standard AsynchronousFileChannel, whose I do not the details. There are two instances of this application: one runs on a node writing on a local drive, the other one runs writing on a GPFS mounted filesystem (this node is part of the cluster, no remote-mounting). What we do see is that in the former the application has a lower sum VIRT+RES memory and the OS shows a really big cache usage; in the latter, OS's cache is negligible while VIRT+RES is very (even too) high (with VIRT very high). So I wonder what is the difference... Writing into a GPFS mounted filesystem, as far as I understand, implies "talking" to the local mmfsd daemon which fills up its own pagepool... and then the system will asynchronously handle these pages to be written on real pdisk. But why the Linux kernel accounts so much memory to the process itself ? And why this large amount of memory is much more VIRT than RES ? thanks in advance, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From mladen.portak at hr.ibm.com Wed Mar 6 09:49:13 2019 From: mladen.portak at hr.ibm.com (Mladen Portak) Date: Wed, 6 Mar 2019 10:49:13 +0100 Subject: [gpfsug-discuss] Question about inodes incrise Message-ID: Dear. is it process of increasing inodes disruptiv? Thank You Mladen Portak Lab Service SEE Storage Consultant mladen.portak at hr.ibm.com +385 91 6308 293 IBM Hrvatska d.o.o. za proizvodnju i trgovinu Miramarska 23, 10 000 Zagreb, Hrvatska Upisan kod Trgova?kog suda u Zagrebu pod br. 080011422 Temeljni kapital: 788,000.00 kuna - upla?en u cijelosti Direktor: ?eljka Ti?i? ?iro ra?un kod: RAIFFEISENBANK AUSTRIA d.d. Zagreb, Magazinska cesta 69, 10000 Zagreb, Hrvatska IBAN: HR5424840081100396574 (SWIFT RZBHHR2X); OIB 43331467622 -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.schlipalius at pawsey.org.au Wed Mar 6 09:56:31 2019 From: chris.schlipalius at pawsey.org.au (Chris Schlipalius) Date: Wed, 06 Mar 2019 17:56:31 +0800 Subject: [gpfsug-discuss] Migrating billions of files? Message-ID: <8D4D1060-A52E-4A7B-AE2F-25AD44FF141A@pawsey.org.au> Hi Bob, so Simon has hit the nail on the head. So it?s a challenge, we used dcp with multiple parallel threads per nsd with mmdsh - 2PB and millions of files, it?s worth a test as it does look after xattribs, but test it. See https://github.com/hpc/dcp Test the preserve: -p, --preserve Preserve the original files' owner, group, permissions (including the setuid and setgid bits), time of last modification and time of last access. In case duplication of owner or group fails, the setuid and setgid bits are cleared. ------- We migrated between 12K storage FS a few years back. My colleague also has tested https://www.nersc.gov/users/storage-and-file-systems/transferring-data/bbcp/ or http://www.slac.stanford.edu/~abh/bbcp/ It?s excellent I hear with xattribs and recursive small files copy. I steer clear of rsync, different versions do not preserve xattribs and this is a bit of an issue some have found Regards, Chris Schlipalius Team Lead, Data Storage Infrastructure, Data & Visualisation, Pawsey Supercomputing Centre (CSIRO) 13 Burvill Court Kensington WA 6151 Australia From jake.carroll at uq.edu.au Wed Mar 6 11:06:49 2019 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Wed, 6 Mar 2019 11:06:49 +0000 Subject: [gpfsug-discuss] SLURM scripts/policy for data movement into a flash pool? In-Reply-To: References: Message-ID: Hi Scale-folk. I have an IBM ESS GH14S building block currently configured for my HPC workloads. I've got about 1PB of /scratch filesystem configured in mechanical spindles via GNR and about 20TB of SSD/flash sitting in another GNR filesystem at the moment. My intention is to destroy that stand-alone flash filesystem eventually and use storage pools coupled with GPFS policy to warm up workloads into that flash storage: https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_storagepool.htm A little dated, but that kind of thing. Does anyone have any experience in this space in using flash storage inside a pool with pre/post flight SLURM scripts to puppeteer GPFS policy to warm data up? I had a few ideas for policy construction around file size, file count, file access intensity. Someone mentioned heat map construction and mmdiag --iohist to me the other day. Could use some background there. If anyone has any SLURM specific integration tips for the scheduler or pre/post flight bits for SBATCH, it'd be really very much appreciated. This array really does fly along and surpassed my expectations - but, I want to get the most out of it that I can for my users - and I think storage pool automation and good file placement management is going to be an important part of that. Thank you. -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Wed Mar 6 11:13:50 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 6 Mar 2019 11:13:50 +0000 Subject: [gpfsug-discuss] Question about inodes incrise In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Wed Mar 6 11:20:11 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 6 Mar 2019 11:20:11 +0000 Subject: [gpfsug-discuss] Migrating billions of files? In-Reply-To: References: , <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com><011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0D9491980D948BE4003315D3C22583B5.gif Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF62E00CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF64E80CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF66F00CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF68F80CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6B180CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6D640CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4786 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6F6C0CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5054 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC365680FC36184003315D3C22583B5.gif Type: image/gif Size: 1852 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC2FF280CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4377 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC301300CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC303380CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4747 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC305400CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4558 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC307480CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC309500CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4787 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC30B580CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5055 bytes Desc: not available URL: From abeattie at au1.ibm.com Wed Mar 6 11:22:53 2019 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 6 Mar 2019 11:22:53 +0000 Subject: [gpfsug-discuss] Migrating billions of files? In-Reply-To: References: , , <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com><011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0D9491980D948BE4003315D3C22583B5.gif Type: image/gif Size: 1851 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF62E00CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4376 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF64E80CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF66F00CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4746 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF68F80CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6B180CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5093 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6D640CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 4786 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0CDF6F6C0CDF5ED0003315D3C22583B5.gif Type: image/gif Size: 5054 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC365680FC36184003315D3C22583B5.gif Type: image/gif Size: 1852 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC2FF280CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4377 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC301300CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC303380CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4747 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC305400CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4558 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC307480CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5094 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC309500CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 4787 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image._1_0FC30B580CDE5DD0003315D3C22583B5.gif Type: image/gif Size: 5055 bytes Desc: not available URL: From Robert.Oesterlin at nuance.com Wed Mar 6 12:44:24 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 6 Mar 2019 12:44:24 +0000 Subject: [gpfsug-discuss] Follow-up: migrating billions of files Message-ID: Some of you had questions to my original post. More information: Source: - Files are straight GPFS/Posix - no extended NFSV4 ACLs - A solution that requires $?s to be spent on software (ie, Aspera) isn?t a very viable option - Both source and target clusters are in the same DC - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage - Approx 40 file systems, a few large ones with 300M-400M files each, others smaller - no independent file sets - migration must pose minimal disruption to existing users Target architecture is a small number of file systems (2-3) on ESS with independent filesets - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) My current thinking is AFM with a pre-populate of the file space and switch the clients over to have them pull data they need (most of the data is older and less active) and them let AFM populate the rest in the background. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjdoherty at yahoo.com Wed Mar 6 12:59:23 2019 From: jjdoherty at yahoo.com (Jim Doherty) Date: Wed, 6 Mar 2019 12:59:23 +0000 (UTC) Subject: [gpfsug-discuss] Memory accounting for processes writing to GPFS In-Reply-To: <83A6EEB0EC738F459A39439733AE80452682711C@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE80452682711C@MBX214.d.ethz.ch> Message-ID: <410609032.929267.1551877163983@mail.yahoo.com> For any process with a large number of threads the VMM size has become an imaginary number ever since the glibc change to allocate a heap per thread. I look to /proc/$pid/status to find the memory used by a proc? RSS + Swap + kernel page tables.? Jim On Wednesday, March 6, 2019, 4:25:48 AM EST, Dorigo Alvise (PSI) wrote: #yiv1607149323 P {margin-top:0;margin-bottom:0;}Hello to everyone,Here a PSI we're observing something that in principle seems strange (at least to me).We run a Java application writing into disk by mean of a standard AsynchronousFileChannel, whose I do not the details.There are two instances of this application: one runs on a node writing on a local drive, the other one runs writing on a GPFS mounted filesystem (this node is part of the cluster, no remote-mounting). What we do see is that in the former the application has a lower sum VIRT+RES memory and the OS shows a really big cache usage; in the latter, OS's cache is negligible while VIRT+RES is very (even too) high (with VIRT very high). So I wonder what is the difference... Writing into a GPFS mounted filesystem, as far as I understand, implies "talking" to the local mmfsd daemon which fills up its own pagepool... and then the system will asynchronously handle these pages to be written on real pdisk. But why the Linux kernel accounts so much memory to the process itself ? And why this large amount of memory is much more VIRT than RES ? thanks in advance, ?? Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Wed Mar 6 13:13:16 2019 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 6 Mar 2019 14:13:16 +0100 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: Message-ID: Hi, in that case I'd open several tar pipes in parallel, maybe using directories carefully selected, like tar -c | ssh "tar -x" I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but along these lines might be a good efficient method. target_hosts should be all nodes haveing the target file system mounted, and you should start those pipes on the nodes with the source file system. It is best to start with the largest directories, and use some masterscript to start the tar pipes controlled by semaphores to not overload anything. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 06/03/2019 13:44 Subject: [gpfsug-discuss] Follow-up: migrating billions of files Sent by: gpfsug-discuss-bounces at spectrumscale.org Some of you had questions to my original post. More information: Source: - Files are straight GPFS/Posix - no extended NFSV4 ACLs - A solution that requires $?s to be spent on software (ie, Aspera) isn?t a very viable option - Both source and target clusters are in the same DC - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage - Approx 40 file systems, a few large ones with 300M-400M files each, others smaller - no independent file sets - migration must pose minimal disruption to existing users Target architecture is a small number of file systems (2-3) on ESS with independent filesets - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) My current thinking is AFM with a pre-populate of the file space and switch the clients over to have them pull data they need (most of the data is older and less active) and them let AFM populate the rest in the background. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=J5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM&s=fEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI&e= From TOMP at il.ibm.com Wed Mar 6 13:14:47 2019 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 6 Mar 2019 07:14:47 -0600 Subject: [gpfsug-discuss] Memory accounting for processes writing to GPFS In-Reply-To: <410609032.929267.1551877163983@mail.yahoo.com> References: <83A6EEB0EC738F459A39439733AE80452682711C@MBX214.d.ethz.ch> <410609032.929267.1551877163983@mail.yahoo.com> Message-ID: It might be the case that AsynchronousFileChannel is actually doing mmap access to the files. Thus, the memory management will be completely different with GPFS in compare to local fs. Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Jim Doherty To: gpfsug main discussion list Date: 06/03/2019 06:59 Subject: Re: [gpfsug-discuss] Memory accounting for processes writing to GPFS Sent by: gpfsug-discuss-bounces at spectrumscale.org For any process with a large number of threads the VMM size has become an imaginary number ever since the glibc change to allocate a heap per thread. I look to /proc/$pid/status to find the memory used by a proc RSS + Swap + kernel page tables. Jim On Wednesday, March 6, 2019, 4:25:48 AM EST, Dorigo Alvise (PSI) wrote: Hello to everyone, Here a PSI we're observing something that in principle seems strange (at least to me). We run a Java application writing into disk by mean of a standard AsynchronousFileChannel, whose I do not the details. There are two instances of this application: one runs on a node writing on a local drive, the other one runs writing on a GPFS mounted filesystem (this node is part of the cluster, no remote-mounting). What we do see is that in the former the application has a lower sum VIRT+RES memory and the OS shows a really big cache usage; in the latter, OS's cache is negligible while VIRT+RES is very (even too) high (with VIRT very high). So I wonder what is the difference... Writing into a GPFS mounted filesystem, as far as I understand, implies "talking" to the local mmfsd daemon which fills up its own pagepool... and then the system will asynchronously handle these pages to be written on real pdisk. But why the Linux kernel accounts so much memory to the process itself ? And why this large amount of memory is much more VIRT than RES ? thanks in advance, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=cm3DTOcac__Y20DdtIZcwEXYG9GqlDxlHFTLeSAUOdE&s=hxak8mqRwAQuN7BaF-B9gvTQu1PGnCFF8am1GvMu3bI&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Mar 6 15:01:57 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 6 Mar 2019 10:01:57 -0500 Subject: [gpfsug-discuss] Migrating billions of files? mmfind ... mmxcp In-Reply-To: References: , , <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com><011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> Message-ID: mmxcp may be in samples/ilm if not, perhaps we can put it on an approved file sharing service ... + mmxcp script, for use with mmfind ... -xargs mmxcp ... Which makes parallelized file copy relatively easy and super fast! Usage: /gh/bin/mmxcp -t target -p strip_count source_pathname1 source_pathname2 ... Run "cp" in a mmfind ... -xarg ... pipeline, e.g. mmfind -polFlags '-N all -g /gpfs/tmp' /gpfs/source -gpfsWeight DIRECTORY_HASH -xargs mmxcp -t /target -p 2 Options: -t target_path : Copy files to this path. -p strip_count : Remove this many directory names from the pathnames of the source files. -a : pass -a to cp -v : pass -v to cp -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: From S.J.Thompson at bham.ac.uk Wed Mar 6 15:07:09 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 6 Mar 2019 15:07:09 +0000 Subject: [gpfsug-discuss] Migrating billions of files? mmfind ... mmxcp In-Reply-To: References: , , <4D433B18-3B14-4DFB-8954-868E67DA566D@nuance.com><011E924E-9FE6-4049-94B5-2D7EEB659D86@bham.ac.uk> , Message-ID: Last time this was mentioned, it doesn't do ACLs? Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of makaplan at us.ibm.com [makaplan at us.ibm.com] Sent: 06 March 2019 15:01 To: gpfsug main discussion list Cc: gpfsug-discuss-bounces at spectrumscale.org Subject: Re: [gpfsug-discuss] Migrating billions of files? mmfind ... mmxcp mmxcp may be in samples/ilm if not, perhaps we can put it on an approved file sharing service ... + mmxcp script, for use with mmfind ... -xargs mmxcp ... Which makes parallelized file copy relatively easy and super fast! Usage: /gh/bin/mmxcp -t target -p strip_count source_pathname1 source_pathname2 ... Run "cp" in a mmfind ... -xarg ... pipeline, e.g. mmfind -polFlags '-N all -g /gpfs/tmp' /gpfs/source -gpfsWeight DIRECTORY_HASH -xargs mmxcp -t /target -p 2 Options: -t target_path : Copy files to this path. -p strip_count : Remove this many directory names from the pathnames of the source files. -a : pass -a to cp -v : pass -v to cp [Marc A Kaplan] -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.gif Type: image/gif Size: 21994 bytes Desc: ATT00001.gif URL: From oehmes at gmail.com Wed Mar 6 15:30:31 2019 From: oehmes at gmail.com (Sven Oehme) Date: Wed, 06 Mar 2019 07:30:31 -0800 Subject: [gpfsug-discuss] Question about inodes incrise In-Reply-To: References: Message-ID: <8140D183-2D50-4FF9-8BEA-329F4C9A5977@gmail.com> While Fred is right, in most cases you shouldn?t see this, under heavy burst create workloads before 5.0.2 you can even trigger out of space errors even you have plenty of space in the filesystem (very hard to reproduce so unlikely to hit for a normal enduser). to address the issues there have been significant enhancements in this area in 5.0.2. prior the changes expansions under heavy load many times happened in the foreground (means the application waits for the expansion to finish before it proceeds) especially if many nodes create lots of files in parallel. Since the changes you now see messages on the filesystem manager in its mmfs log when a expansion happens with details including if somebody had to wait for it or not. Sven From: on behalf of Mladen Portak Reply-To: gpfsug main discussion list Date: Wednesday, March 6, 2019 at 1:49 AM To: Subject: [gpfsug-discuss] Question about inodes incrise Dear. is it process of increasing inodes disruptiv? Thank You Mladen Portak Lab Service SEE Storage Consultant mladen.portak at hr.ibm.com +385 91 6308 293 IBM Hrvatska d.o.o. za proizvodnju i trgovinu Miramarska 23, 10 000 Zagreb, Hrvatska Upisan kod Trgova?kog suda u Zagrebu pod br. 080011422 Temeljni kapital: 788,000.00 kuna - upla?en u cijelosti Direktor: ?eljka Ti?i? ?iro ra?un kod: RAIFFEISENBANK AUSTRIA d.d. Zagreb, Magazinska cesta 69, 10000 Zagreb, Hrvatska IBAN: HR5424840081100396574 (SWIFT RZBHHR2X); OIB 43331467622 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From eboyd at us.ibm.com Wed Mar 6 15:41:54 2019 From: eboyd at us.ibm.com (Edward Boyd) Date: Wed, 6 Mar 2019 15:41:54 +0000 Subject: [gpfsug-discuss] gpfsug-discuss mmxcp In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Mar 6 15:49:32 2019 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 6 Mar 2019 10:49:32 -0500 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: Message-ID: <66B1E0A0-723A-4D08-B6D1-D99392E3DE71@ulmer.org> In the case where tar -C doesn?t work, you can always use a subshell (I do this regularly): tar -cf . | ssh someguy at otherhost "(cd targetdir; tar -xvf - )" Only use -v on one end. :) Also, for parallel work that?s not designed that way, don't underestimate the -P option to GNU and BSD xargs! With the amount of stuff to be copied, making sure a subjob doesn?t finish right after you go home leaving a slot idle for several hours is a medium deal. In Bob?s case, however, treating it like a DR exercise where users "restore" their own files by accessing them (using AFM instead of HSM) is probably the most convenient. -- Stephen > On Mar 6, 2019, at 8:13 AM, Uwe Falke > wrote: > > Hi, in that case I'd open several tar pipes in parallel, maybe using > directories carefully selected, like > > tar -c | ssh "tar -x" > > I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but > along these lines might be a good efficient method. target_hosts should be > all nodes haveing the target file system mounted, and you should start > those pipes on the nodes with the source file system. > It is best to start with the largest directories, and use some > masterscript to start the tar pipes controlled by semaphores to not > overload anything. > > > > Mit freundlichen Gr??en / Kind regards > > > Dr. Uwe Falke > > IT Specialist > High Performance Computing Services / Integrated Technology Services / > Data Center Services > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland > Rathausstr. 7 > 09111 Chemnitz > Phone: +49 371 6978 2165 > Mobile: +49 175 575 2877 > E-Mail: uwefalke at de.ibm.com > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: > Thomas Wolter, Sven Schoo? > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 17122 > > > > > From: "Oesterlin, Robert" > > To: gpfsug main discussion list > > Date: 06/03/2019 13:44 > Subject: [gpfsug-discuss] Follow-up: migrating billions of files > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Some of you had questions to my original post. More information: > > Source: > - Files are straight GPFS/Posix - no extended NFSV4 ACLs > - A solution that requires $?s to be spent on software (ie, Aspera) isn?t > a very viable option > - Both source and target clusters are in the same DC > - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage > - Approx 40 file systems, a few large ones with 300M-400M files each, > others smaller > - no independent file sets > - migration must pose minimal disruption to existing users > > Target architecture is a small number of file systems (2-3) on ESS with > independent filesets > - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) > > My current thinking is AFM with a pre-populate of the file space and > switch the clients over to have them pull data they need (most of the data > is older and less active) and them let AFM populate the rest in the > background. > > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=J5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM&s=fEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI&e= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Wed Mar 6 15:59:55 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Wed, 6 Mar 2019 10:59:55 -0500 Subject: [gpfsug-discuss] gpfsug-discuss mmxcp In-Reply-To: References: Message-ID: Basically yes. If you can't find the scripts in 4.2 samples... You can copy them over from 5.x to the 4.2 system... Should work except perhaps for some of the more esoteric find conditionals... From: "Edward Boyd" To: gpfsug-discuss at spectrumscale.org Date: 03/06/2019 10:42 AM Subject: Re: [gpfsug-discuss] gpfsug-discuss mmxcp Sent by: gpfsug-discuss-bounces at spectrumscale.org Curious if this command would be suitable for migration from Scale 4.2 file system to 5.x file system? What is lost or left behind? Edward L. Boyd ( Ed ), Client Technical Specialist IBM Systems Storage Solutions US Federal 407-271-9210 Office / Cell / Office / Text eboyd at us.ibm.com email -----gpfsug-discuss-bounces at spectrumscale.org wrote: ----- To: gpfsug-discuss at spectrumscale.org From: gpfsug-discuss-request at spectrumscale.org Sent by: gpfsug-discuss-bounces at spectrumscale.org Date: 03/06/2019 10:03AM Subject: gpfsug-discuss Digest, Vol 86, Issue 11 Send gpfsug-discuss mailing list submissions to gpfsug-discuss at spectrumscale.org To subscribe or unsubscribe via the World Wide Web, visit http://gpfsug.org/mailman/listinfo/gpfsug-discuss or, via email, send a message with subject or body 'help' to gpfsug-discuss-request at spectrumscale.org You can reach the person managing the list at gpfsug-discuss-owner at spectrumscale.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gpfsug-discuss digest..." Today's Topics: 1. Re: Migrating billions of files? mmfind ... mmxcp (Marc A Kaplan) ---------------------------------------------------------------------- Message: 1 Date: Wed, 6 Mar 2019 10:01:57 -0500 From: "Marc A Kaplan" To: gpfsug main discussion list Cc: gpfsug-discuss-bounces at spectrumscale.org Subject: Re: [gpfsug-discuss] Migrating billions of files? mmfind ... mmxcp Message-ID: < OF18FDF6D8.C850134F-ON852583B5.005243D0-852583B5.0052961B at notes.na.collabserv.com > Content-Type: text/plain; charset="us-ascii" mmxcp may be in samples/ilm if not, perhaps we can put it on an approved file sharing service ... + mmxcp script, for use with mmfind ... -xargs mmxcp ... Which makes parallelized file copy relatively easy and super fast! Usage: /gh/bin/mmxcp -t target -p strip_count source_pathname1 source_pathname2 ... Run "cp" in a mmfind ... -xarg ... pipeline, e.g. mmfind -polFlags '-N all -g /gpfs/tmp' /gpfs/source -gpfsWeight DIRECTORY_HASH -xargs mmxcp -t /target -p 2 Options: -t target_path : Copy files to this path. -p strip_count : Remove this many directory names from the pathnames of the source files. -a : pass -a to cp -v : pass -v to cp -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20190306/0361a3dd/attachment.html > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 21994 bytes Desc: not available URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20190306/0361a3dd/attachment.gif > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 86, Issue 11 ********************************************** _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=UpQuMLyiY5RYAlgIz4tU_Ou1f0vzJQeW3YhaTsUNNjg&s=UG74CyaXta-G7ib_KTNz0_ypCbmqWveCUFnV-oPaDYY&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmsing at us.ibm.com Wed Mar 6 16:37:01 2019 From: jmsing at us.ibm.com (John M Sing) Date: Wed, 6 Mar 2019 11:37:01 -0500 Subject: [gpfsug-discuss] Question about inodes increase - how to increase non-disruptively - orig question by Mladen Portak on 3/6/19 - 09:46 GMT In-Reply-To: References: Message-ID: Upon further thought, it occurs to me that Spectrum Scale V5's introduction of variable sub-blocks must by necessity have changed the inode calculation that I describe below. I would be interested to know how exactly in Spectrum Scale V5 formatted file systems, how one may need to change the information I document below. I would imagine the pre-V5 file system format probably still uses the inode allocation schema that I document below. John Sing IBM Offering Evangelist, Spectrum Scale, ESS Venice FL From: John M Sing/Tampa/IBM To: gpfsug-discuss at spectrumscale.org Date: 03/06/2019 11:23 AM Subject: Question about inodes increase - how to increase non-disruptively - orig question by Mladen Portak on 3/6/19 - 09:46 GMT Hi, all, Mladen, (This is my first post to the GPFSug-discuss list. I am IBMer, am the IBM worldwide technical support Evangelist on Spectrum Scale/ESS. I am based in Florida. Apologies if my attachment is not permitted or if I did not reply properly to tie my reply to the original poster - pls let me know if there are more instructions or rules for using GPFSug-discuss (I could not find any such guidelines)). ------------- Mladen, Increasing or changing inodes in a GPFS/Spectrum Scale file system can be done non-disruptively, within the boundaries of how GPFS / Spectrum Scale works. I wrote and delivered the following presentation on this topic back in 2013 in the GPFS V4.1 timeframe. While older IBM technologies SONAS/V7000 Unified are the reason the preso was written, and the commands shown are from those now-withdrawn products, the GPFS concepts involved as far as I know have not changed, and you can simply use the GPFS/Spectrum Scale equivalent commands such as mmcrfs, mmcrfileset, mmchfileset, etc to allocate, add, or change inodes non-disruptively, within the boundaries of how GPFS / Spectrum Scale works. There's lots of diagrams. [attachment "sDS05_John_Sing_SONAS_V7000_GPFS_Unified_Independent_Filesets_Inode_Planning.ppt" deleted by John M Sing/Tampa/IBM] The PPT is handy because there is animation in Slideshow mode to better explain (at least in my mind) how GPFS allocates inodes, and how you extend or under what circumstances you can change the number of inodes in either a file system or an independent file set. Here is a Box link to download this 8.7MB preso, should the attachment not come thru or be too big for the list. https://ibm.box.com/shared/static/phn9dypcdbzyn2ei6hy2hc79lgmch904.ppt This Box link, which anyone who has the link can use to download, will expire on Dec 31, 2019. If you are reading this post past that date, just email me and I will be happy to reshare the preso with you. I wrote this up because I myself needed to remember inode allocation especially in light of how GPFS independent filesets works, should I ever need to refer back to it. Happy to hear feedback on the above preso from all of you out there. Corrections/comments/update suggestions welcome. Regards, John M. Sing Offering Evangelist, IBM Spectrum Scale, Elastic Storage Server, Spectrum NAS Venice, Florida https://www.linkedin.com/in/johnsing/ jmsing at us.ibm.com office: 941-492-2998 ------------------------------------------------------------------------------------------------------------------------------------------------------------- Mladen Portak?mladen.portak at hr.ibm.com? wrote on Wed Mar 6 09:49:13 GMT 2019 Dear. is it process of increasing inodes disruptive? Thank You Mladen Portak Lab Service SEE Storage Consultant mladen.portak at hr.ibm.com +385 91 6308 293 IBM Hrvatska d.o.o. za proizvodnju i trgovinu Miramarska 23, 10 000 Zagreb, Hrvatska Upisan kod Trgova?kog suda u Zagrebu pod br. 080011422 Temeljni kapital: 788,000.00 kuna - upla?en u cijelosti Direktor: ?eljka Ti?i? ?iro ra?un kod: RAIFFEISENBANK AUSTRIA d.d. Zagreb, Magazinska cesta 69, 10000 Zagreb, Hrvatska IBAN: HR5424840081100396574 (SWIFT RZBHHR2X); OIB 43331467622 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jmsing at us.ibm.com Wed Mar 6 16:42:11 2019 From: jmsing at us.ibm.com (John M Sing) Date: Wed, 6 Mar 2019 11:42:11 -0500 Subject: [gpfsug-discuss] Fw: Question about inodes increase - how to increase non-disruptively - orig question by Mladen Portak on 3/6/19 - 09:46 GMT Message-ID: Hi, all, Mladen, (This is my first post to the GPFSug-discuss list. I am IBMer, am the IBM worldwide technical support Evangelist on Spectrum Scale/ESS. I am based in Florida. Apologies if my attachment URL is not permitted or if I did not reply properly to tie my reply to the original poster - pls let me know if there are more instructions or rules for using GPFSug-discuss (I could not find any such guidelines)). ------------- Mladen, Increasing or changing inodes in a GPFS/Spectrum Scale file system can be done non-disruptively, within the boundaries of how GPFS / Spectrum Scale works. I wrote and delivered the following presentation on this topic back in 2013 in the GPFS V4.1 timeframe. While older IBM technologies SONAS/V7000 Unified are the reason the preso was written, and the commands shown are from those now-withdrawn products, the GPFS concepts involved as far as I know have not changed, and you can simply use the GPFS/Spectrum Scale equivalent commands such as mmcrfs, mmcrfileset, mmchfileset, etc to allocate, add, or change inodes non-disruptively, within the boundaries of how GPFS / Spectrum Scale works. There's lots of diagrams. Here is a Box link to download this 8.7MB preso which anyone who has the link can use to download : https://ibm.box.com/shared/static/phn9dypcdbzyn2ei6hy2hc79lgmch904.ppt This should apply to any Spectrum Scale / GPFS file system this is the the Spectrum Scale V4.x or older format. I would imagine a file system with the newer Scale V5 variable sub-blocks has a modification to the above schema. I'd be interested to know what that is and how V5 users should modify the above diagrams/information. The PPT is handy because there is animation in Slideshow mode to better explain (at least in my mind) how GPFS / Spectrum Scale V4.x and older allocates inodes, and how you extend or under what circumstances you can change the number of inodes in either a file system or an independent file set. This Box link, will expire on Dec 31, 2019. If you are reading this post past that date, just email me and I will be happy to reshare the preso with you. I wrote this up because I myself needed to remember inode allocation especially in light of how GPFS independent filesets works, should I ever need to refer back to it. Happy to hear feedback on the above preso from all of you out there. Corrections/comments/update suggestions welcome. Regards, John M. Sing Offering Evangelist, IBM Spectrum Scale, Elastic Storage Server, Spectrum NAS Venice, Florida https://www.linkedin.com/in/johnsing/ jmsing at us.ibm.com office: 941-492-2998 ------------------------------------------------------------------------------------------------------------------------------------------------------------- Mladen Portak?mladen.portak at hr.ibm.com? wrote on Wed Mar 6 09:49:13 GMT 2019 Dear. is it process of increasing inodes disruptive? Thank You Mladen Portak Lab Service SEE Storage Consultant mladen.portak at hr.ibm.com +385 91 6308 293 IBM Hrvatska d.o.o. za proizvodnju i trgovinu Miramarska 23, 10 000 Zagreb, Hrvatska Upisan kod Trgova?kog suda u Zagrebu pod br. 080011422 Temeljni kapital: 788,000.00 kuna - upla?en u cijelosti Direktor: ?eljka Ti?i? ?iro ra?un kod: RAIFFEISENBANK AUSTRIA d.d. Zagreb, Magazinska cesta 69, 10000 Zagreb, Hrvatska IBAN: HR5424840081100396574 (SWIFT RZBHHR2X); OIB 43331467622 -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex at calicolabs.com Wed Mar 6 17:13:18 2019 From: alex at calicolabs.com (Alex Chekholko) Date: Wed, 6 Mar 2019 09:13:18 -0800 Subject: [gpfsug-discuss] SLURM scripts/policy for data movement into a flash pool? In-Reply-To: References: Message-ID: Hi, I have tried this before and I would like to temper your expectations. If you use a placement policy to allow users to write any files into your "small" pool (e.g. by directory), they will get E_NOSPC when your small pool fills up. And they will be confused because they can't see the pool configuration, they just see a large filesystem with lots of space. I think there may now be an "overflow" policy but it will only work for new files, not if someone keeps writing into an existing file in an existing pool. If you use a migration policy (even based on heat map) it is still a periodic scheduled data movement and not anything that happens "on the fly". Also, "fileheat" only gets updated at some interval anyway. If you use a migration policy to move data between pools, you may starve users of I/O which will confuse your users because suddenly things are slow. I think there is now a QOS way to throttle your data migration. I guess it depends on how much of your disk I/O throughput is not used; if your disks are already churning, migrations will just slow everything down. Think of it less like a cache layer and more like two separate storage locations. If a bunch of jobs want to read the same files from your big pool, it's probably faster to just have them read from the big pool directly rather than have some kind of prologue job to read the data from the big pool, write it into the small poool, then have the jobs read from the small pool. Also, my experience was with pool ratios of like 10%/90%, yours is more like 2%/98%. However, mine were with write-heavy workloads (typical university environment with quickly growing capacity utilization). Hope these anecdotes help. Also, it could be that things work a bit differently now in new versions. Regards, Alex On Wed, Mar 6, 2019 at 3:13 AM Jake Carroll wrote: > Hi Scale-folk. > > I have an IBM ESS GH14S building block currently configured for my HPC > workloads. > > I've got about 1PB of /scratch filesystem configured in mechanical > spindles via GNR and about 20TB of SSD/flash sitting in another GNR > filesystem at the moment. My intention is to destroy that stand-alone flash > filesystem eventually and use storage pools coupled with GPFS policy to > warm up workloads into that flash storage: > > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.3/com.ibm.spectrum.scale.v4r23.doc/bl1adv_storagepool.htm > > A little dated, but that kind of thing. > > Does anyone have any experience in this space in using flash storage > inside a pool with pre/post flight SLURM scripts to puppeteer GPFS policy > to warm data up? > > I had a few ideas for policy construction around file size, file count, > file access intensity. Someone mentioned heat map construction and mmdiag > --iohist to me the other day. Could use some background there. > > If anyone has any SLURM specific integration tips for the scheduler or > pre/post flight bits for SBATCH, it'd be really very much appreciated. > > This array really does fly along and surpassed my expectations - but, I > want to get the most out of it that I can for my users - and I think > storage pool automation and good file placement management is going to be > an important part of that. > > Thank you. > > -jc > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Wed Mar 6 12:55:15 2019 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Wed, 6 Mar 2019 12:55:15 +0000 Subject: [gpfsug-discuss] Question about inodes incrise In-Reply-To: References: , Message-ID: <75613CE6-602B-4792-9F01-E736E7AFF0EA@rutgers.edu> They hadn?t asked, but neither is the process of raising the maximum, which could be what they?re asking about (might be some momentary performance hit ? can?t recall, but I don?t believe it?s significant if so). -- ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Mar 6, 2019, at 06:14, Frederick Stock > wrote: No. It happens automatically and generally without notice to end users, that is they do not see any noticeable pause in operations. If you are asking the question because you are considering pre-allocating all of your inodes I would advise you not take that option. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Mladen Portak" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Question about inodes incrise Date: Wed, Mar 6, 2019 4:49 AM Dear. is it process of increasing inodes disruptiv? Thank You Mladen Portak Lab Service SEE Storage Consultant mladen.portak at hr.ibm.com +385 91 6308 293 IBM Hrvatska d.o.o. za proizvodnju i trgovinu Miramarska 23, 10 000 Zagreb, Hrvatska Upisan kod Trgova?kog suda u Zagrebu pod br. 080011422 Temeljni kapital: 788,000.00 kuna - upla?en u cijelosti Direktor: ?eljka Ti?i? ?iro ra?un kod: RAIFFEISENBANK AUSTRIA d.d. Zagreb, Magazinska cesta 69, 10000 Zagreb, Hrvatska IBAN: HR5424840081100396574 (SWIFT RZBHHR2X); OIB 43331467622 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Mar 7 10:15:16 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 7 Mar 2019 10:15:16 +0000 Subject: [gpfsug-discuss] Memory accounting for processes writing to GPFS In-Reply-To: References: <83A6EEB0EC738F459A39439733AE80452682711C@MBX214.d.ethz.ch> <410609032.929267.1551877163983@mail.yahoo.com>, Message-ID: <83A6EEB0EC738F459A39439733AE80452682C54A@MBX214.d.ethz.ch> Thanks to all for clarification. A ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Tomer Perry [TOMP at il.ibm.com] Sent: Wednesday, March 06, 2019 2:14 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Memory accounting for processes writing to GPFS It might be the case that AsynchronousFileChannelis actually doing mmap access to the files. Thus, the memory management will be completely different with GPFS in compare to local fs. Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Jim Doherty To: gpfsug main discussion list Date: 06/03/2019 06:59 Subject: Re: [gpfsug-discuss] Memory accounting for processes writing to GPFS Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ For any process with a large number of threads the VMM size has become an imaginary number ever since the glibc change to allocate a heap per thread. I look to /proc/$pid/status to find the memory used by a proc RSS + Swap + kernel page tables. Jim On Wednesday, March 6, 2019, 4:25:48 AM EST, Dorigo Alvise (PSI) wrote: Hello to everyone, Here a PSI we're observing something that in principle seems strange (at least to me). We run a Java application writing into disk by mean of a standard AsynchronousFileChannel, whose I do not the details. There are two instances of this application: one runs on a node writing on a local drive, the other one runs writing on a GPFS mounted filesystem (this node is part of the cluster, no remote-mounting). What we do see is that in the former the application has a lower sum VIRT+RES memory and the OS shows a really big cache usage; in the latter, OS's cache is negligible while VIRT+RES is very (even too) high (with VIRT very high). So I wonder what is the difference... Writing into a GPFS mounted filesystem, as far as I understand, implies "talking" to the local mmfsd daemon which fills up its own pagepool... and then the system will asynchronously handle these pages to be written on real pdisk. But why the Linux kernel accounts so much memory to the process itself ? And why this large amount of memory is much more VIRT than RES ? thanks in advance, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Thu Mar 7 11:41:18 2019 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Thu, 7 Mar 2019 12:41:18 +0100 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: <66B1E0A0-723A-4D08-B6D1-D99392E3DE71@ulmer.org> References: <66B1E0A0-723A-4D08-B6D1-D99392E3DE71@ulmer.org> Message-ID: As for "making sure a subjob doesn't finish right after you go home leaving a slot idle for several hours ". That's the reason for the masterscript / control script / whatever. There would be a list of directories sorted to decreasing size, the master script would have a counter for each participating source host (a semaphore) and start as many parallel copy jobs, each with the currently topmost directory in the list, removing that directory (best possibly to an intermediary "in-work" list), counting down the semaphore on each start , unless 0. As soon as a job returns successfully, count up the semaphore, and if >0, start the next job, and so on. I suppose you can easily run about 8 to 12 such jobs per server (maybe best to use dedicated source server - dest server pairs). So, no worries about leaving at any time WRT jobs ending and idle job slots . of course, some precautions should be taken to ensure each job succeeds and gets repeated if not , and a lot of logging should take place to be sure you would know what's happened. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: Stephen Ulmer To: gpfsug main discussion list Date: 06/03/2019 16:55 Subject: Re: [gpfsug-discuss] Follow-up: migrating billions of files Sent by: gpfsug-discuss-bounces at spectrumscale.org In the case where tar -C doesn?t work, you can always use a subshell (I do this regularly): tar -cf . | ssh someguy at otherhost "(cd targetdir; tar -xvf - )" Only use -v on one end. :) Also, for parallel work that?s not designed that way, don't underestimate the -P option to GNU and BSD xargs! With the amount of stuff to be copied, making sure a subjob doesn?t finish right after you go home leaving a slot idle for several hours is a medium deal. In Bob?s case, however, treating it like a DR exercise where users "restore" their own files by accessing them (using AFM instead of HSM) is probably the most convenient. -- Stephen On Mar 6, 2019, at 8:13 AM, Uwe Falke wrote: Hi, in that case I'd open several tar pipes in parallel, maybe using directories carefully selected, like tar -c | ssh "tar -x" I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but along these lines might be a good efficient method. target_hosts should be all nodes haveing the target file system mounted, and you should start those pipes on the nodes with the source file system. It is best to start with the largest directories, and use some masterscript to start the tar pipes controlled by semaphores to not overload anything. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 06/03/2019 13:44 Subject: [gpfsug-discuss] Follow-up: migrating billions of files Sent by: gpfsug-discuss-bounces at spectrumscale.org Some of you had questions to my original post. More information: Source: - Files are straight GPFS/Posix - no extended NFSV4 ACLs - A solution that requires $?s to be spent on software (ie, Aspera) isn?t a very viable option - Both source and target clusters are in the same DC - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage - Approx 40 file systems, a few large ones with 300M-400M files each, others smaller - no independent file sets - migration must pose minimal disruption to existing users Target architecture is a small number of file systems (2-3) on ESS with independent filesets - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) My current thinking is AFM with a pre-populate of the file space and switch the clients over to have them pull data they need (most of the data is older and less active) and them let AFM populate the rest in the background. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=J5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM&s=fEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI&e= _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=fTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8&m=4gYLFpEqhJ4XD4RdqwClWf14hrSb2JKrH_EirNxZtuY&s=InZvoRUosC8y-cfwNsRiXvN3fujTLLf4U_uDvPGupoc&e= From jonathan.buzzard at strath.ac.uk Thu Mar 7 11:39:45 2019 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Thu, 07 Mar 2019 11:39:45 +0000 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: Message-ID: On Wed, 2019-03-06 at 12:44 +0000, Oesterlin, Robert wrote: > Some of you had questions to my original post. More information: > > Source: > - Files are straight GPFS/Posix - no extended NFSV4 ACLs > - A solution that requires $?s to be spent on software (ie, Aspera) > isn?t a very viable option > - Both source and target clusters are in the same DC > - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN > storage > - Approx 40 file systems, a few large ones with 300M-400M files each, > others smaller > - no independent file sets > - migration must pose minimal disruption to existing users > > Target architecture is a small number of file systems (2-3) on ESS > with independent filesets > - Target (ESS) will have multiple 40gb-E links on each NSD server > (GS4) > > My current thinking is AFM with a pre-populate of the file space and > switch the clients over to have them pull data they need (most of the > data is older and less active) and them let AFM populate the rest in > the background. > As it's not been mentioned yet "dsmc restore" or equivalent depending on your backup solution. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From stefan.dietrich at desy.de Thu Mar 7 12:05:13 2019 From: stefan.dietrich at desy.de (Dietrich, Stefan) Date: Thu, 7 Mar 2019 13:05:13 +0100 (CET) Subject: [gpfsug-discuss] CES Ganesha netgroup caching? In-Reply-To: References: <2121724779.6221169.1551340616921.JavaMail.zimbra@desy.de> Message-ID: <1829516345.7374115.1551960313390.JavaMail.zimbra@desy.de> Hi Malhal, thanks for the quick answer! Regards, Stefan ----- Original Message ----- > From: "Malahal R Naineni" > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Sent: Thursday, February 28, 2019 1:33:50 PM > Subject: Re: [gpfsug-discuss] CES Ganesha netgroup caching? > Ganesha maintains negative and positive cache. Maybe, we should remove negative > cache. A cache entry (either negative or positive) auto expires after 30 > minutes. "ganesha_mgr purge netgroup" removes the entire netgroup cache. > So, if you add a host to the netgroup, it should be able to access exports > immediately provided the host never tried to access in the past. If it did, > then it would have been part of negative cache entry and you may need to wait > for 30 minutes. If you remove a host from a netgroups, it may take about 30 > minutes to revoke the access. > Added, "ganesha_mgr purge netgroup" to purge the cache to make the cache > consistent with the actual configuration. It needs to be run on each node. > Regards, Malahal. > > > ----- Original message ----- > From: "Dietrich, Stefan" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug-discuss at spectrumscale.org > Cc: > Subject: [gpfsug-discuss] CES Ganesha netgroup caching? > Date: Thu, Feb 28, 2019 1:36 PM > Hi, > > I am currently playing around with LDAP netgroups for NFS exports via CES. > However, I could not figure out how long Ganesha is caching the netgroup > entries? > > There is definitely some caching, as adding a host to the netgroup does not > immediately grant access to the share. > A "getent netgroup " on the CES node returns the correct result, so > this is not some other caching effect. > > Resetting the cache via "ganesha_mgr purge netgroup" works, but is probably not > officially supported. > > The CES nodes are running with GPFS 5.0.2.3 and > gpfs.nfs-ganesha-2.5.3-ibm030.01.el7. > CES authentication is set to user-defined, the nodes just use SSSD with a > rfc2307bis LDAP server. > > Regards, > Stefan > > -- > ------------------------------------------------------------------------ > Stefan Dietrich Deutsches Elektronen-Synchrotron (IT-Systems) > Ein Forschungszentrum der Helmholtz-Gemeinschaft > Notkestr. 85 > phone: +49-40-8998-4696 22607 Hamburg > e-mail: stefan.dietrich at desy.de Germany > ------------------------------------------------------------------------ > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > [ http://gpfsug.org/mailman/listinfo/gpfsug-discuss | > http://gpfsug.org/mailman/listinfo/gpfsug-discuss ] > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss From vpuvvada at in.ibm.com Thu Mar 7 13:52:12 2019 From: vpuvvada at in.ibm.com (Venkateswara R Puvvada) Date: Thu, 7 Mar 2019 19:22:12 +0530 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: Message-ID: AFM based migration provides near-zero downtime and supports migrating EAs/ACLs including immutability attributes (if home is Scale/ESS). I would recommend starting migration in read-only mode, prefetch most of the data and convert the fileset to local-updates (if backup is not needed during the migration) or independent-writer mode before moving the applications to the AFM cache filesets. AFM now supports (from 5.0.2) directory level prefetch with many performance improvements and does not require list-files to be specified. ~Venkat (vpuvvada at in.ibm.com) From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 03/06/2019 06:14 PM Subject: [gpfsug-discuss] Follow-up: migrating billions of files Sent by: gpfsug-discuss-bounces at spectrumscale.org Some of you had questions to my original post. More information: Source: - Files are straight GPFS/Posix - no extended NFSV4 ACLs - A solution that requires $?s to be spent on software (ie, Aspera) isn?t a very viable option - Both source and target clusters are in the same DC - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage - Approx 40 file systems, a few large ones with 300M-400M files each, others smaller - no independent file sets - migration must pose minimal disruption to existing users Target architecture is a small number of file systems (2-3) on ESS with independent filesets - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) My current thinking is AFM with a pre-populate of the file space and switch the clients over to have them pull data they need (most of the data is older and less active) and them let AFM populate the rest in the background. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=92LOlNh2yLzrrGTDA7HnfF8LFr55zGxghLZtvZcZD7A&m=YkRmc5bZTZ4O8u_y9PwCjhzuvVXZmhm-_SNQzKhDt0g&s=DUBqVmYz6ycQjkr-PZk4r5hndMIB1-FVzan1CCzlxRg&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Mar 7 18:59:03 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 7 Mar 2019 12:59:03 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share Message-ID: <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Hello All, We are thinking of exporting ?remote" GPFS mounts on a remote GPFS 5.0 cluster through a SMB share. I have heard in a previous thread that it is not a good idea to export NFS/SMB share on a remote GPFS mount, and make it writable. The issue that could be caused by making it writable would be metanode swapping between the GPFS clusters. May i understand this better and the seriousness of this issue? The possibility of a single file being written at the same time from a GPFS node and NFS/SMB node is minimum - however it is possible that a file is written at the same time from multiple protocols by mistake and we cannot prevent it. This is the setup: GPFS storage cluster: /gpfs01 GPFS CES cluster ( does not have any storage) : /gpfs01 -> mounted remotely . NFS export /gpfs01 as part of CES cluster GPFS client for CES cluster -> Acts as SMB server and exports /gpfs01 over SMB Are there any other limitations that i need to know for the above setup? We cannot use GPFS CES SMB as of now for few other reasons such as LDAP/AD id mapping and authentication complications. Regards, Lohit -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Thu Mar 7 20:44:59 2019 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 7 Mar 2019 20:44:59 +0000 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> References: <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Mar 7 21:02:54 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 7 Mar 2019 15:02:54 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> Thank you Andrew. However, we are not using SMB from the CES cluster but instead running a Redhat based SMB on a GPFS client of the CES cluster and exporting it from the GPFS client. Is the above supported, and not known to cause any issues? Regards, Lohit On Mar 7, 2019, 2:45 PM -0600, Andrew Beattie , wrote: > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_configprotocolsonremotefs.htm -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Thu Mar 7 21:12:31 2019 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 7 Mar 2019 21:12:31 +0000 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark>, <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Thu Mar 7 22:10:25 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 7 Mar 2019 16:10:25 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: We have many current usernames from LDAP that do not exactly match with the usernames from AD. Unfortunately, i guess CES SMB will need us to use either AD or LDAP or use the same usernames in both AD and LDAP. I have been looking for a solution where could map the different usernames from LDAP and AD but have not found a solution. So exploring ways to do this from RHEL SMB. I would appreciate if you have any solution to this issue. As of now we use LDAP uids/gids and SSH keys for authentication to the HPC cluster. We want to use CES SMB to export the same mounts which have LDAP usernames/uids/gids however because of different usernames in AD - it has become a challenge. Even if we do find a solution to this, i want to be able to use AD authentication for SMB and ssh key authentication for NFS. The above are the reasons we are just using CES with NFS and user defined authentication for users to have access with login through ssh keys. Regards, Lohit On Mar 7, 2019, 3:12 PM -0600, Andrew Beattie , wrote: > That would not be supported > > You shouldn't publish a remote mount Protocol cluster , and then connect a native client to that cluster and create a non CES protocol export > if you are going to use a Protocol cluster that's how you present your protocols. > otherwise don't set up the remote mount cluster. > > Why are you trying to publish a non HA RHEL SMB share instead of using the HA CES protocols? > Andrew Beattie > File and Object Storage Technical Specialist - A/NZ > IBM Systems - Storage > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > > ----- Original message ----- > > From: valleru at cbio.mskcc.org > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > To: gpfsug-discuss at spectrumscale.org, gpfsug main discussion list > > Cc: > > Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share > > Date: Fri, Mar 8, 2019 7:05 AM > > > > Thank you Andrew. > > > > However, we are not using SMB from the CES cluster but instead running a Redhat based SMB on a GPFS client of the CES cluster and exporting it from the GPFS client. > > Is the above supported, and not known to cause any issues? > > > > Regards, > > Lohit > > > > On Mar 7, 2019, 2:45 PM -0600, Andrew Beattie , wrote: > > > > > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_configprotocolsonremotefs.htm > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Thu Mar 7 22:52:28 2019 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 7 Mar 2019 22:52:28 +0000 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: , <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark><9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Thu Mar 7 23:00:46 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Thu, 7 Mar 2019 23:00:46 +0000 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> , Message-ID: There is a custom Auth mode I think that allows you to use ad for Auth and LDAP for identity. You'd could do what you wanted but you'd need another LDAP instance that mapped the ad usernames to the UID that is only used by SMB. Hack yes. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of valleru at cbio.mskcc.org [valleru at cbio.mskcc.org] Sent: 07 March 2019 22:10 To: gpfsug-discuss at spectrumscale.org; gpfsug main discussion list Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share We have many current usernames from LDAP that do not exactly match with the usernames from AD. Unfortunately, i guess CES SMB will need us to use either AD or LDAP or use the same usernames in both AD and LDAP. I have been looking for a solution where could map the different usernames from LDAP and AD but have not found a solution. So exploring ways to do this from RHEL SMB. I would appreciate if you have any solution to this issue. As of now we use LDAP uids/gids and SSH keys for authentication to the HPC cluster. We want to use CES SMB to export the same mounts which have LDAP usernames/uids/gids however because of different usernames in AD - it has become a challenge. Even if we do find a solution to this, i want to be able to use AD authentication for SMB and ssh key authentication for NFS. The above are the reasons we are just using CES with NFS and user defined authentication for users to have access with login through ssh keys. Regards, Lohit On Mar 7, 2019, 3:12 PM -0600, Andrew Beattie , wrote: That would not be supported You shouldn't publish a remote mount Protocol cluster , and then connect a native client to that cluster and create a non CES protocol export if you are going to use a Protocol cluster that's how you present your protocols. otherwise don't set up the remote mount cluster. Why are you trying to publish a non HA RHEL SMB share instead of using the HA CES protocols? Andrew Beattie File and Object Storage Technical Specialist - A/NZ IBM Systems - Storage Phone: 614-2133-7927 E-mail: abeattie at au1.ibm.com ----- Original message ----- From: valleru at cbio.mskcc.org Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org, gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share Date: Fri, Mar 8, 2019 7:05 AM Thank you Andrew. However, we are not using SMB from the CES cluster but instead running a Redhat based SMB on a GPFS client of the CES cluster and exporting it from the GPFS client. Is the above supported, and not known to cause any issues? Regards, Lohit On Mar 7, 2019, 2:45 PM -0600, Andrew Beattie , wrote: https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_configprotocolsonremotefs.htm _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From valleru at cbio.mskcc.org Thu Mar 7 23:29:49 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Thu, 7 Mar 2019 17:29:49 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> Message-ID: <0736385e-0371-4295-b665-a745af85ab29@Spark> Thanks a lot Andrew. It does look promising but It does not strike me immediately on how this could solve the SMB export where user authenticates with an AD username but the gpfs files that are present are owned by LDAP username. May be you are saying that if i enable GPFS to use these scripts - then GPFS will map the AD username to the LDAP username? I found this url too.. https://www.ibm.com/support/knowledgecenter/en/SSFKCN/com.ibm.cluster.gpfs.doc/gpfs_uid/uid_gpfs.html I will give it a read, try to understand how to implement it and get back if i have any more questions. If this works, it should help me configure and use the CES SMB. (Hopefully, CES file based authentication will allow both ssh key authentication for NFS and AD for SMB in same CES cluster). Regards, Lohit On Mar 7, 2019, 4:52 PM -0600, Andrew Beattie , wrote: > Lohit > > Have you looked at mmUIDtoName mmNametoUID > > Yes it will require some custom scripting on your behalf but it would be a far more elegant solution and not run the risk of data corruption issues. > > There is at least one university on this mailing list that is doing exactly what you are talking about, and they successfully use > mmUIDtoName / mmNametoUID? to provide the relevant mapping between different authentication environments - both internally in the university and externally from other institutions. > > They use AFM to move data between different storage clusters, and mmUIDtoName / mmNametoUID, to manage the ACL and permissions, they then move the data from the AFM filesystem to the HPC scratch filesystem for processing by the HPC (different filesystems within the same cluster) > > > Regards, > Andrew Beattie > File and Object Storage Technical Specialist - A/NZ > IBM Systems - Storage > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > > ----- Original message ----- > > From: valleru at cbio.mskcc.org > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > To: gpfsug-discuss at spectrumscale.org, gpfsug main discussion list > > Cc: > > Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share > > Date: Fri, Mar 8, 2019 8:21 AM > > > > We have many current usernames from LDAP that do not exactly match with the usernames from AD. > > Unfortunately, i guess CES SMB will need us to use either AD or LDAP or use the same usernames in both AD and LDAP. > > I have been looking for a solution where could map the different usernames from LDAP and AD but have not found a solution. So exploring ways to do this from RHEL SMB. > > I would appreciate if you have any solution to this issue. > > > > As of now we use LDAP uids/gids and SSH keys for authentication to the HPC cluster. > > We want to use CES SMB to export the same mounts which have LDAP usernames/uids/gids however because of different usernames in AD - it has become a challenge. > > Even if we do find a solution to this, i want to be able to use AD authentication for SMB and ssh key authentication for NFS. > > > > The above are the reasons we are just using CES with NFS and user defined authentication for users to have access with login through ssh keys. > > > > Regards, > > Lohit > > > > On Mar 7, 2019, 3:12 PM -0600, Andrew Beattie , wrote: > > > That would not be supported > > > > > > You shouldn't publish a remote mount Protocol cluster , and then connect a native client to that cluster and create a non CES protocol export > > > if you are going to use a Protocol cluster that's how you present your protocols. > > > otherwise don't set up the remote mount cluster. > > > > > > Why are you trying to publish a non HA RHEL SMB share instead of using the HA CES protocols? > > > Andrew Beattie > > > File and Object Storage Technical Specialist - A/NZ > > > IBM Systems - Storage > > > Phone: 614-2133-7927 > > > E-mail: abeattie at au1.ibm.com > > > > > > > > > > ----- Original message ----- > > > > From: valleru at cbio.mskcc.org > > > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > To: gpfsug-discuss at spectrumscale.org, gpfsug main discussion list > > > > Cc: > > > > Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share > > > > Date: Fri, Mar 8, 2019 7:05 AM > > > > > > > > Thank you Andrew. > > > > > > > > However, we are not using SMB from the CES cluster but instead running a Redhat based SMB on a GPFS client of the CES cluster and exporting it from the GPFS client. > > > > Is the above supported, and not known to cause any issues? > > > > > > > > Regards, > > > > Lohit > > > > > > > > On Mar 7, 2019, 2:45 PM -0600, Andrew Beattie , wrote: > > > > > > > > > > https://www.ibm.com/support/knowledgecenter/en/STXKQY_5.0.2/com.ibm.spectrum.scale.v5r02.doc/bl1adv_configprotocolsonremotefs.htm > > > > _______________________________________________ > > > > gpfsug-discuss mailing list > > > > gpfsug-discuss at spectrumscale.org > > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > > > > > _______________________________________________ > > > gpfsug-discuss mailing list > > > gpfsug-discuss at spectrumscale.org > > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Mar 8 13:05:13 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 8 Mar 2019 13:05:13 +0000 Subject: [gpfsug-discuss] US Spring User Group Meeting update - April 16-17th, NCAR Boulder Co Message-ID: <76731BC4-8C08-4D18-A052-1E3D34F6111A@nuance.com> Less than 6 weeks until the US Spring user group meeting! Thanks to the team at NCAR and IBM, we have an excellent facility and we?ll be able to offer breakfast, lunch, and evening social event on site. All at no charge to attendees. Detailed agenda coming soon. Register here: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2019-meeting-tickets-57035376346 (directions, locations, and suggested hotels) Topics will include: - User Talks - Breakout sessions - Spectrum Scale: The past, the present, the future - Accelerating AI workloads with IBM Spectrum Scale - AI ecosystem and solutions with IBM Spectrum Scale - Spectrum Scale Update - ESS Update - Support Update - Container & Cloud Update - AFM Update - High Performance Tier - Memory Consumption in Spectrum Scale - Spectrum Scale Use Cases - New storage options for Spectrum Scale - Overview - Introduction to Spectrum Scale (For Beginners) Bob Oesterlin/Kristy Kallback-Rose -------------- next part -------------- An HTML attachment was scrubbed... URL: From babbott at rutgers.edu Wed Mar 6 15:11:15 2019 From: babbott at rutgers.edu (William Abbott) Date: Wed, 6 Mar 2019 15:11:15 +0000 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: Message-ID: <54153a80-efef-4757-df89-69df1751648e@rutgers.edu> We had a similar situation and ended up using parsyncfp, which generates multiple parallel rsyncs based on file lists. If they're on the same IB fabric (as ours were) you can use that instead of ethernet, and it worked pretty well. One caveat is that you need to follow the parallel transfers with a final single rsync, so you can use --delete. For the initial transfer you can also use bbcp. It can get very good performance but isn't nearly as convenient as rsync for subsequent transfers. The performance isn't good with small files but you can use tar on both ends to deal with that, in a similar way to what Uwe suggests below. The bbcp documentation outlines how to do that. Bill On 3/6/19 8:13 AM, Uwe Falke wrote: > Hi, in that case I'd open several tar pipes in parallel, maybe using > directories carefully selected, like > > tar -c | ssh "tar -x" > > I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but > along these lines might be a good efficient method. target_hosts should be > all nodes haveing the target file system mounted, and you should start > those pipes on the nodes with the source file system. > It is best to start with the largest directories, and use some > masterscript to start the tar pipes controlled by semaphores to not > overload anything. > > > > Mit freundlichen Gr??en / Kind regards > > > Dr. Uwe Falke > > IT Specialist > High Performance Computing Services / Integrated Technology Services / > Data Center Services > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland > Rathausstr. 7 > 09111 Chemnitz > Phone: +49 371 6978 2165 > Mobile: +49 175 575 2877 > E-Mail: uwefalke at de.ibm.com > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: > Thomas Wolter, Sven Schoo? > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 17122 > > > > > From: "Oesterlin, Robert" > To: gpfsug main discussion list > Date: 06/03/2019 13:44 > Subject: [gpfsug-discuss] Follow-up: migrating billions of files > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Some of you had questions to my original post. More information: > > Source: > - Files are straight GPFS/Posix - no extended NFSV4 ACLs > - A solution that requires $?s to be spent on software (ie, Aspera) isn?t > a very viable option > - Both source and target clusters are in the same DC > - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage > - Approx 40 file systems, a few large ones with 300M-400M files each, > others smaller > - no independent file sets > - migration must pose minimal disruption to existing users > > Target architecture is a small number of file systems (2-3) on ESS with > independent filesets > - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) > > My current thinking is AFM with a pre-populate of the file space and > switch the clients over to have them pull data they need (most of the data > is older and less active) and them let AFM populate the rest in the > background. > > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DfTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8%26m%3DJ5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM%26s%3DfEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI%26e&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=W06i8IWqrxgEmdp3htxad0euiRhA6%2Bexd3YAziSrUhg%3D&reserved=0= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=Pjf4RhUchThoFvWI7hLJO4eWhoTXnIYd9m7Mvf809iE%3D&reserved=0 > From valleru at cbio.mskcc.org Fri Mar 8 15:01:13 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Fri, 8 Mar 2019 09:01:13 -0600 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: <54153a80-efef-4757-df89-69df1751648e@rutgers.edu> References: <54153a80-efef-4757-df89-69df1751648e@rutgers.edu> Message-ID: I had to do this twice too. Once i had to copy a 4 PB filesystem as fast as possible when NSD disk descriptors were corrupted and shutting down GPFS would have led to me loosing those files forever, and the other was a regular maintenance but had to copy similar data in less time. In both the cases, i just used GPFS provided util scripts in?/usr/lpp/mmfs/samples/util/ ?. These could be run only as root i believe. I wish i could give them to users to use. I had used few of those scripts like?tsreaddir which used to be really fast in listing all the paths in the directories. It prints full paths of all files along with there inodes etc. I had modified it to print just the full file paths. I then use these paths and group them up in different groups which gets fed into a array jobs to the SGE/LSF cluster. Each array jobs basically uses GNU parallel and running something similar to rsync -avR . The ?-R? option basically creates the directories as given. Of course this worked because i was using the fast private network to transfer between the storage systems. Also i know that cp or tar might be better than rsync with respect to speed, but rsync was convenient and i could always start over again without checkpointing or remembering where i left off previously. Similar to how Bill mentioned in the previous email, but i used gpfs util scripts and basic GNU parallel/rsync, SGE/LSF to submit jobs to the cluster as superuser. It used to work pretty well. Since then - I constantly use parallel and rsync to copy large directories. Thank you, Lohit On Mar 8, 2019, 7:43 AM -0600, William Abbott , wrote: > We had a similar situation and ended up using parsyncfp, which generates > multiple parallel rsyncs based on file lists. If they're on the same IB > fabric (as ours were) you can use that instead of ethernet, and it > worked pretty well. One caveat is that you need to follow the parallel > transfers with a final single rsync, so you can use --delete. > > For the initial transfer you can also use bbcp. It can get very good > performance but isn't nearly as convenient as rsync for subsequent > transfers. The performance isn't good with small files but you can use > tar on both ends to deal with that, in a similar way to what Uwe > suggests below. The bbcp documentation outlines how to do that. > > Bill > > On 3/6/19 8:13 AM, Uwe Falke wrote: > > Hi, in that case I'd open several tar pipes in parallel, maybe using > > directories carefully selected, like > > > > tar -c | ssh "tar -x" > > > > I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but > > along these lines might be a good efficient method. target_hosts should be > > all nodes haveing the target file system mounted, and you should start > > those pipes on the nodes with the source file system. > > It is best to start with the largest directories, and use some > > masterscript to start the tar pipes controlled by semaphores to not > > overload anything. > > > > > > > > Mit freundlichen Gr??en / Kind regards > > > > > > Dr. Uwe Falke > > > > IT Specialist > > High Performance Computing Services / Integrated Technology Services / > > Data Center Services > > ------------------------------------------------------------------------------------------------------------------------------------------- > > IBM Deutschland > > Rathausstr. 7 > > 09111 Chemnitz > > Phone: +49 371 6978 2165 > > Mobile: +49 175 575 2877 > > E-Mail: uwefalke at de.ibm.com > > ------------------------------------------------------------------------------------------------------------------------------------------- > > IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: > > Thomas Wolter, Sven Schoo? > > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > > HRB 17122 > > > > > > > > > > From: "Oesterlin, Robert" > To: gpfsug main discussion list > Date: 06/03/2019 13:44 > > Subject: [gpfsug-discuss] Follow-up: migrating billions of files > > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > > > > > Some of you had questions to my original post. More information: > > > > Source: > > - Files are straight GPFS/Posix - no extended NFSV4 ACLs > > - A solution that requires $?s to be spent on software (ie, Aspera) isn?t > > a very viable option > > - Both source and target clusters are in the same DC > > - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage > > - Approx 40 file systems, a few large ones with 300M-400M files each, > > others smaller > > - no independent file sets > > - migration must pose minimal disruption to existing users > > > > Target architecture is a small number of file systems (2-3) on ESS with > > independent filesets > > - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) > > > > My current thinking is AFM with a pre-populate of the file space and > > switch the clients over to have them pull data they need (most of the data > > is older and less active) and them let AFM populate the rest in the > > background. > > > > > > Bob Oesterlin > > Sr Principal Storage Engineer, Nuance > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DfTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8%26m%3DJ5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM%26s%3DfEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI%26e&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=W06i8IWqrxgEmdp3htxad0euiRhA6%2Bexd3YAziSrUhg%3D&reserved=0= > > > > > > > > > > > > _______________________________________________ > > gpfsug-discuss mailing list > > gpfsug-discuss at spectrumscale.org > > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=Pjf4RhUchThoFvWI7hLJO4eWhoTXnIYd9m7Mvf809iE%3D&reserved=0 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtucker at pixitmedia.com Fri Mar 8 16:08:14 2019 From: jtucker at pixitmedia.com (Jez Tucker) Date: Fri, 8 Mar 2019 16:08:14 +0000 Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another In-Reply-To: References: <827394bcbb794a0d9bd5bd8341fc1593@IN-CCI-D1S14.ads.iu.edu> Message-ID: <7432d104-4780-2224-19a3-e080078dbe74@pixitmedia.com> Hi ? I feel as an 'other products do exist' I should also mention Ngenea and APSync which could meet the technical requirements of these use cases. Ngenea allows you to bring data in from 'cloud' and also of interest in this specific use case, POSIX filesystems or filer islands.? You can present the remote data available locally and then inflate the data either on demand or via enacted process.? Massively parallel, multi-node, highly threaded with extremely granular rules based control.?? You can also migrate data back to your filer re-utilising such islands as tiers.? You can even use it to 'virtually tier' within GPFS/Scale filesystems, alike a 'hardlink across independent filesets'.? Or even across Global WANs for true 24x7 follow-the-sun working practices. APSync also provides a differently patched version of rsync and builds on top of the 'SnapDiff' technology previously presented at the UG whereby you don't need to re-scan your entire filesystem for each sync and thus can do incremental changes for create, modified, deleted and _track moved files_.? Handy and extremely time saving over regularised full runs.? Massively parallel, multi-node, highly threaded (a common theme with our tools...). As I don't do sales; if anyone wants to talk tech nuts-and-bolts with me about these, or you have challenges (and I love a challenge..) by all means hit me up directly.? I like solving people's blockers :-) Happy Friday ppl, Jez On 05/03/2019 21:38, Simon Thompson wrote: > DDN also have a paid for product for doing moving of data (data flow) We found out about it after we did a massive data migration... > > I can't comment on it other than being aware of it. Sure your local DDN sales person can help. > > But if only IBM supported some sort of restripe to new block size, we wouldn't have to do this mass migration :-P > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Simon Thompson [S.J.Thompson at bham.ac.uk] > Sent: 05 March 2019 16:38 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] suggestions forwar copying one GPFS file system into another > > I wrote a patch to mpifileutils which will copy gpfs attributes, but when we played with it with rsync, something was obviously still different about the attrs from each, so use with care. > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Ratliff, John [jdratlif at iu.edu] > Sent: 05 March 2019 16:21 > To: gpfsug-discuss at spectrumscale.org > Subject: [gpfsug-discuss] suggestions for copying one GPFS file system into another > > We use a GPFS file system for our computing clusters and we?re working on moving to a new SAN. > > We originally tried AFM, but it didn?t seem to work very well. We tried to do a prefetch on a test policy scan of 100 million files, and after 24 hours it hadn?t pre-fetched anything. It wasn?t clear what was happening. Some smaller tests succeeded, but the NFSv4 ACLs did not seem to be transferred. > > Since then we started using rsync with the GPFS attrs patch. We have over 600 million files and 700 TB. I split up the rsync tasks with lists of files generated by the policy engine and we transferred the original data in about 2 weeks. Now we?re working on final synchronization. I?d like to use one of the delete options to remove files that were sync?d earlier and then deleted. This can?t be combined with the files-from option, so it?s harder to break up the rsync tasks. Some of the directories I?m running this against have 30-150 million files each. This can take quite some time with a single rsync process. > > I?m also wondering if any of my rsync options are unnecessary. I was using avHAXS and numeric-ids. I?m thinking the A (acls) and X (xatttrs) might be unnecessary with GPFS->GPFS. We?re only using NFSv4 GPFS ACLs. I don?t know if GPFS uses any xattrs that rsync would sync or not. Removing those two options removed several system calls, which should make it much faster, but I want to make sure I?m syncing correctly. Also, it seems there is a problem with the GPFS patch on rsync where it will always give an error trying to get GPFS attributes on a symlink, which means it doesn?t sync any symlinks when using that option. So you can rsync symlinks or GPFS attrs, but not both at the same time. This has lead to me running two rsyncs, one to get all files and one to get all attributes. > > Thanks for any ideas or suggestions. > > John Ratliff | Pervasive Technology Institute | UITS | Research Storage ? Indiana University | http://pti.iu.edu > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- *Jez Tucker* Head of Research and Development, Pixit Media 07764193820 | jtucker at pixitmedia.com www.pixitmedia.com | Tw:@pixitmedia.com -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Fri Mar 8 16:42:17 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Fri, 8 Mar 2019 10:42:17 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: <186F603F-A278-433F-AE2C-7080EBA94AC9@bham.ac.uk> References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> <036e8839-90be-421a-844c-fd7d299f92d4@Spark> <186F603F-A278-433F-AE2C-7080EBA94AC9@bham.ac.uk> Message-ID: Thank you Simon. I do remember reading your page about few years back, when i was researching this issue. When you mentioned Custom Auth. I assumed it to be user-defined authentication from CES. However, looks like i need to hack it a bit to get SMB working with AD? I did not feel comfortable hacking the SMB from the CES cluster, and thus i was trying to bring up SMB outside the CES cluster. I almost hack with everything in the cluster but i leave GPFS and any of its configuration in the supported config, because if things break - i felt it might mess up things real bad. I wish we do not have to hack our way out of this, and IBM supported this config out of the box. I do not understand the current requirements from CES with respect to AD or user defined authentication where either both SMB and NFS should be AD/LDAP authenticated or both of them user defined. I believe many places do use just ssh-key as authentication for linux machines including the cloud instances, while SMB obviously cannot be used with ssh-key authentication and has to be used either with LDAP or AD authentication. Did anyone try to raise this as a feature request? Even if i do figure to hack this thing and make sure that updating CES won?t mess it up badly. I think i will have to do few things to get the SIDs to Uids match as you mentioned. We do not use passwords to authenticate to LDAP and I do not want to be creating another set of passwords apart from AD which is already existing, and users authenticate to it when they login to machines. I was thinking to bring up something like Redhat IDM that could sync with AD and get all the usernames/sids and password hashes. I could then enter my current LDAP uids/gids in the Redhat IDM. IDM will automatically create uids/gids for usernames that do not have them i believe. In this way, when SMB authenticates with Redhat IDM - users can use there current AD kerberos tickets or the same passwords and i do not have to change the passwords. It will also automatically sync with AD and create UIDs/GIDs and thus i don?t have to manually script something to create one for every person in AD. I however need to see if i could get to make this work with institutional AD and it might not be as smooth. So which of the below cases will IBM most probably support? :) 1. Run SMB outside the CES cluster with the above configuration. 2. Hack SMB inside the CES cluster Is it that running SMB outside the CES cluster with R/W has a possibility of corrupting the GPFS filesystem? We do not necessarily need HA with SMB and so apart from HA - What does IBM SMB do that would prevent such corruption from happening? The reason i was expecting the usernames to be same in LDAP and AD is because - if they are, then SMB will do uid mapping by default. i.e SMB will automatically map windows sids to ldap uids. I will not have to bring up Redhat IDM if this was the case. But unfortunately we have many users who have different ldap usernames from AD usernames - so i guess the practical way would be to use Redhat IDM to map windows sids to ldap uids. I have read about mmname2uid and mmuid2name that Andrew mentioned but looks like it is made to work between 2 gpfs clusters with different uids. Not exactly to make SMB map windows SIDs to ldap uids. Regards, Lohit On Mar 8, 2019, 2:41 AM -0600, Simon Thompson , wrote: > Hi Lohit, > > Custom auth sounds like it would work. > > NFS uses the ?system? ldap, SMB can use LDAP or AD, or you can fudge it and actually use both. We came at this very early in CES and I think some of this is better in mixed mode now, but we do something vaguely related to what you need. > > What you?d need is data in your ldap server to map windows usernames and SIDs to Unix IDs. So for example we have in our mmsmb config: > idmap config * : backend?????????? ldap > idmap config * : bind_path_group?? ou=SidMap,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > idmap config * : ldap_base_dn????? ou=SidMap,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > idmap config * : ldap_server?????? stand-alone > idmap config * : ldap_url????????? ldap://localhost > idmap config * : ldap_user_dn????? uid=nslcd,ou=People,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > idmap config * : range???????????? 1000-9999999 > idmap config * : rangesize???????? 1000000 > idmap config * : read only???????? yes > > You then need entries in the LDAP server, it could be a different server or somewhere else in the schema, but basically LDAP entries that map windows username/sid to underlying UID, e.g: > > dn: uid=USERNAME,ou=People,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > uid: USERNAME > objectClass: top > objectClass: posixAccount > objectClass: account > objectClass: shadowAccount > loginShell: /bin/bash > uidNumber: 605436 > shadowMax: 99999 > gidNumber: 100 > homeDirectory: /rds/homes/u/USERNAME > cn: USERS DISPLAY NAME > structuralObjectClass: account > entryUUID: 85a18df0-88bd-1037-9152-418eb0c7777 > creatorsName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > createTimestamp: 20180108124516Z > entryCSN: 20180108124516.623983Z#000000#001#000000 > modifiersName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > modifyTimestamp: 20180108124516Z > > dn: sambaSID=S-1-5-21-1390067357-308236825-725345543-498888,ou=SidMap,dc=rds > ,dc=adf,dc=bham,dc=ac,dc=uk > objectClass: sambaIdmapEntry > objectClass: sambaSidEntry > sambaSID: S-1-5-21-1390067357-308236825-725345543-498888 > uidNumber: 605436 > structuralObjectClass: sambaSidEntry > entryUUID: 85efa490-88bd-1037-9153-418eb0c9999 > creatorsName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > createTimestamp: 20180108124517Z > entryCSN: 20180108124517.135744Z#000000#001#000000 > modifiersName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > modifyTimestamp: 20180108124517Z > > I don?t think SMB actually cares about the username matching, what it needs to be able to do is resolve the Windows SID presented to the Unix UID underneath which is how it then accesses files. i.e. it doesn?t really matter what the username in the middle is ? > > Supported config? No. Works for what you need? Probably ... > > I wrote this: https://www.roamingzebra.co.uk/2015/07/smb-protocol-support-with-spectrum.html back in 2015 about what we were doing, probably much of it stands, but you might want to look at proper supported mixed mode. That is our plan at some point. > > Simon > > From: "valleru at cbio.mskcc.org" > Date: Friday, 8 March 2019 at 00:08 > To: "Simon Thompson (IT Research Support)" > Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share > > Thank you Simon. > > First issue: > I believe what i would need is a combination of user-defined authentication and ad authentication. > > User-defined authentication to help me export NFS and have the linux clients authenticate users with ssh keys. > AD based authentication to help me export SMB with AD authentication/kerberos to mount filesystem on windows connected to just AD. > > At first look, it looked like CES either supports user-defined authentication or AD based authentication - which would not work. We do not use kerberos or ldap passwords for accessing the HPC clusters. > > Second issue: > AD username to LDAP username mapping. I could bring up another AD/LDAP server that has the AD usernames and LDAP uids just for SMB authentication but i would need to do this for all the users in the agency. > I will try and research if this way is easier or the mmNametoUID. > > > Regards, > Lohit > > On Mar 7, 2019, 5:00 PM -0600, Simon Thompson , wrote: > > > > > custom Auth mode -------------- next part -------------- An HTML attachment was scrubbed... URL: From valleru at cbio.mskcc.org Fri Mar 8 16:52:13 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Fri, 8 Mar 2019 10:52:13 -0600 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark> <9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark> <036e8839-90be-421a-844c-fd7d299f92d4@Spark> <186F603F-A278-433F-AE2C-7080EBA94AC9@bham.ac.uk> Message-ID: Well, reading the user-defined authentication documentation again. It is basically left to sysadmins to deal with authentication and it looks like it would not be so much of a hack, to customize smb on CES nodes according to our needs. I will see if i could do this without much trouble. Regards, Lohit On Mar 8, 2019, 10:42 AM -0600, valleru at cbio.mskcc.org, wrote: > Thank you Simon. > > I do remember reading your page about few years back, when i was researching this issue. > When you mentioned Custom Auth. I assumed it to be user-defined authentication from CES. However, looks like i need to hack it a bit to get SMB working with AD? > > I did not feel comfortable hacking the SMB from the CES cluster, and thus i was trying to bring up SMB outside the CES cluster. I almost hack with everything in the cluster but i leave GPFS and any of its configuration in the supported config, because if things break - i felt it might mess up things real bad. > I wish we do not have to hack our way out of this, and IBM supported this config out of the box. > > I do not understand the current requirements from CES with respect to AD or user defined authentication where either both SMB and NFS should be AD/LDAP authenticated or both of them user defined. > > I believe many places do use just ssh-key as authentication for linux machines including the cloud instances, while SMB obviously cannot be used with ssh-key authentication and has to be used either with LDAP or AD authentication. > > Did anyone try to raise this as a feature request? > > Even if i do figure to hack this thing and make sure that updating CES won?t mess it up badly. I think i will have to do few things to get the SIDs to Uids match as you mentioned. > We do not use passwords to authenticate to LDAP and I do not want to be creating another set of passwords apart from AD which is already existing, and users authenticate to it when they login to machines. > > I was thinking to bring up something like Redhat IDM that could sync with AD and get all the usernames/sids and password hashes. I could then enter my current LDAP uids/gids in the Redhat IDM. IDM will automatically create uids/gids for usernames that do not have them i believe. > In this way, when SMB authenticates with Redhat IDM - users can use there current AD kerberos tickets or the same passwords and i do not have to change the passwords. > It will also automatically sync with AD and create UIDs/GIDs and thus i don?t have to manually script something to create one for every person in AD. > I however need to see if i could get to make this work with institutional AD and it might not be as smooth. > > So which of the below cases will IBM most probably support? :) > > 1. Run SMB outside the CES cluster with the above configuration. > 2. Hack SMB inside the CES cluster > > Is it that running SMB outside the CES cluster with R/W has a possibility of corrupting the GPFS filesystem? > We do not necessarily need HA with SMB and so apart from HA - What does IBM SMB do that would prevent such corruption from happening? > > The reason i was expecting the usernames to be same in LDAP and AD is because - if they are, then SMB will do uid mapping by default. i.e SMB will automatically map windows sids to ldap uids. I will not have to bring up Redhat IDM if this was the case. But unfortunately we have many users who have different ldap usernames from AD usernames - so i guess the practical way would be to use Redhat IDM to map windows sids to ldap uids. > > I have read about mmname2uid and mmuid2name that Andrew mentioned but looks like it is made to work between 2 gpfs clusters with different uids. Not exactly to make SMB map windows SIDs to ldap uids. > > Regards, > Lohit > > On Mar 8, 2019, 2:41 AM -0600, Simon Thompson , wrote: > > Hi Lohit, > > > > Custom auth sounds like it would work. > > > > NFS uses the ?system? ldap, SMB can use LDAP or AD, or you can fudge it and actually use both. We came at this very early in CES and I think some of this is better in mixed mode now, but we do something vaguely related to what you need. > > > > What you?d need is data in your ldap server to map windows usernames and SIDs to Unix IDs. So for example we have in our mmsmb config: > > idmap config * : backend?????????? ldap > > idmap config * : bind_path_group?? ou=SidMap,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > idmap config * : ldap_base_dn????? ou=SidMap,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > idmap config * : ldap_server?????? stand-alone > > idmap config * : ldap_url????????? ldap://localhost > > idmap config * : ldap_user_dn????? uid=nslcd,ou=People,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > idmap config * : range???????????? 1000-9999999 > > idmap config * : rangesize???????? 1000000 > > idmap config * : read only???????? yes > > > > You then need entries in the LDAP server, it could be a different server or somewhere else in the schema, but basically LDAP entries that map windows username/sid to underlying UID, e.g: > > > > dn: uid=USERNAME,ou=People,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > uid: USERNAME > > objectClass: top > > objectClass: posixAccount > > objectClass: account > > objectClass: shadowAccount > > loginShell: /bin/bash > > uidNumber: 605436 > > shadowMax: 99999 > > gidNumber: 100 > > homeDirectory: /rds/homes/u/USERNAME > > cn: USERS DISPLAY NAME > > structuralObjectClass: account > > entryUUID: 85a18df0-88bd-1037-9152-418eb0c7777 > > creatorsName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > createTimestamp: 20180108124516Z > > entryCSN: 20180108124516.623983Z#000000#001#000000 > > modifiersName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > modifyTimestamp: 20180108124516Z > > > > dn: sambaSID=S-1-5-21-1390067357-308236825-725345543-498888,ou=SidMap,dc=rds > > ,dc=adf,dc=bham,dc=ac,dc=uk > > objectClass: sambaIdmapEntry > > objectClass: sambaSidEntry > > sambaSID: S-1-5-21-1390067357-308236825-725345543-498888 > > uidNumber: 605436 > > structuralObjectClass: sambaSidEntry > > entryUUID: 85efa490-88bd-1037-9153-418eb0c9999 > > creatorsName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > createTimestamp: 20180108124517Z > > entryCSN: 20180108124517.135744Z#000000#001#000000 > > modifiersName: cn=Manager,dc=rds,dc=adf,dc=bham,dc=ac,dc=uk > > modifyTimestamp: 20180108124517Z > > > > I don?t think SMB actually cares about the username matching, what it needs to be able to do is resolve the Windows SID presented to the Unix UID underneath which is how it then accesses files. i.e. it doesn?t really matter what the username in the middle is ? > > > > Supported config? No. Works for what you need? Probably ... > > > > I wrote this: https://www.roamingzebra.co.uk/2015/07/smb-protocol-support-with-spectrum.html back in 2015 about what we were doing, probably much of it stands, but you might want to look at proper supported mixed mode. That is our plan at some point. > > > > Simon > > > > From: "valleru at cbio.mskcc.org" > > Date: Friday, 8 March 2019 at 00:08 > > To: "Simon Thompson (IT Research Support)" > > Subject: Re: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share > > > > Thank you Simon. > > > > First issue: > > I believe what i would need is a combination of user-defined authentication and ad authentication. > > > > User-defined authentication to help me export NFS and have the linux clients authenticate users with ssh keys. > > AD based authentication to help me export SMB with AD authentication/kerberos to mount filesystem on windows connected to just AD. > > > > At first look, it looked like CES either supports user-defined authentication or AD based authentication - which would not work. We do not use kerberos or ldap passwords for accessing the HPC clusters. > > > > Second issue: > > AD username to LDAP username mapping. I could bring up another AD/LDAP server that has the AD usernames and LDAP uids just for SMB authentication but i would need to do this for all the users in the agency. > > I will try and research if this way is easier or the mmNametoUID. > > > > > > Regards, > > Lohit > > > > On Mar 7, 2019, 5:00 PM -0600, Simon Thompson , wrote: > > > > > > > > custom Auth mode > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Mar 8 21:43:36 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 8 Mar 2019 16:43:36 -0500 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: <54153a80-efef-4757-df89-69df1751648e@rutgers.edu> Message-ID: Lohit... Any and all of those commands and techniques should still work with newer version of GPFS. But mmapplypolicy is the supported command for generating file lists. It uses the GPFS APIs and some parallel processing tricks. mmfind is a script that make it easier to write GPFS "policy rules" and runs mmapplypolicy for you. mmxcp can be used with mmfind (and/or mmapplypolicy) to make it easy to run a cp (or other command) in parallel on those filelists ... --marc K of GPFS From: valleru at cbio.mskcc.org To: ""gpfsug-discuss<""gpfsug-discuss at spectrumscale.org ", gpfsug main discussion list Date: 03/08/2019 10:13 AM Subject: Re: [gpfsug-discuss] Follow-up: migrating billions of files Sent by: gpfsug-discuss-bounces at spectrumscale.org I had to do this twice too. Once i had to copy a 4 PB filesystem as fast as possible when NSD disk descriptors were corrupted and shutting down GPFS would have led to me loosing those files forever, and the other was a regular maintenance but had to copy similar data in less time. In both the cases, i just used GPFS provided util scripts in /usr/lpp/mmfs/samples/util/ . These could be run only as root i believe. I wish i could give them to users to use. I had used few of those scripts like tsreaddir which used to be really fast in listing all the paths in the directories. It prints full paths of all files along with there inodes etc. I had modified it to print just the full file paths. I then use these paths and group them up in different groups which gets fed into a array jobs to the SGE/LSF cluster. Each array jobs basically uses GNU parallel and running something similar to rsync -avR . The ?-R? option basically creates the directories as given. Of course this worked because i was using the fast private network to transfer between the storage systems. Also i know that cp or tar might be better than rsync with respect to speed, but rsync was convenient and i could always start over again without checkpointing or remembering where i left off previously. Similar to how Bill mentioned in the previous email, but i used gpfs util scripts and basic GNU parallel/rsync, SGE/LSF to submit jobs to the cluster as superuser. It used to work pretty well. Since then - I constantly use parallel and rsync to copy large directories. Thank you, Lohit On Mar 8, 2019, 7:43 AM -0600, William Abbott , wrote: We had a similar situation and ended up using parsyncfp, which generates multiple parallel rsyncs based on file lists. If they're on the same IB fabric (as ours were) you can use that instead of ethernet, and it worked pretty well. One caveat is that you need to follow the parallel transfers with a final single rsync, so you can use --delete. For the initial transfer you can also use bbcp. It can get very good performance but isn't nearly as convenient as rsync for subsequent transfers. The performance isn't good with small files but you can use tar on both ends to deal with that, in a similar way to what Uwe suggests below. The bbcp documentation outlines how to do that. Bill On 3/6/19 8:13 AM, Uwe Falke wrote: Hi, in that case I'd open several tar pipes in parallel, maybe using directories carefully selected, like tar -c | ssh "tar -x" I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but along these lines might be a good efficient method. target_hosts should be all nodes haveing the target file system mounted, and you should start those pipes on the nodes with the source file system. It is best to start with the largest directories, and use some masterscript to start the tar pipes controlled by semaphores to not overload anything. Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Thomas Wolter, Sven Schoo? Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Oesterlin, Robert" From valleru at cbio.mskcc.org Fri Mar 8 22:40:32 2019 From: valleru at cbio.mskcc.org (valleru at cbio.mskcc.org) Date: Fri, 8 Mar 2019 16:40:32 -0600 Subject: [gpfsug-discuss] Follow-up: migrating billions of files In-Reply-To: References: <54153a80-efef-4757-df89-69df1751648e@rutgers.edu> Message-ID: Thank you Marc. I was just trying to suggest another approach to this email thread. However i believe, we cannot run mmfind/mmapplypolicy with remote filesystems and can only be run on the owning cluster? In our clusters - All the gpfs clients are generally in there own compute clusters and mount filesystems from other storage clusters - which i thought is one of the recommended designs. The scripts in the /usr/lpp/mmfs/samples/util folder do work with remote filesystems, and thus on the compute nodes. I was also trying to find something that could be used by users and not by superuser? but i guess none of these tools are meant to be run by a user without superuser privileges. Regards, Lohit On Mar 8, 2019, 3:54 PM -0600, Marc A Kaplan , wrote: > Lohit... Any and all of those commands and techniques should still work with newer version of GPFS. > > But mmapplypolicy is the supported command for generating file lists. ?It uses the GPFS APIs and some parallel processing tricks. > > mmfind is a script that make it easier to write GPFS "policy rules" and runs mmapplypolicy for you. > > mmxcp can be used with mmfind (and/or mmapplypolicy) to make it easy to run a cp (or other command) in parallel on those filelists ... > > --marc K of GPFS > > > > From: ? ? ? ?valleru at cbio.mskcc.org > To: ? ? ? ?""gpfsug-discuss<""gpfsug-discuss at spectrumscale.org ? ? ? ? ", gpfsug main discussion list > Date: ? ? ? ?03/08/2019 10:13 AM > Subject: ? ? ? ?Re: [gpfsug-discuss] Follow-up: migrating billions of files > Sent by: ? ? ? ?gpfsug-discuss-bounces at spectrumscale.org > > > > I had to do this twice too. Once i had to copy a 4 PB filesystem as fast as possible when NSD disk descriptors were corrupted and shutting down GPFS would have led to me loosing those files forever, and the other was a regular maintenance but had to copy similar data in less time. > > In both the cases, i just used GPFS provided util scripts in /usr/lpp/mmfs/samples/util/ ?. These could be run only as root i believe. I wish i could give them to users to use. > > I had used few of those scripts like tsreaddir which used to be really fast in listing all the paths in the directories. It prints full paths of all files along with there inodes etc. I had modified it to print just the full file paths. > > I then use these paths and group them up in different groups which gets fed into a array jobs to the SGE/LSF cluster. > Each array jobs basically uses GNU parallel and running something similar to rsync -avR . The ?-R? option basically creates the directories as given. > Of course this worked because i was using the fast private network to transfer between the storage systems. Also i know that cp or tar might be better than rsync with respect to speed, but rsync was convenient and i could always start over again without checkpointing or remembering where i left off previously. > > Similar to how Bill mentioned in the previous email, but i used gpfs util scripts and basic GNU parallel/rsync, SGE/LSF to submit jobs to the cluster as superuser. It used to work pretty well. > > Since then - I constantly use parallel and rsync to copy large directories. > > Thank you, > Lohit > > On Mar 8, 2019, 7:43 AM -0600, William Abbott , wrote: > We had a similar situation and ended up using parsyncfp, which generates > multiple parallel rsyncs based on file lists. If they're on the same IB > fabric (as ours were) you can use that instead of ethernet, and it > worked pretty well. One caveat is that you need to follow the parallel > transfers with a final single rsync, so you can use --delete. > > For the initial transfer you can also use bbcp. It can get very good > performance but isn't nearly as convenient as rsync for subsequent > transfers. The performance isn't good with small files but you can use > tar on both ends to deal with that, in a similar way to what Uwe > suggests below. The bbcp documentation outlines how to do that. > > Bill > > On 3/6/19 8:13 AM, Uwe Falke wrote: > Hi, in that case I'd open several tar pipes in parallel, maybe using > directories carefully selected, like > > tar -c | ssh "tar -x" > > I am not quite sure whether "-C /" for tar works here ("tar -C / -x"), but > along these lines might be a good efficient method. target_hosts should be > all nodes haveing the target file system mounted, and you should start > those pipes on the nodes with the source file system. > It is best to start with the largest directories, and use some > masterscript to start the tar pipes controlled by semaphores to not > overload anything. > > > > Mit freundlichen Gr??en / Kind regards > > > Dr. Uwe Falke > > IT Specialist > High Performance Computing Services / Integrated Technology Services / > Data Center Services > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland > Rathausstr. 7 > 09111 Chemnitz > Phone: +49 371 6978 2165 > Mobile: +49 175 575 2877 > E-Mail: uwefalke at de.ibm.com > ------------------------------------------------------------------------------------------------------------------------------------------- > IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: > Thomas Wolter, Sven Schoo? > Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, > HRB 17122 > > > > > From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 06/03/2019 13:44 > Subject: [gpfsug-discuss] Follow-up: migrating billions of files > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Some of you had questions to my original post. More information: > > Source: > - Files are straight GPFS/Posix - no extended NFSV4 ACLs > - A solution that requires $?s to be spent on software (ie, Aspera) isn?t > a very viable option > - Both source and target clusters are in the same DC > - Source is stand-alone NSD servers (bonded 10g-E) and 8gb FC SAN storage > - Approx 40 file systems, a few large ones with 300M-400M files each, > others smaller > - no independent file sets > - migration must pose minimal disruption to existing users > > Target architecture is a small number of file systems (2-3) on ESS with > independent filesets > - Target (ESS) will have multiple 40gb-E links on each NSD server (GS4) > > My current thinking is AFM with a pre-populate of the file space and > switch the clients over to have them pull data they need (most of the data > is older and less active) and them let AFM populate the rest in the > background. > > > Bob Oesterlin > Sr Principal Storage Engineer, Nuance > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss%26d%3DDwICAg%26c%3Djf_iaSHvJObTbx-siA1ZOg%26r%3DfTuVGtgq6A14KiNeaGfNZzOOgtHW5Lm4crZU6lJxtB8%26m%3DJ5RpIj-EzFyU_dM9I4P8SrpHMikte_pn9sbllFcOvyM%26s%3DfEwDQyDSL7hvOVPbg_n8o_LDz-cLqSI6lQtSzmhaSoI%26e&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=W06i8IWqrxgEmdp3htxad0euiRhA6%2Bexd3YAziSrUhg%3D&reserved=0= > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7Cbabbott%40rutgers.edu%7C8cbda3d651584119393808d6a2358544%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636874748092821399&sdata=Pjf4RhUchThoFvWI7hLJO4eWhoTXnIYd9m7Mvf809iE%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Fri Mar 8 22:58:59 2019 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 8 Mar 2019 22:58:59 +0000 Subject: [gpfsug-discuss] Exporting remote GPFS mounts on a non-ces SMB share In-Reply-To: References: , <2b02981b-185b-4f64-988f-ce2c19b55c29@Spark><9ceb9b16-18ad-4137-a8d1-f0b34966d7cf@Spark><036e8839-90be-421a-844c-fd7d299f92d4@Spark><186F603F-A278-433F-AE2C-7080EBA94AC9@bham.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Fri Mar 8 16:24:40 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Fri, 8 Mar 2019 16:24:40 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? Message-ID: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> Hi All, This is kind of a survey if you will, so for this one it might be best if you responded directly to me and I?ll summarize the results next week. Question 1 - do you use SSDs for data? If not - i.e. if you only use SSDs for metadata (as we currently do) - thanks, that?s all! If, however, you do use SSDs for data, please see Question 2. Question 2 - what is the DWPD (daily writes per day) of the SSDs that you use for data? Question 3 - is that different than the DWPD of the SSDs for metadata? Question 4 - any pertinent information in regards to your answers above (i.e. if you?ve got a filesystem that data is uploaded to only once and never modified after that then that?s useful to know!)? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Tue Mar 12 13:15:24 2019 From: pinto at scinet.utoronto.ca (Jaime Pinto) Date: Tue, 12 Mar 2019 09:15:24 -0400 Subject: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) files Message-ID: <20190312091524.10175q4zufaqley4@support.scinet.utoronto.ca> How can I instruct mmbackup to *NOT* delete the temporary directories and files created inside the FILESET/.mmbackupCfg folder? I can see that during the process the folders expiredFiles & updatedFiles are there, and contain the lists I'm interested in for post-analysis. Thanks Jaime --- Jaime Pinto - Storage Analyst SciNet HPC Consortium - Compute/Calcul Canada www.scinet.utoronto.ca - www.computecanada.ca University of Toronto 661 University Ave. (MaRS), Suite 1140 Toronto, ON, M5G1M1 P: 416-978-2755 C: 416-505-1477 ---------------------------------------------------------------- This message was sent using IMP at SciNet Consortium, University of Toronto. From stockf at us.ibm.com Tue Mar 12 15:19:41 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Tue, 12 Mar 2019 15:19:41 +0000 Subject: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) files In-Reply-To: <20190312091524.10175q4zufaqley4@support.scinet.utoronto.ca> References: <20190312091524.10175q4zufaqley4@support.scinet.utoronto.ca> Message-ID: An HTML attachment was scrubbed... URL: From pinto at scinet.utoronto.ca Tue Mar 12 16:07:49 2019 From: pinto at scinet.utoronto.ca (JAIME PINTO) Date: Tue, 12 Mar 2019 12:07:49 -0400 Subject: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, updatedFiles) files In-Reply-To: References: <20190312091524.10175q4zufaqley4@support.scinet.utoronto.ca> Message-ID: <16972a8f408.27e4.eccefdca38f81ee5a57d11a4aad4f6a6@scinet.utoronto.ca> Thanks Fred. I'll try that. Jaime On March 12, 2019 11:21:49 AM "Frederick Stock" wrote: > In the mmbackup man page look at the settings for the DEBUGmmbackup > variable. There is a value that will keep the temporary files. > > Fred > __________________________________________________ > Fred Stock | IBM Pittsburgh Lab | 720-430-8821 > stockf at us.ibm.com > > > ----- Original message ----- > From: "Jaime Pinto" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "gpfsug main discussion list" > Cc: > Subject: [gpfsug-discuss] mmbackup: how to keep list(expiredFiles, > updatedFiles) files > Date: Tue, Mar 12, 2019 10:28 AM > > How can I instruct mmbackup to *NOT* delete the temporary directories > and files created inside the FILESET/.mmbackupCfg folder? > > I can see that during the process the folders expiredFiles & > updatedFiles are there, and contain the lists I'm interested in for > post-analysis. > > Thanks > Jaime > > > > --- > Jaime Pinto - Storage Analyst > SciNet HPC Consortium - Compute/Calcul Canada > www.scinet.utoronto.ca - www.computecanada.ca > University of Toronto > 661 University Ave. (MaRS), Suite 1140 > Toronto, ON, M5G1M1 > P: 416-978-2755 > C: 416-505-1477 > > ---------------------------------------------------------------- > This message was sent using IMP at SciNet Consortium, University of Toronto. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at spectrumscale.org Thu Mar 14 14:44:35 2019 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale User Group Chair)) Date: Thu, 14 Mar 2019 14:44:35 +0000 Subject: [gpfsug-discuss] UK Spectrum Scale user group Message-ID: Registration is now open for the UK Spectrum Scale user group, taking place on 8th and 9th May 2019. Details and registration are available at: https://www.spectrumscaleug.org/event/uk-user-group-meeting/ We?re still looking for some user/customer talks to form part of the agenda and finalising the agenda speakers. If you are interested in presenting your use case of Spectrum Scale, please let me know. Thanks go out to our sponsors OCF, E8 storage, Lenovo and DDN storage and as always to IBM for supporting the event. Thanks Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen.buchanan at us.ibm.com Thu Mar 14 19:58:04 2019 From: stephen.buchanan at us.ibm.com (Stephen R Buchanan) Date: Thu, 14 Mar 2019 19:58:04 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Mar 14 20:17:28 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 14 Mar 2019 20:17:28 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From bbanister at jumptrading.com Thu Mar 14 20:47:35 2019 From: bbanister at jumptrading.com (Bryan Banister) Date: Thu, 14 Mar 2019 20:47:35 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: We use a site specific systemd unit which we call gpfs_fs.service that `BindsTo=gpfs.service` and `After=gpfs.service`. This service basically waits for GPFS to become active and then once active attempts to mount the required file systems. The list of file systems is determined by our own system configuration software (e.g. puppet/cfengine/salt-stack/ansible). We have also added a custom extension to gpfs.service (/usr/lib/systemd/system/gpfs.service.d/gpfs.service.conf) which adds a ExecStartPre to the IBM provided unit (we don?t want to mess with this IBM provide file). This ExecStartPre will make sure the node has the required version of GPFS installed and do some other basic checks. We have other systemd controlled process then both `BindsTo` and `After` the gpfs_fs.service. This works pretty well for us. Hope that helps, -Bryan From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Frederick Stock Sent: Thursday, March 14, 2019 3:17 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem [EXTERNAL EMAIL] It is not systemd based but you might want to look at the user callback feature in GPFS (mmaddcallback). There is a file system mount callback you could register. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Stephen R Buchanan" > Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 3:58 PM I searched the list archives with no obvious results. I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. Any suggestions are welcome and appreciated. dev:(path names have been slightly generalized, but the structure is identical) SS filesystem: /dev01 full path: /dev01/app-bin/user-tree/builds/ soft link: /appbin/ -> /dev01/app-bin/user-tree/ test: SS filesystem: /tst01 full path: /tst01/app-bin/user-tree/builds/ soft link: /appbin/ -> /tst01/app-bin/user-tree/ prod: SS filesystem: /prd01 full path: /prd01/app-bin/user-tree/builds/ soft link: /appbin/ -> /prd01/app-bin/user-tree/ Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential, or privileged information and/or personal data. If you are not the intended recipient, you are hereby notified that any review, dissemination, or copying of this email is strictly prohibited, and requested to notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request, or solicitation of any kind to buy, sell, subscribe, redeem, or perform any type of transaction of a financial product. Personal data, as defined by applicable data privacy laws, contained in this email may be processed by the Company, and any of its affiliated or related companies, for potential ongoing compliance and/or business-related purposes. You may have rights regarding your personal data; for information on exercising these rights or the Company?s treatment of personal data, please email datarequests at jumptrading.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen.buchanan at us.ibm.com Thu Mar 14 20:52:32 2019 From: stephen.buchanan at us.ibm.com (Stephen R Buchanan) Date: Thu, 14 Mar 2019 20:52:32 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Thu Mar 14 21:04:24 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Thu, 14 Mar 2019 21:04:24 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: , , Message-ID: An HTML attachment was scrubbed... URL: From tyler.trafford at yale.edu Thu Mar 14 21:36:07 2019 From: tyler.trafford at yale.edu (Trafford, Tyler) Date: Thu, 14 Mar 2019 21:36:07 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: I use the following: [Unit] Description=Foo After=gpfs.service [Service] ExecStartPre=/bin/bash -c 'until [ -d /gpfs/%I/apps/services/foo ]; do sleep 20s; done' ExecStart=/usr/sbin/runuser -u root /gpfs/%I/apps/services/foo/bin/runme [Install] WantedBy=multi-user.target Then I can drop it on multiple systems (with the same app layout), and run: systemctl enable foo at fs1 or systemctl enable foo at fs2 The "%I" gets replaced by what is after that "@". -- Tyler Trafford tyler.trafford at yale.edu ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Stephen R Buchanan Sent: Thursday, March 14, 2019 3:58 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem I searched the list archives with no obvious results. I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. Any suggestions are welcome and appreciated. dev:(path names have been slightly generalized, but the structure is identical) SS filesystem: /dev01 full path: /dev01/app-bin/user-tree/builds/ soft link: /appbin/ -> /dev01/app-bin/user-tree/ test: SS filesystem: /tst01 full path: /tst01/app-bin/user-tree/builds/ soft link: /appbin/ -> /tst01/app-bin/user-tree/ prod: SS filesystem: /prd01 full path: /prd01/app-bin/user-tree/builds/ soft link: /appbin/ -> /prd01/app-bin/user-tree/ Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com From tyler.trafford at yale.edu Thu Mar 14 21:38:32 2019 From: tyler.trafford at yale.edu (Trafford, Tyler) Date: Thu, 14 Mar 2019 21:38:32 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: , Message-ID: I forgot to mention that you need to name the unit file something like foo at .service -Tyler ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Trafford, Tyler Sent: Thursday, March 14, 2019 5:36 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem I use the following: [Unit] Description=Foo After=gpfs.service [Service] ExecStartPre=/bin/bash -c 'until [ -d /gpfs/%I/apps/services/foo ]; do sleep 20s; done' ExecStart=/usr/sbin/runuser -u root /gpfs/%I/apps/services/foo/bin/runme [Install] WantedBy=multi-user.target Then I can drop it on multiple systems (with the same app layout), and run: systemctl enable foo at fs1 or systemctl enable foo at fs2 The "%I" gets replaced by what is after that "@". -- Tyler Trafford tyler.trafford at yale.edu ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Stephen R Buchanan Sent: Thursday, March 14, 2019 3:58 PM To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem I searched the list archives with no obvious results. I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. Any suggestions are welcome and appreciated. dev:(path names have been slightly generalized, but the structure is identical) SS filesystem: /dev01 full path: /dev01/app-bin/user-tree/builds/ soft link: /appbin/ -> /dev01/app-bin/user-tree/ test: SS filesystem: /tst01 full path: /tst01/app-bin/user-tree/builds/ soft link: /appbin/ -> /tst01/app-bin/user-tree/ prod: SS filesystem: /prd01 full path: /prd01/app-bin/user-tree/builds/ soft link: /appbin/ -> /prd01/app-bin/user-tree/ Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7Ctyler.trafford%40yale.edu%7Cb3be6639c012419aaf3908d6a8c518c9%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C636881961822668372&sdata=4rU%2BWIv1tpJiTOGmliWba2vvTxeJf5gYyJ7xrbdf6wE%3D&reserved=0 From makaplan at us.ibm.com Thu Mar 14 21:40:39 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 14 Mar 2019 16:40:39 -0500 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: K.I.S.S. Try to open and read a file that you stored in GPFS. If good, proceed. Otherwise wait a second and retry. Nope, nothing GPFS specific about that. Need not be privileged or root either. K.I.S.S. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Fri Mar 15 02:37:18 2019 From: ulmer at ulmer.org (Stephen Ulmer) Date: Thu, 14 Mar 2019 22:37:18 -0400 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: +1 ? This is the best solution. The only thing I would change would be to add: TimeoutStartSec=300 Or something similar. This leaves the maintenance of starting applications where it belongs (in systems, not in GPFS). You can use the same technique for other VFS types (like NFS if you needed). You can check for any file on the file system you want, so you could just put a dotfile in the root of each waited-for file system and look for that. You an even chase your symlink if you want (removing the parameter completely). As a recovering sysadmin, this makes me smile. -- Stephen > On Mar 14, 2019, at 5:36 PM, Trafford, Tyler wrote: > > I use the following: > > [Unit] > Description=Foo > After=gpfs.service > > [Service] > ExecStartPre=/bin/bash -c 'until [ -d /gpfs/%I/apps/services/foo ]; do sleep 20s; done' > ExecStart=/usr/sbin/runuser -u root /gpfs/%I/apps/services/foo/bin/runme > > [Install] > WantedBy=multi-user.target > > > Then I can drop it on multiple systems (with the same app layout), and run: > > systemctl enable foo at fs1 > or > systemctl enable foo at fs2 > > The "%I" gets replaced by what is after that "@". > > -- > Tyler Trafford > tyler.trafford at yale.edu > > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Stephen R Buchanan > Sent: Thursday, March 14, 2019 3:58 PM > To: gpfsug-discuss at spectrumscale.org > Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem > > I searched the list archives with no obvious results. > > I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) > > My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. > > What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. > > Any suggestions are welcome and appreciated. > > dev:(path names have been slightly generalized, but the structure is identical) > SS filesystem: /dev01 > full path: /dev01/app-bin/user-tree/builds/ > soft link: /appbin/ -> /dev01/app-bin/user-tree/ > > test: > SS filesystem: /tst01 > full path: /tst01/app-bin/user-tree/builds/ > soft link: /appbin/ -> /tst01/app-bin/user-tree/ > > prod: > SS filesystem: /prd01 > full path: /prd01/app-bin/user-tree/builds/ > soft link: /appbin/ -> /prd01/app-bin/user-tree/ > > > Stephen R. Wall Buchanan > Sr. IT Specialist > IBM Data & AI North America Government Expert Labs > +1 (571) 299-4601 > stephen.buchanan at us.ibm.com > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Mar 15 08:49:44 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Fri, 15 Mar 2019 08:49:44 +0000 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: References: Message-ID: <433E6EBE-069B-4600-BA91-79E052050705@bham.ac.uk> +1 for using callbacks, we use the Mount and preUnMount callbacks on various things, e.g. before unmount, shutdown all the VMs running on the host, i.e. start and stop other things cleanly when the FS arrives/before it goes away. Simon From: on behalf of "stockf at us.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 14 March 2019 at 21:04 To: "gpfsug-discuss at spectrumscale.org" Cc: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem But if all you are waiting for is the mount to occur the invocation of the callback informs you the file system has been mounted. You would be free to start a command in the background, with appropriate protection, and exit the callback script. Also, making the callback script run asynchronous means GPFS will not wait for it to complete and that greatly mitigates any potential problems with GPFS commands, if you need to run them from the script. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Stephen R Buchanan" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 4:52 PM The man page for mmaddcallback specifically cautions against running "commands that involve GPFS files" because it "may cause unexpected and undesired results, including loss of file system availability." While I can imagine some kind of Rube Goldberg-esque chain of commands that I could run locally that would trigger the GPFS-filesystem-based commands I really want, I don't think mmaddcallback is the droid I'm looking for. Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com ----- Original message ----- From: "Frederick Stock" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 4:17 PM It is not systemd based but you might want to look at the user callback feature in GPFS (mmaddcallback). There is a file system mount callback you could register. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Stephen R Buchanan" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 3:58 PM I searched the list archives with no obvious results. I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. Any suggestions are welcome and appreciated. dev:(path names have been slightly generalized, but the structure is identical) SS filesystem: /dev01 full path: /dev01/app-bin/user-tree/builds/ soft link: /appbin/ -> /dev01/app-bin/user-tree/ test: SS filesystem: /tst01 full path: /tst01/app-bin/user-tree/builds/ soft link: /appbin/ -> /tst01/app-bin/user-tree/ prod: SS filesystem: /prd01 full path: /prd01/app-bin/user-tree/builds/ soft link: /appbin/ -> /prd01/app-bin/user-tree/ Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Fri Mar 15 18:12:14 2019 From: TOMP at il.ibm.com (Tomer Perry) Date: Fri, 15 Mar 2019 20:12:14 +0200 Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem In-Reply-To: <433E6EBE-069B-4600-BA91-79E052050705@bham.ac.uk> References: <433E6EBE-069B-4600-BA91-79E052050705@bham.ac.uk> Message-ID: I also for using callbacks ( as that's the "right" way for GPFS to report an event) instead of polling for status. One exception is the umount case, in which when using bind mounts ( which is quite common for "namespace virtualization") one should use the preunmount user exit instead of callback ( callback wouldn't work on these cases) for more info check https://www.ibm.com/developerworks/community/wikis/home?lang=en-gb#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/User%20Exits Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Simon Thompson To: gpfsug main discussion list Date: 15/03/2019 10:53 Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Sent by: gpfsug-discuss-bounces at spectrumscale.org +1 for using callbacks, we use the Mount and preUnMount callbacks on various things, e.g. before unmount, shutdown all the VMs running on the host, i.e. start and stop other things cleanly when the FS arrives/before it goes away. Simon From: on behalf of "stockf at us.ibm.com" Reply-To: "gpfsug-discuss at spectrumscale.org" Date: Thursday, 14 March 2019 at 21:04 To: "gpfsug-discuss at spectrumscale.org" Cc: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem But if all you are waiting for is the mount to occur the invocation of the callback informs you the file system has been mounted. You would be free to start a command in the background, with appropriate protection, and exit the callback script. Also, making the callback script run asynchronous means GPFS will not wait for it to complete and that greatly mitigates any potential problems with GPFS commands, if you need to run them from the script. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Stephen R Buchanan" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 4:52 PM The man page for mmaddcallback specifically cautions against running "commands that involve GPFS files" because it "may cause unexpected and undesired results, including loss of file system availability." While I can imagine some kind of Rube Goldberg-esque chain of commands that I could run locally that would trigger the GPFS-filesystem-based commands I really want, I don't think mmaddcallback is the droid I'm looking for. Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com ----- Original message ----- From: "Frederick Stock" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 4:17 PM It is not systemd based but you might want to look at the user callback feature in GPFS (mmaddcallback). There is a file system mount callback you could register. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Stephen R Buchanan" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug-discuss at spectrumscale.org Cc: Subject: [gpfsug-discuss] Systemd configuration to wait for mount of SS filesystem Date: Thu, Mar 14, 2019 3:58 PM I searched the list archives with no obvious results. I have an application that runs completely from a Spectrum Scale filesystem that I would like to start automatically on boot, obviously after the SS filesystem mounts, on multiple nodes. There are groups of nodes for dev, test, and production, (separate clusters) and the target filesystems are different between them (and are named differently, so the paths are different), but all nodes have an identical soft link from root (/) that points to the environment-specific path. (see below for details) My first effort before I did any research was to try to simply use a directive of After=gpfs.service which anyone who has tried it will know that the gpfs.service returns as "started" far in advance (and independently of) when filesystems are actually mounted. What I want is to be able to deploy a systemd service-unit and path-unit pair of files (that are as close to identical as possible across the environments) that wait for /appbin/builds/ to be available (/[dev|tst|prd]01/ to be mounted) and then starts the application. The problem is that systemd.path units, specifically the 'PathExists=' directive, don't follow symbolic links, so I would need to customize the path unit file for each environment with the full (real) path. There are other differences between the environments that I believe I can handle by specifying an EnvironmentFile directive -- but that would come from the SS filesystem so as to be a single reference point, so it can't help with the path unit. Any suggestions are welcome and appreciated. dev:(path names have been slightly generalized, but the structure is identical) SS filesystem: /dev01 full path: /dev01/app-bin/user-tree/builds/ soft link: /appbin/ -> /dev01/app-bin/user-tree/ test: SS filesystem: /tst01 full path: /tst01/app-bin/user-tree/builds/ soft link: /appbin/ -> /tst01/app-bin/user-tree/ prod: SS filesystem: /prd01 full path: /prd01/app-bin/user-tree/builds/ soft link: /appbin/ -> /prd01/app-bin/user-tree/ Stephen R. Wall Buchanan Sr. IT Specialist IBM Data & AI North America Government Expert Labs +1 (571) 299-4601 stephen.buchanan at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=F9Tf-JhgNwLBBIROpBcPVceJFINblVd6CHoSA1tOhmw&s=bhtWnfg7iqmu6Xu6_pJePULJ8jw8-4mFHftkHZ_bZho&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Mon Mar 18 19:09:34 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 18 Mar 2019 19:09:34 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> Message-ID: <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> Hi All, Just wanted to follow up with the results of my survey ? I received a grand total of two responses (Thanks Alex and John). In their case, they?re using SSDs with a 10 DWPD rating. The motivation behind my asking this question was ? money! ;-). Seriously, 10 DWPD drives are still very expensive, while 3 DWPD drives are significantly less expensive and 1 DWPD drives are even cheaper still. While we would NOT feel comfortable using anything less than 10 DWPD drives for metadata, we?re wondering about using less expensive drives for data. For example, let?s just say that you?re getting ready to set up a brand new GPFS 5 formatted filesystem of 1-2 PB in size. You?re considering having 3 pools: 1) a metadata only system pool of 10 DWPD SSDs. 4K inodes, and a ton of small files that?ll fit in the inode. 2) a data only ?hot? pool (i.e. the default pool for writes) of SSDs. 3) a data only ?capacity? pool of 12 TB spinning disks. And let?s just say that you have looked back at the historical data you?ve collected and you see that over the last 6 months or so you?ve been averaging 10-12 TB of data being written into your existing filesystem per day. You want to do migrations between pools only on the weekends if at all possible. 12 * 7 = 84 TB. So if you had somewhere between 125 - 150 TB of SSDs ... 1 DWPD SSDs ? then in theory you should easily be able to handle your anticipated workload without coming close to exceeding the 1 DWPD rating of the SSDs. However, as the saying goes, while in theory there?s no difference between theory and practice, in practice there is ... so am I overlooking anything here from a GPFS perspective??? If anybody still wants to respond on the DWPD rating of the SSDs they use for data, I?m still listening. Thanks? Kevin P.S. I still have a couple of ?outstanding issues? to respond to that I?ve posted to the list about previously: 1) the long I/O?s we see occasionally in the output of ?mmdiag ?iohist? on our NSD servers. We?re still trying to track that down ? it seems to happen only with a subset of our hardware - most of the time at least - but we?re still working to track down what triggers it ? i.e. at this point I can?t say whether it?s really the hardware or a user abusing the hardware. 2) I promised to post benchmark results of 3 different metadata configs: a) RAID 1 mirrors, b) a RAID 5 stripe, c) no RAID, but GPFS metadata replication of 3. That benchmarking has been put on hold for reasons I can?t really discuss on this mailing list at this time ? but hopefully soon. I haven?t forgotten the above and will respond back on the list when it?s appropriate. Thanks... On Mar 8, 2019, at 10:24 AM, Buterbaugh, Kevin L > wrote: Hi All, This is kind of a survey if you will, so for this one it might be best if you responded directly to me and I?ll summarize the results next week. Question 1 - do you use SSDs for data? If not - i.e. if you only use SSDs for metadata (as we currently do) - thanks, that?s all! If, however, you do use SSDs for data, please see Question 2. Question 2 - what is the DWPD (daily writes per day) of the SSDs that you use for data? Question 3 - is that different than the DWPD of the SSDs for metadata? Question 4 - any pertinent information in regards to your answers above (i.e. if you?ve got a filesystem that data is uploaded to only once and never modified after that then that?s useful to know!)? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Mon Mar 18 21:13:58 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Mon, 18 Mar 2019 21:13:58 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu>, <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> Message-ID: Did you look at pricing larger SSDs than you need and only using partial capacity to get more DWPD out of them? I.e. 1TB drive 3dpwd = 3TBpd 2TB drive (using 1/2 capacity) = 6TBpd Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Buterbaugh, Kevin L [Kevin.Buterbaugh at Vanderbilt.Edu] Sent: 18 March 2019 19:09 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] SSDs for data - DWPD? Hi All, Just wanted to follow up with the results of my survey ? I received a grand total of two responses (Thanks Alex and John). In their case, they?re using SSDs with a 10 DWPD rating. The motivation behind my asking this question was ? money! ;-). Seriously, 10 DWPD drives are still very expensive, while 3 DWPD drives are significantly less expensive and 1 DWPD drives are even cheaper still. While we would NOT feel comfortable using anything less than 10 DWPD drives for metadata, we?re wondering about using less expensive drives for data. For example, let?s just say that you?re getting ready to set up a brand new GPFS 5 formatted filesystem of 1-2 PB in size. You?re considering having 3 pools: 1) a metadata only system pool of 10 DWPD SSDs. 4K inodes, and a ton of small files that?ll fit in the inode. 2) a data only ?hot? pool (i.e. the default pool for writes) of SSDs. 3) a data only ?capacity? pool of 12 TB spinning disks. And let?s just say that you have looked back at the historical data you?ve collected and you see that over the last 6 months or so you?ve been averaging 10-12 TB of data being written into your existing filesystem per day. You want to do migrations between pools only on the weekends if at all possible. 12 * 7 = 84 TB. So if you had somewhere between 125 - 150 TB of SSDs ... 1 DWPD SSDs ? then in theory you should easily be able to handle your anticipated workload without coming close to exceeding the 1 DWPD rating of the SSDs. However, as the saying goes, while in theory there?s no difference between theory and practice, in practice there is ... so am I overlooking anything here from a GPFS perspective??? If anybody still wants to respond on the DWPD rating of the SSDs they use for data, I?m still listening. Thanks? Kevin P.S. I still have a couple of ?outstanding issues? to respond to that I?ve posted to the list about previously: 1) the long I/O?s we see occasionally in the output of ?mmdiag ?iohist? on our NSD servers. We?re still trying to track that down ? it seems to happen only with a subset of our hardware - most of the time at least - but we?re still working to track down what triggers it ? i.e. at this point I can?t say whether it?s really the hardware or a user abusing the hardware. 2) I promised to post benchmark results of 3 different metadata configs: a) RAID 1 mirrors, b) a RAID 5 stripe, c) no RAID, but GPFS metadata replication of 3. That benchmarking has been put on hold for reasons I can?t really discuss on this mailing list at this time ? but hopefully soon. I haven?t forgotten the above and will respond back on the list when it?s appropriate. Thanks... On Mar 8, 2019, at 10:24 AM, Buterbaugh, Kevin L > wrote: Hi All, This is kind of a survey if you will, so for this one it might be best if you responded directly to me and I?ll summarize the results next week. Question 1 - do you use SSDs for data? If not - i.e. if you only use SSDs for metadata (as we currently do) - thanks, that?s all! If, however, you do use SSDs for data, please see Question 2. Question 2 - what is the DWPD (daily writes per day) of the SSDs that you use for data? Question 3 - is that different than the DWPD of the SSDs for metadata? Question 4 - any pertinent information in regards to your answers above (i.e. if you?ve got a filesystem that data is uploaded to only once and never modified after that then that?s useful to know!)? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 From Kevin.Buterbaugh at Vanderbilt.Edu Mon Mar 18 22:45:03 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Mon, 18 Mar 2019 22:45:03 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> Message-ID: Thanks for the suggestion, Simon. Yes, we?ve looked at that, but we think that we?re going to potentially be in a situation where we?re using fairly big SSDs already. For example, if we bought 30 6.4 TB SSDs rated at 1 DWPD and configured them as 6 4+1P RAID 5 LUNs, then we?d end up with a usable capacity of 6 * 4 * 6 = ~144 TB usable space in our ?hot? pool. That would satisfy our capacity needs and also not exceed the 1 DWPD rating of the drives. BTW, we noticed with one particular vendor that their 3 DWPD drives were exactly 1/3rd the size of their 1 DWPD drives ? which makes us wonder if that?s coincidence or not. Anybody know for sure? Thanks? Kevin > On Mar 18, 2019, at 4:13 PM, Simon Thompson wrote: > > Did you look at pricing larger SSDs than you need and only using partial capacity to get more DWPD out of them? > > I.e. 1TB drive 3dpwd = 3TBpd > 2TB drive (using 1/2 capacity) = 6TBpd > > Simon > ________________________________________ > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Buterbaugh, Kevin L [Kevin.Buterbaugh at Vanderbilt.Edu] > Sent: 18 March 2019 19:09 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] SSDs for data - DWPD? > > Hi All, > > Just wanted to follow up with the results of my survey ? I received a grand total of two responses (Thanks Alex and John). In their case, they?re using SSDs with a 10 DWPD rating. > > The motivation behind my asking this question was ? money! ;-). Seriously, 10 DWPD drives are still very expensive, while 3 DWPD drives are significantly less expensive and 1 DWPD drives are even cheaper still. While we would NOT feel comfortable using anything less than 10 DWPD drives for metadata, we?re wondering about using less expensive drives for data. > > For example, let?s just say that you?re getting ready to set up a brand new GPFS 5 formatted filesystem of 1-2 PB in size. You?re considering having 3 pools: > > 1) a metadata only system pool of 10 DWPD SSDs. 4K inodes, and a ton of small files that?ll fit in the inode. > 2) a data only ?hot? pool (i.e. the default pool for writes) of SSDs. > 3) a data only ?capacity? pool of 12 TB spinning disks. > > And let?s just say that you have looked back at the historical data you?ve collected and you see that over the last 6 months or so you?ve been averaging 10-12 TB of data being written into your existing filesystem per day. You want to do migrations between pools only on the weekends if at all possible. > > 12 * 7 = 84 TB. So if you had somewhere between 125 - 150 TB of SSDs ... 1 DWPD SSDs ? then in theory you should easily be able to handle your anticipated workload without coming close to exceeding the 1 DWPD rating of the SSDs. > > However, as the saying goes, while in theory there?s no difference between theory and practice, in practice there is ... so am I overlooking anything here from a GPFS perspective??? > > If anybody still wants to respond on the DWPD rating of the SSDs they use for data, I?m still listening. > > Thanks? > > Kevin > > P.S. I still have a couple of ?outstanding issues? to respond to that I?ve posted to the list about previously: > > 1) the long I/O?s we see occasionally in the output of ?mmdiag ?iohist? on our NSD servers. We?re still trying to track that down ? it seems to happen only with a subset of our hardware - most of the time at least - but we?re still working to track down what triggers it ? i.e. at this point I can?t say whether it?s really the hardware or a user abusing the hardware. > > 2) I promised to post benchmark results of 3 different metadata configs: a) RAID 1 mirrors, b) a RAID 5 stripe, c) no RAID, but GPFS metadata replication of 3. That benchmarking has been put on hold for reasons I can?t really discuss on this mailing list at this time ? but hopefully soon. > > I haven?t forgotten the above and will respond back on the list when it?s appropriate. Thanks... > > On Mar 8, 2019, at 10:24 AM, Buterbaugh, Kevin L > wrote: > > Hi All, > > This is kind of a survey if you will, so for this one it might be best if you responded directly to me and I?ll summarize the results next week. > > Question 1 - do you use SSDs for data? If not - i.e. if you only use SSDs for metadata (as we currently do) - thanks, that?s all! If, however, you do use SSDs for data, please see Question 2. > > Question 2 - what is the DWPD (daily writes per day) of the SSDs that you use for data? > > Question 3 - is that different than the DWPD of the SSDs for metadata? > > Question 4 - any pertinent information in regards to your answers above (i.e. if you?ve got a filesystem that data is uploaded to only once and never modified after that then that?s useful to know!)? > > Thanks? > > Kevin > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C274d56e2906e4df3340a08d6abe6a61e%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C0%7C636885404456477052&sdata=eJ6XKuMQ3H4y8V1kyTd8%2ByGJX0rhlTqfcl0fce14pYA%3D&reserved=0 From alvise.dorigo at psi.ch Tue Mar 19 09:25:37 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Tue, 19 Mar 2019 09:25:37 +0000 Subject: [gpfsug-discuss] Calculate evicted space with a policy Message-ID: <83A6EEB0EC738F459A39439733AE804526840922@MBX214.d.ethz.ch> Dear users, is there a way (through a policy) to list the files (and their size) that are actually completely evicted by AFM from the cache filesystem ? I used a policy with the clause KB_ALLOCATED=0, but it is clearly not precise, because it also includes files that are not evicted, but are so small that they fit into their inodes (I'm assuming that GPFS inode structure has this feature similar to some regular filesystems, like ext4... otherwise I could not explain some non empty file with 0 allocated KB that have been fetched, i.e. non-evicted). Many thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Tue Mar 19 11:29:30 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Tue, 19 Mar 2019 11:29:30 +0000 Subject: [gpfsug-discuss] Calculate evicted space with a policy In-Reply-To: <83A6EEB0EC738F459A39439733AE804526840922@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE804526840922@MBX214.d.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Tue Mar 19 12:10:08 2019 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Tue, 19 Mar 2019 12:10:08 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> Message-ID: <93a0eecad50c925708aa60f15aecb361d0a022df.camel@strath.ac.uk> On Mon, 2019-03-18 at 19:09 +0000, Buterbaugh, Kevin L wrote: [SNIP] > > 12 * 7 = 84 TB. So if you had somewhere between 125 - 150 TB of SSDs > ... 1 DWPD SSDs ? then in theory you should easily be able to handle > your anticipated workload without coming close to exceeding the 1 > DWPD rating of the SSDs. > > However, as the saying goes, while in theory there?s no difference > between theory and practice, in practice there is ... so am I > overlooking anything here from a GPFS perspective??? > > If anybody still wants to respond on the DWPD rating of the SSDs they > use for data, I?m still listening. I would be weary of write amplification in RAID coming to bite you in the ass. Just because you write 1TB of data to the file system does not mean the drives write 1TB of data, it could be 2TB of data. I would if you can look at the data written to the drives using smartctl if you are on a DSS or ESS or something similar if they are behind a conventional storage array. So for example on my DSS-G picking a random drive used for data which is an 8TB NL-SAS for the record shows the following in the output of smartctl -a Error counter log: Errors Corrected by Total Correction Gigabytes Total ECC rereads/ errors algorithm processed uncorrected fast | delayed rewrites corrected invocations [10^9 bytes] errors read: 682040270 0 0 682040270 0 116208.442 0 write: 0 0 0 0 0 34680.694 0 Looking at the gigabytes processed shows that 33TB has been written to the drive. These are lifetime figures for the drive, so there is no under reporting/estimation going on. If you can get these figures back you can calculate what drive writes you need because they encapsulate the RAID write amplification. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From alvise.dorigo at psi.ch Tue Mar 19 12:09:10 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Tue, 19 Mar 2019 12:09:10 +0000 Subject: [gpfsug-discuss] Calculate evicted space with a policy In-Reply-To: References: <83A6EEB0EC738F459A39439733AE804526840922@MBX214.d.ethz.ch>, Message-ID: <83A6EEB0EC738F459A39439733AE8045268409B0@MBX214.d.ethz.ch> Thanks Fred. A ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Frederick Stock [stockf at us.ibm.com] Sent: Tuesday, March 19, 2019 12:29 PM To: gpfsug-discuss at spectrumscale.org Cc: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] Calculate evicted space with a policy You can scan for files using the MISC_ATTRIBUTES and look for those that are not cached, that is without the 'u' setting, and track their file size. I think that should work. Fred __________________________________________________ Fred Stock | IBM Pittsburgh Lab | 720-430-8821 stockf at us.ibm.com ----- Original message ----- From: "Dorigo Alvise (PSI)" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: "gpfsug-discuss at spectrumscale.org" Cc: Subject: [gpfsug-discuss] Calculate evicted space with a policy Date: Tue, Mar 19, 2019 5:27 AM Dear users, is there a way (through a policy) to list the files (and their size) that are actually completely evicted by AFM from the cache filesystem ? I used a policy with the clause KB_ALLOCATED=0, but it is clearly not precise, because it also includes files that are not evicted, but are so small that they fit into their inodes (I'm assuming that GPFS inode structure has this feature similar to some regular filesystems, like ext4... otherwise I could not explain some non empty file with 0 allocated KB that have been fetched, i.e. non-evicted). Many thanks, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Tue Mar 19 13:36:26 2019 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Tue, 19 Mar 2019 13:36:26 +0000 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: <93a0eecad50c925708aa60f15aecb361d0a022df.camel@strath.ac.uk> References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> <93a0eecad50c925708aa60f15aecb361d0a022df.camel@strath.ac.uk> Message-ID: It has been interesting to watch the evolution of the same discussion over on the Ceph Users mailing list over the last few years. Obviously GPFS and Ceph are used differently, so the comparison isn't direct, but the attitudes have generally shifted from recommending only high DWPD drives to the lower (or sometimes even lowest) tiers. The reasoning tends to be that often you will write less data than you think, and also drives often last longer than their rating. We have an all SSD (Samsung SM863a) Ceph cluster backing an Openstack system that's been in production for ~12 months, and the drives are all reporting 97%+ endurance remaining. It's not the busiest of storage backends, but Ceph is a notorious write amplifier and I'm more than happy with the ongoing endurance that I expect to see. On Tue, 19 Mar 2019 at 12:10, Jonathan Buzzard < jonathan.buzzard at strath.ac.uk> wrote: > On Mon, 2019-03-18 at 19:09 +0000, Buterbaugh, Kevin L wrote: > > [SNIP] > > > > > 12 * 7 = 84 TB. So if you had somewhere between 125 - 150 TB of SSDs > > ... 1 DWPD SSDs ? then in theory you should easily be able to handle > > your anticipated workload without coming close to exceeding the 1 > > DWPD rating of the SSDs. > > > > However, as the saying goes, while in theory there?s no difference > > between theory and practice, in practice there is ... so am I > > overlooking anything here from a GPFS perspective??? > > > > If anybody still wants to respond on the DWPD rating of the SSDs they > > use for data, I?m still listening. > > I would be weary of write amplification in RAID coming to bite you in > the ass. Just because you write 1TB of data to the file system does not > mean the drives write 1TB of data, it could be 2TB of data. > > I would if you can look at the data written to the drives using > smartctl if you are on a DSS or ESS or something similar if they are > behind a conventional storage array. > > So for example on my DSS-G picking a random drive used for data which > is an 8TB NL-SAS for the record shows the following in the output of > smartctl -a > > Error counter log: > Errors Corrected by Total Correction Gigabytes > Total > ECC rereads/ errors algorithm processed > uncorrected > fast | delayed rewrites corrected invocations [10^9 > bytes] errors > read: 682040270 0 0 682040270 0 116208.442 > 0 > write: 0 0 0 0 0 34680.694 > 0 > > > Looking at the gigabytes processed shows that 33TB has been written to > the drive. These are lifetime figures for the drive, so there is no > under reporting/estimation going on. > > If you can get these figures back you can calculate what drive writes > you need because they encapsulate the RAID write amplification. > > JAB. > > -- > Jonathan A. Buzzard Tel: +44141-5483420 > HPC System Administrator, ARCHIE-WeSt. > University of Strathclyde, John Anderson Building, Glasgow. G4 0NG > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From valdis.kletnieks at vt.edu Tue Mar 19 17:22:32 2019 From: valdis.kletnieks at vt.edu (Valdis Kl=?utf-8?Q?=c4=93?=tnieks) Date: Tue, 19 Mar 2019 13:22:32 -0400 Subject: [gpfsug-discuss] SSDs for data - DWPD? In-Reply-To: <93a0eecad50c925708aa60f15aecb361d0a022df.camel@strath.ac.uk> References: <7B8A565F-94B7-419E-A2D0-35FE1C898BB6@vanderbilt.edu> <9C6708C6-B45D-4947-A8FD-07FEBE9CE131@vanderbilt.edu> <93a0eecad50c925708aa60f15aecb361d0a022df.camel@strath.ac.uk> Message-ID: <28273.1553016152@turing-police> On Tue, 19 Mar 2019 12:10:08 -0000, Jonathan Buzzard said: > I would be weary of write amplification in RAID coming to bite you in > the ass. Just because you write 1TB of data to the file system does not > mean the drives write 1TB of data, it could be 2TB of data. Right, but that 2T would be across multiple drives. That's part of why write amplification can cause problems - many RAID subsystems are unable to do the writes in true parallel across the drives involved. From chris.schlipalius at pawsey.org.au Wed Mar 20 23:10:11 2019 From: chris.schlipalius at pawsey.org.au (Chris Schlipalius) Date: Thu, 21 Mar 2019 07:10:11 +0800 Subject: [gpfsug-discuss] SCA19 presentations are live! Message-ID: <694B7DC7-4746-4DE2-B478-50CA2D33F906@pawsey.org.au> Please see https://www.spectrumscaleug.org/presentations/ for the latest uploads. There are some excellent contemporary talks. Regards, Chris Schlipalius Team Lead, Data Storage Infrastructure, Data & Visualisation, Pawsey Supercomputing Centre (CSIRO) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chair at spectrumscale.org Thu Mar 21 09:08:43 2019 From: chair at spectrumscale.org (Simon Thompson (Spectrum Scale UG Chair)) Date: Thu, 21 Mar 2019 09:08:43 +0000 Subject: [gpfsug-discuss] SSUG Website updates Message-ID: <1d056d6a-98ba-40e5-9c8b-265c9946574d@email.android.com> An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Thu Mar 21 13:22:45 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 21 Mar 2019 13:22:45 +0000 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Message-ID: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> Hi, I'm a little bit puzzled about different meanings of blocksize for different GPFS installation (standard and gnr). >From this page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/File%20System%20Planning I read: * The blocksize is the largest size IO that GPFS can issue to the underlying device * A subblock is 1/32nd of blocksize. This is the smallest allocation to a single file For non-gnr GPFS device is quite clear to me (I hope): it is a single spinning disk (or ssd). And I verified this on a small cluster composed of nsd using their local hard drive. Can someone explain what is the "device" in the case of GNR ? a single pdisk ? Thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Mar 21 13:32:04 2019 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Thu, 21 Mar 2019 09:32:04 -0400 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR In-Reply-To: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> Message-ID: The underlying device in this context is the NSD, network storage device. This has relation at all to 512 byte or 4K disk blocks. Usually around a meg, always a power of two. -- ddj Dave Johnson > On Mar 21, 2019, at 9:22 AM, Dorigo Alvise (PSI) wrote: > > Hi, > I'm a little bit puzzled about different meanings of blocksize for different GPFS installation (standard and gnr). > > From this page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/File%20System%20Planning > > I read: > The blocksize is the largest size IO that GPFS can issue to the underlying device > A subblock is 1/32nd of blocksize. This is the smallest allocation to a single file > For non-gnr GPFS device is quite clear to me (I hope): it is a single spinning disk (or ssd). And I verified this on a small cluster composed of nsd using their local hard drive. > > Can someone explain what is the "device" in the case of GNR ? a single pdisk ? > > Thanks, > > Alvise > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Thu Mar 21 15:05:14 2019 From: david_johnson at brown.edu (David Johnson) Date: Thu, 21 Mar 2019 11:05:14 -0400 Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 download broken Message-ID: <9D0D3D23-923D-4C48-B775-BD9818CB2DD6@brown.edu> I tried twice to download the latest PTF, but the md5sum did not match and the package will not install. Succeeded with the Protocols version. There is no link on the web page to report problems, so I'm posting here hoping someone can get it fixed. -- ddj From constance.rice at us.ibm.com Thu Mar 21 15:43:39 2019 From: constance.rice at us.ibm.com (Constance M Rice) Date: Thu, 21 Mar 2019 15:43:39 +0000 Subject: [gpfsug-discuss] interested in people who have implemented immutability. what was your use case? Message-ID: Hello, I'm trying to understand how and why people have been using the immutability (WORM) functions that became available in scale 4.2. I don't need a company name, just an idea of why and how you use it. TIA Connie Rice Storage Specialist Washington Systems Center Mobile: 202-821-6747 E-mail: constance.rice at us.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 93815 bytes Desc: not available URL: From rpergamin at ddn.com Thu Mar 21 15:52:56 2019 From: rpergamin at ddn.com (Ran Pergamin) Date: Thu, 21 Mar 2019 15:52:56 +0000 Subject: [gpfsug-discuss] interested in people who have implemented immutability. what was your use case? In-Reply-To: References: Message-ID: <153DB3CA-9EB1-4F09-BB16-BBDE345137EE@ddn.com> Hi, A customer of mine uses it to store its own customers Digital receipts that they need to store for 12years, due to government regulations, without ability to delete whatsoever. It?s also replicated across two sites. Another uses it to protect critical video digital assets. Regards, Ran On 21 Mar 2019, at 17:43, Constance M Rice > wrote: Hello, I'm trying to understand how and why people have been using the immutability (WORM) functions that became available in scale 4.2. I don't need a company name, just an idea of why and how you use it. TIA Connie Rice Storage Specialist Washington Systems Center Mobile: 202-821-6747 E-mail: constance.rice at us.ibm.com _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.jpg Type: image/jpeg Size: 93815 bytes Desc: ATT00001.jpg URL: From S.J.Thompson at bham.ac.uk Thu Mar 21 17:05:25 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Thu, 21 Mar 2019 17:05:25 +0000 Subject: [gpfsug-discuss] interested in people who have implemented immutability. what was your use case? In-Reply-To: References: Message-ID: We've used soft immutability before for data associated with specific research projects. Soft so that we can remove data as "root" - sometimes research data is a partner's and there are cases when they require us to be able to remove their data. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of constance.rice at us.ibm.com [constance.rice at us.ibm.com] Sent: 21 March 2019 15:43 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] interested in people who have implemented immutability. what was your use case? Hello, I'm trying to understand how and why people have been using the immutability (WORM) functions that became available in scale 4.2. I don't need a company name, just an idea of why and how you use it. TIA Connie Rice Storage Specialist Washington Systems Center Mobile: 202-821-6747 E-mail: constance.rice at us.ibm.com [cid:_2_73D2E57873D2E1D80056641B852583C4] -------------- next part -------------- A non-text attachment was scrubbed... Name: ATT00001.jpg Type: image/jpeg Size: 93815 bytes Desc: ATT00001.jpg URL: From daniel.kidger at uk.ibm.com Thu Mar 21 17:15:06 2019 From: daniel.kidger at uk.ibm.com (Daniel Kidger) Date: Thu, 21 Mar 2019 17:15:06 +0000 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR In-Reply-To: References: , <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From oehmes at gmail.com Thu Mar 21 17:35:13 2019 From: oehmes at gmail.com (Sven Oehme) Date: Thu, 21 Mar 2019 10:35:13 -0700 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR In-Reply-To: References: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> Message-ID: <6216BA50-4921-43C7-8021-97DC6C0553E1@gmail.com> Lots of details in a presentation I did last year before I left IBM ? http://files.gpfsug.org/presentations/2018/Singapore/Sven_Oehme_ESS_in_CORAL_project_update.pdf Sven From: on behalf of Daniel Kidger Reply-To: gpfsug main discussion list Date: Thursday, March 21, 2019 at 10:15 AM To: Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Alvise, Also note that that DeveloperWorks page was maintained by Scott Faddon. He has since left IBM and that page has unfortunately not been updated for almost 2 years. :-( This page predates the current version 5.x of SpectrumScale which has been available since the beginning of 2018. In version 5.x. the statement that there are 32 sub-blocks in one block is no longer true. Now, by default you get a 4MiB Filesystem blocksize that has 512 sub0clocks, each 8192 bytes long. Daniel _________________________________________________________ Daniel Kidger IBM Technical Sales Specialist Spectrum Scale, Spectrum NAS and IBM Cloud Object Store +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: david_johnson at brown.edu Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Date: Thu, Mar 21, 2019 1:32 PM The underlying device in this context is the NSD, network storage device. This has relation at all to 512 byte or 4K disk blocks. Usually around a meg, always a power of two. -- ddj Dave Johnson On Mar 21, 2019, at 9:22 AM, Dorigo Alvise (PSI) wrote: Hi, I'm a little bit puzzled about different meanings of blocksize for different GPFS installation (standard and gnr). >From this page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/File%20System%20Planning I read: The blocksize is the largest size IO that GPFS can issue to the underlying device A subblock is 1/32nd of blocksize. This is the smallest allocation to a single file For non-gnr GPFS device is quite clear to me (I hope): it is a single spinning disk (or ssd). And I verified this on a small cluster composed of nsd using their local hard drive. Can someone explain what is the "device" in the case of GNR ? a single pdisk ? Thanks, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Fri Mar 22 07:44:13 2019 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Fri, 22 Mar 2019 15:44:13 +0800 Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 download broken In-Reply-To: <9D0D3D23-923D-4C48-B775-BD9818CB2DD6@brown.edu> References: <9D0D3D23-923D-4C48-B775-BD9818CB2DD6@brown.edu> Message-ID: David, Thanks for reporting this download issue. The mail is forwarded Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479 . If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: David Johnson To: gpfsug main discussion list Date: 03/21/2019 11:08 PM Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 download broken Sent by: gpfsug-discuss-bounces at spectrumscale.org I tried twice to download the latest PTF, but the md5sum did not match and the package will not install. Succeeded with the Protocols version. There is no link on the web page to report problems, so I'm posting here hoping someone can get it fixed. -- ddj _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=Fe4BhsvYhtd9TWJ6GNNKMfHPfXt3A2XPD3h-62brD7w&s=NyLt02TR8u1dYAv22BEaCX-n2N_gdQ-9MBqVrwbBATc&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Fri Mar 22 09:16:06 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Fri, 22 Mar 2019 09:16:06 +0000 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR In-Reply-To: <6216BA50-4921-43C7-8021-97DC6C0553E1@gmail.com> References: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> , <6216BA50-4921-43C7-8021-97DC6C0553E1@gmail.com> Message-ID: <83A6EEB0EC738F459A39439733AE80452684260B@MBX214.d.ethz.ch> Thank you all guys for the answers. Just a quick question: what is the difference between gpfsperf and tsqosperf ? I knew the former, but not the latter (mentioned in your presentation). Do they do I/O test in different ways ? thanks, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sven Oehme [oehmes at gmail.com] Sent: Thursday, March 21, 2019 6:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Lots of details in a presentation I did last year before I left IBM --> http://files.gpfsug.org/presentations/2018/Singapore/Sven_Oehme_ESS_in_CORAL_project_update.pdf Sven From: on behalf of Daniel Kidger Reply-To: gpfsug main discussion list Date: Thursday, March 21, 2019 at 10:15 AM To: Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Alvise, Also note that that DeveloperWorks page was maintained by Scott Faddon. He has since left IBM and that page has unfortunately not been updated for almost 2 years. :-( This page predates the current version 5.x of SpectrumScale which has been available since the beginning of 2018. In version 5.x. the statement that there are 32 sub-blocks in one block is no longer true. Now, by default you get a 4MiB Filesystem blocksize that has 512 sub0clocks, each 8192 bytes long. Daniel _________________________________________________________ Daniel Kidger IBM Technical Sales Specialist Spectrum Scale, Spectrum NAS and IBM Cloud Object Store +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: david_johnson at brown.edu Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Date: Thu, Mar 21, 2019 1:32 PM The underlying device in this context is the NSD, network storage device. This has relation at all to 512 byte or 4K disk blocks. Usually around a meg, always a power of two. -- ddj Dave Johnson On Mar 21, 2019, at 9:22 AM, Dorigo Alvise (PSI) > wrote: Hi, I'm a little bit puzzled about different meanings of blocksize for different GPFS installation (standard and gnr). >From this page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/File%20System%20Planning I read: * The blocksize is the largest size IO that GPFS can issue to the underlying device * A subblock is 1/32nd of blocksize. This is the smallest allocation to a single file For non-gnr GPFS device is quite clear to me (I hope): it is a single spinning disk (or ssd). And I verified this on a small cluster composed of nsd using their local hard drive. Can someone explain what is the "device" in the case of GNR ? a single pdisk ? Thanks, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From nnasef at us.ibm.com Fri Mar 22 13:18:55 2019 From: nnasef at us.ibm.com (Nariman Nasef) Date: Fri, 22 Mar 2019 13:18:55 +0000 Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 downloadbroken In-Reply-To: References: , <9D0D3D23-923D-4C48-B775-BD9818CB2DD6@brown.edu> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15532604919330.png Type: image/png Size: 15543 bytes Desc: not available URL: From oehmes at gmail.com Fri Mar 22 13:44:21 2019 From: oehmes at gmail.com (Sven Oehme) Date: Fri, 22 Mar 2019 06:44:21 -0700 Subject: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR In-Reply-To: <83A6EEB0EC738F459A39439733AE80452684260B@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE80452684217D@MBX214.d.ethz.ch> <6216BA50-4921-43C7-8021-97DC6C0553E1@gmail.com> <83A6EEB0EC738F459A39439733AE80452684260B@MBX214.d.ethz.ch> Message-ID: Hi, They are slightly different versions of the same tool. You should use gpfsperf as this is the one pre-packaged with newer versions of Scale. Sven From: on behalf of "Dorigo Alvise (PSI)" Reply-To: gpfsug main discussion list Date: Friday, March 22, 2019 at 2:18 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Thank you all guys for the answers. Just a quick question: what is the difference between gpfsperf and tsqosperf ? I knew the former, but not the latter (mentioned in your presentation). Do they do I/O test in different ways ? thanks, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sven Oehme [oehmes at gmail.com] Sent: Thursday, March 21, 2019 6:35 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Lots of details in a presentation I did last year before I left IBM ? http://files.gpfsug.org/presentations/2018/Singapore/Sven_Oehme_ESS_in_CORAL_project_update.pdf Sven From: on behalf of Daniel Kidger Reply-To: gpfsug main discussion list Date: Thursday, March 21, 2019 at 10:15 AM To: Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Alvise, Also note that that DeveloperWorks page was maintained by Scott Faddon. He has since left IBM and that page has unfortunately not been updated for almost 2 years. :-( This page predates the current version 5.x of SpectrumScale which has been available since the beginning of 2018. In version 5.x. the statement that there are 32 sub-blocks in one block is no longer true. Now, by default you get a 4MiB Filesystem blocksize that has 512 sub0clocks, each 8192 bytes long. Daniel _________________________________________________________ Daniel Kidger IBM Technical Sales Specialist Spectrum Scale, Spectrum NAS and IBM Cloud Object Store +44-(0)7818 522 266 daniel.kidger at uk.ibm.com ----- Original message ----- From: david_johnson at brown.edu Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] Clarification about blocksize in stardanrd gpfs and GNR Date: Thu, Mar 21, 2019 1:32 PM The underlying device in this context is the NSD, network storage device. This has relation at all to 512 byte or 4K disk blocks. Usually around a meg, always a power of two. -- ddj Dave Johnson On Mar 21, 2019, at 9:22 AM, Dorigo Alvise (PSI) wrote: Hi, I'm a little bit puzzled about different meanings of blocksize for different GPFS installation (standard and gnr). >From this page https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General%20Parallel%20File%20System%20(GPFS)/page/File%20System%20Planning I read: ? The blocksize is the largest size IO that GPFS can issue to the underlying device ? A subblock is 1/32nd of blocksize. This is the smallest allocation to a single file For non-gnr GPFS device is quite clear to me (I hope): it is a single spinning disk (or ssd). And I verified this on a small cluster composed of nsd using their local hard drive. Can someone explain what is the "device" in the case of GNR ? a single pdisk ? Thanks, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_johnson at brown.edu Fri Mar 22 14:12:54 2019 From: david_johnson at brown.edu (david_johnson at brown.edu) Date: Fri, 22 Mar 2019 10:12:54 -0400 Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 downloadbroken In-Reply-To: References: <9D0D3D23-923D-4C48-B775-BD9818CB2DD6@brown.edu> Message-ID: <96F378B5-4239-420F-B212-09E97E97DF47@brown.edu> Thank you, I trust that it is fixed now, will check it when I have a chance. The protocols version allowed me to proceed, just didn?t want others to run into the same issue. -- ddj Dave Johnson > On Mar 22, 2019, at 9:18 AM, Nariman Nasef wrote: > > This issue has been addressed yesterday, as per David's message. > David, you should have also received confirmation email from me. > > We have identified the issue, and reposted the packages. Please let us know if this has not been resolved. > > Thanks > Nariman > > > Nariman Nasef, M.Eng., PMP?, PMI-ACP? > Senior Program Manager > Spectrum Scale, IBM Systems > Phone: 1-626-390-6124 > > "If you do not change direction, you will end up where you are heading". > > > > ----- Original message ----- > From: "IBM Spectrum Scale" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 download broken > Date: Fri, Mar 22, 2019 12:44 AM > > David, > > Thanks for reporting this download issue. The mail is forwarded > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. > > The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. > > > > From: David Johnson > To: gpfsug main discussion list > Date: 03/21/2019 11:08 PM > Subject: [gpfsug-discuss] Spectrum Scale Standard 4.2.3-13 download broken > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > I tried twice to download the latest PTF, but the md5sum did not match and the package will not install. > Succeeded with the Protocols version. There is no link on the web page to report problems, so I'm posting > here hoping someone can get it fixed. > > -- ddj > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.childs at qmul.ac.uk Mon Mar 25 09:38:17 2019 From: p.childs at qmul.ac.uk (Peter Childs) Date: Mon, 25 Mar 2019 09:38:17 +0000 Subject: [gpfsug-discuss] mmlsquota output Message-ID: <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Can someone tell me I'm not reading this wrong. This is using Spectrum Scale 5.0.2-1 It looks like the output from mmlsquota is not what it says In the man page it says, mmlsquota [-u User | -g Group] [-v | -q] [-e] [-C ClusterName] [-Y] [--block-size {BlockSize | auto}] [Device[:Fileset] ...] however mmlsquota -u username fs:fileset Return the output for every fileset, not just the "fileset" I've asked for, this is same output as mmlsquota -u username fs Where I've not said the fileset. I can work around this, but I'm just checking this is not actually a bug, that ought to be fixed. Long story is that I'm working on rewriting our quota report util that used be a long bash/awk script into a more easy to understand python script, and I want to get the user quota info for just one fileset. Thanks in advance. -- Peter Childs ITS Research Storage Queen Mary, University of London From robert.horton at icr.ac.uk Mon Mar 25 09:52:18 2019 From: robert.horton at icr.ac.uk (Robert Horton) Date: Mon, 25 Mar 2019 09:52:18 +0000 Subject: [gpfsug-discuss] mmlsquota output In-Reply-To: <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> References: <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Message-ID: I don't know the answer to your actual question, but have you thought about using the REST-API rather than parsing the command outputs? I can send over the Python stuff we're using if you mail me off list. Rob On Mon, 2019-03-25 at 09:38 +0000, Peter Childs wrote: > Can someone tell me I'm not reading this wrong. > > This is using Spectrum Scale 5.0.2-1 > > It looks like the output from mmlsquota is not what it says > > In the man page it says, > > mmlsquota [-u User | -g Group] [-v | -q] [-e] [-C ClusterName] > [-Y] [--block-size {BlockSize | auto}] [Device[:Fileset] > ...] > > however > > mmlsquota -u username fs:fileset > > Return the output for every fileset, not just the "fileset" I've > asked > for, this is same output as > > mmlsquota -u username fs > > Where I've not said the fileset. > > I can work around this, but I'm just checking this is not actually a > bug, that ought to be fixed. > > Long story is that I'm working on rewriting our quota report util > that > used be a long bash/awk script into a more easy to understand python > script, and I want to get the user quota info for just one fileset. > > Thanks in advance. > > -- Robert Horton | Research Data Storage Lead The Institute of Cancer Research | 237 Fulham Road | London | SW3 6JB T +44 (0)20 7153 5350 | E robert.horton at icr.ac.uk | W www.icr.ac.uk | Twitter @ICR_London Facebook: www.facebook.com/theinstituteofcancerresearch The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From stockf at us.ibm.com Mon Mar 25 10:34:04 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Mon, 25 Mar 2019 10:34:04 +0000 Subject: [gpfsug-discuss] mmlsquota output In-Reply-To: References: , <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Mar 25 11:40:56 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 25 Mar 2019 11:40:56 +0000 Subject: [gpfsug-discuss] Reminder - Registration now open - US User Group Meeting, April 16-17th, NCAR Boulder In-Reply-To: References: Message-ID: <4AD46697-CA3E-4D7D-B73F-D33DCA66686D@nuance.com> 3 weeks away! Registration is now open: https://www.eventbrite.com/e/spectrum-scale-gpfs-user-group-us-spring-2019-meeting-tickets-57035376346 This is a FREE event and gives a great opportunity to interact with your peers and get direct access to the IBM team. - April 15th: Informal social gathering on Monday for those arriving early (location TBD) - April 16th: Full day of talks from IBM and the user community, Social and Networking Event (and breakfast, lunch) - April 17th: Talks and breakout sessions Looking forward to seeing everyone in Boulder! Bob Oesterlin/Kristy Kallback-Rose -------------- next part -------------- An HTML attachment was scrubbed... URL: From marc.caubet at psi.ch Tue Mar 26 15:39:30 2019 From: marc.caubet at psi.ch (Caubet Serrabou Marc (PSI)) Date: Tue, 26 Mar 2019 15:39:30 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Message-ID: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at gmail.com Tue Mar 26 16:08:47 2019 From: oehmes at gmail.com (Sven Oehme) Date: Tue, 26 Mar 2019 09:08:47 -0700 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Message-ID: I know this will be very confusing, but the code works different than one would think (not sure this is documented anywhere). The number of subblocks across pools of a fileystsem is calculated based on the smallest pools blocksize. So given you have a 1MB blocksize in the system pool you will end up with 128 subblocks, now you have a 2nd pool (data) which will inherit the 128 subblocks and the code calculates a subblock size of 128k (16M/128=128k). Sven From: on behalf of "Caubet Serrabou Marc (PSI)" Reply-To: gpfsug main discussion list Date: Tuesday, March 26, 2019 at 8:46 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhennecke at lenovo.com Tue Mar 26 15:56:37 2019 From: mhennecke at lenovo.com (Michael Hennecke) Date: Tue, 26 Mar 2019 15:56:37 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Message-ID: <3AACBCBC085C694CAF2FBC38C3B9DC48DDDA4661@usmailmbx04> Hi, you have two storage pools. The calculation of the number of subblocks is performed on the system pool with 1MiB blocksize --> subblock size of 8kiB --> 128 subblocks. The other pool inherits the "128 subblocks", as all pools in a filesystem will have the same number of subblocks. Mit freundlichen Gr?ssen / Best regards, Michael Hennecke HPC Chief Technologist - HPC and AI Business Unit -- Lenovo Global Technology (Germany) GmbH * Am Zehnthof 77 * D-45307 Essen * Germany Gesch?ftsf?hrung: Colm Gleeson, Christophe Laurent * Sitz der Gesellschaft: Stuttgart * HRB-Nr.: 758298, AG Stuttgart From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of Caubet Serrabou Marc (PSI) Sent: Tuesday, 26 March, 2019 16:40 To: gpfsug main discussion list Subject: [External] [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From alvise.dorigo at psi.ch Tue Mar 26 16:27:12 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Tue, 26 Mar 2019 16:27:12 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> Message-ID: <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> Hi Marc, "Indirect block size" is well explained in this presentation: http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf pages 37-41 Cheers, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou Marc (PSI) [marc.caubet at psi.ch] Sent: Tuesday, March 26, 2019 4:39 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch -------------- next part -------------- An HTML attachment was scrubbed... URL: From p.childs at qmul.ac.uk Wed Mar 27 09:09:20 2019 From: p.childs at qmul.ac.uk (Peter Childs) Date: Wed, 27 Mar 2019 09:09:20 +0000 Subject: [gpfsug-discuss] mmlsquota output In-Reply-To: References: <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Message-ID: <3c78ad05d319cdb56839a3e12407d645febbe255.camel@qmul.ac.uk> On Mon, 2019-03-25 at 09:52 +0000, Robert Horton wrote: > I don't know the answer to your actual question, but have you thought > about using the REST-API rather than parsing the command outputs? I > can > send over the Python stuff we're using if you mail me off list. Thanks, We don't currently run the REST-API, partly I've never got around to getting the monitoring overhead working, and working out which extra packages we need to go round our 300 nodes and install. Out cluster has been gradually upgraded over the years from 3.5 and we don't routinely install all the new packages the GUI needs on every node. It might be nice to see a list of which Spectrum Scale packages are needed for the different added value features in Scale. I'm currently working on re-writing the cli quota reporting program which was originally written in a combination of bash and awk. Its a strict Linux Cli util for reporting quota's and hence I'd prefer to avoid the overhead of using a Rest-API. With reference to the issue people reported not being able to run "mmlsfileset" as a user a few weeks ago, I've found a handy work-around using "mmlsattr" instead, and yes it does use the -Y flag all the time. I'd like to share the code, once its gone though some internal code review...... With reference to the other post, I will I think raise a PMR for this as it does not look like mmlsquota is working as documented. Thanks Peter Childs > > Rob > > On Mon, 2019-03-25 at 09:38 +0000, Peter Childs wrote: > > Can someone tell me I'm not reading this wrong. > > > > This is using Spectrum Scale 5.0.2-1 > > > > It looks like the output from mmlsquota is not what it says > > > > In the man page it says, > > > > mmlsquota [-u User | -g Group] [-v | -q] [-e] [-C ClusterName] > > [-Y] [--block-size {BlockSize | auto}] [Device[:Fileset] > > ...] > > > > however > > > > mmlsquota -u username fs:fileset > > > > Return the output for every fileset, not just the "fileset" I've > > asked > > for, this is same output as > > > > mmlsquota -u username fs > > > > Where I've not said the fileset. > > > > I can work around this, but I'm just checking this is not actually > > a > > bug, that ought to be fixed. > > > > Long story is that I'm working on rewriting our quota report util > > that > > used be a long bash/awk script into a more easy to understand > > python > > script, and I want to get the user quota info for just one > > fileset. > > > > Thanks in advance. > > > > -- Peter Childs ITS Research Storage Queen Mary, University of London From A.Wolf-Reber at de.ibm.com Wed Mar 27 11:51:50 2019 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Wed, 27 Mar 2019 11:51:50 +0000 Subject: [gpfsug-discuss] mmlsquota output In-Reply-To: <3c78ad05d319cdb56839a3e12407d645febbe255.camel@qmul.ac.uk> References: <3c78ad05d319cdb56839a3e12407d645febbe255.camel@qmul.ac.uk>, <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15536731432843.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15536731432844.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15536731432845.png Type: image/png Size: 1134 bytes Desc: not available URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Mar 27 14:32:46 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 27 Mar 2019 14:32:46 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> Message-ID: <6604491E-7A94-4EEA-937C-0AA719324F78@vanderbilt.edu> Hi All, So I was looking at the presentation referenced below and it states - on multiple slides - that there is one system storage pool per cluster. Really? Shouldn?t that be one system storage pool per filesystem?!? If not, please explain how in my GPFS cluster with two (local) filesystems I see two different system pools with two different sets of NSDs, two different capacities, and two different percentages full??? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 26, 2019, at 11:27 AM, Dorigo Alvise (PSI) > wrote: Hi Marc, "Indirect block size" is well explained in this presentation: http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf pages 37-41 Cheers, Alvise ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou Marc (PSI) [marc.caubet at psi.ch] Sent: Tuesday, March 26, 2019 4:39 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C5b28a9a0d39a47fd3f0608d6b208186a%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892145179836634&sdata=23F22sUiyCYEg0H3AdbkBAnhPpLVBVTh39zRr%2FLYCmc%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stockf at us.ibm.com Wed Mar 27 14:51:07 2019 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 27 Mar 2019 14:51:07 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <6604491E-7A94-4EEA-937C-0AA719324F78@vanderbilt.edu> References: <6604491E-7A94-4EEA-937C-0AA719324F78@vanderbilt.edu>, <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch><83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Mar 27 14:57:25 2019 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 27 Mar 2019 10:57:25 -0400 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> Message-ID: <6B2ACF47-294E-4EA7-BEF0-6FEC394CF57C@ulmer.org> This presentation contains lots of good information about file system structure in general, and GPFS in specific, and I appreciate that and enjoyed reading it. However, it states outright (both graphically and in text) that storage pools are a feature of the cluster, not of a file system ? which I believe to be completely incorrect. For example, it states that there is "only one system pool per cluster", rather than one per file system. Given that this was written by IBMers and presented at an actual users? group, can someone please weigh in on this? I?m asking because it represents a fundamental misunderstanding of a very basic GPFS concept, which makes me wonder how authoritative the rest of it is... -- Stephen > On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) > wrote: > > Hi Marc, > "Indirect block size" is well explained in this presentation: > > http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf > > pages 37-41 > > Cheers, > > Alvise > > From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org ] on behalf of Caubet Serrabou Marc (PSI) [marc.caubet at psi.ch ] > Sent: Tuesday, March 26, 2019 4:39 PM > To: gpfsug main discussion list > Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks > > Hi all, > > according to several GPFS presentations as well as according to the man pages: > > Table 1. Block sizes and subblock sizes > > +-------------------------------+-------------------------------+ > | Block size | Subblock size | > +-------------------------------+-------------------------------+ > | 64 KiB | 2 KiB | > +-------------------------------+-------------------------------+ > | 128 KiB | 4 KiB | > +-------------------------------+-------------------------------+ > | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | > | MiB, 4 MiB | | > +-------------------------------+-------------------------------+ > | 8 MiB, 16 MiB | 16 KiB | > +-------------------------------+-------------------------------+ > > A block size of 8MiB or 16MiB should contain subblocks of 16KiB. > > However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: > > [root at merlindssio01 ~]# mmlsfs merlin > flag value description > ------------------- ------------------------ ----------------------------------- > -f 8192 Minimum fragment (subblock) size in bytes (system pool) > 131072 Minimum fragment (subblock) size in bytes (other pools) > -i 4096 Inode size in bytes > -I 32768 Indirect block size in bytes > . > . > . > -n 128 Estimated number of nodes that will mount file system > -B 1048576 Block size (system pool) > 16777216 Block size (other pools) > . > . > . > > What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? > > On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? > > Thanks a lot and best regards, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > Building/Room: WHGA/019A > Forschungsstrasse, 111 > 5232 Villigen PSI > Switzerland > > Telephone: +41 56 310 46 67 > E-Mail: marc.caubet at psi.ch _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric.wonderley at vt.edu Wed Mar 27 15:20:58 2019 From: eric.wonderley at vt.edu (J. Eric Wonderley) Date: Wed, 27 Mar 2019 11:20:58 -0400 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <6B2ACF47-294E-4EA7-BEF0-6FEC394CF57C@ulmer.org> References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> <6B2ACF47-294E-4EA7-BEF0-6FEC394CF57C@ulmer.org> Message-ID: mmlspool might suggest there's only 1 system pool per cluster. We have 2 clusters and it has id=0 on both. One of our clusters has 2 filesystems that have same id for two different dataonly pools: [root at cl001 ~]# mmlspool home all Name Id system 0 fc_8T 65537 fc_ssd400G 65538 [root at cl001 ~]# mmlspool work all Name Id system 0 sas_6T 65537 I know md lives in the system pool and if you do encryption you can forget about putting data into you inodes for small files On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer wrote: > This presentation contains lots of good information about file system > structure in general, and GPFS in specific, and I appreciate that and > enjoyed reading it. > > However, it states outright (both graphically and in text) that storage > pools are a feature of the cluster, not of a file system ? which I believe > to be completely incorrect. For example, it states that there is "only one > system pool per cluster", rather than one per file system. > > Given that this was written by IBMers and presented at an actual users? > group, can someone please weigh in on this? I?m asking because it > represents a fundamental misunderstanding of a very basic GPFS concept, > which makes me wonder how authoritative the rest of it is... > > -- > Stephen > > > > On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) > wrote: > > Hi Marc, > "Indirect block size" is well explained in this presentation: > > > http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf > > pages 37-41 > > Cheers, > > Alvise > > ------------------------------ > *From:* gpfsug-discuss-bounces at spectrumscale.org [ > gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou > Marc (PSI) [marc.caubet at psi.ch] > *Sent:* Tuesday, March 26, 2019 4:39 PM > *To:* gpfsug main discussion list > *Subject:* [gpfsug-discuss] GPFS v5: Blocksizes and subblocks > > Hi all, > > according to several GPFS presentations as well as according to the man > pages: > > Table 1. Block sizes and subblock sizes > > +-------------------------------+-------------------------------+ > | Block size | Subblock size | > +-------------------------------+-------------------------------+ > | 64 KiB | 2 KiB | > +-------------------------------+-------------------------------+ > | 128 KiB | 4 KiB | > +-------------------------------+-------------------------------+ > | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | > | MiB, 4 MiB | | > +-------------------------------+-------------------------------+ > | 8 MiB, 16 MiB | 16 KiB | > +-------------------------------+-------------------------------+ > > A block size of 8MiB or 16MiB should contain subblocks of 16KiB. > > However, when creating a new filesystem with 16MiB blocksize, looks like > is using 128KiB subblocks: > > [root at merlindssio01 ~]# mmlsfs merlin > flag value description > ------------------- ------------------------ > ----------------------------------- > -f 8192 Minimum fragment (subblock) > size in bytes (system pool) > 131072 Minimum fragment (subblock) > size in bytes (other pools) > -i 4096 Inode size in bytes > -I 32768 Indirect block size in bytes > . > . > . > -n 128 Estimated number of nodes > that will mount file system > -B 1048576 Block size (system pool) > 16777216 Block size (other pools) > . > . > . > > What am I missing? According to documentation, I expect this to be a fixed > value, or it isn't at all? > > On the other hand, I don't really understand the concept 'Indirect block > size in bytes', can somebody clarify or provide some details about this > setting? > > Thanks a lot and best regards, > Marc > _________________________________________ > Paul Scherrer Institut > High Performance Computing > Marc Caubet Serrabou > Building/Room: WHGA/019A > Forschungsstrasse, 111 > 5232 Villigen PSI > Switzerland > > Telephone: +41 56 310 46 67 > E-Mail: marc.caubet at psi.ch > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Mar 27 15:52:43 2019 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 27 Mar 2019 11:52:43 -0400 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch> <83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch> <6B2ACF47-294E-4EA7-BEF0-6FEC394CF57C@ulmer.org> Message-ID: <52575FD1-82B7-48BF-A0FD-04C783C99B8F@ulmer.org> Hmmm? I was going to ask what structures are actually shared by "two" pools that are in different file systems, and you provided the answer before I asked. So all disks which are labelled with a particular storage pool name share some metadata: pool id, the name, possibly other items. I was confused because the NSD is labelled with the pool when it?s added to the file system ? not when it?s created. So I thought that the pool was a property of a disk+fs, not the NSD itself. The more I talk this out the more I think that pools aren?t real, but just another label that happens to be orthogonal to all of the other labels: Only disks have pools ? NSDs do not, because there is no way to give them one at creation time. Disks are NSDs that are in file systems. A disk is in exactly one file system. All disks that have the same "pool name" will have the same "pool id", and possibly other pool-related metadata. It appears that the disks in a pool have absolutely nothing in common other than that they have been labelled as being in the same pool when added to a file system, right? I mean, literally everything but the pool name/id could be different ? or is there more there? Do we do anything to pools outside of the context of a file system? Even when we list them we have to provide a file system. Does GPFS keep statistics about pools that aren?t related to file systems? (I love learning things, even when I look like an idiot?) -- Stephen > On Mar 27, 2019, at 11:20 AM, J. Eric Wonderley > wrote: > > mmlspool might suggest there's only 1 system pool per cluster. We have 2 clusters and it has id=0 on both. > > One of our clusters has 2 filesystems that have same id for two different dataonly pools: > [root at cl001 ~]# mmlspool home all > Name Id > system 0 > fc_8T 65537 > fc_ssd400G 65538 > [root at cl001 ~]# mmlspool work all > Name Id > system 0 > sas_6T 65537 > > I know md lives in the system pool and if you do encryption you can forget about putting data into you inodes for small files > > > > On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer > wrote: > This presentation contains lots of good information about file system structure in general, and GPFS in specific, and I appreciate that and enjoyed reading it. > > However, it states outright (both graphically and in text) that storage pools are a feature of the cluster, not of a file system ? which I believe to be completely incorrect. For example, it states that there is "only one system pool per cluster", rather than one per file system. > > Given that this was written by IBMers and presented at an actual users? group, can someone please weigh in on this? I?m asking because it represents a fundamental misunderstanding of a very basic GPFS concept, which makes me wonder how authoritative the rest of it is... > > -- > Stephen > > > >> On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) > wrote: >> >> Hi Marc, >> "Indirect block size" is well explained in this presentation: >> >> http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf >> >> pages 37-41 >> >> Cheers, >> >> Alvise >> >> From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org ] on behalf of Caubet Serrabou Marc (PSI) [marc.caubet at psi.ch ] >> Sent: Tuesday, March 26, 2019 4:39 PM >> To: gpfsug main discussion list >> Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks >> >> Hi all, >> >> according to several GPFS presentations as well as according to the man pages: >> >> Table 1. Block sizes and subblock sizes >> >> +-------------------------------+-------------------------------+ >> | Block size | Subblock size | >> +-------------------------------+-------------------------------+ >> | 64 KiB | 2 KiB | >> +-------------------------------+-------------------------------+ >> | 128 KiB | 4 KiB | >> +-------------------------------+-------------------------------+ >> | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | >> | MiB, 4 MiB | | >> +-------------------------------+-------------------------------+ >> | 8 MiB, 16 MiB | 16 KiB | >> +-------------------------------+-------------------------------+ >> >> A block size of 8MiB or 16MiB should contain subblocks of 16KiB. >> >> However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: >> >> [root at merlindssio01 ~]# mmlsfs merlin >> flag value description >> ------------------- ------------------------ ----------------------------------- >> -f 8192 Minimum fragment (subblock) size in bytes (system pool) >> 131072 Minimum fragment (subblock) size in bytes (other pools) >> -i 4096 Inode size in bytes >> -I 32768 Indirect block size in bytes >> . >> . >> . >> -n 128 Estimated number of nodes that will mount file system >> -B 1048576 Block size (system pool) >> 16777216 Block size (other pools) >> . >> . >> . >> >> What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? >> >> On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? >> >> Thanks a lot and best regards, >> Marc >> _________________________________________ >> Paul Scherrer Institut >> High Performance Computing >> Marc Caubet Serrabou >> Building/Room: WHGA/019A >> Forschungsstrasse, 111 >> 5232 Villigen PSI >> Switzerland >> >> Telephone: +41 56 310 46 67 >> E-Mail: marc.caubet at psi.ch _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Mar 27 15:59:17 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 27 Mar 2019 15:59:17 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL Message-ID: Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Mar 27 16:02:48 2019 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 27 Mar 2019 16:02:48 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: Message-ID: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [1553012336789_download] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Mar 27 16:19:03 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 27 Mar 2019 16:19:03 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> Message-ID: Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan > wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From TOMP at il.ibm.com Wed Mar 27 16:19:40 2019 From: TOMP at il.ibm.com (Tomer Perry) Date: Wed, 27 Mar 2019 16:19:40 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: <52575FD1-82B7-48BF-A0FD-04C783C99B8F@ulmer.org> References: <0081EB235765E14395278B9AE1DF34180A8403EB@MBX214.d.ethz.ch><83A6EEB0EC738F459A39439733AE8045268438A9@MBX214.d.ethz.ch><6B2ACF47-294E-4EA7-BEF0-6FEC394CF57C@ulmer.org> <52575FD1-82B7-48BF-A0FD-04C783C99B8F@ulmer.org> Message-ID: Hi, Not sure how will it work over the mailing list... Since its a popular question, I've prepared a slide explaining all of that - ( pasted/attached below, but I'll try to explain in text as well...). On the right we can see the various "layers": - OS disks ( what looks to the OS/GPFS as a physical disk) - its properties are size, media, device name etc. ( we actually won't know what media means, but we don't really care) - NSD: When introduced to GPFS, so later on we can use it for "something". Two interesting properties at this stage: name and through which servers we can get to it... - FS disk: When NSD is being added to a filesystem, then we start caring about stuff like type ( data, metadata, data+metadata, desconly etc.), to what pool we add the disk, what failure groups etc. That's true on a per filesystem basis. With the exception that nsd name must be unique across the cluster. All the rest is in a filesystem context. So: - Each filesystem will have its own "system pool" which will store that filesystem metadata ( can also store data - which of course belong to that filesystem..not others...not the cluster) - Pool exist just because several filesystem disks were told that they belong to that pool ( and hopefully there is some policy that brings data to that pool). And since filesystem disks, exist only in the context of their filesystem - a pool exist inside a single filesystem only ( other filesystems might have their own pools of course). Regards, Tomer Perry Scalable I/O Development (Spectrum Scale) email: tomp at il.ibm.com 1 Azrieli Center, Tel Aviv 67021, Israel Global Tel: +1 720 3422758 Israel Tel: +972 3 9188625 Mobile: +972 52 2554625 From: Stephen Ulmer To: gpfsug main discussion list Date: 27/03/2019 17:53 Subject: Re: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Sent by: gpfsug-discuss-bounces at spectrumscale.org Hmmm? I was going to ask what structures are actually shared by "two" pools that are in different file systems, and you provided the answer before I asked. So all disks which are labelled with a particular storage pool name share some metadata: pool id, the name, possibly other items. I was confused because the NSD is labelled with the pool when it?s added to the file system ? not when it?s created. So I thought that the pool was a property of a disk+fs, not the NSD itself. The more I talk this out the more I think that pools aren?t real, but just another label that happens to be orthogonal to all of the other labels: Only disks have pools ? NSDs do not, because there is no way to give them one at creation time. Disks are NSDs that are in file systems. A disk is in exactly one file system. All disks that have the same "pool name" will have the same "pool id", and possibly other pool-related metadata. It appears that the disks in a pool have absolutely nothing in common other than that they have been labelled as being in the same pool when added to a file system, right? I mean, literally everything but the pool name/id could be different ? or is there more there? Do we do anything to pools outside of the context of a file system? Even when we list them we have to provide a file system. Does GPFS keep statistics about pools that aren?t related to file systems? (I love learning things, even when I look like an idiot?) -- Stephen On Mar 27, 2019, at 11:20 AM, J. Eric Wonderley wrote: mmlspool might suggest there's only 1 system pool per cluster. We have 2 clusters and it has id=0 on both. One of our clusters has 2 filesystems that have same id for two different dataonly pools: [root at cl001 ~]# mmlspool home all Name Id system 0 fc_8T 65537 fc_ssd400G 65538 [root at cl001 ~]# mmlspool work all Name Id system 0 sas_6T 65537 I know md lives in the system pool and if you do encryption you can forget about putting data into you inodes for small files On Wed, Mar 27, 2019 at 10:57 AM Stephen Ulmer wrote: This presentation contains lots of good information about file system structure in general, and GPFS in specific, and I appreciate that and enjoyed reading it. However, it states outright (both graphically and in text) that storage pools are a feature of the cluster, not of a file system ? which I believe to be completely incorrect. For example, it states that there is "only one system pool per cluster", rather than one per file system. Given that this was written by IBMers and presented at an actual users? group, can someone please weigh in on this? I?m asking because it represents a fundamental misunderstanding of a very basic GPFS concept, which makes me wonder how authoritative the rest of it is... -- Stephen On Mar 26, 2019, at 12:27 PM, Dorigo Alvise (PSI) wrote: Hi Marc, "Indirect block size" is well explained in this presentation: http://files.gpfsug.org/presentations/2016/south-bank/D2_P2_A_spectrum_scale_metadata_dark_V2a.pdf pages 37-41 Cheers, Alvise From: gpfsug-discuss-bounces at spectrumscale.org [ gpfsug-discuss-bounces at spectrumscale.org] on behalf of Caubet Serrabou Marc (PSI) [marc.caubet at psi.ch] Sent: Tuesday, March 26, 2019 4:39 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks Hi all, according to several GPFS presentations as well as according to the man pages: Table 1. Block sizes and subblock sizes +-------------------------------+-------------------------------+ | Block size | Subblock size | +-------------------------------+-------------------------------+ | 64 KiB | 2 KiB | +-------------------------------+-------------------------------+ | 128 KiB | 4 KiB | +-------------------------------+-------------------------------+ | 256 KiB, 512 KiB, 1 MiB, 2 | 8 KiB | | MiB, 4 MiB | | +-------------------------------+-------------------------------+ | 8 MiB, 16 MiB | 16 KiB | +-------------------------------+-------------------------------+ A block size of 8MiB or 16MiB should contain subblocks of 16KiB. However, when creating a new filesystem with 16MiB blocksize, looks like is using 128KiB subblocks: [root at merlindssio01 ~]# mmlsfs merlin flag value description ------------------- ------------------------ ----------------------------------- -f 8192 Minimum fragment (subblock) size in bytes (system pool) 131072 Minimum fragment (subblock) size in bytes (other pools) -i 4096 Inode size in bytes -I 32768 Indirect block size in bytes . . . -n 128 Estimated number of nodes that will mount file system -B 1048576 Block size (system pool) 16777216 Block size (other pools) . . . What am I missing? According to documentation, I expect this to be a fixed value, or it isn't at all? On the other hand, I don't really understand the concept 'Indirect block size in bytes', can somebody clarify or provide some details about this setting? Thanks a lot and best regards, Marc _________________________________________ Paul Scherrer Institut High Performance Computing Marc Caubet Serrabou Building/Room: WHGA/019A Forschungsstrasse, 111 5232 Villigen PSI Switzerland Telephone: +41 56 310 46 67 E-Mail: marc.caubet at psi.ch _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=mLPyKeOa1gNDrORvEXBgMw&m=bg9EailWWZuz9EdTQO1uOk21naHNDRFX4LSAi9ehmXU&s=fwy_H6JVRBfBQJWU_LfPyKtSsHaKnuRJ9DO-ghnKIaM&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 30588 bytes Desc: not available URL: From olaf.weiser at de.ibm.com Wed Mar 27 16:36:18 2019 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 27 Mar 2019 17:36:18 +0100 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> Message-ID: An HTML attachment was scrubbed... URL: From cblack at nygenome.org Wed Mar 27 16:07:04 2019 From: cblack at nygenome.org (Christopher Black) Date: Wed, 27 Mar 2019 16:07:04 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL Message-ID: <45C03412-61C6-4F19-BAB8-FF0143786044@nygenome.org> I don?t have a solution, just similar experience with mmputacl vs setfacl. IMO, needing to dump and reapply full ACLs rather than just specifying what is to be added is one of a few reasons mmputacl is inferior to setfacl. We do all our extended ACL manipulation with setfacl from a gpfs native client and keep filesystem acl sematics set to -k all rather than -k nfs4. I?d see if you can use setfacl or nfs4_setfacl. This might not work for your use case. Best, Chris From: on behalf of "Buterbaugh, Kevin L" Reply-To: gpfsug main discussion list Date: Wednesday, March 27, 2019 at 11:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Adding to an existing GPFS ACL Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 ________________________________ This message is for the recipient?s use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Mar 27 16:33:07 2019 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 27 Mar 2019 16:33:07 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org>, Message-ID: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [1553012336789_download] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan > wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Wed Mar 27 16:52:37 2019 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Wed, 27 Mar 2019 16:52:37 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> Message-ID: <5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> Hi Jonathan, Thanks. We have done a very similar thing when we?re dealing with a situation where: 1) all files and directories in the fileset are starting out with the same existing ACL, and 2) all need the same modification made to them. Unfortunately, in this situation item 2 is true, but item 1 is _not_. That?s what?s making this one a bit thorny? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:33 AM, Fosburgh,Jonathan > wrote: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan > wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C06b6070313d74610e17208d6b2d34b57%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636893017903174312&sdata=OX51kSL5fs8CqW9u0y7MK1omYGqkx%2F3K%2Bwvn9iKjFM8%3D&reserved=0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Mar 27 16:53:08 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Wed, 27 Mar 2019 16:53:08 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: <45C03412-61C6-4F19-BAB8-FF0143786044@nygenome.org> References: <45C03412-61C6-4F19-BAB8-FF0143786044@nygenome.org> Message-ID: Unless you have CES and SMB in which case you have to set -k nfs4. Well technically you can set it, create a share and then set it back. But then you can't create more shares. AFAIK SMB actually understands the POSIX ACL and represents it to Windows in some way (just don't try and change it from Windows). Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of cblack at nygenome.org [cblack at nygenome.org] Sent: 27 March 2019 16:07 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL I don?t have a solution, just similar experience with mmputacl vs setfacl. IMO, needing to dump and reapply full ACLs rather than just specifying what is to be added is one of a few reasons mmputacl is inferior to setfacl. We do all our extended ACL manipulation with setfacl from a gpfs native client and keep filesystem acl sematics set to -k all rather than -k nfs4. I?d see if you can use setfacl or nfs4_setfacl. This might not work for your use case. Best, Chris From: on behalf of "Buterbaugh, Kevin L" Reply-To: gpfsug main discussion list Date: Wednesday, March 27, 2019 at 11:59 AM To: gpfsug main discussion list Subject: [gpfsug-discuss] Adding to an existing GPFS ACL Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 ________________________________ This message is for the recipient?s use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. From ckerner at illinois.edu Wed Mar 27 16:54:38 2019 From: ckerner at illinois.edu (Kerner, Chad A) Date: Wed, 27 Mar 2019 16:54:38 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL Message-ID: I have a python module that I am nearing the completion of for a project that wraps all of that. It also contains another python script for the easy manipulation of the ACLs from the command line. Once I have that wraped up, hopefully this week, I would be happy to share. Chad -- Chad Kerner ? ckerner at illinois.edu Senior Storage Engineer, Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From: on behalf of "Fosburgh,Jonathan" Reply-To: gpfsug main discussion list Date: Wednesday, March 27, 2019 at 11:13 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 Error! Filename not specified. ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Wed Mar 27 16:59:18 2019 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Wed, 27 Mar 2019 16:59:18 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: <5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> , <5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> Message-ID: <056864a79b2443499efc8b6ffc769013@mdanderson.org> This is going to be difficult, regardless of the tool used. And it's made worse by inheritance not flowing automatically to existing files and directories. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [1553012336789_download] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 11:52:37 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks. We have done a very similar thing when we?re dealing with a situation where: 1) all files and directories in the fileset are starting out with the same existing ACL, and 2) all need the same modification made to them. Unfortunately, in this situation item 2 is true, but item 1 is _not_. That?s what?s making this one a bit thorny? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:33 AM, Fosburgh,Jonathan > wrote: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan > wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [X] ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C06b6070313d74610e17208d6b2d34b57%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636893017903174312&sdata=OX51kSL5fs8CqW9u0y7MK1omYGqkx%2F3K%2Bwvn9iKjFM8%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfalk at us.ibm.com Wed Mar 27 17:04:23 2019 From: nfalk at us.ibm.com (Nathan Falk) Date: Wed, 27 Mar 2019 12:04:23 -0500 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: <5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> <5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> Message-ID: Hello Kevin, No, you're not missing something. GPFS doesn't provide a means of recursively modifying ACLs. It's not even all that easy to just modify one ACL for one file (it's either mmeditacl, or mmgetacl > /tmp/acl.txt; vi /tmp/acl.txt; mmputacl -i /tmp/acl.txt). I've had a few queries along these lines over the years and decided to publish a little bit of a guide here: https://www-prd-trops.events.ibm.com/node/how-recursively-set-nfsv4-acls-gpfs-filesystem There's a sample script there for the recursive part, but that would still have to be tweaked in your case to append just a single ACE to the existing ACL rather than replace the whole ACL. Or as others have noted, export the fileset via NFS and go to an NFS client and use nfs4_setfacl instead. Thanks, Nate Falk IBM Spectrum Scale Level 2 Support Software Defined Infrastructure, IBM Systems E-mail: nfalk at us.ibm.com Find me on: From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 03/27/2019 12:53 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Jonathan, Thanks. We have done a very similar thing when we?re dealing with a situation where: 1) all files and directories in the fileset are starting out with the same existing ACL, and 2) all need the same modification made to them. Unfortunately, in this situation item 2 is true, but item 1 is _not_. That?s what?s making this one a bit thorny? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:33 AM, Fosburgh,Jonathan wrote: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org < gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org < gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C06b6070313d74610e17208d6b2d34b57%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636893017903174312&sdata=OX51kSL5fs8CqW9u0y7MK1omYGqkx%2F3K%2Bwvn9iKjFM8%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p3ZFejMgr8nrtvkuBSxsXg&m=tWa7c7_Nu1t7-zUozpFd8c1XSV7N7TShOBelxQS3POM&s=Q_tZmc5wSfixdoNnqTzBUuG9b4iW2vMUOUHy7DZXdRU&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From nfalk at us.ibm.com Wed Mar 27 17:07:29 2019 From: nfalk at us.ibm.com (Nathan Falk) Date: Wed, 27 Mar 2019 12:07:29 -0500 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org><5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> Message-ID: I think I gave an internal link. Try this instead: http://www.ibm.com/support/docview.wss?uid=ibm10716323 Nate Falk IBM Spectrum Scale Level 2 Support Software Defined Infrastructure, IBM Systems E-mail: nfalk at us.ibm.com Find me on: From: "Nathan Falk" To: gpfsug main discussion list Date: 03/27/2019 01:04 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Kevin, No, you're not missing something. GPFS doesn't provide a means of recursively modifying ACLs. It's not even all that easy to just modify one ACL for one file (it's either mmeditacl, or mmgetacl > /tmp/acl.txt; vi /tmp/acl.txt; mmputacl -i /tmp/acl.txt). I've had a few queries along these lines over the years and decided to publish a little bit of a guide here: https://www-prd-trops.events.ibm.com/node/how-recursively-set-nfsv4-acls-gpfs-filesystem There's a sample script there for the recursive part, but that would still have to be tweaked in your case to append just a single ACE to the existing ACL rather than replace the whole ACL. Or as others have noted, export the fileset via NFS and go to an NFS client and use nfs4_setfacl instead. Thanks, Nate Falk IBM Spectrum Scale Level 2 Support Software Defined Infrastructure, IBM Systems E-mail:nfalk at us.ibm.com Find me on: From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 03/27/2019 12:53 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Jonathan, Thanks. We have done a very similar thing when we?re dealing with a situation where: 1) all files and directories in the fileset are starting out with the same existing ACL, and 2) all need the same modification made to them. Unfortunately, in this situation item 2 is true, but item 1 is _not_. That?s what?s making this one a bit thorny? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 On Mar 27, 2019, at 11:33 AM, Fosburgh,Jonathan wrote: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org< gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING:This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org< gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING:This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C06b6070313d74610e17208d6b2d34b57%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636893017903174312&sdata=OX51kSL5fs8CqW9u0y7MK1omYGqkx%2F3K%2Bwvn9iKjFM8%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=p3ZFejMgr8nrtvkuBSxsXg&m=3civslLJ9p1g1obgFb08ZEV5pKUtHmsZfA1sB23rrOA&s=jEVB15lqgaHC0sRH4P3BNVs0PlGUHVPDWML3oS_xZBo&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From chetkulk at in.ibm.com Wed Mar 27 18:24:26 2019 From: chetkulk at in.ibm.com (Chetan R Kulkarni) Date: Wed, 27 Mar 2019 23:54:26 +0530 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org><5328E360-D085-4C98-965B-76B95ADFFB42@vanderbilt.edu> Message-ID: Hi Kevin, Small script herewith (append.acl.sh ) - appends one group ace (append.acl) to all the files/dirs under . You may try it for small directory first to check it's usefulness in your case. (tried along the same lines as discussed by others - mmgetacl, append then mmputacl). $ cat append.acl # add ace as per your setup in this file group:bgroup1:r-x-:allow (X)READ/LIST (-)WRITE/CREATE (-)APPEND/MKDIR (X)SYNCHRONIZE (X)READ_ACL (X)READ_ATTR (X)READ_NAMED (-)DELETE (-)DELETE_CHILD (-)CHOWN (X)EXEC/SEARCH (-)WRITE_ACL (-)WRITE_ATTR (-)WRITE_NAMED $ cat append.acl.sh [[ $# -eq 1 ]] && dir=$1 || { echo "Usage: ./append.acl.sh "; exit 1; } appendAclFile="/tmp/append.acl" newAclFile="/tmp/new.acl" cd $dir for filename in $(find -follow | grep -v ^.$) do echo "Applying ACL to $filename..." mmgetacl -k nfs4 $filename -o $newAclFile cat $appendAclFile >> $newAclFile mmputacl $filename -i $newAclFile done rm -f $newAclFile $ chmod +x append.acl.sh $ ./append.acl.sh Usage: ./append.acl.sh $ time ./append.acl.sh /ibm/fs1/fset2 Applying ACL to ./dir30... Applying ACL to ./dir30/file10... Applying ACL to ./dir30/file9... ... ... $ Thanks, Chetan. From: "Nathan Falk" To: gpfsug main discussion list Date: 03/27/2019 10:37 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org I think I gave an internal link. Try this instead: http://www.ibm.com/support/docview.wss?uid=ibm10716323 Nate Falk IBM Spectrum Scale Level 2 Support Software Defined Infrastructure, IBM Systems IBM E-mail:nfalk at us.ibm.com Find me on:LinkedIn: https://www.linkedin.com/in/nathan-falk-078ba5125 Twitter: https://twitter.com/natefalk922 From: "Nathan Falk" To: gpfsug main discussion list Date: 03/27/2019 01:04 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org Hello Kevin, No, you're not missing something. GPFS doesn't provide a means of recursively modifying ACLs. It's not even all that easy to just modify one ACL for one file (it's either mmeditacl, or mmgetacl > /tmp/acl.txt; vi /tmp/acl.txt; mmputacl -i /tmp/acl.txt). I've had a few queries along these lines over the years and decided to publish a little bit of a guide here: https://www-prd-trops.events.ibm.com/node/how-recursively-set-nfsv4-acls-gpfs-filesystem There's a sample script there for the recursive part, but that would still have to be tweaked in your case to append just a single ACE to the existing ACL rather than replace the whole ACL. Or as others have noted, export the fileset via NFS and go to an NFS client and use nfs4_setfacl instead. Thanks, Nate Falk IBM Spectrum Scale Level 2 Support Software Defined Infrastructure, IBM Systems IBM E-mail:nfalk at us.ibm.com Find me on:LinkedIn: https://www.linkedin.com/in/nathan-falk-078ba5125 Twitter: https://twitter.com/natefalk922 From: "Buterbaugh, Kevin L" To: gpfsug main discussion list Date: 03/27/2019 12:53 PM Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Jonathan, Thanks. We have done a very similar thing when we?re dealing with a situation where: 1) all files and directories in the fileset are starting out with the same existing ACL, and 2) all need the same modification made to them. Unfortunately, in this situation item 2 is true, but item 1 is _not_. That?s what?s making this one a bit thorny? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 On Mar 27, 2019, at 11:33 AM, Fosburgh,Jonathan wrote: I misunderstood you. Pretty much what we've been doing is maintaining "ACL template" files based on how our filesystem hierarchy is set up. Basically, fileset foo has a foo.acl file that contains what the ACL is supposed to be. If we need to change the ACL, we modify that file with the new ACL and then pass it through a simple (and expensive, I'm sure) script. This wouldn't be necessary if in heritance flowed down on existing files and directories. If you have CIFS access, you can also use Windows to do this, but it is MUCH slower. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org< gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 11:19:03 AM To: gpfsug main discussion list Subject: [EXT] Re: [gpfsug-discuss] Adding to an existing GPFS ACL WARNING:This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi Jonathan, Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. Am I missing something? Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan wrote: Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 From: gpfsug-discuss-bounces at spectrumscale.org< gpfsug-discuss-bounces at spectrumscale.org> on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING:This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu- (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7C06b6070313d74610e17208d6b2d34b57%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636893017903174312&sdata=OX51kSL5fs8CqW9u0y7MK1omYGqkx%2F3K%2Bwvn9iKjFM8%3D&reserved=0 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=uic-29lyJ5TCiTRi0FyznYhKJx5I7Vzu80WyYuZ4_iM&m=ivmdoowntUbUm9ifHIf9wdvGUMfmSn_5krX1obsqqkU&s=3VRVobm0YuPyznasor5EQsdASSWQHckCwSfoY6FBg3I&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 18498113.jpg Type: image/jpeg Size: 518 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 18442256.jpg Type: image/jpeg Size: 638 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 18963353.gif Type: image/gif Size: 1851 bytes Desc: not available URL: From INDULISB at uk.ibm.com Wed Mar 27 18:31:24 2019 From: INDULISB at uk.ibm.com (Indulis Bernsteins1) Date: Wed, 27 Mar 2019 18:31:24 +0000 Subject: [gpfsug-discuss] GPFS v5: Blocksizes and subblocks In-Reply-To: References: Message-ID: I'm the author of the presentation. I'll bow to Tomer's knowledge about how the internals of Spectrum Scale (GPFS) work. I've been working with GPFS since V1.3 so it was a bit of a shock to think I had a fundamental misunderstanding. In this case both viewpoints are actually equivalent because of the way Spectrum Scale works. Both ways of visualising what happens work in exactly the same way from a "user perspective". The 2 actions of allocating an NSD into a filesystem, and also allocating it into a storage pool occur as part of the same single atomic transaction. An NSD is either in both a filesystem and a storage pool, or it is in neither. You can visualise one part of the operation first- "allocate NSD into filesystem"- and then second part of the operation is"allocate into System storage pool within the filesystem" (Stephen's perspective). Or you can visualise the actions happening the other way around "allocate NSD into System storage pool within the cluster", then "allocate into filesystem" (Indulis' perspective). The output of mmdf always made me think of it in this way. Because the 2 transactions on the NSD- allocate to filesystem and allocate to storage pool- are atomic, and there is a 1:1 mapping in each operation, who cares? I can take the viewpoint that the NSD goes into a cluster-wide System pool, or someone else can take the view that there is a System pool per filesystem. There is no external way to distinguish which is right or wrong. The "visual and mental models" are different but it makes no nevermind in terms of how things work. Though now that I have had to think about it, it is simpler to visualise each filesystem having its own System pool, and the fact that Tomer says this is how it works internally is a good reason to change the visualisation as well :-D Regards, Indulis Bernsteins Systems Architect IBM New Generation Storage Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From ulmer at ulmer.org Wed Mar 27 18:47:24 2019 From: ulmer at ulmer.org (Stephen Ulmer) Date: Wed, 27 Mar 2019 14:47:24 -0400 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: <131058852bb14b529e7fa2bf6244b837@mdanderson.org> Message-ID: mmeditacl passes a temporary file containing the ACLs to $EDITOR. You can write $EDITOR if you want. :) -- Stephen > On Mar 27, 2019, at 12:19 PM, Buterbaugh, Kevin L > wrote: > > Hi Jonathan, > > Thanks for the response. I did look at mmeditacl, but unless I?m missing something it?s interactive (kind of like mmedquota is by default). If I had only a handful of files / directories to modify that would be fine, but in this case there are thousands of ACL?s that need modifying. > > Am I missing something? Thanks? > > Kevin > > ? > Kevin Buterbaugh - Senior System Administrator > Vanderbilt University - Advanced Computing Center for Research and Education > Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 > >> On Mar 27, 2019, at 11:02 AM, Fosburgh,Jonathan > wrote: >> >> Try mmeditacl. >> >> -- >> Jonathan Fosburgh >> Principal Application Systems Analyst >> IT Operations Storage Team >> The University of Texas MD Anderson Cancer Center >> (713) 745-9346 >> >> From: gpfsug-discuss-bounces at spectrumscale.org > on behalf of Buterbaugh, Kevin L > >> Sent: Wednesday, March 27, 2019 10:59:17 AM >> To: gpfsug main discussion list >> Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL >> >> WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. >> Hi All, >> >> First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? >> >> We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. >> >> So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! >> >> Thanks? >> >> Kevin >> ? >> Kevin Buterbaugh - Senior System Administrator >> Vanderbilt University - Advanced Computing Center for Research and Education >> Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 >> >> The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=02%7C01%7CKevin.Buterbaugh%40vanderbilt.edu%7Cb2040f23087c4aac0b4908d6b2cf11ed%7Cba5a7f39e3be4ab3b45067fa80faecad%7C0%7C1%7C636892999763011551&sdata=pXhLlRfQuJ4bKfib4bQBlWY4OP5WoZh1YQ%2Bjne2ycEY%3D&reserved=0 > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Wed Mar 27 22:58:15 2019 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Wed, 27 Mar 2019 22:58:15 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: Message-ID: On 27/03/2019 15:59, Buterbaugh, Kevin L wrote: [SNIP] > So am I missing something? Nope you are not missing anything. Setting NFSv4 ACL's on GPFS on *LINUX* has always been a steaming pile of Brontosaurus droppings. I have been on about since 2011... Search the mailing list archives. > ?Is there an easier solution than writing a > script which recurses over the fileset, gets the existing ACL with > mmgetacl and outputs that to a file, edits that file to add in the new > group, and passes that as input to mmputacl? ?That seems very cumbersome > and error prone, especially if I?m the one writing the script! > The best option is to get yourself a pSeries machine, install AIX and GPFS and use the native AIX ACL command to set the ACL's. This works because AIX has a mechanism for passing NFSv4 ACL's through it's VFS interface. The RichACL kernel patches for Linux to give it the same functionality went nowhere. Noting that the XFS and JFS file systems, internally have NFSv4 ACL support. The next best option is to export it as an NSFv4 file system and use a Linux/FreeBSD machine to set the ACL's (a Mac might even work). Expect performance to not be great. The next best option is to do an SMB export, mount it on Linux and use setcifsacl or map it on Windows and use cacls command. Some experimentation on working out exactly how NFSv4 ACLS get mapped to Windows ACLS would be advisable before a mass apply though. I don't think it is possible to set all NFSv4 ACL options using this method. Probably the best option, but which is not publicly available is to use my modified version of the Linux nfs4_setacl command :-) You just modify nfs4_acl_for_path.c and nfs4_set_acl.c so they read/write the GPFS ACL struct and convert between the GPFS representation and the internal data structure used by the nfs4-acl-tools to hold NFSv4 ACL's. However I have not put it any where public because the GPFS API documentation is incomplete when it comes to ACL's. Consequently I can't be sure it is safe so I am not releasing it. I have two questions that I would like answering before I make it public. I will ask them for the third time, in hopes someone at IBM is actually listening. 1. What's the purpose of a special flag to indicate that it is smbd setting the ACL? Does this tie in with the undocumented "mmchfs -k samba" feature? 2. There is a whole bunch of stuff in the documentation about v4.1 ACL's. How does one trigger that. All I seem to be able to do is get POSIX and v4 ACL's. Do you get v4.1 ACL's if you set the file system to "Samba" ACL's or am I missing something. The other option is to write a script. Personally I would use Perl/Python rather than a shell script as it would be easier to read the result of mmgetacl into a buffer, append the extra bits and write it out again with mmputacl. It is horribly slow however if you have millions of files to iterate over. Trust me back in 2011 I had Perl scripts for setting ACL's. The final option though not quick would be for IBM to actually implement a mmsetfacl command. Surely it would not be too hard to take the code from AIX and modify the bits that set ACL's to use the GPFS API. Alternatively take the FreeBSD ACL commands and use them as a starting point. However I would not hold your breath for IBM if you expect them to fix the situation. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From A.Wolf-Reber at de.ibm.com Thu Mar 28 08:24:52 2019 From: A.Wolf-Reber at de.ibm.com (Alexander Wolf) Date: Thu, 28 Mar 2019 08:24:52 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15537580138880.png Type: image/png Size: 1134 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15537580138881.png Type: image/png Size: 6645 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.15537580138882.png Type: image/png Size: 1134 bytes Desc: not available URL: From alvise.dorigo at psi.ch Thu Mar 28 09:01:21 2019 From: alvise.dorigo at psi.ch (Dorigo Alvise (PSI)) Date: Thu, 28 Mar 2019 09:01:21 +0000 Subject: [gpfsug-discuss] Getting which files are store fully in inodes Message-ID: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> Hello, to get the list (and size) of files that fit into inodes what I do, using a policy, is listing "online" (not evicted) files that have zero allocated KB. Is this correct or there could be some exception I'm missing ? Does it exists a smarter/faster way ? thanks, Alvise -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.horton at icr.ac.uk Thu Mar 28 11:21:20 2019 From: robert.horton at icr.ac.uk (Robert Horton) Date: Thu, 28 Mar 2019 11:21:20 +0000 Subject: [gpfsug-discuss] mmlsquota output In-Reply-To: References: <3c78ad05d319cdb56839a3e12407d645febbe255.camel@qmul.ac.uk> , <245fe541e001b27016ea13287cee72e930330977.camel@qmul.ac.uk> Message-ID: On Wed, 2019-03-27 at 11:51 +0000, Alexander Wolf wrote: The requirements for GUI & ReST API aren't actually that dramatic. It boils down to three things: 1) CCR. This is part of the base package but you need to migrate you config from server based to CCR which comes with the added benefit that your cluster config is now truly HA. 2) mmhealth/mmsysmonc. This comes with the base package as well. 3) mmperfmon. This comes with the pm-sensor packages that need to be distributed accross all nodes. And the pm-collector that sits on the GUI node (in large configuration you might want to have more than one collector). So from a package point of view it is basically just the pm-sensor package that needs to be installed all accross your cluster. It's possibly worth point out you don't actually *need* the mmhealth/mmperfmon stuff to make the GUI work if you just want the basic API functionality (for managing filesets/quotas etc) - although obviously the actual GUI won't look as attractive in that case. Rob -- Robert Horton | Research Data Storage Lead The Institute of Cancer Research | 237 Fulham Road | London | SW3 6JB T +44 (0)20 7153 5350 | E robert.horton at icr.ac.uk | W www.icr.ac.uk | Twitter @ICR_London Facebook: www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jfosburg at mdanderson.org Thu Mar 28 11:55:05 2019 From: jfosburg at mdanderson.org (Fosburgh,Jonathan) Date: Thu, 28 Mar 2019 11:55:05 +0000 Subject: [gpfsug-discuss] [EXT] Re: Adding to an existing GPFS ACL In-Reply-To: References: , Message-ID: Sometimes, you just need the right channels in order to get IBM to implement changes.... -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 [1553012336789_download] The final option though not quick would be for IBM to actually implement a mmsetfacl command. Surely it would not be too hard to take the code from AIX and modify the bits that set ACL's to use the GPFS API. Alternatively take the FreeBSD ACL commands and use them as a starting point. However I would not hold your breath for IBM if you expect them to fix the situation. The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Thu Mar 28 13:08:58 2019 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 28 Mar 2019 14:08:58 +0100 Subject: [gpfsug-discuss] Getting which files are store fully in inodes In-Reply-To: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> Message-ID: An HTML attachment was scrubbed... URL: From janfrode at tanso.net Thu Mar 28 14:06:40 2019 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 28 Mar 2019 15:06:40 +0100 Subject: [gpfsug-discuss] Getting which files are store fully in inodes In-Reply-To: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> References: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> Message-ID: I've been looking for a good way of listing this as well. Could you please share your policy ? -jf On Thu, Mar 28, 2019 at 1:52 PM Dorigo Alvise (PSI) wrote: > Hello, > to get the list (and size) of files that fit into inodes what I do, using > a policy, is listing "online" (not evicted) files that have zero allocated > KB. > Is this correct or there could be some exception I'm missing ? > Does it exists a smarter/faster way ? > > thanks, > > Alvise > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Mar 28 15:09:51 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 28 Mar 2019 10:09:51 -0500 Subject: [gpfsug-discuss] Getting which files are store fully in inodes In-Reply-To: References: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> Message-ID: This will select files that have some data, but it must all be in the inode, because no (part of) any data block has been assigned ... WHERE KB_ALLOCATED=0 AND FILE_SIZE>0 From: Jan-Frode Myklebust To: gpfsug main discussion list Date: 03/28/2019 10:08 AM Subject: Re: [gpfsug-discuss] Getting which files are store fully in inodes Sent by: gpfsug-discuss-bounces at spectrumscale.org I've been looking for a good way of listing this as well. Could you please share your policy ? -jf On Thu, Mar 28, 2019 at 1:52 PM Dorigo Alvise (PSI) wrote: Hello, to get the list (and size) of files that fit into inodes what I do, using a policy, is listing "online" (not evicted) files that have zero allocated KB. Is this correct or there could be some exception I'm missing ? Does it exists a smarter/faster way ? thanks, Alvise _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=cvpnBBH0j41aQy0RPiG2xRL_M8mTc1izuQD3_PmtjZ8&m=FJOFQY-uW5quZzSVAkmAGgRYQM6vt1fxlIgTdGe3QkE&s=IQr4_65VYJTwHvgili5gUV-d6ieys7IhsLBq5Aofg0U&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Thu Mar 28 15:23:03 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Thu, 28 Mar 2019 10:23:03 -0500 Subject: [gpfsug-discuss] Getting which files are store fully in inodes In-Reply-To: References: <83A6EEB0EC738F459A39439733AE804526844181@MBX214.d.ethz.ch> Message-ID: ... WHERE KB_ALLOCATED=0 AND FILE_SIZE>0 Oh, if you are also working with an HSM or HSM-like manager that can migrate files -- then you might have to add some additional tests... -------------- next part -------------- An HTML attachment was scrubbed... URL: From christof.schmitt at us.ibm.com Thu Mar 28 16:56:31 2019 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Thu, 28 Mar 2019 16:56:31 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From L.R.Sudbery at bham.ac.uk Thu Mar 28 17:43:03 2019 From: L.R.Sudbery at bham.ac.uk (Luke Sudbery) Date: Thu, 28 Mar 2019 17:43:03 +0000 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR Message-ID: We have a 2 site Lenovo DSS-G based filesystem, which with (some of the data) replicated across the 2 sites. We'd like to a 3rd filesystem descriptor disk so we can lose one site and still have filesystem descriptor quorum. But this seems incompatible unless we make new vdisks - which we can't do on non native raid servers. We've added a new NSD which now has a free disc, but can't add that disk as descOnly. Adding the new descriptor disk to the system pool says: mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk NSDs. Adding the new disk to a new system_desc pool says: mmadddisk: Disk usage descOnly is incompatible with storage pool system_desc. Tried adding config to specify the system_desc pool is for metadataOnly and it still says it's incompatible. There is no mention of descOnly disks here: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don't work on Monday and work from home on Friday. From janfrode at tanso.net Thu Mar 28 18:33:17 2019 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Thu, 28 Mar 2019 19:33:17 +0100 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR In-Reply-To: References: Message-ID: There seems to be some changes or bug here.. But try usage=dataOnly pool=neverused failureGroup=xx.. and it should have the same function as long as you never place anything in this pool. -jf tor. 28. mar. 2019 kl. 18:43 skrev Luke Sudbery : > We have a 2 site Lenovo DSS-G based filesystem, which with (some of the > data) replicated across the 2 sites. We'd like to a 3rd filesystem > descriptor disk so we can lose one site and still have filesystem > descriptor quorum. But this seems incompatible unless we make new vdisks - > which we can't do on non native raid servers. > > We've added a new NSD which now has a free disc, but can't add that disk > as descOnly. > > Adding the new descriptor disk to the system pool says: > mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk > NSDs. > > Adding the new disk to a new system_desc pool says: > mmadddisk: Disk usage descOnly is incompatible with storage pool > system_desc. > > Tried adding config to specify the system_desc pool is for metadataOnly > and it still says it's incompatible. > > There is no mention of descOnly disks here: > https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm > > Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? > > Cheers, > > Luke > > -- > Luke Sudbery > Architecture, Infrastructure and Systems > Advanced Research Computing, IT Services > Room 103, Computer Centre G5, Elms Road > > Please note I don't work on Monday and work from home on Friday. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From scale at us.ibm.com Thu Mar 28 20:08:40 2019 From: scale at us.ibm.com (IBM Spectrum Scale) Date: Thu, 28 Mar 2019 15:08:40 -0500 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR In-Reply-To: References: Message-ID: This is a known issue. The workaround is to use --force-nsd-mismatch option. Just make sure that the failure group is different from those used by the vdisk NSDs Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. From: Luke Sudbery To: "gpfsug-discuss at spectrumscale.org" Date: 03/28/2019 01:45 PM Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR Sent by: gpfsug-discuss-bounces at spectrumscale.org We have a 2 site Lenovo DSS-G based filesystem, which with (some of the data) replicated across the 2 sites. We'd like to a 3rd filesystem descriptor disk so we can lose one site and still have filesystem descriptor quorum. But this seems incompatible unless we make new vdisks - which we can't do on non native raid servers. We've added a new NSD which now has a free disc, but can't add that disk as descOnly. Adding the new descriptor disk to the system pool says: mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk NSDs. Adding the new disk to a new system_desc pool says: mmadddisk: Disk usage descOnly is incompatible with storage pool system_desc. Tried adding config to specify the system_desc pool is for metadataOnly and it still says it's incompatible. There is no mention of descOnly disks here: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don't work on Monday and work from home on Friday. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx-siA1ZOg&r=IbxtjdkPAM2Sbon4Lbbi4w&m=pr7qVunPxXhHi14ZqTSfbZUgVIws-B0IFhNx3TcDovI&s=JWDWojN5xrI2A3WX4AgcN-oxj0UMb89RGCFgNZjNxDk&e= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From L.R.Sudbery at bham.ac.uk Thu Mar 28 23:01:24 2019 From: L.R.Sudbery at bham.ac.uk (Luke Sudbery) Date: Thu, 28 Mar 2019 23:01:24 +0000 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR In-Reply-To: References: Message-ID: Thanks, that worked. Will this be addresses in a future update? Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don't work on Monday and work from home on Friday. From: Truong Vu On Behalf Of scale at us.ibm.com Sent: 28 March 2019 20:09 To: gpfsug main discussion list ; Luke Sudbery (IT Research Support) Subject: Re: [gpfsug-discuss] Filesystem descriptor discs for GNR This is a known issue. The workaround is to use --force-nsd-mismatch option. Just make sure that the failure group is different from those used by the vdisk NSDs Regards, The Spectrum Scale (GPFS) team ------------------------------------------------------------------------------------------------------------------ If you feel that your question can benefit other users of Spectrum Scale (GPFS), then please post it to the public IBM developerWroks Forum at https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479. If your query concerns a potential software error in Spectrum Scale (GPFS) and you have an IBM software maintenance contract please contact 1-800-237-5511 in the United States or your local IBM Service Center in other countries. The forum is informally monitored as time permits and should not be used for priority messages to the Spectrum Scale (GPFS) team. [Inactive hide details for Luke Sudbery ---03/28/2019 01:45:16 PM---We have a 2 site Lenovo DSS-G based filesystem, which with (]Luke Sudbery ---03/28/2019 01:45:16 PM---We have a 2 site Lenovo DSS-G based filesystem, which with (some of the data) replicated across the From: Luke Sudbery > To: "gpfsug-discuss at spectrumscale.org" > Date: 03/28/2019 01:45 PM Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR Sent by: gpfsug-discuss-bounces at spectrumscale.org ________________________________ We have a 2 site Lenovo DSS-G based filesystem, which with (some of the data) replicated across the 2 sites. We'd like to a 3rd filesystem descriptor disk so we can lose one site and still have filesystem descriptor quorum. But this seems incompatible unless we make new vdisks - which we can't do on non native raid servers. We've added a new NSD which now has a free disc, but can't add that disk as descOnly. Adding the new descriptor disk to the system pool says: mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk NSDs. Adding the new disk to a new system_desc pool says: mmadddisk: Disk usage descOnly is incompatible with storage pool system_desc. Tried adding config to specify the system_desc pool is for metadataOnly and it still says it's incompatible. There is no mention of descOnly disks here: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don't work on Monday and work from home on Friday. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 105 bytes Desc: image001.gif URL: From L.R.Sudbery at bham.ac.uk Thu Mar 28 23:03:52 2019 From: L.R.Sudbery at bham.ac.uk (Luke Sudbery) Date: Thu, 28 Mar 2019 23:03:52 +0000 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR In-Reply-To: References: Message-ID: Thanks ? I haven?t tried this, but docs for mmadddisk say ?Only the system storage pool can contain metadataOnly, dataAndMetadata, or descOnly disks.? actually. Truong?s --force-nsd-mismatch option worked. Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don?t work on Monday and work from home on Friday. From: gpfsug-discuss-bounces at spectrumscale.org On Behalf Of janfrode at tanso.net Sent: 28 March 2019 18:33 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Filesystem descriptor discs for GNR There seems to be some changes or bug here.. But try usage=dataOnly pool=neverused failureGroup=xx.. and it should have the same function as long as you never place anything in this pool. -jf tor. 28. mar. 2019 kl. 18:43 skrev Luke Sudbery >: We have a 2 site Lenovo DSS-G based filesystem, which with (some of the data) replicated across the 2 sites. We'd like to a 3rd filesystem descriptor disk so we can lose one site and still have filesystem descriptor quorum. But this seems incompatible unless we make new vdisks - which we can't do on non native raid servers. We've added a new NSD which now has a free disc, but can't add that disk as descOnly. Adding the new descriptor disk to the system pool says: mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk NSDs. Adding the new disk to a new system_desc pool says: mmadddisk: Disk usage descOnly is incompatible with storage pool system_desc. Tried adding config to specify the system_desc pool is for metadataOnly and it still says it's incompatible. There is no mention of descOnly disks here: https://www.ibm.com/support/knowledgecenter/en/SSFKCN_4.1.0/com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? Cheers, Luke -- Luke Sudbery Architecture, Infrastructure and Systems Advanced Research Computing, IT Services Room 103, Computer Centre G5, Elms Road Please note I don't work on Monday and work from home on Friday. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From ckerner at illinois.edu Fri Mar 29 08:05:15 2019 From: ckerner at illinois.edu (Kerner, Chad A) Date: Fri, 29 Mar 2019 08:05:15 +0000 Subject: [gpfsug-discuss] Adding to an existing GPFS ACL In-Reply-To: References: Message-ID: I got this code cleaned up a little bit and posted the initial version out to https://github.com/ckerner/ssacl.git . There are detailed examples in the README, but I listed a few quick ones below. I will be merging in the default ACL code, recursion, and backup/restoration of ACL branches hopefully over the next few days. Usage Examples: - List the ACLs on a file > ssacl --list /data/acl/testfile - Set the ACL to the contents of a specified ACL file. > ssacl --set -f acl.testfile /data/acl/testfile - Add a user ACL to a file > ssacl --add -u ckerner -a='rwx-' /data/acl/testfile - Add a group ACL to a file > ssacl --add -g nfsnobody -a='r-x-' /data/acl/testfile - Clear the ACLs on a file, leaving the permissions alone. > ssacl --clear /data/acl/testfile - Clear the ACLs on a file and reset the permissions to 760: > ssacl --clear -U=rwxc --GID=r-x- -O=---- /data/acl/testfile - Delete a user ACL to a file > ssacl --del -u ckerner /data/acl/testfile - Delete a group ACL to a file > ssacl --del -g nfsnobody /data/acl/testfile Chad -- Chad Kerner ? ckerner at illinois.edu Senior Storage Engineer, Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From: "Kerner, Chad A" Date: Wednesday, March 27, 2019 at 11:53 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL I have a python module that I am nearing the completion of for a project that wraps all of that. It also contains another python script for the easy manipulation of the ACLs from the command line. Once I have that wraped up, hopefully this week, I would be happy to share. Chad -- Chad Kerner ? ckerner at illinois.edu Senior Storage Engineer, Storage Enabling Technologies National Center for Supercomputing Applications University of Illinois, Urbana-Champaign From: on behalf of "Fosburgh,Jonathan" Reply-To: gpfsug main discussion list Date: Wednesday, March 27, 2019 at 11:13 AM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Adding to an existing GPFS ACL Try mmeditacl. -- Jonathan Fosburgh Principal Application Systems Analyst IT Operations Storage Team The University of Texas MD Anderson Cancer Center (713) 745-9346 Error! Filename not specified. ________________________________ From: gpfsug-discuss-bounces at spectrumscale.org on behalf of Buterbaugh, Kevin L Sent: Wednesday, March 27, 2019 10:59:17 AM To: gpfsug main discussion list Subject: [EXT] [gpfsug-discuss] Adding to an existing GPFS ACL WARNING: This email originated from outside of MD Anderson. Please validate the sender's email address before clicking on links or attachments as they may not be safe. Hi All, First off, I have very limited experience with GPFS ACL?s, so please forgive me if I?m missing something obvious here. AFAIK, this is the first time we?ve hit something like this? We have a fileset where all the files / directories have GPFS NFSv4 ACL?s set on them. However, unlike most of our filesets where the same ACL is applied to every file / directory in the share, this one has different ACL?s on different files / directories. Now we have the need to add to the existing ACL?s ? another group needs access. Unlike regular Unix / Linux ACL?s where setfacl can be used to just add to an ACL (i.e. setfacl -R g:group_name:rwx), I?m not seeing where GPFS has a similar command ? i.e. mmputacl seems to expect the _entire_ new ACL to be supplied via either manual entry or an input file. That?s obviously problematic in this scenario. So am I missing something? Is there an easier solution than writing a script which recurses over the fileset, gets the existing ACL with mmgetacl and outputs that to a file, edits that file to add in the new group, and passes that as input to mmputacl? That seems very cumbersome and error prone, especially if I?m the one writing the script! Thanks? Kevin ? Kevin Buterbaugh - Senior System Administrator Vanderbilt University - Advanced Computing Center for Research and Education Kevin.Buterbaugh at vanderbilt.edu - (615)875-9633 The information contained in this e-mail message may be privileged, confidential, and/or protected from disclosure. This e-mail message may contain protected health information (PHI); dissemination of PHI should comply with applicable federal and state laws. If you are not the intended recipient, or an authorized representative of the intended recipient, any further review, disclosure, use, dissemination, distribution, or copying of this message or any attachment (or the information contained therein) is strictly prohibited. If you think that you have received this e-mail message in error, please notify the sender by return e-mail and delete all references to it and its contents from your systems. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Fri Mar 29 12:22:07 2019 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Fri, 29 Mar 2019 12:22:07 +0000 Subject: [gpfsug-discuss] IBM ESS: Error on ConnectX-4 card, "Power budget Exceeded" Message-ID: <1737B795-61D9-4A65-9219-4BB3CEBB0D93@nuance.com> Anyone come across this? Or know what I might look at to fix it? I did find a reference online for ConnectX-5 cards, but nothing for X-4. The SFP (Cisco SFP-10G-SR) is certified by Mellanox to work. [ 3.807332] mlx5_core 0000:01:00.1: Port module event[error]: module 1, Cable error, Power budget exceeded [ 7.454895] mlx5_core 0002:01:00.1: Port module event[error]: module 1, Cable error, Power budget exceeded [ 1585.090315] mlx5_core 0002:01:00.0: Port module event[error]: module 0, Cable error, Power budget exceeded [ 1610.688397] mlx5_core 0000:01:00.0: Port module event[error]: module 0, Cable error, Power budget exceeded Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan.buzzard at strath.ac.uk Fri Mar 29 12:50:35 2019 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 29 Mar 2019 12:50:35 +0000 Subject: [gpfsug-discuss] IBM ESS: Error on ConnectX-4 card, "Power budget Exceeded" In-Reply-To: <1737B795-61D9-4A65-9219-4BB3CEBB0D93@nuance.com> References: <1737B795-61D9-4A65-9219-4BB3CEBB0D93@nuance.com> Message-ID: <9160c3539a8c9206376b350a1f2f5325eb9cd022.camel@strath.ac.uk> On Fri, 2019-03-29 at 12:22 +0000, Oesterlin, Robert wrote: > Anyone come across this? Or know what I might look at to fix it? I > did find a reference online for ConnectX-5 cards, but nothing for X- > 4. The SFP (Cisco SFP-10G-SR) is certified by Mellanox to work. > > [ 3.807332] mlx5_core 0000:01:00.1: Port module event[error]: > module 1, Cable error, Power budget exceeded > [ 7.454895] mlx5_core 0002:01:00.1: Port module event[error]: > module 1, Cable error, Power budget exceeded > [ 1585.090315] mlx5_core 0002:01:00.0: Port module event[error]: > module 0, Cable error, Power budget exceeded > [ 1610.688397] mlx5_core 0000:01:00.0: Port module event[error]: > module 0, Cable error, Power budget exceeded > My understanding is that the ConnectX-4 cards where QSFP28 not SFP? Well at least ours are and all the documentation says the same. Presumably you have some sort of adaptor to make that work, could that be the issue? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From S.J.Thompson at bham.ac.uk Fri Mar 29 12:54:10 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Fri, 29 Mar 2019 12:54:10 +0000 Subject: [gpfsug-discuss] IBM ESS: Error on ConnectX-4 card, "Power budget Exceeded" In-Reply-To: <9160c3539a8c9206376b350a1f2f5325eb9cd022.camel@strath.ac.uk> References: <1737B795-61D9-4A65-9219-4BB3CEBB0D93@nuance.com>, <9160c3539a8c9206376b350a1f2f5325eb9cd022.camel@strath.ac.uk> Message-ID: You mean like the Mellanox QSA? https://store.mellanox.com/products/mellanox-mam1q00a-qsa-sp-single-pack-mam1q00a-qsa-ethernet-cable-adapter-40gb-s-to-10gb-s-qsfp-to-sfp.html We use hundreds of these in CX-4 cards. But not with the SFP+ Bob mentioned, we normally use breakout cables from the SN2100 switches. Simon _______________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Jonathan Buzzard [jonathan.buzzard at strath.ac.uk] Sent: 29 March 2019 12:50 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] IBM ESS: Error on ConnectX-4 card, "Power budget Exceeded" On Fri, 2019-03-29 at 12:22 +0000, Oesterlin, Robert wrote: > Anyone come across this? Or know what I might look at to fix it? I > did find a reference online for ConnectX-5 cards, but nothing for X- > 4. The SFP (Cisco SFP-10G-SR) is certified by Mellanox to work. > > [ 3.807332] mlx5_core 0000:01:00.1: Port module event[error]: > module 1, Cable error, Power budget exceeded > [ 7.454895] mlx5_core 0002:01:00.1: Port module event[error]: > module 1, Cable error, Power budget exceeded > [ 1585.090315] mlx5_core 0002:01:00.0: Port module event[error]: > module 0, Cable error, Power budget exceeded > [ 1610.688397] mlx5_core 0000:01:00.0: Port module event[error]: > module 0, Cable error, Power budget exceeded > My understanding is that the ConnectX-4 cards where QSFP28 not SFP? Well at least ours are and all the documentation says the same. Presumably you have some sort of adaptor to make that work, could that be the issue? JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From duersch at us.ibm.com Fri Mar 29 14:54:08 2019 From: duersch at us.ibm.com (Steve Duersch) Date: Fri, 29 Mar 2019 09:54:08 -0500 Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR In-Reply-To: References: Message-ID: Yes, this will be fixed in the next ESS release. Steve Duersch Spectrum Scale/ESS IBM Poughkeepsie, New York gpfsug-discuss-bounces at spectrumscale.org wrote on 03/28/2019 07:01:26 PM: > > Message: 4 > Date: Thu, 28 Mar 2019 23:01:24 +0000 > From: Luke Sudbery > To: "scale at us.ibm.com" , gpfsug main discussion list > > Subject: Re: [gpfsug-discuss] Filesystem descriptor discs for GNR > Message-ID: > > Content-Type: text/plain; charset="us-ascii" > > Thanks, that worked. > > Will this be addresses in a future update? > > Cheers, > > Luke > > -- > Luke Sudbery > Architecture, Infrastructure and Systems > Advanced Research Computing, IT Services > Room 103, Computer Centre G5, Elms Road > > Please note I don't work on Monday and work from home on Friday. > > From: Truong Vu On Behalf Of scale at us.ibm.com > Sent: 28 March 2019 20:09 > To: gpfsug main discussion list ; > Luke Sudbery (IT Research Support) > Subject: Re: [gpfsug-discuss] Filesystem descriptor discs for GNR > > > This is a known issue. The workaround is to use --force-nsd-mismatch > option. Just make sure that the failure group is different from > those used by the vdisk NSDs > > Regards, The Spectrum Scale (GPFS) team > > ------------------------------------------------------------------------------------------------------------------ > If you feel that your question can benefit other users of Spectrum > Scale (GPFS), then please post it to the public IBM developerWroks Forum at > https://www.ibm.com/developerworks/community/forums/html/forum? > id=11111111-0000-0000-0000-000000000479. > > If your query concerns a potential software error in Spectrum Scale > (GPFS) and you have an IBM software maintenance contract please > contact 1-800-237-5511 in the United States or your local IBM > Service Center in other countries. > > The forum is informally monitored as time permits and should not be > used for priority messages to the Spectrum Scale (GPFS) team. > > [Inactive hide details for Luke Sudbery ---03/28/2019 01:45:16 PM--- > We have a 2 site Lenovo DSS-G based filesystem, which with (]Luke > Sudbery ---03/28/2019 01:45:16 PM---We have a 2 site Lenovo DSS-G > based filesystem, which with (some of the data) replicated across the > > From: Luke Sudbery > > To: "gpfsug-discuss at spectrumscale.org >" > > Date: 03/28/2019 01:45 PM > Subject: [gpfsug-discuss] Filesystem descriptor discs for GNR > Sent by: gpfsug-discuss-bounces at spectrumscale.org discuss-bounces at spectrumscale.org> > > ________________________________ > > > > We have a 2 site Lenovo DSS-G based filesystem, which with (some of > the data) replicated across the 2 sites. We'd like to a 3rd > filesystem descriptor disk so we can lose one site and still have > filesystem descriptor quorum. But this seems incompatible unless we > make new vdisks - which we can't do on non native raid servers. > > We've added a new NSD which now has a free disc, but can't add that > disk as descOnly. > > Adding the new descriptor disk to the system pool says: > mmadddisk: A storage pool may not contain both vdisk NSDs and non-vdisk NSDs. > > Adding the new disk to a new system_desc pool says: > mmadddisk: Disk usage descOnly is incompatible with storage pool system_desc. > > Tried adding config to specify the system_desc pool is for > metadataOnly and it still says it's incompatible. > > There is no mention of descOnly disks here: https://www.ibm.com/ > support/knowledgecenter/en/SSFKCN_4.1.0/ > com.ibm.cluster.gpfs.v4r1.gpfs200.doc/bl1adv_planning.htm > > Is it possible add a non-vdisk descOnly NSD to a dss-g/GNR solution? > > Cheers, > > Luke > > -- > Luke Sudbery > Architecture, Infrastructure and Systems > Advanced Research Computing, IT Services > Room 103, Computer Centre G5, Elms Road > > Please note I don't work on Monday and work from home on Friday. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=i9wd-Alr3Ob6d70ipuobv4HluF-A_L8fotaxDbaUusQ&m=RS2O8L4G7- > l266MoVBkRNGQjapCFZuheqRvoEG9xoiQ&s=Ol12Q__YLpxI_KprxbT_wjm2g7ObK2g_hXUn3WmpLw0&e= > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20190328_a569dbad_attachment.html&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=i9wd-Alr3Ob6d70ipuobv4HluF-A_L8fotaxDbaUusQ&m=RS2O8L4G7- > l266MoVBkRNGQjapCFZuheqRvoEG9xoiQ&s=IR6LgLZTDNs87fORc9RtsmNMPIFcGdNXYeqFlnMe0L8&e= > > > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image001.gif > Type: image/gif > Size: 105 bytes > Desc: image001.gif > URL: u=http-3A__gpfsug.org_pipermail_gpfsug-2Ddiscuss_attachments_20190328_a569dbad_attachment.gif&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=i9wd-Alr3Ob6d70ipuobv4HluF-A_L8fotaxDbaUusQ&m=RS2O8L4G7- > l266MoVBkRNGQjapCFZuheqRvoEG9xoiQ&s=onn- > LxPodPw4pzzTt79rj0V5MFxfk8luNCGIbplikyw&e=> > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > https://urldefense.proofpoint.com/v2/url? > u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=jf_iaSHvJObTbx- > siA1ZOg&r=i9wd-Alr3Ob6d70ipuobv4HluF-A_L8fotaxDbaUusQ&m=RS2O8L4G7- > l266MoVBkRNGQjapCFZuheqRvoEG9xoiQ&s=Ol12Q__YLpxI_KprxbT_wjm2g7ObK2g_hXUn3WmpLw0&e= > > > End of gpfsug-discuss Digest, Vol 86, Issue 62 > ********************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mark.Bush at siriuscom.com Fri Mar 29 16:10:57 2019 From: Mark.Bush at siriuscom.com (Mark Bush) Date: Fri, 29 Mar 2019 16:10:57 +0000 Subject: [gpfsug-discuss] A net new cluster Message-ID: So I have an ESS today and it's nearing end of life (just our own timeline/depreciation etc) and I will be purchasing a new ESS. I'm working through the logistics of this. Here is my thinking so far: This is just a big Data Lake and not an HPC environment Option A Purchase new ESS and set it up as a new cluster Remote mount old cluster and copy data (rsync or AFM) Option B Purchase new ESS and set it up to join the old cluster same filesystems Evacuate old NSDs and expel Old nodes Option C Purchase new ESS and set it up as a new cluster Point all new data to new cluster and age out old cluster over time (leverage TCT or Archive) Any/All feedback welcome Mark This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, notify us immediately by telephone and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cblack at nygenome.org Fri Mar 29 16:37:04 2019 From: cblack at nygenome.org (Christopher Black) Date: Fri, 29 Mar 2019 16:37:04 +0000 Subject: [gpfsug-discuss] A net new cluster Message-ID: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> I suggest option A. We are facing a similar transition and are going with a new cluster and then 4.x cluster to 5.x cluster migration of existing data. An extra wrinkle for us is we are going to join some of the old hardware to the new cluster once it is free of serving current data. Main reasoning of the new cluster for us is to be able to make a fully V19+ filesystem with sub-block allocation. Our understanding from talking to IBM is that there is no way to upgrade a pool to be SBA-compatible, nor is it advisable to try to create a new pool or filesystem in same cluster and then migrate (partially because migrating between filesystems within a cluster with afm would require going through nfs stack afaik). Option B would be much less work and work fine if you are not concerned with things like sub-block allocation that can?t be gained easily with an in-place upgrade. Best, Chris From: on behalf of Mark Bush Reply-To: gpfsug main discussion list Date: Friday, March 29, 2019 at 12:11 PM To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] A net new cluster So I have an ESS today and it?s nearing end of life (just our own timeline/depreciation etc) and I will be purchasing a new ESS. I?m working through the logistics of this. Here is my thinking so far: This is just a big Data Lake and not an HPC environment Option A Purchase new ESS and set it up as a new cluster Remote mount old cluster and copy data (rsync or AFM) Option B Purchase new ESS and set it up to join the old cluster same filesystems Evacuate old NSDs and expel Old nodes Option C Purchase new ESS and set it up as a new cluster Point all new data to new cluster and age out old cluster over time (leverage TCT or Archive) Any/All feedback welcome Mark This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law or may constitute as attorney work product. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, notify us immediately by telephone and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. ________________________________ This message is for the recipient?s use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cowan at bnl.gov Fri Mar 29 17:03:25 2019 From: cowan at bnl.gov (Matt Cowan) Date: Fri, 29 Mar 2019 13:03:25 -0400 (EDT) Subject: [gpfsug-discuss] A net new cluster In-Reply-To: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> Message-ID: On Fri, 29 Mar 2019, Christopher Black wrote: ... > Main reasoning of the new cluster for us is to be able to make a fully V19+ filesystem with > sub-block allocation. > > Our understanding from talking to IBM is that there is no way to upgrade a pool to be > SBA-compatible, nor is it advisable to try to create a new pool or filesystem in same > cluster and then migrate (partially because migrating between filesystems within a cluster > with afm would require going through nfs stack afaik). ... Could you just use policy to migrate to the new pool? no afm required. "partially because"... what are the other reasons? From cblack at nygenome.org Fri Mar 29 17:45:40 2019 From: cblack at nygenome.org (Christopher Black) Date: Fri, 29 Mar 2019 17:45:40 +0000 Subject: [gpfsug-discuss] A net new cluster In-Reply-To: References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> Message-ID: <3C834EA5-1235-4FD3-832A-015A52DDF0AC@nygenome.org> Our understanding is there is no supported way to create an SBA-enabled pool other than as part of a filesystem that is already at a sufficient 5.x level (we've heard this a couple times in discussions with multiple IBMers). The other reasons for us include the perceived difficulty of in-place ESS upgrades compared to reprovisioning them as fresh building blocks in a new cluster. We've complained about how labor-intensive ESS upgrades are before - last time we did an upgrade there were over 15 manual steps per ess io node (many with wait 10m+ time between them and a non-automated check step) in addition to the 25+ steps on the management server. This becomes a multi-week project when you have 20+ ess io nodes and can only do 2-3 a day with careful babysitting. We'd be very interested in hearing the ess upgrade experiences from other large sites, but perhaps this is diluting the original purpose of the thread. Best, Chris ?On 3/29/19, 1:30 PM, "gpfsug-discuss-bounces at spectrumscale.org on behalf of Matt Cowan" wrote: On Fri, 29 Mar 2019, Christopher Black wrote: ... > Main reasoning of the new cluster for us is to be able to make a fully V19+ filesystem with > sub-block allocation. > > Our understanding from talking to IBM is that there is no way to upgrade a pool to be > SBA-compatible, nor is it advisable to try to create a new pool or filesystem in same > cluster and then migrate (partially because migrating between filesystems within a cluster > with afm would require going through nfs stack afaik). ... Could you just use policy to migrate to the new pool? no afm required. "partially because"... what are the other reasons? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org https://urldefense.proofpoint.com/v2/url?u=http-3A__gpfsug.org_mailman_listinfo_gpfsug-2Ddiscuss&d=DwICAg&c=C9X8xNkG_lwP_-eFHTGejw&r=DopWM-bvfskhBn2zeglfyyw5U2pumni6m_QzQFYFepU&m=7duFmPP7emxED3rEn1IeLfCZkkUT683Cuz4xrNm83lM&s=KjMtVfNRTl4P2CiSBqR3LYx8fdyjxw5Yh4e3oSBaWTc&e= ________________________________ This message is for the recipient?s use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. From makaplan at us.ibm.com Fri Mar 29 19:04:53 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 29 Mar 2019 14:04:53 -0500 Subject: [gpfsug-discuss] A net new cluster In-Reply-To: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> Message-ID: I don't know the particulars of the case in question, nor much about ESS rules... But for a vanilla Spectrum Scale cluster -. 1) There is nothing wrong or ill-advised about upgrading software and then creating a new version 5.x file system... keeping any older file systems in place. 2) I thought AFM was improved years ago to support GPFS native access -- need not go through NFS stack...? Whereas your wrote: ... nor is it advisable to try to create a new pool or filesystem in same cluster and then migrate (partially because migrating between filesystems within a cluster with afm would require going through nfs stack afaik) ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From cblack at nygenome.org Fri Mar 29 19:13:19 2019 From: cblack at nygenome.org (Christopher Black) Date: Fri, 29 Mar 2019 19:13:19 +0000 Subject: [gpfsug-discuss] A net new cluster In-Reply-To: References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> Message-ID: <0A0CFBD4-35C3-4334-B3FA-7DD7DD4AF7E9@nygenome.org> I was under the impression that AFM could not move between filesystems in the same cluster without going through NFS, but perhaps that is outdated. We?ve only used it in the past to move data between clusters. Could someone with more experience with AFM within a cluster comment? Our goal is to keep the same namespace/path for the users (and ideally keep the same filesystem name) by switching all clients to point to new cluster after a subset of (active) data had been migrated. Best, Chris From: on behalf of Marc A Kaplan Reply-To: gpfsug main discussion list Date: Friday, March 29, 2019 at 3:05 PM To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] A net new cluster I don't know the particulars of the case in question, nor much about ESS rules... But for a vanilla Spectrum Scale cluster -. 1) There is nothing wrong or ill-advised about upgrading software and then creating a new version 5.x file system... keeping any older file systems in place. 2) I thought AFM was improved years ago to support GPFS native access -- need not go through NFS stack...? Whereas your wrote: ... nor is it advisable to try to create a new pool or filesystem in same cluster and then migrate (partially because migrating between filesystems within a cluster with afm would require going through nfs stack afaik) ... ________________________________ This message is for the recipient?s use only, and may contain confidential, privileged or protected information. Any unauthorized use or dissemination of this communication is prohibited. If you received this message in error, please immediately notify the sender and destroy all copies of this message. The recipient should check this email and any attachments for the presence of viruses, as we accept no liability for any damage caused by any virus transmitted by this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From makaplan at us.ibm.com Fri Mar 29 19:58:21 2019 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Fri, 29 Mar 2019 14:58:21 -0500 Subject: [gpfsug-discuss] A net new cluster In-Reply-To: <0A0CFBD4-35C3-4334-B3FA-7DD7DD4AF7E9@nygenome.org> References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> <0A0CFBD4-35C3-4334-B3FA-7DD7DD4AF7E9@nygenome.org> Message-ID: If one googles "GPFS AFM Migration" you'll find several IBM presentations, white papers and docs on the subject. Also, I thought one can run AFM between two file systems, both file systems in the same cluster. Yes I'm saying local cluster == remote cluster == same cluster. I thought I did that some years ago, just as an exercise to set up AFM and I only had one cluster conveniently available... An expert will confirm or deny. -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Fri Mar 29 20:29:03 2019 From: S.J.Thompson at bham.ac.uk (Simon Thompson) Date: Fri, 29 Mar 2019 20:29:03 +0000 Subject: [gpfsug-discuss] A net new cluster In-Reply-To: References: <7F92D137-07D4-4136-9182-9C5E165704FE@nygenome.org> <0A0CFBD4-35C3-4334-B3FA-7DD7DD4AF7E9@nygenome.org>, Message-ID: I heard that works. But didn't think it was supported for production use. But I guess data migration isn't really production in that sense. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of makaplan at us.ibm.com [makaplan at us.ibm.com] Sent: 29 March 2019 19:58 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] A net new cluster If one googles "GPFS AFM Migration" you'll find several IBM presentations, white papers and docs on the subject. Also, I thought one can run AFM between two file systems, both file systems in the same cluster. Yes I'm saying local cluster == remote cluster == same cluster. I thought I did that some years ago, just as an exercise to set up AFM and I only had one cluster conveniently available... An expert will confirm or deny.