From UWEFALKE at de.ibm.com Tue Nov 1 08:41:37 2016 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Tue, 1 Nov 2016 09:41:37 +0100 Subject: [gpfsug-discuss] Recent Whitepapers from Yuri Volobuev In-Reply-To: References: Message-ID: Another serious loss ... Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Frank Hammer, Thorsten Moehring Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 10/31/2016 10:54 AM Subject: [gpfsug-discuss] Recent Whitepapers from Yuri Volobuev Sent by: gpfsug-discuss-bounces at spectrumscale.org For those of you who may not know, Yuri Volobuev has left IBM to pursue new challenges. Myself along with many others received so much help and keen insight from Yuri on all things GPFS. He will be missed. Bob Oesterlin Sr Principal Storage Engineer, Nuance _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From makaplan at us.ibm.com Tue Nov 1 16:37:17 2016 From: makaplan at us.ibm.com (Marc A Kaplan) Date: Tue, 1 Nov 2016 11:37:17 -0500 Subject: [gpfsug-discuss] wanted...gpfs policy that places larger files onto a pool based on size In-Reply-To: References: <21BC488F0AEA2245B2C3E83FC0B33DBB063A1D4A@CHI-EXCHANGEW1.w2k.jumptrading.com> Message-ID: Placement policy rules, SET POOL ..., are evaluated at open/create time before any write(2) calls have been made, so GPFS has no "idea" how big the file is going to ultimately be. In other words at file creation time FILE_SIZE, if we had implemented it, would be 0, so rather than mislead you and answer the question "why is FILE_SIZE==0, we left FILE_SIZE undefined in SET POOL rules. Of course, we at IBM have thought of at least some other scenarios, and we are listening here... As the last so many years show, GPFS continues to add features, etc, etc. -- marc -------------- next part -------------- An HTML attachment was scrubbed... URL: From billowen at us.ibm.com Wed Nov 2 14:28:15 2016 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 2 Nov 2016 07:28:15 -0700 Subject: [gpfsug-discuss] unified file and object In-Reply-To: References: Message-ID: Hi Leslie, Can you also send the /etc/swift/object-server-sof.conf file from this system? Here is a sample of the file from my working system - it sounds like the config file may not be complete on your system: [root at spectrumscale ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] bind_ip = 127.0.0.1 bind_port = 6203 workers = 3 mount_check = false log_name = object-server-sof log_level = ERROR id_mgmt = unified_mode retain_acl = yes retain_winattr = yes retain_xattr = yes retain_owner = yes tempfile_prefix = .ibmtmp_ disable_fallocate = true log_statsd_host = localhost log_statsd_port = 8125 log_statsd_default_sample_rate = 1.0 log_statsd_sample_rate_factor = 1.0 log_statsd_metric_prefix = devices = /gpfs/fs1/object_fileset/o [pipeline:main] pipeline = object-server [app:object-server] use = egg:swiftonfile#object disk_chunk_size = 65536 network_chunk_size = 65536 [object-replicator] [object-updater] [object-auditor] [object-reconstructor] Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 From: leslie elliott To: gpfsug main discussion list Date: 10/29/2016 03:53 AM Subject: Re: [gpfsug-discuss] unified file and object Sent by: gpfsug-discuss-bounces at spectrumscale.org Bill to be clear the file access ?I mentioned was in relation to SMB and NFS using mmuserauth rather than the unification with the object store since it is required as well but I did try to do this for object as well using the Administration and Programming Reference from page 142, was using unified_mode rather than local_mode mmobj config change --ccrfile spectrum-scale-object.conf --section capabilities --property file-access-enabled --value true the mmuserauth failed as you are aware, we have created test accounts without spaces in the DN and were successful with this step, so eagerly await a fix to be able to use the correct accounts mmobj config change --ccrfile object-server-sof.conf --section DEFAULT --property id_mgmt --value unified_mode mmobj config change --ccrfile object-server-sof.conf --section DEFAULT --property ad_domain --value DOMAIN we have successfully tested object stores on this cluster with simple auth the output you asked for is as follows [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] devices = /gpfs/pren01/ObjectFileset/o log_level = ERROR [root at pren-gs7k-vm4 ~]# systemctl -l status openstack-swift-object-sof ? openstack-swift-object-sof.service - OpenStack Object Storage (swift) - Object Server ? ?Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-sof.service; disabled; vendor preset: disabled) ? ?Active: failed (Result: exit-code) since Sat 2016-10-29 10:30:22 UTC; 27s ago ? Process: 8086 ExecStart=/usr/bin/swift-object-server-sof /etc/swift/object-server-sof.conf (code=exited, status=1/FAILURE) ?Main PID: 8086 (code=exited, status=1/FAILURE) Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Started OpenStack Object Storage (swift) - Object Server. Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Starting OpenStack Object Storage (swift) - Object Server... Oct 29 10:30:22 pren-gs7k-vm4 swift-object-server-sof[8086]: Error trying to load config from /etc/swift/object-server-sof.conf: No section 'object-server' (prefixed by 'app' or 'application' or 'composite' or 'composit' or 'pipeline' or 'filter-app') found in config /etc/swift/object-server-sof.conf Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service: main process exited, code=exited, status=1/FAILURE Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Unit openstack-swift-object-sof.service entered failed state. Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service failed. I am happy to help you or for you to help to debug this problem via a short call thanks leslie On 29 October 2016 at 00:37, Bill Owen wrote: 2. Can you provide more details on how you configured file access? The normal procedure is to use "mmobj file-access enable", and this will set up the required settings in the config file. Can you send us: - the steps used to configure file access - the resulting /etc/swift/object-server-sof.conf - log files from /var/log/swift or output of "systemctl status openstack-swift-object-sof" We can schedule a short call to help debug if needed. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From leslie.james.elliott at gmail.com Wed Nov 2 22:00:25 2016 From: leslie.james.elliott at gmail.com (leslie elliott) Date: Thu, 3 Nov 2016 08:00:25 +1000 Subject: [gpfsug-discuss] unified file and object In-Reply-To: References: Message-ID: Bill you are correct about it missing details [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] devices = /gpfs/pren01/ObjectFileset/o log_level = ERROR now that I have yours for reference I have updated the file and the service starts, but I am unsure why it was not provisioned correctly initially leslie On 3 November 2016 at 00:28, Bill Owen wrote: > Hi Leslie, > Can you also send the /etc/swift/object-server-sof.conf file from this > system? > > Here is a sample of the file from my working system - it sounds like the > config file may not be complete on your system: > [root at spectrumscale ~]# cat /etc/swift/object-server-sof.conf > [DEFAULT] > bind_ip = 127.0.0.1 > bind_port = 6203 > workers = 3 > mount_check = false > log_name = object-server-sof > log_level = ERROR > id_mgmt = unified_mode > retain_acl = yes > retain_winattr = yes > retain_xattr = yes > retain_owner = yes > tempfile_prefix = .ibmtmp_ > disable_fallocate = true > log_statsd_host = localhost > log_statsd_port = 8125 > log_statsd_default_sample_rate = 1.0 > log_statsd_sample_rate_factor = 1.0 > log_statsd_metric_prefix = > devices = /gpfs/fs1/object_fileset/o > > [pipeline:main] > pipeline = object-server > > [app:object-server] > use = egg:swiftonfile#object > disk_chunk_size = 65536 > network_chunk_size = 65536 > > [object-replicator] > > [object-updater] > > [object-auditor] > > [object-reconstructor] > > > Bill Owen > billowen at us.ibm.com > Spectrum Scale Object Storage > 520-799-4829 > > > [image: Inactive hide details for leslie elliott ---10/29/2016 03:53:48 > AM---Bill to be clear the file access I mentioned was in relat]leslie > elliott ---10/29/2016 03:53:48 AM---Bill to be clear the file access I > mentioned was in relation to SMB and NFS > > From: leslie elliott > To: gpfsug main discussion list > Date: 10/29/2016 03:53 AM > Subject: Re: [gpfsug-discuss] unified file and object > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Bill > > to be clear the file access I mentioned was in relation to SMB and NFS > using mmuserauth rather than the unification with the object store since it > is required as well > > but I did try to do this for object as well using the Administration and > Programming Reference from page 142, was using unified_mode rather than > local_mode > > mmobj config change --ccrfile spectrum-scale-object.conf --section > capabilities --property file-access-enabled --value true > > the mmuserauth failed as you are aware, we have created test accounts > without spaces in the DN and were successful with this step, so eagerly > await a fix to be able to use the correct accounts > > mmobj config change --ccrfile object-server-sof.conf --section DEFAULT > --property id_mgmt --value unified_mode > mmobj config change --ccrfile object-server-sof.conf --section DEFAULT > --property ad_domain --value DOMAIN > > > we have successfully tested object stores on this cluster with simple auth > > > the output you asked for is as follows > > [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf > [DEFAULT] > devices = /gpfs/pren01/ObjectFileset/o > log_level = ERROR > > > [root at pren-gs7k-vm4 ~]# systemctl -l status openstack-swift-object-sof > ? openstack-swift-object-sof.service - OpenStack Object Storage (swift) - > Object Server > Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-sof.service; > disabled; vendor preset: disabled) > Active: failed (Result: exit-code) since Sat 2016-10-29 10:30:22 UTC; > 27s ago > Process: 8086 ExecStart=/usr/bin/swift-object-server-sof > /etc/swift/object-server-sof.conf (code=exited, status=1/FAILURE) > Main PID: 8086 (code=exited, status=1/FAILURE) > > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Started OpenStack Object Storage > (swift) - Object Server. > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Starting OpenStack Object > Storage (swift) - Object Server... > Oct 29 10:30:22 pren-gs7k-vm4 swift-object-server-sof[8086]: Error trying > to load config from /etc/swift/object-server-sof.conf: No section > 'object-server' (prefixed by 'app' or 'application' or 'composite' or > 'composit' or 'pipeline' or 'filter-app') found in config > /etc/swift/object-server-sof.conf > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service: > main process exited, code=exited, status=1/FAILURE > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Unit openstack-swift-object-sof.service > entered failed state. > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service > failed. > > > > > I am happy to help you or for you to help to debug this problem via a > short call > > > thanks > > leslie > > > > On 29 October 2016 at 00:37, Bill Owen <*billowen at us.ibm.com* > > wrote: > > > 2. Can you provide more details on how you configured file access? The > normal procedure is to use "mmobj file-access enable", and this will set up > the required settings in the config file. Can you send us: > - the steps used to configure file access > - the resulting /etc/swift/object-server-sof.conf > - log files from /var/log/swift or output of "systemctl status > openstack-swift-object-sof" > > We can schedule a short call to help debug if needed. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From billowen at us.ibm.com Wed Nov 2 23:39:48 2016 From: billowen at us.ibm.com (Bill Owen) Date: Wed, 2 Nov 2016 16:39:48 -0700 Subject: [gpfsug-discuss] unified file and object In-Reply-To: References: Message-ID: > now that I have yours for reference I have updated the file and the service starts, but I am unsure why it was not provisioned correctly initially Do you have the log from the original installation? Did you install using the spectrumscale install toolkit? Thanks, Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 From: leslie elliott To: gpfsug main discussion list Date: 11/02/2016 03:00 PM Subject: Re: [gpfsug-discuss] unified file and object Sent by: gpfsug-discuss-bounces at spectrumscale.org Bill you are correct about it missing details [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] devices = /gpfs/pren01/ObjectFileset/o log_level = ERROR now that I have yours for reference I have updated the file and the service starts, but I am unsure why it was not provisioned correctly initially leslie On 3 November 2016 at 00:28, Bill Owen wrote: Hi Leslie, Can you also send the /etc/swift/object-server-sof.conf file from this system? Here is a sample of the file from my working system - it sounds like the config file may not be complete on your system: [root at spectrumscale ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] bind_ip = 127.0.0.1 bind_port = 6203 workers = 3 mount_check = false log_name = object-server-sof log_level = ERROR id_mgmt = unified_mode retain_acl = yes retain_winattr = yes retain_xattr = yes retain_owner = yes tempfile_prefix = .ibmtmp_ disable_fallocate = true log_statsd_host = localhost log_statsd_port = 8125 log_statsd_default_sample_rate = 1.0 log_statsd_sample_rate_factor = 1.0 log_statsd_metric_prefix = devices = /gpfs/fs1/object_fileset/o [pipeline:main] pipeline = object-server [app:object-server] use = egg:swiftonfile#object disk_chunk_size = 65536 network_chunk_size = 65536 [object-replicator] [object-updater] [object-auditor] [object-reconstructor] Bill Owen billowen at us.ibm.com Spectrum Scale Object Storage 520-799-4829 Inactive hide details for leslie elliott ---10/29/2016 03:53:48 AM---Bill to be clear the file access I mentioned was in relatleslie elliott ---10/29/2016 03:53:48 AM---Bill to be clear the file access I mentioned was in relation to SMB and NFS From: leslie elliott To: gpfsug main discussion list Date: 10/29/2016 03:53 AM Subject: Re: [gpfsug-discuss] unified file and object Sent by: gpfsug-discuss-bounces at spectrumscale.org Bill to be clear the file access ?I mentioned was in relation to SMB and NFS using mmuserauth rather than the unification with the object store since it is required as well but I did try to do this for object as well using the Administration and Programming Reference from page 142, was using unified_mode rather than local_mode mmobj config change --ccrfile spectrum-scale-object.conf --section capabilities --property file-access-enabled --value true the mmuserauth failed as you are aware, we have created test accounts without spaces in the DN and were successful with this step, so eagerly await a fix to be able to use the correct accounts mmobj config change --ccrfile object-server-sof.conf --section DEFAULT --property id_mgmt --value unified_mode mmobj config change --ccrfile object-server-sof.conf --section DEFAULT --property ad_domain --value DOMAIN we have successfully tested object stores on this cluster with simple auth the output you asked for is as follows [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf [DEFAULT] devices = /gpfs/pren01/ObjectFileset/o log_level = ERROR [root at pren-gs7k-vm4 ~]# systemctl -l status openstack-swift-object-sof ? openstack-swift-object-sof.service - OpenStack Object Storage (swift) - Object Server ? ?Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-sof.service; disabled; vendor preset: disabled) ? ?Active: failed (Result: exit-code) since Sat 2016-10-29 10:30:22 UTC; 27s ago ? Process: 8086 ExecStart=/usr/bin/swift-object-server-sof /etc/swift/object-server-sof.conf (code=exited, status=1/FAILURE) ?Main PID: 8086 (code=exited, status=1/FAILURE) Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Started OpenStack Object Storage (swift) - Object Server. Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Starting OpenStack Object Storage (swift) - Object Server... Oct 29 10:30:22 pren-gs7k-vm4 swift-object-server-sof[8086]: Error trying to load config from /etc/swift/object-server-sof.conf: No section 'object-server' (prefixed by 'app' or 'application' or 'composite' or 'composit' or 'pipeline' or 'filter-app') found in config /etc/swift/object-server-sof.conf Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service: main process exited, code=exited, status=1/FAILURE Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Unit openstack-swift-object-sof.service entered failed state. Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service failed. I am happy to help you or for you to help to debug this problem via a short call thanks leslie On 29 October 2016 at 00:37, Bill Owen wrote: 2. Can you provide more details on how you configured file access? The normal procedure is to use "mmobj file-access enable", and this will set up the required settings in the config file. Can you send us: - the steps used to configure file access - the resulting /etc/swift/object-server-sof.conf - log files from /var/log/swift or output of "systemctl status openstack-swift-object-sof" We can schedule a short call to help debug if needed. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From leslie.james.elliott at gmail.com Thu Nov 3 04:37:43 2016 From: leslie.james.elliott at gmail.com (leslie elliott) Date: Thu, 3 Nov 2016 14:37:43 +1000 Subject: [gpfsug-discuss] unified file and object In-Reply-To: References: Message-ID: Sorry I don't have an install log This is a DDN installation so while I believe the use the spectrum scale toolkit I can not confirm this Thanks Leslie On Thursday, 3 November 2016, Bill Owen wrote: > > now that I have yours for reference I have updated the file and the > service starts, but I am unsure why it was not provisioned correctly > initially > Do you have the log from the original installation? Did you install using > the spectrumscale install toolkit? > > Thanks, > Bill Owen > billowen at us.ibm.com > Spectrum Scale Object Storage > 520-799-4829 > > > [image: Inactive hide details for leslie elliott ---11/02/2016 03:00:55 > PM---Bill you are correct about it missing details]leslie elliott > ---11/02/2016 03:00:55 PM---Bill you are correct about it missing details > > From: leslie elliott > > To: gpfsug main discussion list > > Date: 11/02/2016 03:00 PM > Subject: Re: [gpfsug-discuss] unified file and object > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > ------------------------------ > > > > Bill > > you are correct about it missing details > > > [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf > [DEFAULT] > devices = /gpfs/pren01/ObjectFileset/o > log_level = ERROR > > > > now that I have yours for reference I have updated the file and the > service starts, but I am unsure why it was not provisioned correctly > initially > > leslie > > > On 3 November 2016 at 00:28, Bill Owen <*billowen at us.ibm.com* > > wrote: > > Hi Leslie, > Can you also send the /etc/swift/object-server-sof.conf file from this > system? > > Here is a sample of the file from my working system - it sounds like > the config file may not be complete on your system: > [root at spectrumscale ~]# cat /etc/swift/object-server-sof.conf > [DEFAULT] > bind_ip = 127.0.0.1 > bind_port = 6203 > workers = 3 > mount_check = false > log_name = object-server-sof > log_level = ERROR > id_mgmt = unified_mode > retain_acl = yes > retain_winattr = yes > retain_xattr = yes > retain_owner = yes > tempfile_prefix = .ibmtmp_ > disable_fallocate = true > log_statsd_host = localhost > log_statsd_port = 8125 > log_statsd_default_sample_rate = 1.0 > log_statsd_sample_rate_factor = 1.0 > log_statsd_metric_prefix = > devices = /gpfs/fs1/object_fileset/o > > [pipeline:main] > pipeline = object-server > > [app:object-server] > use = egg:swiftonfile#object > disk_chunk_size = 65536 > network_chunk_size = 65536 > > [object-replicator] > > [object-updater] > > [object-auditor] > > [object-reconstructor] > > > Bill Owen > *billowen at us.ibm.com* > > Spectrum Scale Object Storage > 520-799-4829 > > > [image: Inactive hide details for leslie elliott ---10/29/2016 > 03:53:48 AM---Bill to be clear the file access I mentioned was in relat]leslie > elliott ---10/29/2016 03:53:48 AM---Bill to be clear the file access I > mentioned was in relation to SMB and NFS > > From: leslie elliott <*leslie.james.elliott at gmail.com* > > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Date: 10/29/2016 03:53 AM > Subject: Re: [gpfsug-discuss] unified file and object > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > ------------------------------ > > > > Bill > > to be clear the file access I mentioned was in relation to SMB and > NFS using mmuserauth rather than the unification with the object store > since it is required as well > > but I did try to do this for object as well using the Administration > and Programming Reference from page 142, was using unified_mode rather than > local_mode > > mmobj config change --ccrfile spectrum-scale-object.conf --section > capabilities --property file-access-enabled --value true > > the mmuserauth failed as you are aware, we have created test accounts > without spaces in the DN and were successful with this step, so eagerly > await a fix to be able to use the correct accounts > > mmobj config change --ccrfile object-server-sof.conf --section DEFAULT > --property id_mgmt --value unified_mode > mmobj config change --ccrfile object-server-sof.conf --section DEFAULT > --property ad_domain --value DOMAIN > > > we have successfully tested object stores on this cluster with simple > auth > > > the output you asked for is as follows > > [root at pren-gs7k-vm4 ~]# cat /etc/swift/object-server-sof.conf > [DEFAULT] > devices = /gpfs/pren01/ObjectFileset/o > log_level = ERROR > > > [root at pren-gs7k-vm4 ~]# systemctl -l status openstack-swift-object-sof > ? openstack-swift-object-sof.service - OpenStack Object Storage > (swift) - Object Server > Loaded: loaded (/usr/lib/systemd/system/openstack-swift-object-sof.service; > disabled; vendor preset: disabled) > Active: failed (Result: exit-code) since Sat 2016-10-29 10:30:22 > UTC; 27s ago > Process: 8086 ExecStart=/usr/bin/swift-object-server-sof > /etc/swift/object-server-sof.conf (code=exited, status=1/FAILURE) > Main PID: 8086 (code=exited, status=1/FAILURE) > > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Started OpenStack Object > Storage (swift) - Object Server. > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Starting OpenStack Object > Storage (swift) - Object Server... > Oct 29 10:30:22 pren-gs7k-vm4 swift-object-server-sof[8086]: Error > trying to load config from /etc/swift/object-server-sof.conf: No > section 'object-server' (prefixed by 'app' or 'application' or 'composite' > or 'composit' or 'pipeline' or 'filter-app') found in config > /etc/swift/object-server-sof.conf > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service: > main process exited, code=exited, status=1/FAILURE > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: Unit > openstack-swift-object-sof.service entered failed state. > Oct 29 10:30:22 pren-gs7k-vm4 systemd[1]: openstack-swift-object-sof.service > failed. > > > > > I am happy to help you or for you to help to debug this problem via a > short call > > > thanks > > leslie > > > > On 29 October 2016 at 00:37, Bill Owen <*billowen at us.ibm.com* > > wrote: > > 2. Can you provide more details on how you configured file > access? The normal procedure is to use "mmobj file-access enable", and this > will set up the required settings in the config file. Can you send us: > - the steps used to configure file access > - the resulting /etc/swift/object-server-sof.conf > - log files from /var/log/swift or output of "systemctl status > openstack-swift-object-sof" > > We can schedule a short call to help debug if needed. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From Mark.Bush at siriuscom.com Fri Nov 4 16:18:17 2016 From: Mark.Bush at siriuscom.com (Mark.Bush at siriuscom.com) Date: Fri, 4 Nov 2016 16:18:17 +0000 Subject: [gpfsug-discuss] CES and IP's that disappear Message-ID: <451248FD-D116-4C3D-A439-73967C287F6C@siriuscom.com> I continue to run into a problem where after I get CES setup properly and the ces-ip addresses show up, I then add an NFS export and all of a sudden the ces-ip?s disappear from the protocol nodes. I have been scouring the problem determination guide but can?t seem to find out what is going on. It makes no sense to me that the IP?s would disappear. Especially after a simple task of just adding an nfs export. I first thought this was just a gui issue but just got done trying it all from the cli and the same thing happens. Has anyone seen anything like this? Mark This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. This message may be viewed by parties at Sirius Computer Solutions other than those named in the message header. This message does not contain an official representation of Sirius Computer Solutions. If you have received this communication in error, notify Sirius Computer Solutions immediately and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. Sirius Computer Solutions -------------- next part -------------- An HTML attachment was scrubbed... URL: From leslie.james.elliott at gmail.com Sat Nov 5 06:09:34 2016 From: leslie.james.elliott at gmail.com (leslie elliott) Date: Sat, 5 Nov 2016 16:09:34 +1000 Subject: [gpfsug-discuss] HAWC and LROC Message-ID: Hi I am curious if anyone has run these together on a client and whether it helped If we wanted to have these functions out at the client to optimise compute IO in a couple of special cases can both exist at the same time on the same nonvolatile hardware or do the two functions need independent devices and what would be the process to disestablish them on the clients as the requirement was satisfied thanks leslie -------------- next part -------------- An HTML attachment was scrubbed... URL: From olaf.weiser at de.ibm.com Sat Nov 5 13:39:58 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Sat, 5 Nov 2016 13:39:58 +0000 Subject: [gpfsug-discuss] HAWC and LROC Message-ID: You can use both -HAWC ,LROC- on the same node... but you need dedicated ,independent ,block devices ... In addition for hawc, you could consider replication and use 2 devices, even across 2 nodes. ... Gesendet von IBM Verse leslie elliott --- [gpfsug-discuss] HAWC and LROC --- Von:"leslie elliott" An:"gpfsug main discussion list" Datum:Sa. 05.11.2016 02:09Betreff:[gpfsug-discuss] HAWC and LROC Hi I am curious if anyone has run these together on a client and whether it helped If we wanted to have these functions out at the client to optimise compute IO in a couple of special cases can both exist at the same time on the same nonvolatile hardware or do the two functions need independent devices and what would be the process to disestablish them on the clients as the requirement was satisfied thanks leslie -------------- next part -------------- An HTML attachment was scrubbed... URL: From oehmes at gmail.com Sat Nov 5 16:17:52 2016 From: oehmes at gmail.com (Sven Oehme) Date: Sat, 05 Nov 2016 16:17:52 +0000 Subject: [gpfsug-discuss] HAWC and LROC In-Reply-To: References: Message-ID: Yes and no :) While olaf is right, it needs two independent blockdevices, partitions are just fine. So one could have in fact have a 200g ssd as a boot device and partitions it lets say 30g os 20g hawc 150g lroc you have to keep in mind that lroc and hawc have 2 very different requirements on the 'device'. if you loose hawc, you loose one copy of critical data (that's why the log needs to be replicated), if you loose lroc, you only loose cached data stored somewhere else, so the recommendation is to use soemwhat reliable 'devices' for hawc, while for lroc it could be simple consumer grade ssd's. So if you use one for both, it should be reliable. Sven On Sat, Nov 5, 2016, 6:40 AM Olaf Weiser wrote: > You can use both -HAWC ,LROC- on the same node... but you need dedicated > ,independent ,block devices ... > In addition for hawc, you could consider replication and use 2 devices, > even across 2 nodes. ... > > Gesendet von IBM Verse > > leslie elliott --- [gpfsug-discuss] HAWC and LROC --- > > Von: "leslie elliott" > An: "gpfsug main discussion list" > Datum: Sa. 05.11.2016 02:09 > Betreff: [gpfsug-discuss] HAWC and LROC > ------------------------------ > > > Hi I am curious if anyone has run these together on a client and whether > it helped > > If we wanted to have these functions out at the client to optimise compute > IO in a couple of special cases > > can both exist at the same time on the same nonvolatile hardware or do the > two functions need independent devices > > and what would be the process to disestablish them on the clients as the > requirement was satisfied > > thanks > > leslie > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Mon Nov 7 11:08:00 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Mon, 7 Nov 2016 11:08:00 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log Message-ID: Hi all Since upgrading to 4.2.1 and all the in-between work that comes with it, I've got an error (warning) in my GUI with the following: Event ID: MS8071 Time: 13/09/2016 17:47:00 Message: DISK [disk_down] ICSAN_GPFS_FSD_QUORUM is DEGRADED Details: System response: - Administrator response: Use the command 'mmhealth node show' to get further details. Component NSD There's no fix for it, and neither has it recognised the problem doesn't exist anymore: [root at quorum ~]# mmhealth node show Node name: icgpfsq1 Node status: HEALTHY Component Status Reasons ------------------------------------------------------------------- GPFS HEALTHY - NETWORK HEALTHY - FILESYSTEM HEALTHY - DISK HEALTHY - GUI HEALTHY - Apart from possibly restarting the GUI services, shouldn't this just go away by itself? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From dhildeb at us.ibm.com Mon Nov 7 19:29:27 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Mon, 7 Nov 2016 11:29:27 -0800 Subject: [gpfsug-discuss] HAWC and LROC In-Reply-To: References: Message-ID: Just adding in that with HAWC, you can also use a shared fast storage device (instead of a node local SSD). So, for example, if you already have your metadata stored in a shared SSD server, then you can just enable HAWC without any additional replication requirements. Dean From: Sven Oehme To: gpfsug main discussion list Date: 11/05/2016 09:18 AM Subject: Re: [gpfsug-discuss] HAWC and LROC Sent by: gpfsug-discuss-bounces at spectrumscale.org Yes and no :) While olaf is right, it needs two independent blockdevices, partitions are just fine. So one could have in fact have a 200g ssd as a boot device and partitions it lets say 30g os 20g hawc 150g lroc you have to keep in mind that lroc and hawc have 2 very different requirements on the 'device'. if you loose hawc, you loose one copy of critical data (that's why the log needs to be replicated), if you loose lroc, you only loose cached data stored somewhere else, so the recommendation is to use soemwhat reliable 'devices' for hawc, while for lroc it could be simple consumer grade ssd's. So if you use one for both, it should be reliable. Sven On Sat, Nov 5, 2016, 6:40 AM Olaf Weiser wrote: You can use both -HAWC ,LROC- on the same node... but you need dedicated ,independent ,block devices ... In addition for hawc, you could consider replication and use 2 devices, even across 2 nodes. ... Gesendet von IBM Verse leslie elliott --- [gpfsug-discuss] HAWC and LROC --- Von: "leslie elliott" An: "gpfsug main discussion list" Datum: Sa. 05.11.2016 02:09 Betreff [gpfsug-discuss] HAWC and LROC : Hi I am curious if anyone has run these together on a client and whether it helped If we wanted to have these functions out at the client to optimise compute IO in a couple of special cases can both exist at the same time on the same nonvolatile hardware or do the two functions need independent devices and what would be the process to disestablish them on the clients as the requirement was satisfied thanks leslie _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From MDIETZ at de.ibm.com Mon Nov 7 20:14:10 2016 From: MDIETZ at de.ibm.com (Mathias Dietz) Date: Mon, 7 Nov 2016 21:14:10 +0100 Subject: [gpfsug-discuss] CES and IP's that disappear In-Reply-To: <451248FD-D116-4C3D-A439-73967C287F6C@siriuscom.com> References: <451248FD-D116-4C3D-A439-73967C287F6C@siriuscom.com> Message-ID: Hi Mark, this sounds like a CES IP failover happened in the background. With Spectrum Scale 4.2.1 you can use the command "mmhealth node eventlog" on the failing node to see if a failover happened and what has triggered the failover. Prior to 4.2.1 use the command "mmces events list" or look into the mmfs.log for errors. Mit freundlichen Gr??en / Kind regards Mathias Dietz Spectrum Scale - Release Lead Architect (4.2.2 Release) System Health and Problem Determination Architect IBM Certified Software Engineer ---------------------------------------------------------------------------------------------------------- IBM Deutschland Hechtsheimer Str. 2 55131 Mainz Phone: +49-6131-84-2027 Mobile: +49-15152801035 E-Mail: mdietz at de.ibm.com ---------------------------------------------------------------------------------------------------------- IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martina Koederitz, Gesch?ftsf?hrung: Dirk Wittkopp Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 From: "Mark.Bush at siriuscom.com" To: "gpfsug-discuss at spectrumscale.org" Date: 11/04/2016 05:18 PM Subject: [gpfsug-discuss] CES and IP's that disappear Sent by: gpfsug-discuss-bounces at spectrumscale.org I continue to run into a problem where after I get CES setup properly and the ces-ip addresses show up, I then add an NFS export and all of a sudden the ces-ip?s disappear from the protocol nodes. I have been scouring the problem determination guide but can?t seem to find out what is going on. It makes no sense to me that the IP?s would disappear. Especially after a simple task of just adding an nfs export. I first thought this was just a gui issue but just got done trying it all from the cli and the same thing happens. Has anyone seen anything like this? Mark This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. This message may be viewed by parties at Sirius Computer Solutions other than those named in the message header. This message does not contain an official representation of Sirius Computer Solutions. If you have received this communication in error, notify Sirius Computer Solutions immediately and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. Sirius Computer Solutions _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Mon Nov 7 21:12:50 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Mon, 7 Nov 2016 21:12:50 +0000 Subject: [gpfsug-discuss] SC16: GPFS User Group Meeting location Information Message-ID: <55E57C1F-B20C-4AAE-9FE9-A2D87C9AE61C@nuance.com> IBM Spectrum Scale User Group Meeting - SC16 Please register if you have not done so: https://www-01.ibm.com/events/wwe/grp/grp305.nsf/Registration.xsp?openform&seminar=357M7UES&locale=en_US&auth=anonymous Date: Sunday, November 13th Time: 12:30p - 5:30p - Please be on time! Location: Grand Ballroom Salon F Reception Following: Salon E Salt Lake Marriott Downtown at City Creek 75 S W Temple Salt Lake City, Utah 84101 United States The Salt Lake Marriott Downtown at City Creek is located across the street from the Salt Palace Convention Center. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From mimarsh2 at vt.edu Tue Nov 8 13:40:43 2016 From: mimarsh2 at vt.edu (Brian Marshall) Date: Tue, 8 Nov 2016 08:40:43 -0500 Subject: [gpfsug-discuss] subnets confusion Message-ID: All, I have a tricky (at least to me) subnets question. I have 2 NSD Server clusters: Serv1 -> daemon on 10.51 with high speed network on 10.82 Serv2 -> daemon on 10.42 a high speed network and 2 client clusters: Cli1 -> daemon on 10.81 with high speed network on 10.82 Cli2 -> daemon on 10.41 with high speed network on 10.42 Serv1 has the following subnets operand: subnets 10.82.0.0/Serv1;Cli1 10.41.0.0/Cli2 Cli1 has the following subnets subnets 10.82.0.0/Serv1;Cli1 Cli2 has the following subnets subnets 10.51.0.0/Serv1 10.41.0.0/Cli2 10.42.0.0/Serv2 Problem: Sometimes Serv1 will try to contact Cli2 nodes on the 10.42 address which they don't have access to. I get errors like Close connection to 10.42.1 0.1 hs001.cluster.ib (Connection timed out) Cli2 nodes can connect/re-connect to Serv1 once the server cluster kicks them out. Serv1 has Cli2 listed on its 10.41 subnets operand, so I don't fully understand why Serv1 does not use 10.41 to connect Possible Solution?? I think to fix this I either need to add Serv1 to the 10.41 subnet of Cli2 OR move the 10.42 operand on Cli2 to the front of the list. I am working from this link https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+(GPFS)/page/GPFS+Network+Communication+Overview Please let me know if you need more info. I have tried to strip this down to the bare minimum and in doing so may have left out good details. Thank you, Brian Marshall -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Tue Nov 8 13:56:10 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Tue, 8 Nov 2016 13:56:10 +0000 Subject: [gpfsug-discuss] "waiting for exclusive use of connection for sending msg" Message-ID: <24DB6954-F972-4022-9A7C-539E048E0680@nuance.com> This is one of those RPC waiters whose real cause and solution elude me. I know from various sources that it's network congestion related. But the documentation doesn't *really* give me a clue as to where to look next. If the NSD server is running well, with no obvious network issues then this may be a simple matter of network congestion. Any member out there who might know in detail where I should be looking? Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rohwedder at de.ibm.com Tue Nov 8 14:51:02 2016 From: rohwedder at de.ibm.com (Markus Rohwedder) Date: Tue, 8 Nov 2016 15:51:02 +0100 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: Message-ID: Hello, you ran into a defect which is fixed with the upcoming 4.2.1.2 PTF Here is a workaround: You can clear the eventlog of the system health component using mmsysmonc clearDB This is a per node database, so you need to run this on all the nodes which have stale entries. It will clear all the events on this node, if you want to save them run: mmhealth node eventlog > log.save On the GUI node, run systemctl restart gpfsgui afterwards. The mmhealth command suppresses events during startup. So in case a bad condition turns OK during a restart phase, the bad event will remain stale. Regards, Markus Rohwedder IBM Spectrum Scale GUI development -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 8 15:09:51 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 8 Nov 2016 15:09:51 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: Message-ID: Thanks. I've run that on, I assume, our quorum server where this disk is mounted, but the error is still showing up. The event itself doesn't say which node is affected. ICSAN_GPFS_FSD_QUORUM nsd 512 103 no no ready up system That looks ok to me. Maybe I misunderstood your line "This is a per node database, so you need to run this on all the nodes which have stale entries.". Should I just run it on all the nodes in the cluster instead... there's not many so won't take long but wondering if that's really necessary? Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Markus Rohwedder Sent: 08 November 2016 14:51 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] How to clear stale entries in GUI log Hello, you ran into a defect which is fixed with the upcoming 4.2.1.2 PTF Here is a workaround: You can clear the eventlog of the system health component using mmsysmonc clearDB This is a per node database, so you need to run this on all the nodes which have stale entries. It will clear all the events on this node, if you want to save them run: mmhealth node eventlog > log.save On the GUI node, run systemctl restart gpfsgui afterwards. The mmhealth command suppresses events during startup. So in case a bad condition turns OK during a restart phase, the bad event will remain stale. Regards, Markus Rohwedder IBM Spectrum Scale GUI development -------------- next part -------------- An HTML attachment was scrubbed... URL: From andreas.koeninger at de.ibm.com Tue Nov 8 16:49:50 2016 From: andreas.koeninger at de.ibm.com (Andreas Koeninger) Date: Tue, 8 Nov 2016 16:49:50 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Wed Nov 9 09:10:47 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Wed, 9 Nov 2016 09:10:47 +0000 Subject: [gpfsug-discuss] SC16: GPFS User Group Meeting location Information Message-ID: Please do sign up and come along to the user group meeting, Important note: the start time is 12:30pm, (it was originally advertised as 1pm on the IBM website). I was on a planning call yesterday, and I'm very happy to hear that IBM are specifically vetting their slides to ensure they are technical talks (not sales) for the user group session. We also have some great sounding user talks on the agenda. The programme is available on the Spectrum Scale UG website as well at: http://www.spectrumscale.org/ssug-at-sc16/ We're all looking forward to seeing you on Sunday Simon From: > on behalf of "Oesterlin, Robert" > Reply-To: "gpfsug-discuss at spectrumscale.org" > Date: Monday, 7 November 2016 at 21:12 To: "gpfsug-discuss at spectrumscale.org" > Subject: [gpfsug-discuss] SC16: GPFS User Group Meeting location Information IBM Spectrum Scale User Group Meeting - SC16 Please register if you have not done so: https://www-01.ibm.com/events/wwe/grp/grp305.nsf/Registration.xsp?openform&seminar=357M7UES&locale=en_US&auth=anonymous Date: Sunday, November 13th Time: 12:30p - 5:30p - Please be on time! Location: Grand Ballroom Salon F Reception Following: Salon E Salt Lake Marriott Downtown at City Creek 75 S W Temple Salt Lake City, Utah 84101 United States The Salt Lake Marriott Downtown at City Creek is located across the street from the Salt Palace Convention Center. Bob Oesterlin Sr Principal Storage Engineer, Nuance -------------- next part -------------- An HTML attachment was scrubbed... URL: From pascal+gpfsug at blue-onyx.ch Wed Nov 9 16:23:06 2016 From: pascal+gpfsug at blue-onyx.ch (Pascal Jermini) Date: Wed, 9 Nov 2016 17:23:06 +0100 Subject: [gpfsug-discuss] LROC and Spectrum Scale Express Message-ID: <7d2dcea5-0d28-2357-e5fe-8419aeaaf30b@blue-onyx.ch> Dear all, by looking at the documentation it is not clear whether Spectrum Scale Express edition supports the LROC feature. As far as I understand it, the client license is sufficient, however no word is given about which edition supports that feature. Any idea and/or pointer? Many thanks, Pascal From jake.carroll at uq.edu.au Wed Nov 9 17:39:05 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Wed, 9 Nov 2016 17:39:05 +0000 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Message-ID: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> Hi. I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. So ? my questions: 1. Are there very specific tunings AFM needs for high latency/long distance IO? 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. ethtool -C enp2s0f0 adaptive-rx off ifconfig enp2s0f0 txqueuelen 10000 sysctl -w net.core.rmem_max=536870912 sysctl -w net.core.wmem_max=536870912 sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" sysctl -w net.core.netdev_max_backlog=250000 sysctl -w net.ipv4.tcp_congestion_control=htcp sysctl -w net.ipv4.tcp_mtu_probing=1 I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: mmchconfig afmNumWriteThreads=4 mmchconfig afmNumReadThreads=4 But no difference so far. Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From janfrode at tanso.net Wed Nov 9 18:05:21 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Wed, 09 Nov 2016 18:05:21 +0000 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances In-Reply-To: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> References: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> Message-ID: Mostly curious, don't have experience in such environments, but ... Is this AFM over NFS or NSD protocol? Might be interesting to try the other option -- and also check how nsdperf performs over such distance/latency. -jf ons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll : > Hi. > > > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a > really long distance. About 180ms of latency between the two clusters and > around 13,000km of optical path. Fortunately for me, I?ve actually got near > theoretical maximum IO over the NIC?s between the clusters and I?m > iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 > all the way through. > > > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t > really understand why that might be. I?ve verified the links and transports > ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. > > > > I also verified the clusters on both sides in terms of disk IO and they > both seem easily capable in IOZone and IOR tests of multiple GB/sec of > throughput. > > > > So ? my questions: > > > > 1. Are there very specific tunings AFM needs for high latency/long > distance IO? > > 2. Are there very specific NIC/TCP-stack tunings (beyond the type > of thing we already have in place) that benefits AFM over really long > distances and high latency? > > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? > in the home mount. It sometimes takes 20 to 30 seconds before the command > line will report back with a long listing of files. Any ideas why it?d take > that long to get a response from ?home?. > > > > We?ve got our TCP stack setup fairly aggressively, on all hosts that > participate in these two clusters. > > > > ethtool -C enp2s0f0 adaptive-rx off > > ifconfig enp2s0f0 txqueuelen 10000 > > sysctl -w net.core.rmem_max=536870912 > > sysctl -w net.core.wmem_max=536870912 > > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > > sysctl -w net.core.netdev_max_backlog=250000 > > sysctl -w net.ipv4.tcp_congestion_control=htcp > > sysctl -w net.ipv4.tcp_mtu_probing=1 > > > > I modified a couple of small things on the AFM ?cache? side to see if it?d > make a difference such as: > > > > mmchconfig afmNumWriteThreads=4 > > mmchconfig afmNumReadThreads=4 > > > > But no difference so far. > > > > Thoughts would be appreciated. I?ve done this before over much shorter > distances (30Km) and I?ve flattened a 10GbE wire without really > tuning?anything. Are my large in-flight-packets > numbers/long-time-to-acknowledgement semantics going to hurt here? I really > thought AFM might be well designed for exactly this kind of work at long > distance **and** high throughput ? so I must be missing something! > > > > -jc > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sfadden at us.ibm.com Wed Nov 9 18:08:42 2016 From: sfadden at us.ibm.com (Scott Fadden) Date: Wed, 9 Nov 2016 10:08:42 -0800 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances In-Reply-To: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> References: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> Message-ID: Jake, If AFM is using NFS it is all about NFS tuning. The copy from one side to the other is basically just a client writing to an NFS mount. Thee are a few things you can look at: 1. NFS Transfer size (Make is 1MiB, I think that is the max) 2. TCP Tuning for large window size. This is discussed on Tuning active file management home communications in the docs. On this page you will find some discussion on increasing gateway threads, and other things similar that may help as well. We can discuss further as I understand we will be meeting at SC16. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Date: 11/09/2016 09:39 AM Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi. I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. So ? my questions: 1. Are there very specific tunings AFM needs for high latency/long distance IO? 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. ethtool -C enp2s0f0 adaptive-rx off ifconfig enp2s0f0 txqueuelen 10000 sysctl -w net.core.rmem_max=536870912 sysctl -w net.core.wmem_max=536870912 sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" sysctl -w net.core.netdev_max_backlog=250000 sysctl -w net.ipv4.tcp_congestion_control=htcp sysctl -w net.ipv4.tcp_mtu_probing=1 I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: mmchconfig afmNumWriteThreads=4 mmchconfig afmNumReadThreads=4 But no difference so far. Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! -jc _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.carroll at uq.edu.au Wed Nov 9 18:09:14 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Wed, 9 Nov 2016 18:09:14 +0000 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances (Jan-Frode Myklebust) Message-ID: <5D327C63-84EC-4F59-86E7-158308E91013@uq.edu.au> Hi jf? >> Mostly curious, don't have experience in such environments, but ... Is this AFM over NFS or NSD protocol? Might be interesting to try the other option -- and also check how nsdperf performs over such distance/latency. As it turns out, it seems, very few people do. I will test nsdperf over it and see how it performs. And yes, it is AFM ? AFM. No NFS involved here! -jc ------------------------------ Message: 2 Date: Wed, 9 Nov 2016 17:39:05 +0000 From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Message-ID: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0 at uq.edu.au> Content-Type: text/plain; charset="utf-8" Hi. I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. So ? my questions: 1. Are there very specific tunings AFM needs for high latency/long distance IO? 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. ethtool -C enp2s0f0 adaptive-rx off ifconfig enp2s0f0 txqueuelen 10000 sysctl -w net.core.rmem_max=536870912 sysctl -w net.core.wmem_max=536870912 sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" sysctl -w net.core.netdev_max_backlog=250000 sysctl -w net.ipv4.tcp_congestion_control=htcp sysctl -w net.ipv4.tcp_mtu_probing=1 I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: mmchconfig afmNumWriteThreads=4 mmchconfig afmNumReadThreads=4 But no difference so far. Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 3 Date: Wed, 09 Nov 2016 18:05:21 +0000 From: Jan-Frode Myklebust To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Message-ID: Content-Type: text/plain; charset="utf-8" Mostly curious, don't have experience in such environments, but ... Is this AFM over NFS or NSD protocol? Might be interesting to try the other option -- and also check how nsdperf performs over such distance/latency. -jf ons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll : > Hi. > > > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a > really long distance. About 180ms of latency between the two clusters and > around 13,000km of optical path. Fortunately for me, I?ve actually got near > theoretical maximum IO over the NIC?s between the clusters and I?m > iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 > all the way through. > > > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t > really understand why that might be. I?ve verified the links and transports > ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. > > > > I also verified the clusters on both sides in terms of disk IO and they > both seem easily capable in IOZone and IOR tests of multiple GB/sec of > throughput. > > > > So ? my questions: > > > > 1. Are there very specific tunings AFM needs for high latency/long > distance IO? > > 2. Are there very specific NIC/TCP-stack tunings (beyond the type > of thing we already have in place) that benefits AFM over really long > distances and high latency? > > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? > in the home mount. It sometimes takes 20 to 30 seconds before the command > line will report back with a long listing of files. Any ideas why it?d take > that long to get a response from ?home?. > > > > We?ve got our TCP stack setup fairly aggressively, on all hosts that > participate in these two clusters. > > > > ethtool -C enp2s0f0 adaptive-rx off > > ifconfig enp2s0f0 txqueuelen 10000 > > sysctl -w net.core.rmem_max=536870912 > > sysctl -w net.core.wmem_max=536870912 > > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > > sysctl -w net.core.netdev_max_backlog=250000 > > sysctl -w net.ipv4.tcp_congestion_control=htcp > > sysctl -w net.ipv4.tcp_mtu_probing=1 > > > > I modified a couple of small things on the AFM ?cache? side to see if it?d > make a difference such as: > > > > mmchconfig afmNumWriteThreads=4 > > mmchconfig afmNumReadThreads=4 > > > > But no difference so far. > > > > Thoughts would be appreciated. I?ve done this before over much shorter > distances (30Km) and I?ve flattened a 10GbE wire without really > tuning?anything. Are my large in-flight-packets > numbers/long-time-to-acknowledgement semantics going to hurt here? I really > thought AFM might be well designed for exactly this kind of work at long > distance **and** high throughput ? so I must be missing something! > > > > -jc > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 58, Issue 12 ********************************************** From sfadden at us.ibm.com Wed Nov 9 18:24:15 2016 From: sfadden at us.ibm.com (Scott Fadden) Date: Wed, 9 Nov 2016 10:24:15 -0800 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances (Jan-Frode Myklebust) In-Reply-To: <5D327C63-84EC-4F59-86E7-158308E91013@uq.edu.au> References: <5D327C63-84EC-4F59-86E7-158308E91013@uq.edu.au> Message-ID: So you are using the NSD protocol for data transfers over multi-cluster? If so the TCP and thread tuning should help as well. Scott Fadden Spectrum Scale - Technical Marketing Phone: (503) 880-5833 sfadden at us.ibm.com http://www.ibm.com/systems/storage/spectrum/scale From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Date: 11/09/2016 10:09 AM Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances (Jan-Frode Myklebust) Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi jf? >> Mostly curious, don't have experience in such environments, but ... Is this AFM over NFS or NSD protocol? Might be interesting to try the other option -- and also check how nsdperf performs over such distance/latency. As it turns out, it seems, very few people do. I will test nsdperf over it and see how it performs. And yes, it is AFM ? AFM. No NFS involved here! -jc ------------------------------ Message: 2 Date: Wed, 9 Nov 2016 17:39:05 +0000 From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Message-ID: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0 at uq.edu.au> Content-Type: text/plain; charset="utf-8" Hi. I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. So ? my questions: 1. Are there very specific tunings AFM needs for high latency/long distance IO? 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. ethtool -C enp2s0f0 adaptive-rx off ifconfig enp2s0f0 txqueuelen 10000 sysctl -w net.core.rmem_max=536870912 sysctl -w net.core.wmem_max=536870912 sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" sysctl -w net.core.netdev_max_backlog=250000 sysctl -w net.ipv4.tcp_congestion_control=htcp sysctl -w net.ipv4.tcp_mtu_probing=1 I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: mmchconfig afmNumWriteThreads=4 mmchconfig afmNumReadThreads=4 But no difference so far. Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161109/d4f4d9a7/attachment-0001.html > ------------------------------ Message: 3 Date: Wed, 09 Nov 2016 18:05:21 +0000 From: Jan-Frode Myklebust To: "gpfsug-discuss at spectrumscale.org" Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances Message-ID: Content-Type: text/plain; charset="utf-8" Mostly curious, don't have experience in such environments, but ... Is this AFM over NFS or NSD protocol? Might be interesting to try the other option -- and also check how nsdperf performs over such distance/latency. -jf ons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll : > Hi. > > > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a > really long distance. About 180ms of latency between the two clusters and > around 13,000km of optical path. Fortunately for me, I?ve actually got near > theoretical maximum IO over the NIC?s between the clusters and I?m > iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 > all the way through. > > > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t > really understand why that might be. I?ve verified the links and transports > ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. > > > > I also verified the clusters on both sides in terms of disk IO and they > both seem easily capable in IOZone and IOR tests of multiple GB/sec of > throughput. > > > > So ? my questions: > > > > 1. Are there very specific tunings AFM needs for high latency/long > distance IO? > > 2. Are there very specific NIC/TCP-stack tunings (beyond the type > of thing we already have in place) that benefits AFM over really long > distances and high latency? > > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? > in the home mount. It sometimes takes 20 to 30 seconds before the command > line will report back with a long listing of files. Any ideas why it?d take > that long to get a response from ?home?. > > > > We?ve got our TCP stack setup fairly aggressively, on all hosts that > participate in these two clusters. > > > > ethtool -C enp2s0f0 adaptive-rx off > > ifconfig enp2s0f0 txqueuelen 10000 > > sysctl -w net.core.rmem_max=536870912 > > sysctl -w net.core.wmem_max=536870912 > > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > > sysctl -w net.core.netdev_max_backlog=250000 > > sysctl -w net.ipv4.tcp_congestion_control=htcp > > sysctl -w net.ipv4.tcp_mtu_probing=1 > > > > I modified a couple of small things on the AFM ?cache? side to see if it?d > make a difference such as: > > > > mmchconfig afmNumWriteThreads=4 > > mmchconfig afmNumReadThreads=4 > > > > But no difference so far. > > > > Thoughts would be appreciated. I?ve done this before over much shorter > distances (30Km) and I?ve flattened a 10GbE wire without really > tuning?anything. Are my large in-flight-packets > numbers/long-time-to-acknowledgement semantics going to hurt here? I really > thought AFM might be well designed for exactly this kind of work at long > distance **and** high throughput ? so I must be missing something! > > > > -jc > > > > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161109/f44369ab/attachment.html > ------------------------------ _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss End of gpfsug-discuss Digest, Vol 58, Issue 12 ********************************************** _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From jake.carroll at uq.edu.au Wed Nov 9 18:27:50 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Wed, 9 Nov 2016 18:27:50 +0000 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long In-Reply-To: References: Message-ID: <88B892F7-75AA-4881-B1E3-DDC7500456CD@uq.edu.au> Scott, Nar, very much pure AFM to AFM here, hence we are a little surprised. Last time we did this over a longish link we almost caused an outage with the ease at which we attained throughput - but maybe there are some magic tolerances we are hitting in latency and in flight IO semantics that SS/GPFS/AFM is not well tweaked for (yet...)... Yes - we are catching up at SC. I think it's all been arranged? We are also talking to one of your resources about this AFM throughput behaviour this afternoon. John I believe his name is? Anyway - if you've got any ideas, am all ears! > > > Today's Topics: > > 1. Re: Tuning AFM for high throughput/high IO over _really_ long > distances (Scott Fadden) > 2. Re: Tuning AFM for high throughput/high IO over _really_ long > distances (Jan-Frode Myklebust) (Jake Carroll) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 9 Nov 2016 10:08:42 -0800 > From: "Scott Fadden" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > Jake, > > If AFM is using NFS it is all about NFS tuning. The copy from one side to > the other is basically just a client writing to an NFS mount. Thee are a > few things you can look at: > 1. NFS Transfer size (Make is 1MiB, I think that is the max) > 2. TCP Tuning for large window size. This is discussed on Tuning active > file management home communications in the docs. On this page you will > find some discussion on increasing gateway threads, and other things > similar that may help as well. > > We can discuss further as I understand we will be meeting at SC16. > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Date: 11/09/2016 09:39 AM > Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi. > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a > really long distance. About 180ms of latency between the two clusters and > around 13,000km of optical path. Fortunately for me, I?ve actually got > near theoretical maximum IO over the NIC?s between the clusters and I?m > iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 > all the way through. > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t > really understand why that might be. I?ve verified the links and > transports ability as I said above with iPerf, and CERN?s FDT to near > 10Gbit/sec. > > I also verified the clusters on both sides in terms of disk IO and they > both seem easily capable in IOZone and IOR tests of multiple GB/sec of > throughput. > > So ? my questions: > > 1. Are there very specific tunings AFM needs for high latency/long > distance IO? > 2. Are there very specific NIC/TCP-stack tunings (beyond the type of > thing we already have in place) that benefits AFM over really long > distances and high latency? > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in > the home mount. It sometimes takes 20 to 30 seconds before the command > line will report back with a long listing of files. Any ideas why it?d > take that long to get a response from ?home?. > > We?ve got our TCP stack setup fairly aggressively, on all hosts that > participate in these two clusters. > > ethtool -C enp2s0f0 adaptive-rx off > ifconfig enp2s0f0 txqueuelen 10000 > sysctl -w net.core.rmem_max=536870912 > sysctl -w net.core.wmem_max=536870912 > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > sysctl -w net.core.netdev_max_backlog=250000 > sysctl -w net.ipv4.tcp_congestion_control=htcp > sysctl -w net.ipv4.tcp_mtu_probing=1 > > I modified a couple of small things on the AFM ?cache? side to see if it?d > make a difference such as: > > mmchconfig afmNumWriteThreads=4 > mmchconfig afmNumReadThreads=4 > > But no difference so far. > > Thoughts would be appreciated. I?ve done this before over much shorter > distances (30Km) and I?ve flattened a 10GbE wire without really > tuning?anything. Are my large in-flight-packets > numbers/long-time-to-acknowledgement semantics going to hurt here? I > really thought AFM might be well designed for exactly this kind of work at > long distance *and* high throughput ? so I must be missing something! > > -jc > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 2 > Date: Wed, 9 Nov 2016 18:09:14 +0000 > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances (Jan-Frode Myklebust) > Message-ID: <5D327C63-84EC-4F59-86E7-158308E91013 at uq.edu.au> > Content-Type: text/plain; charset="utf-8" > > Hi jf? > > >>> Mostly curious, don't have experience in such environments, but ... Is this > AFM over NFS or NSD protocol? Might be interesting to try the other option > -- and also check how nsdperf performs over such distance/latency. > > As it turns out, it seems, very few people do. > > I will test nsdperf over it and see how it performs. And yes, it is AFM ? AFM. No NFS involved here! > > -jc > > > > ------------------------------ > > Message: 2 > Date: Wed, 9 Nov 2016 17:39:05 +0000 > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over > _really_ long distances > Message-ID: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0 at uq.edu.au> > Content-Type: text/plain; charset="utf-8" > > Hi. > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. > > I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. > > So ? my questions: > > > 1. Are there very specific tunings AFM needs for high latency/long distance IO? > > 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? > > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. > > We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. > > ethtool -C enp2s0f0 adaptive-rx off > ifconfig enp2s0f0 txqueuelen 10000 > sysctl -w net.core.rmem_max=536870912 > sysctl -w net.core.wmem_max=536870912 > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > sysctl -w net.core.netdev_max_backlog=250000 > sysctl -w net.ipv4.tcp_congestion_control=htcp > sysctl -w net.ipv4.tcp_mtu_probing=1 > > I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: > > mmchconfig afmNumWriteThreads=4 > mmchconfig afmNumReadThreads=4 > > But no difference so far. > > Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! > > -jc > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > Message: 3 > Date: Wed, 09 Nov 2016 18:05:21 +0000 > From: Jan-Frode Myklebust > To: "gpfsug-discuss at spectrumscale.org" > > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Mostly curious, don't have experience in such environments, but ... Is this > AFM over NFS or NSD protocol? Might be interesting to try the other option > -- and also check how nsdperf performs over such distance/latency. > > > > -jf >> ons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll : >> >> Hi. >> >> >> >> I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a >> really long distance. About 180ms of latency between the two clusters and >> around 13,000km of optical path. Fortunately for me, I?ve actually got near >> theoretical maximum IO over the NIC?s between the clusters and I?m >> iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 >> all the way through. >> >> >> >> Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t >> really understand why that might be. I?ve verified the links and transports >> ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. >> >> >> >> I also verified the clusters on both sides in terms of disk IO and they >> both seem easily capable in IOZone and IOR tests of multiple GB/sec of >> throughput. >> >> >> >> So ? my questions: >> >> >> >> 1. Are there very specific tunings AFM needs for high latency/long >> distance IO? >> >> 2. Are there very specific NIC/TCP-stack tunings (beyond the type >> of thing we already have in place) that benefits AFM over really long >> distances and high latency? >> >> 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? >> in the home mount. It sometimes takes 20 to 30 seconds before the command >> line will report back with a long listing of files. Any ideas why it?d take >> that long to get a response from ?home?. >> >> >> >> We?ve got our TCP stack setup fairly aggressively, on all hosts that >> participate in these two clusters. >> >> >> >> ethtool -C enp2s0f0 adaptive-rx off >> >> ifconfig enp2s0f0 txqueuelen 10000 >> >> sysctl -w net.core.rmem_max=536870912 >> >> sysctl -w net.core.wmem_max=536870912 >> >> sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" >> >> sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" >> >> sysctl -w net.core.netdev_max_backlog=250000 >> >> sysctl -w net.ipv4.tcp_congestion_control=htcp >> >> sysctl -w net.ipv4.tcp_mtu_probing=1 >> >> >> >> I modified a couple of small things on the AFM ?cache? side to see if it?d >> make a difference such as: >> >> >> >> mmchconfig afmNumWriteThreads=4 >> >> mmchconfig afmNumReadThreads=4 >> >> >> >> But no difference so far. >> >> >> >> Thoughts would be appreciated. I?ve done this before over much shorter >> distances (30Km) and I?ve flattened a 10GbE wire without really >> tuning?anything. Are my large in-flight-packets >> numbers/long-time-to-acknowledgement semantics going to hurt here? I really >> thought AFM might be well designed for exactly this kind of work at long >> distance **and** high throughput ? so I must be missing something! >> >> >> >> -jc >> >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 58, Issue 12 > ********************************************** > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 58, Issue 13 > ********************************************** From olaf.weiser at de.ibm.com Wed Nov 9 20:53:11 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Wed, 9 Nov 2016 21:53:11 +0100 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances In-Reply-To: References: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> Message-ID: An HTML attachment was scrubbed... URL: From kdball at us.ibm.com Wed Nov 9 22:03:04 2016 From: kdball at us.ibm.com (Keith D Ball) Date: Wed, 9 Nov 2016 22:03:04 +0000 Subject: [gpfsug-discuss] LROC and Spectrum Scale Express In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From dhildeb at us.ibm.com Wed Nov 9 22:28:21 2016 From: dhildeb at us.ibm.com (Dean Hildebrand) Date: Wed, 9 Nov 2016 14:28:21 -0800 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long In-Reply-To: <88B892F7-75AA-4881-B1E3-DDC7500456CD@uq.edu.au> References: <88B892F7-75AA-4881-B1E3-DDC7500456CD@uq.edu.au> Message-ID: Hi Jake, I would tackle this programmatically: a) Mount the remote FS directly from a GPFS client (using multi-cluster) and evaluate the performance using GPFS directly. The key factors affecting performance here will be - number of nsd servers at remote site (home site): The client creates a new TCP connection to each NSD server. The TCP connections will slowly expand their send/receive window as more data is read/written. The more TCP connections the better since it allows multiple windows to better fill the pipe. Note that TCP windows close quickly when no data is sent..and must resume tcp slow start on each benchmark run. - size of tcp buffers that gpfs is using: You want to be able to have the sum of the TCP windows fill the pipe. - type of workload you are running: Small files occur the full latency, heavy metadata occur the full latency, but large files allow tcp slow start to expand the tcp window to the full size and fill the pipe. b) One this is done then move to AFM. Note that with writes the only way to evaluate performance is to monitor the network B/W on the GW. With reads, note that the GW writes data to storage before clients read it off the local disk. So you can either monitor read network B/W on the GW, or run your read tests directly on the GW (since then data is delivered to the application benchmark directly from the pagepool). Also note that AFM writes data in at most (by default) 1GB chunks. This can be increased with afmMaxWriteMergeLen, but be careful since if the network fails in the middle of the write to the home, it may need to restart from the beginning when the network connection is fixed. c) Using multiple GWs with AFM with parallel I/O. This allows more available nodes the opportunity to create more TCP connections, all giving a better chance to fill the pipe. The additional TCP connections also mitigates the impact of packet loss or delay since it will only affect one of the connections. Playing with chunk size here can be useful. Dean From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Date: 11/09/2016 10:28 AM Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long Sent by: gpfsug-discuss-bounces at spectrumscale.org Scott, Nar, very much pure AFM to AFM here, hence we are a little surprised. Last time we did this over a longish link we almost caused an outage with the ease at which we attained throughput - but maybe there are some magic tolerances we are hitting in latency and in flight IO semantics that SS/GPFS/AFM is not well tweaked for (yet...)... Yes - we are catching up at SC. I think it's all been arranged? We are also talking to one of your resources about this AFM throughput behaviour this afternoon. John I believe his name is? Anyway - if you've got any ideas, am all ears! > > > Today's Topics: > > 1. Re: Tuning AFM for high throughput/high IO over _really_ long > distances (Scott Fadden) > 2. Re: Tuning AFM for high throughput/high IO over _really_ long > distances (Jan-Frode Myklebust) (Jake Carroll) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 9 Nov 2016 10:08:42 -0800 > From: "Scott Fadden" > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Message-ID: > > > Content-Type: text/plain; charset="utf-8" > > Jake, > > If AFM is using NFS it is all about NFS tuning. The copy from one side to > the other is basically just a client writing to an NFS mount. Thee are a > few things you can look at: > 1. NFS Transfer size (Make is 1MiB, I think that is the max) > 2. TCP Tuning for large window size. This is discussed on Tuning active > file management home communications in the docs. On this page you will > find some discussion on increasing gateway threads, and other things > similar that may help as well. > > We can discuss further as I understand we will be meeting at SC16. > > Scott Fadden > Spectrum Scale - Technical Marketing > Phone: (503) 880-5833 > sfadden at us.ibm.com > http://www.ibm.com/systems/storage/spectrum/scale > > > > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Date: 11/09/2016 09:39 AM > Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Sent by: gpfsug-discuss-bounces at spectrumscale.org > > > > Hi. > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a > really long distance. About 180ms of latency between the two clusters and > around 13,000km of optical path. Fortunately for me, I?ve actually got > near theoretical maximum IO over the NIC?s between the clusters and I?m > iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 > all the way through. > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t > really understand why that might be. I?ve verified the links and > transports ability as I said above with iPerf, and CERN?s FDT to near > 10Gbit/sec. > > I also verified the clusters on both sides in terms of disk IO and they > both seem easily capable in IOZone and IOR tests of multiple GB/sec of > throughput. > > So ? my questions: > > 1. Are there very specific tunings AFM needs for high latency/long > distance IO? > 2. Are there very specific NIC/TCP-stack tunings (beyond the type of > thing we already have in place) that benefits AFM over really long > distances and high latency? > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in > the home mount. It sometimes takes 20 to 30 seconds before the command > line will report back with a long listing of files. Any ideas why it?d > take that long to get a response from ?home?. > > We?ve got our TCP stack setup fairly aggressively, on all hosts that > participate in these two clusters. > > ethtool -C enp2s0f0 adaptive-rx off > ifconfig enp2s0f0 txqueuelen 10000 > sysctl -w net.core.rmem_max=536870912 > sysctl -w net.core.wmem_max=536870912 > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > sysctl -w net.core.netdev_max_backlog=250000 > sysctl -w net.ipv4.tcp_congestion_control=htcp > sysctl -w net.ipv4.tcp_mtu_probing=1 > > I modified a couple of small things on the AFM ?cache? side to see if it?d > make a difference such as: > > mmchconfig afmNumWriteThreads=4 > mmchconfig afmNumReadThreads=4 > > But no difference so far. > > Thoughts would be appreciated. I?ve done this before over much shorter > distances (30Km) and I?ve flattened a 10GbE wire without really > tuning?anything. Are my large in-flight-packets > numbers/long-time-to-acknowledgement semantics going to hurt here? I > really thought AFM might be well designed for exactly this kind of work at > long distance *and* high throughput ? so I must be missing something! > > -jc > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161109/c775cf5a/attachment-0001.html > > > ------------------------------ > > Message: 2 > Date: Wed, 9 Nov 2016 18:09:14 +0000 > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances (Jan-Frode Myklebust) > Message-ID: <5D327C63-84EC-4F59-86E7-158308E91013 at uq.edu.au> > Content-Type: text/plain; charset="utf-8" > > Hi jf? > > >>> Mostly curious, don't have experience in such environments, but ... Is this > AFM over NFS or NSD protocol? Might be interesting to try the other option > -- and also check how nsdperf performs over such distance/latency. > > As it turns out, it seems, very few people do. > > I will test nsdperf over it and see how it performs. And yes, it is AFM ? AFM. No NFS involved here! > > -jc > > > > ------------------------------ > > Message: 2 > Date: Wed, 9 Nov 2016 17:39:05 +0000 > From: Jake Carroll > To: "gpfsug-discuss at spectrumscale.org" > > Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over > _really_ long distances > Message-ID: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0 at uq.edu.au> > Content-Type: text/plain; charset="utf-8" > > Hi. > > I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a really long distance. About 180ms of latency between the two clusters and around 13,000km of optical path. Fortunately for me, I?ve actually got near theoretical maximum IO over the NIC?s between the clusters and I?m iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 all the way through. > > Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t really understand why that might be. I?ve verified the links and transports ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. > > I also verified the clusters on both sides in terms of disk IO and they both seem easily capable in IOZone and IOR tests of multiple GB/sec of throughput. > > So ? my questions: > > > 1. Are there very specific tunings AFM needs for high latency/long distance IO? > > 2. Are there very specific NIC/TCP-stack tunings (beyond the type of thing we already have in place) that benefits AFM over really long distances and high latency? > > 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? in the home mount. It sometimes takes 20 to 30 seconds before the command line will report back with a long listing of files. Any ideas why it?d take that long to get a response from ?home?. > > We?ve got our TCP stack setup fairly aggressively, on all hosts that participate in these two clusters. > > ethtool -C enp2s0f0 adaptive-rx off > ifconfig enp2s0f0 txqueuelen 10000 > sysctl -w net.core.rmem_max=536870912 > sysctl -w net.core.wmem_max=536870912 > sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" > sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" > sysctl -w net.core.netdev_max_backlog=250000 > sysctl -w net.ipv4.tcp_congestion_control=htcp > sysctl -w net.ipv4.tcp_mtu_probing=1 > > I modified a couple of small things on the AFM ?cache? side to see if it?d make a difference such as: > > mmchconfig afmNumWriteThreads=4 > mmchconfig afmNumReadThreads=4 > > But no difference so far. > > Thoughts would be appreciated. I?ve done this before over much shorter distances (30Km) and I?ve flattened a 10GbE wire without really tuning?anything. Are my large in-flight-packets numbers/long-time-to-acknowledgement semantics going to hurt here? I really thought AFM might be well designed for exactly this kind of work at long distance *and* high throughput ? so I must be missing something! > > -jc > > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161109/d4f4d9a7/attachment-0001.html > > > ------------------------------ > > Message: 3 > Date: Wed, 09 Nov 2016 18:05:21 +0000 > From: Jan-Frode Myklebust > To: "gpfsug-discuss at spectrumscale.org" > > Subject: Re: [gpfsug-discuss] Tuning AFM for high throughput/high IO > over _really_ long distances > Message-ID: > > Content-Type: text/plain; charset="utf-8" > > Mostly curious, don't have experience in such environments, but ... Is this > AFM over NFS or NSD protocol? Might be interesting to try the other option > -- and also check how nsdperf performs over such distance/latency. > > > > -jf >> ons. 9. nov. 2016 kl. 18.39 skrev Jake Carroll : >> >> Hi. >> >> >> >> I?ve got an GPFS to GPFS AFM cache/home (IW) relationship set up over a >> really long distance. About 180ms of latency between the two clusters and >> around 13,000km of optical path. Fortunately for me, I?ve actually got near >> theoretical maximum IO over the NIC?s between the clusters and I?m >> iPerf?ing at around 8.90 to 9.2Gbit/sec over a 10GbE circuit. All MTU9000 >> all the way through. >> >> >> >> Anyway ? I?m finding my AFM traffic to be dragging its feet and I don?t >> really understand why that might be. I?ve verified the links and transports >> ability as I said above with iPerf, and CERN?s FDT to near 10Gbit/sec. >> >> >> >> I also verified the clusters on both sides in terms of disk IO and they >> both seem easily capable in IOZone and IOR tests of multiple GB/sec of >> throughput. >> >> >> >> So ? my questions: >> >> >> >> 1. Are there very specific tunings AFM needs for high latency/long >> distance IO? >> >> 2. Are there very specific NIC/TCP-stack tunings (beyond the type >> of thing we already have in place) that benefits AFM over really long >> distances and high latency? >> >> 3. We are seeing on the ?cache? side really lazy/sticky ?ls ?als? >> in the home mount. It sometimes takes 20 to 30 seconds before the command >> line will report back with a long listing of files. Any ideas why it?d take >> that long to get a response from ?home?. >> >> >> >> We?ve got our TCP stack setup fairly aggressively, on all hosts that >> participate in these two clusters. >> >> >> >> ethtool -C enp2s0f0 adaptive-rx off >> >> ifconfig enp2s0f0 txqueuelen 10000 >> >> sysctl -w net.core.rmem_max=536870912 >> >> sysctl -w net.core.wmem_max=536870912 >> >> sysctl -w net.ipv4.tcp_rmem="4096 87380 268435456" >> >> sysctl -w net.ipv4.tcp_wmem="4096 65536 268435456" >> >> sysctl -w net.core.netdev_max_backlog=250000 >> >> sysctl -w net.ipv4.tcp_congestion_control=htcp >> >> sysctl -w net.ipv4.tcp_mtu_probing=1 >> >> >> >> I modified a couple of small things on the AFM ?cache? side to see if it?d >> make a difference such as: >> >> >> >> mmchconfig afmNumWriteThreads=4 >> >> mmchconfig afmNumReadThreads=4 >> >> >> >> But no difference so far. >> >> >> >> Thoughts would be appreciated. I?ve done this before over much shorter >> distances (30Km) and I?ve flattened a 10GbE wire without really >> tuning?anything. Are my large in-flight-packets >> numbers/long-time-to-acknowledgement semantics going to hurt here? I really >> thought AFM might be well designed for exactly this kind of work at long >> distance **and** high throughput ? so I must be missing something! >> >> >> >> -jc >> >> >> >> >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < http://gpfsug.org/pipermail/gpfsug-discuss/attachments/20161109/f44369ab/attachment.html > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 58, Issue 12 > ********************************************** > > > > ------------------------------ > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > End of gpfsug-discuss Digest, Vol 58, Issue 13 > ********************************************** _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: From tortay at cc.in2p3.fr Thu Nov 10 06:38:35 2016 From: tortay at cc.in2p3.fr (Loic Tortay) Date: Thu, 10 Nov 2016 07:38:35 +0100 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances In-Reply-To: References: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> Message-ID: <582415EB.1030208@cc.in2p3.fr> On 09/11/2016 21:53, Olaf Weiser wrote: > let's say you have a RRT of 180 ms > what you then need is your theoretical link speed - let's say 10 Gbit/s ... > easily let's take 1 GB/s > > this means, you socket must be capable to take your bandwidth (data stream) > during the "first" 180ms because it will take at least this time to get back the > first ACKs .. . > so 1 GB / s x 0,180 s = 1024 MB/s x 0,180 s ==>> 185 MB this means, you have > to allow the operating system to accept socketsizes in that range... > > set something like this - but increase these values to 185 MB > sysctl -w net.ipv4.tcp_rmem="12194304 12194304 12194304" > sysctl -w net.ipv4.tcp_wmem="12194304 12194304 12194304" > sysctl -w net.ipv4.tcp_mem="12194304 12194304 12194304" > sysctl -w net.core.rmem_max=12194304 > sysctl -w net.core.wmem_max=12194304 > sysctl -w net.core.rmem_default=12194304 > sysctl -w net.core.wmem_default=12194304 > sysctl -w net.core.optmem_max=12194304 > Hello, In my opinion, some of these changes are, at best, misguided. For instance, the unit for "tcp_mem" is not bytes but pages. It's also not a parameter for buffers but a parameter influencing global kernel memory management for TCP (source: Linux kernel documentation/source). Or setting the maximum TCP ancillary data buffer size ("optmem_max") to a very large value when, as far a I know/saw when testing AFM w/ NFS, there is no ancillary data used. Setting the min, default and max to the same value for the buffers is also, in my opinion, highly debatable (do you really want, for instance, each and every SSH connection to have 185 MB TCP buffers? -- 185 MB being the value suggested above). I have seen the same suggestions in the AFM documentation, and in my opinion, along with the unhelpful "nfsPrefetchStrategy" recommandation ("it's critical: set it to at least 5 to 10", OK but how do I chose between 5 to 10 or should I use 42?, what's the unit?, what are the criteria?), these do not contribute to give a good understanding of the configuration (let alone "optimization") required for AFM over NFS. I must add that, in my opinion, I have "enough" experience with setting these "sysctl" parameters of NFS "tuning" (so I'm not overwhelmed by the complexity or whatever), to think something is really not right in that part of the AFM documentation. Lo?c. -- | Lo?c Tortay - IN2P3 Computing Centre | -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 2931 bytes Desc: S/MIME Cryptographic Signature URL: From olaf.weiser at de.ibm.com Thu Nov 10 08:01:56 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Thu, 10 Nov 2016 09:01:56 +0100 Subject: [gpfsug-discuss] Tuning AFM for high throughput/high IO over _really_ long distances In-Reply-To: <582415EB.1030208@cc.in2p3.fr> References: <83652C3D-0802-4CC2-B636-9FAA31EF5AF0@uq.edu.au> <582415EB.1030208@cc.in2p3.fr> Message-ID: An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Thu Nov 10 09:52:10 2016 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Thu, 10 Nov 2016 09:52:10 +0000 Subject: [gpfsug-discuss] AFM Licensing Message-ID: HI All, I have a tantalisingly interesting question about licensing... When installing a couple of AFM gateway nodes into a cluster for data migration, where the AFM filesets will only ever be local-updates, those nodes should just require a client license, right? No GPFS data will leave through those nodes, so I can't see any valid argument for them being server licensed. Anyone want to disagree? Cheers, Luke. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Thu Nov 10 12:07:25 2016 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 10 Nov 2016 12:07:25 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787785499920.png Type: image/png Size: 30777 bytes Desc: not available URL: From ulmer at ulmer.org Thu Nov 10 13:10:06 2016 From: ulmer at ulmer.org (Stephen Ulmer) Date: Thu, 10 Nov 2016 08:10:06 -0500 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: <6F30DFAF-1BD5-48D0-855E-FB9A3187AD0A@ulmer.org> The table you included was about Editions, not License types. -- Stephen > On Nov 10, 2016, at 7:07 AM, Andrew Beattie wrote: > > I think you will find that AFM in any flavor is a function of the Server license, not a client license. > > i've always found this to be a pretty good guide, although you now need to add Transparent Cloud Tiering into the bottom column > > > > > > Andrew Beattie > Software Defined Storage - IT Specialist > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > ----- Original message ----- > From: Luke Raimbach > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 8:22 PM > > HI All, > > I have a tantalisingly interesting question about licensing... > > When installing a couple of AFM gateway nodes into a cluster for data migration, where the AFM filesets will only ever be local-updates, those nodes should just require a client license, right? No GPFS data will leave through those nodes, so I can't see any valid argument for them being server licensed. > > Anyone want to disagree? > > Cheers, > Luke. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From luke.raimbach at googlemail.com Thu Nov 10 14:11:57 2016 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Thu, 10 Nov 2016 14:11:57 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: Thanks for the feature matrix, but it doesn't really say anything about client / server licenses. Surely you can have clients and servers in all three flavours - Express, Standard and Advanced. On Thu, 10 Nov 2016 at 12:07 Andrew Beattie wrote: > I think you will find that AFM in any flavor is a function of the Server > license, not a client license. > > i've always found this to be a pretty good guide, although you now need to > add Transparent Cloud Tiering into the bottom column > > > > > Andrew Beattie > Software Defined Storage - IT Specialist > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > > ----- Original message ----- > From: Luke Raimbach > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 8:22 PM > > HI All, > > I have a tantalisingly interesting question about licensing... > > When installing a couple of AFM gateway nodes into a cluster for data > migration, where the AFM filesets will only ever be local-updates, those > nodes should just require a client license, right? No GPFS data will leave > through those nodes, so I can't see any valid argument for them being > server licensed. > > Anyone want to disagree? > > Cheers, > Luke. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787785499920.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787785499920.png Type: image/png Size: 30777 bytes Desc: not available URL: From kevindjo at us.ibm.com Thu Nov 10 14:20:23 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Thu, 10 Nov 2016 14:20:23 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787856423282.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787856423283.png Type: image/png Size: 30777 bytes Desc: not available URL: From luke.raimbach at googlemail.com Thu Nov 10 14:37:02 2016 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Thu, 10 Nov 2016 14:37:02 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: Hi Kevin, Thanks for the response, but that page is still not helpful. We will not be exporting any data from the GPFS cluster through the AFM gateways. Data will be coming from external NFS data sources, through the gateway nodes INTO the GPFS file systems. Reading that licensing page suggests a client license is acceptable in this situation. There is no mention of AFM explicitly as a function of the server license. Cheers, Luke. On Thu, 10 Nov 2016 at 14:20 Kevin D Johnson wrote: > An AFM gateway node would definitely be a server licensed node. Here are > the working definitions, and yes, this would be true for the various > editions of IBM Spectrum Scale: > > > http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_gpfslicensedesignation.htm > > Kevin D. Johnson, MBA, MAFM > Spectrum Computing, Senior Managing Consultant > > IBM Certified Deployment Professional - Spectrum Scale V4.1.1 > IBM Certified Deployment Professional - Cloud Object Storage V3.8 > IBM Certified Solution Advisor - Spectrum Computing V1 > > 720.349.6199 - kevindjo at us.ibm.com > > > > > ----- Original message ----- > From: Luke Raimbach > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > > Subject: Re: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 9:12 AM > > Thanks for the feature matrix, but it doesn't really say anything about > client / server licenses. Surely you can have clients and servers in all > three flavours - Express, Standard and Advanced. > > On Thu, 10 Nov 2016 at 12:07 Andrew Beattie wrote: > > I think you will find that AFM in any flavor is a function of the Server > license, not a client license. > > i've always found this to be a pretty good guide, although you now need to > add Transparent Cloud Tiering into the bottom column > > > > > Andrew Beattie > Software Defined Storage - IT Specialist > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > > > > ----- Original message ----- > From: Luke Raimbach > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 8:22 PM > > HI All, > > I have a tantalisingly interesting question about licensing... > > When installing a couple of AFM gateway nodes into a cluster for data > migration, where the AFM filesets will only ever be local-updates, those > nodes should just require a client license, right? No GPFS data will leave > through those nodes, so I can't see any valid argument for them being > server licensed. > > Anyone want to disagree? > > Cheers, > Luke. > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > [image: Image.14787785499920.png][image: Image.14787785499920.png] > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787856423282.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787856423283.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.14787856423283.png Type: image/png Size: 30777 bytes Desc: not available URL: From gcorneau at us.ibm.com Thu Nov 10 15:02:55 2016 From: gcorneau at us.ibm.com (Glen Corneau) Date: Thu, 10 Nov 2016 09:02:55 -0600 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: The FAQ item does list "sharing data via NFS" as a Server license function (which is what the gateway node does): The IBM Spectrum Scale Server license permits the licensed virtual server to perform IBM Spectrum Scale management functions such as cluster configuration manager, quorum node, manager node, and Network Shared Disk (NSD) server. In addition, the IBM Spectrum Scale Server license permits the licensed virtual server to share IBM Spectrum Scale data directly through any application, service protocol or method such as Network File System (NFS), Common Internet File System (CIFS), File Transfer Protocol (FTP), Hypertext Transfer Protocol (HTTP), or OpenStack Swift. http://www.ibm.com/support/knowledgecenter/en/SSFKCN/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfsclustersfaq.html?view=kc#lic41 ------------------ Glen Corneau Washington Systems Center - Power Systems gcorneau at us.ibm.com From: Luke Raimbach To: gpfsug main discussion list Date: 11/10/2016 08:37 AM Subject: Re: [gpfsug-discuss] AFM Licensing Sent by: gpfsug-discuss-bounces at spectrumscale.org Hi Kevin, Thanks for the response, but that page is still not helpful. We will not be exporting any data from the GPFS cluster through the AFM gateways. Data will be coming from external NFS data sources, through the gateway nodes INTO the GPFS file systems. Reading that licensing page suggests a client license is acceptable in this situation. There is no mention of AFM explicitly as a function of the server license. Cheers, Luke. On Thu, 10 Nov 2016 at 14:20 Kevin D Johnson wrote: An AFM gateway node would definitely be a server licensed node. Here are the working definitions, and yes, this would be true for the various editions of IBM Spectrum Scale: http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_gpfslicensedesignation.htm Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage V3.8 IBM Certified Solution Advisor - Spectrum Computing V1 720.349.6199 - kevindjo at us.ibm.com ----- Original message ----- From: Luke Raimbach Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] AFM Licensing Date: Thu, Nov 10, 2016 9:12 AM Thanks for the feature matrix, but it doesn't really say anything about client / server licenses. Surely you can have clients and servers in all three flavours - Express, Standard and Advanced. On Thu, 10 Nov 2016 at 12:07 Andrew Beattie wrote: I think you will find that AFM in any flavor is a function of the Server license, not a client license. i've always found this to be a pretty good guide, although you now need to add Transparent Cloud Tiering into the bottom column Andrew Beattie Software Defined Storage - IT Specialist Phone: 614-2133-7927 E-mail: abeattie at au1.ibm.com ----- Original message ----- From: Luke Raimbach Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: [gpfsug-discuss] AFM Licensing Date: Thu, Nov 10, 2016 8:22 PM HI All, I have a tantalisingly interesting question about licensing... When installing a couple of AFM gateway nodes into a cluster for data migration, where the AFM filesets will only ever be local-updates, those nodes should just require a client license, right? No GPFS data will leave through those nodes, so I can't see any valid argument for them being server licensed. Anyone want to disagree? Cheers, Luke. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss[attachment "Image.14787856423282.png" deleted by Glen Corneau/Austin/IBM] [attachment "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] [attachment "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 26117 bytes Desc: not available URL: From kevindjo at us.ibm.com Thu Nov 10 15:11:34 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Thu, 10 Nov 2016 15:11:34 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.147878564232819.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.147878564232820.png Type: image/png Size: 30777 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Image.147878564232821.png Type: image/png Size: 30777 bytes Desc: not available URL: From luke.raimbach at googlemail.com Thu Nov 10 15:17:13 2016 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Thu, 10 Nov 2016 15:17:13 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: The gateway nodes will be mounting an external NFS server as a *client*. There will be NO NFS exports from these two AFM nodes. AFM Local Update filesets will cache the remote NFS exported file systems (pretend they are ReiserFS not GPFS to make things easier). On Thu, 10 Nov 2016 at 15:07 Glen Corneau wrote: > The FAQ item does list "sharing data via NFS" as a Server license function > (which is what the gateway node does): > > The IBM Spectrum Scale Server license permits the licensed virtual server > to perform IBM Spectrum Scale management functions such as cluster > configuration manager, quorum node, manager node, and Network Shared Disk > (NSD) server. In addition, the IBM Spectrum Scale Server license permits > the licensed virtual server to *share IBM Spectrum Scale data*directly > through any application, service protocol or method s*uch as Network File > System (NFS)*, Common Internet File System (CIFS), File Transfer Protocol > (FTP), Hypertext Transfer Protocol (HTTP), or OpenStack Swift. > > > http://www.ibm.com/support/knowledgecenter/en/SSFKCN/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfsclustersfaq.html?view=kc#lic41 > > ------------------ > Glen Corneau > Washington Systems Center - Power Systems > gcorneau at us.ibm.com > > > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/10/2016 08:37 AM > Subject: Re: [gpfsug-discuss] AFM Licensing > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Kevin, > > Thanks for the response, but that page is still not helpful. > > We will not be exporting any data from the GPFS cluster through the AFM > gateways. Data will be coming from external NFS data sources, through the > gateway nodes INTO the GPFS file systems. > > Reading that licensing page suggests a client license is acceptable in > this situation. There is no mention of AFM explicitly as a function of the > server license. > > Cheers, > Luke. > > On Thu, 10 Nov 2016 at 14:20 Kevin D Johnson <*kevindjo at us.ibm.com* > > wrote: > An AFM gateway node would definitely be a server licensed node. Here are > the working definitions, and yes, this would be true for the various > editions of IBM Spectrum Scale: > > > *http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_gpfslicensedesignation.htm* > > > *Kevin D. Johnson, MBA, MAFM* > > *Spectrum Computing, Senior Managing Consultant* > > *IBM Certified Deployment Professional - Spectrum Scale V4.1.1IBM > Certified Deployment Professional - Cloud Object Storage V3.8* > *IBM Certified Solution Advisor - Spectrum Computing V1* > > *720.349.6199 - **kevindjo at us.ibm.com* > > > > ----- Original message ----- > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 9:12 AM > > Thanks for the feature matrix, but it doesn't really say anything about > client / server licenses. Surely you can have clients and servers in all > three flavours - Express, Standard and Advanced. > > On Thu, 10 Nov 2016 at 12:07 Andrew Beattie <*abeattie at au1.ibm.com* > > wrote: > I think you will find that AFM in any flavor is a function of the Server > license, not a client license. > > i've always found this to be a pretty good guide, although you now need to > add Transparent Cloud Tiering into the bottom column > > > > > > *Andrew Beattie* > *Software Defined Storage - IT Specialist* > *Phone: *614-2133-7927 > *E-mail: **abeattie at au1.ibm.com* > > > > ----- Original message ----- > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 8:22 PM > > HI All, > > I have a tantalisingly interesting question about licensing... > > When installing a couple of AFM gateway nodes into a cluster for data > migration, where the AFM filesets will only ever be local-updates, those > nodes should just require a client license, right? No GPFS data will leave > through those nodes, so I can't see any valid argument for them being > server licensed. > > Anyone want to disagree? > > Cheers, > Luke. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > *[attachment > "Image.14787856423282.png" deleted by Glen Corneau/Austin/IBM] [attachment > "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] [attachment > "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] * > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 26117 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 26117 bytes Desc: not available URL: From luke.raimbach at googlemail.com Thu Nov 10 15:55:33 2016 From: luke.raimbach at googlemail.com (Luke Raimbach) Date: Thu, 10 Nov 2016 15:55:33 +0000 Subject: [gpfsug-discuss] AFM Licensing In-Reply-To: References: Message-ID: Thanks! That's what I was looking for. Cheers, Luke. On Thu, 10 Nov 2016 at 15:17 Luke Raimbach wrote: > The gateway nodes will be mounting an external NFS server as a *client*. > There will be NO NFS exports from these two AFM nodes. > > AFM Local Update filesets will cache the remote NFS exported file systems > (pretend they are ReiserFS not GPFS to make things easier). > > > > On Thu, 10 Nov 2016 at 15:07 Glen Corneau wrote: > > The FAQ item does list "sharing data via NFS" as a Server license function > (which is what the gateway node does): > > The IBM Spectrum Scale Server license permits the licensed virtual server > to perform IBM Spectrum Scale management functions such as cluster > configuration manager, quorum node, manager node, and Network Shared Disk > (NSD) server. In addition, the IBM Spectrum Scale Server license permits > the licensed virtual server to *share IBM Spectrum Scale data*directly > through any application, service protocol or method s*uch as Network File > System (NFS)*, Common Internet File System (CIFS), File Transfer Protocol > (FTP), Hypertext Transfer Protocol (HTTP), or OpenStack Swift. > > > http://www.ibm.com/support/knowledgecenter/en/SSFKCN/com.ibm.cluster.gpfs.doc/gpfs_faqs/gpfsclustersfaq.html?view=kc#lic41 > > ------------------ > Glen Corneau > Washington Systems Center - Power Systems > gcorneau at us.ibm.com > > > > > > From: Luke Raimbach > To: gpfsug main discussion list > Date: 11/10/2016 08:37 AM > Subject: Re: [gpfsug-discuss] AFM Licensing > Sent by: gpfsug-discuss-bounces at spectrumscale.org > ------------------------------ > > > > Hi Kevin, > > Thanks for the response, but that page is still not helpful. > > We will not be exporting any data from the GPFS cluster through the AFM > gateways. Data will be coming from external NFS data sources, through the > gateway nodes INTO the GPFS file systems. > > Reading that licensing page suggests a client license is acceptable in > this situation. There is no mention of AFM explicitly as a function of the > server license. > > Cheers, > Luke. > > On Thu, 10 Nov 2016 at 14:20 Kevin D Johnson <*kevindjo at us.ibm.com* > > wrote: > An AFM gateway node would definitely be a server licensed node. Here are > the working definitions, and yes, this would be true for the various > editions of IBM Spectrum Scale: > > > *http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.0/com.ibm.spectrum.scale.v4r2.ins.doc/bl1ins_gpfslicensedesignation.htm* > > > *Kevin D. Johnson, MBA, MAFM* > > *Spectrum Computing, Senior Managing Consultant* > > *IBM Certified Deployment Professional - Spectrum Scale V4.1.1IBM > Certified Deployment Professional - Cloud Object Storage V3.8* > *IBM Certified Solution Advisor - Spectrum Computing V1* > > *720.349.6199 - **kevindjo at us.ibm.com* > > > > ----- Original message ----- > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: Re: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 9:12 AM > > Thanks for the feature matrix, but it doesn't really say anything about > client / server licenses. Surely you can have clients and servers in all > three flavours - Express, Standard and Advanced. > > On Thu, 10 Nov 2016 at 12:07 Andrew Beattie <*abeattie at au1.ibm.com* > > wrote: > I think you will find that AFM in any flavor is a function of the Server > license, not a client license. > > i've always found this to be a pretty good guide, although you now need to > add Transparent Cloud Tiering into the bottom column > > > > > > *Andrew Beattie* > *Software Defined Storage - IT Specialist* > *Phone: *614-2133-7927 > *E-mail: **abeattie at au1.ibm.com* > > > > ----- Original message ----- > From: Luke Raimbach <*luke.raimbach at googlemail.com* > > > Sent by: *gpfsug-discuss-bounces at spectrumscale.org* > > To: gpfsug main discussion list <*gpfsug-discuss at spectrumscale.org* > > > Cc: > Subject: [gpfsug-discuss] AFM Licensing > Date: Thu, Nov 10, 2016 8:22 PM > > HI All, > > I have a tantalisingly interesting question about licensing... > > When installing a couple of AFM gateway nodes into a cluster for data > migration, where the AFM filesets will only ever be local-updates, those > nodes should just require a client license, right? No GPFS data will leave > through those nodes, so I can't see any valid argument for them being > server licensed. > > Anyone want to disagree? > > Cheers, > Luke. > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at *spectrumscale.org* > *http://gpfsug.org/mailman/listinfo/gpfsug-discuss* > *[attachment > "Image.14787856423282.png" deleted by Glen Corneau/Austin/IBM] [attachment > "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] [attachment > "Image.14787856423283.png" deleted by Glen Corneau/Austin/IBM] * > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Thu Nov 10 19:14:57 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Thu, 10 Nov 2016 19:14:57 +0000 Subject: [gpfsug-discuss] Local Read-Only cache: Undocumented Config options Message-ID: Can anyone tell me what the highlighted options are? Some of them look "interesting". lrocChecksum 0 lrocData 1 lrocDataMaxBufferSize 0 ! lrocDataMaxFileSize -1 ! lrocDataStubFileSize -1 lrocDeviceMaxSectorsKB 64 lrocDeviceNrRequests 1024 lrocDeviceQueueDepth 31 lrocDevices 0A1E183A5824A9ED#/dev/sdb; lrocDeviceScheduler deadline lrocDeviceSetParams 1 lrocDirectories 1 lrocInodes 1 Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Fri Nov 11 08:50:00 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Fri, 11 Nov 2016 08:50:00 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: , Message-ID: That?s worked, thanks Andreas. Question: when I upgrade to the new PTF when it?s available, can I install it first on just the GUI node (which happens to be the Quorum server for the cluster) and the fixes will go in, or do I need to deploy the new pmsensors packages? From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Andreas Koeninger Sent: 08 November 2016 16:50 To: gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] How to clear stale entries in GUI log Hello Richard, without the PTF (which is not yet available) you will have to manually clear the GUI database as well by running the following command on the GUI node: psql postgres postgres -c "delete from fscc.gss_state where sensor like 'H\_%';" This will remove all events from the GUI database coming from mmhealth. To repopulate this table with all the currently reported events from mmhealth please run: /usr/lpp/mmfs/gui/cli/runtask HEALTH_STATES Let me know if that helps, Andreas Koeninger Spectrum Scale GUI Development ----- Original message ----- From: "Sobey, Richard A" Sent by: gpfsug-discuss-bounces at spectrumscale.org To: gpfsug main discussion list Cc: Subject: Re: [gpfsug-discuss] How to clear stale entries in GUI log Date: Tue, Nov 8, 2016 4:10 PM Thanks. I?ve run that on, I assume, our quorum server where this disk is mounted, but the error is still showing up. The event itself doesn?t say which node is affected. ICSAN_GPFS_FSD_QUORUM nsd 512 103 no no ready up system That looks ok to me. Maybe I misunderstood your line ?This is a per node database, so you need to run this on all the nodes which have stale entries.?. Should I just run it on all the nodes in the cluster instead? there?s not many so won?t take long but wondering if that?s really necessary? Thanks Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Markus Rohwedder Sent: 08 November 2016 14:51 To: gpfsug-discuss at spectrumscale.org Subject: [gpfsug-discuss] How to clear stale entries in GUI log Hello, you ran into a defect which is fixed with the upcoming 4.2.1.2 PTF Here is a workaround: You can clear the eventlog of the system health component using mmsysmonc clearDB This is a per node database, so you need to run this on all the nodes which have stale entries. It will clear all the events on this node, if you want to save them run: mmhealth node eventlog > log.save On the GUI node, run systemctl restart gpfsgui afterwards. The mmhealth command suppresses events during startup. So in case a bad condition turns OK during a restart phase, the bad event will remain stale. Regards, Markus Rohwedder IBM Spectrum Scale GUI development _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy_parker1 at uk.ibm.com Fri Nov 11 16:20:52 2016 From: andy_parker1 at uk.ibm.com (Andy Parker1) Date: Fri, 11 Nov 2016 16:20:52 +0000 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Message-ID: We have setup a small cluster to test, play & learn about the protocol servers. We have setup mmuserauth for AD + RFC2307 and we can share and access data via SMB and access is on windows clients with no issues. The file DAC of a file created via windows looks like this from the SS cesNode: $ ls -l total 0 -rwxr--r-- 1 SPECTRUMSCALE\newmanjo SPECTRUMSCALE\ces-admins 33 Nov 10 17:29 helloworld.txt The NFS protocol is also exported for NFS 3,4 and when mount using NFS version '3' from an AIX 7.1 server I see also OK DAC names uid / group, so the UID mapping is working. The AIX is linked to the AD for LDAP account services and I can query accounts and get shell logon for accounts defined within AD for unix services. # ls -l ( from AIX client NFS V3) total 0 -rwxr--r-- 1 newmanjo ces-admi 33 10 Nov 17:29 helloworld.txt Now the Problem: When I mount the AIX client as NFS4 I do no see the user/group names. I know NFS4 passes names and not UID/GID numbers so I guess this is linked. # pwd /mnt/ibm/hurss/share1 # ls -l ( from AIX client NFS V4) total 0 -rwxr--r-- 1 nobody nobody 33 10 Nov 17:29 helloworld.txt On the AIX server I have set NFS domain to virtual1.com # chnfsdom Current local domain: virtual1.com This matches the DOMAIN from the mmnfs config list domain ( not 100% sure this is correct) [root at hurss4 ~]# mmnfs config list NFS Ganesha Configuration: ========================== NFS_PROTOCOLS: 3,4 NFS_PORT: 2049 MNT_PORT: 0 NLM_PORT: 0 RQUOTA_PORT: 0 SHORT_FILE_HANDLE: FALSE LEASE_LIFETIME: 60 DOMAINNAME: VIRTUAL1.COM DELEGATIONS: Disabled Also the 'nfsrgyd' a name translation service for NFS servers and clients is running. lssrc -s nfsrgyd Subsystem Group PID Status nfsrgyd nfs 8585412 active Summary / Question: Can anybody explain why I do not see userID / Group names when viewing via a NFS4 client and ideally how to fix this. Rgds Andy P Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From Valdis.Kletnieks at vt.edu Sat Nov 12 20:23:42 2016 From: Valdis.Kletnieks at vt.edu (Valdis.Kletnieks at vt.edu) Date: Sat, 12 Nov 2016 15:23:42 -0500 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: , Message-ID: <157403.1478982222@turing-police.cc.vt.edu> On Fri, 11 Nov 2016 08:50:00 +0000, "Sobey, Richard A" said: > Question: when I upgrade to the new PTF when it???s available, can I install it > first on just the GUI node (which happens to be the Quorum server for the > cluster) *the* quorum server, not "one of the quorum nodes"? Best practice is to have enough nodes designated as quorum nodes so even if one of them is taken down for upgrade or maintenance, the cluster as a whole remains up and serving data. That way, you can do rolling installs of patches without taking an outage. The number to pick depends on your config - we have one cluster with 4 NSD servers, where we've defined all 4 as quorum nodes. That way, as long as 3 of them (half plus 1) are up, the cluster stays up. We have another stretch cluster with 10 servers (5 at each node), and we defined 3 quorum nodes at our main site, and 2 at the remote site, specifically so that if we did lose the 10G link between sites, the main site would retain quorum and stay up. (Losing the remote site is, in our setup, *much* less critical than ensuring the main site stays up. We replicate between the two, and if the remote is down, and thus falls behind, mmrestripefs is available for cleaning up) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 484 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Sat Nov 12 20:39:07 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Sat, 12 Nov 2016 20:39:07 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: <157403.1478982222@turing-police.cc.vt.edu> References: , <157403.1478982222@turing-police.cc.vt.edu> Message-ID: Sorry... one of the quorum nodes. -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Valdis.Kletnieks at vt.edu Sent: 12 November 2016 20:24 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] How to clear stale entries in GUI log On Fri, 11 Nov 2016 08:50:00 +0000, "Sobey, Richard A" said: > Question: when I upgrade to the new PTF when it???s available, can I > install it first on just the GUI node (which happens to be the Quorum > server for the > cluster) *the* quorum server, not "one of the quorum nodes"? Best practice is to have enough nodes designated as quorum nodes so even if one of them is taken down for upgrade or maintenance, the cluster as a whole remains up and serving data. That way, you can do rolling installs of patches without taking an outage. The number to pick depends on your config - we have one cluster with 4 NSD servers, where we've defined all 4 as quorum nodes. That way, as long as 3 of them (half plus 1) are up, the cluster stays up. We have another stretch cluster with 10 servers (5 at each node), and we defined 3 quorum nodes at our main site, and 2 at the remote site, specifically so that if we did lose the 10G link between sites, the main site would retain quorum and stay up. (Losing the remote site is, in our setup, *much* less critical than ensuring the main site stays up. We replicate between the two, and if the remote is down, and thus falls behind, mmrestripefs is available for cleaning up) From laurence at qsplace.co.uk Sat Nov 12 20:53:37 2016 From: laurence at qsplace.co.uk (Laurence Horrocks-Barlow) Date: Sat, 12 Nov 2016 20:53:37 +0000 Subject: [gpfsug-discuss] How to clear stale entries in GUI log In-Reply-To: References: <157403.1478982222@turing-police.cc.vt.edu> Message-ID: The Quorum buster node :P -- Lauz On 12/11/2016 20:39, Sobey, Richard A wrote: > Sorry... one of the quorum nodes. > > -----Original Message----- > From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Valdis.Kletnieks at vt.edu > Sent: 12 November 2016 20:24 > To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] How to clear stale entries in GUI log > > On Fri, 11 Nov 2016 08:50:00 +0000, "Sobey, Richard A" said: >> Question: when I upgrade to the new PTF when it???s available, can I >> install it first on just the GUI node (which happens to be the Quorum >> server for the >> cluster) > *the* quorum server, not "one of the quorum nodes"? > > Best practice is to have enough nodes designated as quorum nodes so even if one of them is taken down for upgrade or maintenance, the cluster as a whole remains up and serving data. That way, you can do rolling installs of patches without taking an outage. > > The number to pick depends on your config - we have one cluster with 4 NSD servers, where we've defined all 4 as quorum nodes. That way, as long as 3 of them (half plus 1) are up, the cluster stays up. We have another stretch cluster with 10 servers (5 at each node), and we defined 3 quorum nodes at our main site, and 2 at the remote site, specifically so that if we did lose the 10G link between sites, the main site would retain quorum and stay up. > (Losing the remote site is, in our setup, *much* less critical than ensuring the main site stays up. We replicate between the two, and if the remote is down, and thus falls behind, mmrestripefs is available for cleaning up) > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus From jake.carroll at uq.edu.au Sun Nov 13 14:18:38 2016 From: jake.carroll at uq.edu.au (Jake Carroll) Date: Sun, 13 Nov 2016 14:18:38 +0000 Subject: [gpfsug-discuss] Achieving high parallelism with AFM using NFS? Message-ID: <025F8914-F7A0-465F-9B99-961F70DA2B03@uq.edu.au> Hi all. After some help from IBM, we?ve concluded (and been told) that AFM over the NSD protocol when latency is greater than around 50ms on the RTT is effectively unusable. We?ve proven that now, so it is time to move on from the NSD protocol being an effective option in those conditions (unless IBM can consider it something worthy of an RFE and can fix it!). The problem we face now, is one of parallelism and filling that 10GbE/40GbE/100GbE pipe efficiently, when using NFS as the transport provider for AFM. On my test cluster at ?Cache? side I?ve got two or three gateways: [root at mc-5 ~]# mmlscluster GPFS cluster information ======================== GPFS cluster name: sdx-gpfs.xxxxxxxxxxxxxxxx GPFS cluster id: 12880500218013865782 GPFS UID domain: sdx-gpfs. xxxxxxxxxxxxxxxx Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation --------------------------------------------------------------------------------------- 1 mc-5. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-5.hidden.net quorum-manager 2 mc-6. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-6. hidden.net quorum-manager-gateway 3 mc-7. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-7. hidden.net quorum-manager-gateway 4 mc-8. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-8. hidden.net quorum-manager-gateway The bit I really don?t get is: 1. Why no traffic ever seems to go through mc-6 or mc-8 back to my ?home? directly and 2. Why it only ever lists my AFM-cache fileset being associated with one gateway (mc-7). I can see traffic flowing through mc-6 sometimes?but when it does, it all seems to channel back through mc-7 THEN back to the AFM-home. Am I missing something? This is where I see one of the gateway?s listed (but never the others?). [root at mc-5 ~]# mmafmctl afmcachefs getstate Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ------------- ------------ ------------ ------------- afm-home nfs://omnipath2/gpfs-flash/afm-home Active mc-7 0 746636 I got told I needed to setup ?explicit maps? back to my home cluster to achieve parallelism: [root at mc-5 ~]# mmafmconfig show Map name: omnipath1 Export server map: address.is.hidden.100/mc-6.ip.address.hidden Map name: omnipath2 Export server map: address.is.hidden.101/mc-7.ip.address.hidden But ? I have never seen any traffic come back from mc-6 to omnipath1. What am I missing, and how do I actually achieve significant enough parallelism over an NFS transport to fill my 10GbE pipe? I?ve seen maybe a couple of gigabits per second from the mc-7 host writing back to the omnipath2 host ? and that was really trying my level best to put as many files onto the afm-cache at this side and hoping that enough threads pick up enough different files to start transferring files down the AFM simultaneously ? but what I?d really like is those large files (or small, up to the thresholds set) to break into parallel chunks and ALL transfer as fast as possible, utilising as much of the 10GbE as they can. Maybe I am missing fundamental principles in the way AFM works? Thanks. -jc PS: NB The link is easily capable of 10GbE. We?ve tested it all the way up to about 9.67Gbit/sec transferring data from these sets of hosts using other protocols such as fDT and Globus Grid FTP Et al. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kallbac at iu.edu Mon Nov 14 00:49:09 2016 From: kallbac at iu.edu (Kallback-Rose, Kristy A) Date: Sun, 13 Nov 2016 19:49:09 -0500 Subject: [gpfsug-discuss] SC16 SSUG Event Exit Poll Message-ID: <187C0352-CB2D-42AA-A75F-43B1D073D519@iu.edu> If you attended the SC16 meeting today please complete this quick exit poll so we can make improvements and keep what you liked. https://www.surveymonkey.com/r/SSUGSC16 Thanks, Kristy -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: From kallbac at iu.edu Mon Nov 14 00:50:05 2016 From: kallbac at iu.edu (Kallback-Rose, Kristy A) Date: Sun, 13 Nov 2016 19:50:05 -0500 Subject: [gpfsug-discuss] Next US In-person Spectrum Scale Users Group meeting Message-ID: <4F396324-FCBC-4DF7-B47B-5F0EA6A42EE7@iu.edu> Hi all, We need to start planning for a spring-time(ish?) in-person meeting in the US. It would be great if we can have the event at a user site, so before I send out a survey about where to have the next meeting, I need to know who would be willing to host. So, please reach out and let us know. If you?re at SC16 feel free to reach out to Bob or me in person. Best, Kristy Kristy Kallback-Rose Manager, Research Storage Indiana University -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 495 bytes Desc: Message signed with OpenPGP using GPGMail URL: From bbanister at jumptrading.com Mon Nov 14 01:32:56 2016 From: bbanister at jumptrading.com (Bryan Banister) Date: Mon, 14 Nov 2016 01:32:56 +0000 Subject: [gpfsug-discuss] Next US In-person Spectrum Scale Users Group meeting In-Reply-To: <4F396324-FCBC-4DF7-B47B-5F0EA6A42EE7@iu.edu> References: <4F396324-FCBC-4DF7-B47B-5F0EA6A42EE7@iu.edu> Message-ID: <21BC488F0AEA2245B2C3E83FC0B33DBB063D7717@CHI-EXCHANGEW1.w2k.jumptrading.com> I nominate LBNL/NERSC... I remember them saying they could host in a previous UG meeting. Only a year open now in their new building and I still need to come check it out. ;o) How about it LBNL/NERSC?? -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kallback-Rose, Kristy A Sent: Sunday, November 13, 2016 6:50 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Next US In-person Spectrum Scale Users Group meeting Hi all, We need to start planning for a spring-time(ish?) in-person meeting in the US. It would be great if we can have the event at a user site, so before I send out a survey about where to have the next meeting, I need to know who would be willing to host. So, please reach out and let us know. If you?re at SC16 feel free to reach out to Bob or me in person. Best, Kristy Kristy Kallback-Rose Manager, Research Storage Indiana University ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. From kallbac at iu.edu Mon Nov 14 05:34:22 2016 From: kallbac at iu.edu (Kallback-Rose, Kristy A) Date: Mon, 14 Nov 2016 05:34:22 +0000 Subject: [gpfsug-discuss] Next US In-person Spectrum Scale Users Group meeting Message-ID: <655cafd8-6446-4878-8268-127d23fc2a80@email.android.com> Thanks Brian. It's on the list. Others? On Nov 13, 2016 6:33 PM, Bryan Banister wrote: I nominate LBNL/NERSC... I remember them saying they could host in a previous UG meeting. Only a year open now in their new building and I still need to come check it out. ;o) How about it LBNL/NERSC?? -Bryan -----Original Message----- From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Kallback-Rose, Kristy A Sent: Sunday, November 13, 2016 6:50 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Next US In-person Spectrum Scale Users Group meeting Hi all, We need to start planning for a spring-time(ish?) in-person meeting in the US. It would be great if we can have the event at a user site, so before I send out a survey about where to have the next meeting, I need to know who would be willing to host. So, please reach out and let us know. If you?re at SC16 feel free to reach out to Bob or me in person. Best, Kristy Kristy Kallback-Rose Manager, Research Storage Indiana University ________________________________ Note: This email is for the confidential use of the named addressee(s) only and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you are hereby notified that any review, dissemination or copying of this email is strictly prohibited, and to please notify the sender immediately and destroy this email and any attachments. Email transmission cannot be guaranteed to be secure or error-free. The Company, therefore, does not make any guarantees as to the completeness or accuracy of this email or any attachments. This email is for informational purposes only and does not constitute a recommendation, offer, request or solicitation of any kind to buy, sell, subscribe, redeem or perform any type of transaction of a financial product. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From radhika.p at in.ibm.com Mon Nov 14 09:59:26 2016 From: radhika.p at in.ibm.com (Radhika A Parameswaran) Date: Mon, 14 Nov 2016 15:29:26 +0530 Subject: [gpfsug-discuss] Achieving high parallelism with AFM using NFS? Message-ID: Hello Jake, You will have to set the mapping to include all the GW's that you want to involve in the transfer. Please refer to the example provided in the Knowledge Centre: http://www.ibm.com/support/knowledgecenter/en/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_paralleldatatransfersafm.htm Thanks and Regards Radhika Message: 1 Date: Sun, 13 Nov 2016 14:18:38 +0000 From: Jake Carroll To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] Achieving high parallelism with AFM using NFS? Message-ID: <025F8914-F7A0-465F-9B99-961F70DA2B03 at uq.edu.au> Content-Type: text/plain; charset="utf-8" Hi all. After some help from IBM, we?ve concluded (and been told) that AFM over the NSD protocol when latency is greater than around 50ms on the RTT is effectively unusable. We?ve proven that now, so it is time to move on from the NSD protocol being an effective option in those conditions (unless IBM can consider it something worthy of an RFE and can fix it!). The problem we face now, is one of parallelism and filling that 10GbE/40GbE/100GbE pipe efficiently, when using NFS as the transport provider for AFM. On my test cluster at ?Cache? side I?ve got two or three gateways: [root at mc-5 ~]# mmlscluster GPFS cluster information ======================== GPFS cluster name: sdx-gpfs.xxxxxxxxxxxxxxxx GPFS cluster id: 12880500218013865782 GPFS UID domain: sdx-gpfs. xxxxxxxxxxxxxxxx Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation --------------------------------------------------------------------------------------- 1 mc-5. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-5.hidden.net quorum-manager 2 mc-6. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-6. hidden.net quorum-manager-gateway 3 mc-7. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-7. hidden.net quorum-manager-gateway 4 mc-8. xxxxxxxxxxxxxxxx.net ip.addresses.hidden mc-8. hidden.net quorum-manager-gateway The bit I really don?t get is: 1. Why no traffic ever seems to go through mc-6 or mc-8 back to my ?home? directly and 2. Why it only ever lists my AFM-cache fileset being associated with one gateway (mc-7). I can see traffic flowing through mc-6 sometimes?but when it does, it all seems to channel back through mc-7 THEN back to the AFM-home. Am I missing something? This is where I see one of the gateway?s listed (but never the others?). [root at mc-5 ~]# mmafmctl afmcachefs getstate Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ------------- ------------ ------------ ------------- afm-home nfs://omnipath2/gpfs-flash/afm-home Active mc-7 0 746636 I got told I needed to setup ?explicit maps? back to my home cluster to achieve parallelism: [root at mc-5 ~]# mmafmconfig show Map name: omnipath1 Export server map: address.is.hidden.100/mc-6.ip.address.hidden Map name: omnipath2 Export server map: address.is.hidden.101/mc-7.ip.address.hidden But ? I have never seen any traffic come back from mc-6 to omnipath1. What am I missing, and how do I actually achieve significant enough parallelism over an NFS transport to fill my 10GbE pipe? I?ve seen maybe a couple of gigabits per second from the mc-7 host writing back to the omnipath2 host ? and that was really trying my level best to put as many files onto the afm-cache at this side and hoping that enough threads pick up enough different files to start transferring files down the AFM simultaneously ? but what I?d really like is those large files (or small, up to the thresholds set) to break into parallel chunks and ALL transfer as fast as possible, utilising as much of the 10GbE as they can. Maybe I am missing fundamental principles in the way AFM works? Thanks. -jc -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweil at wustl.edu Mon Nov 14 19:04:53 2016 From: mweil at wustl.edu (Matt Weil) Date: Mon, 14 Nov 2016 13:04:53 -0600 Subject: [gpfsug-discuss] dependency problem with python-dnspython and python-dns Message-ID: <790b2761-dfde-3047-9f85-3a38f16335f4@wustl.edu> > #manual install protocal nodes > yum install nfs-ganesha-2.3.2-0.ibm24_2.el7.x86_64 > nfs-ganesha-gpfs-2.3.2-0.ibm24_2.el7.x86_64 > nfs-ganesha-utils-2.3.2-0.ibm24_2.el7.x86_64 > gpfs.smb-4.3.11_gpfs_21-8.el7.x86_64 spectrum-scale-object-4.2.1-1.noarch > > there is a dependancy problem with python-dns > Transaction check error: > file /usr/lib/python2.7/site-packages/dns/__init__.pyc from install > of python-dnspython-1.11.1-1.ibm.noarch conflicts with file from > package python-dns-1.12.0-2.20150617git465785f.el7.noarch > file /usr/lib/python2.7/site-packages/dns/rdtypes/ANY/__init__.pyc > from install of python-dnspython-1.11.1-1.ibm.noarch conflicts with > file from package python-dns-1.12.0-2.20150617git465785f.el7.noarch > file /usr/lib/python2.7/site-packages/dns/rdtypes/IN/__init__.pyc > from install of python-dnspython-1.11.1-1.ibm.noarch conflicts with > file from package python-dns-1.12.0-2.20150617git465785f.el7.noarch > file /usr/lib/python2.7/site-packages/dns/rdtypes/__init__.pyc from > install of python-dnspython-1.11.1-1.ibm.noarch conflicts with file > from package python-dns-1.12.0-2.20150617git465785f.el7.noarch > file /usr/lib/python2.7/site-packages/dns/__init__.pyo from install > of python-dnspython-1.11.1-1.ibm.noarch conflicts with file from > package python-dns-1.12.0-2.20150617git465785f.el7.noarch > > yum remove python-dns-1.12.0-2.20150617git465785f.el7.noarch -y > yum install nfs-ganesha-2.3.2-0.ibm24_2.el7.x86_64 > nfs-ganesha-gpfs-2.3.2-0.ibm24_2.el7.x86_64 > nfs-ganesha-utils-2.3.2-0.ibm24_2.el7.x86_64 > gpfs.smb-4.3.11_gpfs_21-8.el7.x86_64 spectrum-scale-object-4.2.1-1.noarch > #install IPA with nodeps > yum install --downloadonly ipa-client > then > rpm -Uvh --nodeps > /var/cache/yum/x86_64/7Server/rhel-7-server-rpms/packages/*ipa* ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. From Mark.Bush at siriuscom.com Mon Nov 14 19:48:04 2016 From: Mark.Bush at siriuscom.com (Mark.Bush at siriuscom.com) Date: Mon, 14 Nov 2016 19:48:04 +0000 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: <87251B6E-99A4-4493-9A28-1F794E624216@siriuscom.com> I don?t have the exact answer to this issue but I had dealt with something similar before. I?m thinking this may have something to do with NFSv4 needing to be kerberized to work with AD? Again, not really sure on the SpecScale specifics here but worth seeing if you need Kerberos as well to get this to authenticate properly with AD and NFSv4. From: on behalf of Andy Parker1 Reply-To: gpfsug main discussion list Date: Friday, November 11, 2016 at 10:20 AM To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB We have setup a small cluster to test, play & learn about the protocol servers. We have setup mmuserauth for AD + RFC2307 and we can share and access data via SMB and access is on windows clients with no issues. The file DAC of a file created via windows looks like this from the SS cesNode: $ ls -l total 0 -rwxr--r-- 1 SPECTRUMSCALE\newmanjo SPECTRUMSCALE\ces-admins 33 Nov 10 17:29 helloworld.txt The NFS protocol is also exported for NFS 3,4 and when mount using NFS version '3' from an AIX 7.1 server I see also OK DAC names uid / group, so the UID mapping is working. The AIX is linked to the AD for LDAP account services and I can query accounts and get shell logon for accounts defined within AD for unix services. # ls -l ( from AIX client NFS V3) total 0 -rwxr--r-- 1 newmanjo ces-admi 33 10 Nov 17:29 helloworld.txt Now the Problem: When I mount the AIX client as NFS4 I do no see the user/group names. I know NFS4 passes names and not UID/GID numbers so I guess this is linked. # pwd /mnt/ibm/hurss/share1 # ls -l ( from AIX client NFS V4) total 0 -rwxr--r-- 1 nobody nobody 33 10 Nov 17:29 helloworld.txt On the AIX server I have set NFS domain to virtual1.com # chnfsdom Current local domain: virtual1.com This matches the DOMAIN from the mmnfs config list domain ( not 100% sure this is correct) [root at hurss4 ~]# mmnfs config list NFS Ganesha Configuration: ========================== NFS_PROTOCOLS: 3,4 NFS_PORT: 2049 MNT_PORT: 0 NLM_PORT: 0 RQUOTA_PORT: 0 SHORT_FILE_HANDLE: FALSE LEASE_LIFETIME: 60 DOMAINNAME: VIRTUAL1.COM DELEGATIONS: Disabled Also the 'nfsrgyd' a name translation service for NFS servers and clients is running. lssrc -s nfsrgyd Subsystem Group PID Status nfsrgyd nfs 8585412 active Summary / Question: Can anybody explain why I do not see userID / Group names when viewing via a NFS4 client and ideally how to fix this. Rgds Andy P Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. This message may be viewed by parties at Sirius Computer Solutions other than those named in the message header. This message does not contain an official representation of Sirius Computer Solutions. If you have received this communication in error, notify Sirius Computer Solutions immediately and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. Sirius Computer Solutions -------------- next part -------------- An HTML attachment was scrubbed... URL: From xhejtman at ics.muni.cz Mon Nov 14 19:57:22 2016 From: xhejtman at ics.muni.cz (Lukas Hejtmanek) Date: Mon, 14 Nov 2016 20:57:22 +0100 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: <20161114195722.v3vmflie7dji7tgk@ics.muni.cz> On Fri, Nov 11, 2016 at 04:20:52PM +0000, Andy Parker1 wrote: > When I mount the AIX client as NFS4 I do no see the user/group names. I > know NFS4 passes names and not UID/GID numbers so I > guess this is linked. > Can anybody explain why I do not see userID / Group names when viewing > via a NFS4 client and ideally how to fix this. use tcpdump or something to catch NFS traffic. From the client do: 1) mount 2) ls -l nfs_mount_dir see the dumped traffic (wireshark is your friend) and see, what names are in the dump. You can see either nobody at virutaldomain1, which means that the problem is on the server (i.e., server is not able to translate UID/GID to name resulting in nobody/nogroup), or someting like SPECTRUMSCALE\newmanjo at virtualdomain1. I would expect tha latter case and you see 'nobody' because the client does not understand SPECTRUMSCALE]newmanjo user. Perhaps, set SMB so that domain is stripped off from names, i.e., you should see only newmanjo instead of SPECTRUMSCALE\newmanjo on the server. -- Luk?? Hejtm?nek From YARD at il.ibm.com Mon Nov 14 20:05:11 2016 From: YARD at il.ibm.com (Yaron Daniel) Date: Mon, 14 Nov 2016 22:05:11 +0200 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: <87251B6E-99A4-4493-9A28-1F794E624216@siriuscom.com> References: <87251B6E-99A4-4493-9A28-1F794E624216@siriuscom.com> Message-ID: Hi The protocols CES nodes are configure to get users data from the AD, so all files there show as "DOMain\User" output. When files created from NFSv3 is will have the same UID as in the CES nodes - but different user name - and there is mismatch when u work with NFSv4. Since NFSv4 check for "Domain\user" format - in both server & client u must have the same username in the CES & Nodes. Now - if files were create from CIFS share , i guess you will not have problem to define in the ACL inherent permissions so each file will be created with Domain\User , and when u mount it from NFSv3 it will take the UID - and have the right permissions. One more thing - in case u see permissions for files create from CIFS like this: d --- --- --- Put in the CIFS share the OWNER USER + OWNER group ACL inherent permissions , this will show u the right permissions when working with NFSv3. Regards Yaron Daniel 94 Em Ha'Moshavot Rd Server, Storage and Data Services - Team Leader Petach Tiqva, 49527 Global Technology Services Israel Phone: +972-3-916-5672 Fax: +972-3-916-5672 Mobile: +972-52-8395593 e-mail: yard at il.ibm.com IBM Israel From: "Mark.Bush at siriuscom.com" To: gpfsug main discussion list Date: 11/14/2016 09:48 PM Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org I don?t have the exact answer to this issue but I had dealt with something similar before. I?m thinking this may have something to do with NFSv4 needing to be kerberized to work with AD? Again, not really sure on the SpecScale specifics here but worth seeing if you need Kerberos as well to get this to authenticate properly with AD and NFSv4. From: on behalf of Andy Parker1 Reply-To: gpfsug main discussion list Date: Friday, November 11, 2016 at 10:20 AM To: "gpfsug-discuss at spectrumscale.org" Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB We have setup a small cluster to test, play & learn about the protocol servers. We have setup mmuserauth for AD + RFC2307 and we can share and access data via SMB and access is on windows clients with no issues. The file DAC of a file created via windows looks like this from the SS cesNode: $ ls -l total 0 -rwxr--r-- 1 SPECTRUMSCALE\newmanjo SPECTRUMSCALE\ces-admins 33 Nov 10 17:29 helloworld.txt The NFS protocol is also exported for NFS 3,4 and when mount using NFS version '3' from an AIX 7.1 server I see also OK DAC names uid / group, so the UID mapping is working. The AIX is linked to the AD for LDAP account services and I can query accounts and get shell logon for accounts defined within AD for unix services. # ls -l ( from AIX client NFS V3) total 0 -rwxr--r-- 1 newmanjo ces-admi 33 10 Nov 17:29 helloworld.txt Now the Problem: When I mount the AIX client as NFS4 I do no see the user/group names. I know NFS4 passes names and not UID/GID numbers so I guess this is linked. # pwd /mnt/ibm/hurss/share1 # ls -l ( from AIX client NFS V4) total 0 -rwxr--r-- 1 nobody nobody 33 10 Nov 17:29 helloworld.txt On the AIX server I have set NFS domain to virtual1.com # chnfsdom Current local domain: virtual1.com This matches the DOMAIN from the mmnfs config list domain ( not 100% sure this is correct) [root at hurss4 ~]# mmnfs config list NFS Ganesha Configuration: ========================== NFS_PROTOCOLS: 3,4 NFS_PORT: 2049 MNT_PORT: 0 NLM_PORT: 0 RQUOTA_PORT: 0 SHORT_FILE_HANDLE: FALSE LEASE_LIFETIME: 60 DOMAINNAME: VIRTUAL1.COM DELEGATIONS: Disabled Also the 'nfsrgyd' a name translation service for NFS servers and clients is running. lssrc -s nfsrgyd Subsystem Group PID Status nfsrgyd nfs 8585412 active Summary / Question: Can anybody explain why I do not see userID / Group names when viewing via a NFS4 client and ideally how to fix this. Rgds Andy P Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU This message (including any attachments) is intended only for the use of the individual or entity to which it is addressed and may contain information that is non-public, proprietary, privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, you are hereby notified that any use, dissemination, distribution, or copying of this communication is strictly prohibited. This message may be viewed by parties at Sirius Computer Solutions other than those named in the message header. This message does not contain an official representation of Sirius Computer Solutions. If you have received this communication in error, notify Sirius Computer Solutions immediately and (i) destroy this message if a facsimile or (ii) delete this message immediately if this is an electronic communication. Thank you. Sirius Computer Solutions _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/gif Size: 1851 bytes Desc: not available URL: From chetkulk at in.ibm.com Tue Nov 15 06:00:41 2016 From: chetkulk at in.ibm.com (Chetan R Kulkarni) Date: Tue, 15 Nov 2016 11:30:41 +0530 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Message-ID: >> Summary / Question: >> Can anybody explain why I do not see userID / Group names when viewing >> via a NFS4 client and ideally how to fix this. This is not supported by Spectrum Scale (i.e. NFSv4 mount/access on AIX clients with AD+RFC2307 file authentication). Reason being AIX client integrates with AD like LDAP i.e. AIX client can't resolve the user in format "DOMAIN\user". NFSv4 server returns user in "DOMAIN\user" format and as AIX client doesn't understand "DOMAIN\user"; it translates to "nobody". Hence you see "nobody" under AIX NFSv4 mount. Please note that; with RHEL clients we see correct ownership under NFSv4 mounts. This is because RHEL clients integrate with AD as pure AD client (using winbind or SSSD) i.e. users resolve successfully in "DOMAIN\user" format on RHEL clients. Thanks, Chetan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From michael.holliday at crick.ac.uk Tue Nov 15 09:47:33 2016 From: michael.holliday at crick.ac.uk (Michael Holliday) Date: Tue, 15 Nov 2016 09:47:33 +0000 Subject: [gpfsug-discuss] Quotas on Multiple Filesets Message-ID: Hey Everyone, I have a GPFS system which contain several groups of filesets. Each group has a root fileset, along with a number of other files sets. All of the filesets share the inode space with the root fileset. The file sets are linked to create a tree structure as shown: Fileset Root -> /root Fileset a -> /root/a Fileset B -> /root/b Fileset C -> /root/c I have applied a quota of 5TB to the root fileset. Could someone tell me if the quota will only take into account the files in the root fileset, or if it would include the sub filesets aswell. eg If have 3TB in A and 2TB in B - would that hit the 5TB quota on root? Thanks Michael The Francis Crick Institute Limited is a registered charity in England and Wales no. 1140062 and a company registered in England and Wales no. 06885462, with its registered office at 215 Euston Road, London NW1 2BE. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy_parker1 at uk.ibm.com Tue Nov 15 15:34:49 2016 From: andy_parker1 at uk.ibm.com (Andy Parker1) Date: Tue, 15 Nov 2016 15:34:49 +0000 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: Thanks for the responses, using iptrace on AIX I was able to confirm that indeed the following is passed and cannot be matched by the AIX NFSV4 client. SPECTRUMSCALE\testuser1 at virtual1.com . This is in the response packet sent back from the CES server to the AIX NFSV4 client. Sent by Spectrum CES SPECTRUMSCALE\testuser1 at virtual1.com Expected by AIX NFSV4 testuser1 at virtual1.com !!!!!!!! NO MATCH !!!!!!! 00000200 00000180 00000001 00000024 53504543 |...........$SPEC| 00000210 5452554d 5343414c 455c7465 73747573 |TRUMSCALE\testus| 00000220 65723140 76697274 75616c31 2e636f6d |er1 at virtual1.com| 00000230 0000001f 53504543 5452554d 5343414c |....SPECTRUMSCAL| 00000240 455c7465 73744076 69727475 616c312e |E\test at virtual1.| 00000250 636f6d00 00000000 00000000 00000000 |com.............| Out of interest I setup an AIX 7.1 NFSV4 Server and AIX 7.1 NFSV4 client both authenticating against the AD LDAP. This worked fine. I suspect this is because the AIX LDAP (Posix) does attribute mapping so we only see the UID not DOMAIN\uid .. vi /etc/security/ldap/ldap.cfg # AIX-LDAP attribute map path. userattrmappath:/etc/security/ldap/sfur2user.map groupattrmappath:/etc/security/ldap/sfur2group.map # grep -i uid sfur2user.map username SEC_CHAR uid s na yes id SEC_INT uidNumber s na yes I wonder if Solaris 10/11 and HP-UX 11 are also not supported using NFSv4. Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_authenticationlimitations.htm # Going Forward Initially we want to provide only NFS and SMB CesNode services. So we based our decision to use AD + RFC2307 based on this table, believing that it would provide what we need today and future proof us a little by potentially allowing expansion to OBJ in the future. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_authconcept.htm NFSv4 is pretty mandatory in our design, we want to get rid of using Netgroup's and NFS V3 UID/GID mapping which as weak security. Ideally on day one we would want NFSV4 and Kerberos to provide better security for our clients. Its also likely that in the future corporate security policies may ban netgroup's for NFS authorization so using NFSv4 + kerberos would position my department well for future changes. Based on the table I guess I need to setup LDAP / TLS / Kerberos as the authentication service which will cover all bases expect OBJECT. Thanks again for everyone's comments, this was my first post and the responses were all very welcome. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: "Chetan R Kulkarni" To: gpfsug-discuss at spectrumscale.org Date: 15/11/2016 06:01 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org >> Summary / Question: >> Can anybody explain why I do not see userID / Group names when viewing >> via a NFS4 client and ideally how to fix this. This is not supported by Spectrum Scale (i.e. NFSv4 mount/access on AIX clients with AD+RFC2307 file authentication). Reason being AIX client integrates with AD like LDAP i.e. AIX client can't resolve the user in format "DOMAIN\user". NFSv4 server returns user in "DOMAIN\user" format and as AIX client doesn't understand "DOMAIN\user"; it translates to "nobody". Hence you see "nobody" under AIX NFSv4 mount. Please note that; with RHEL clients we see correct ownership under NFSv4 mounts. This is because RHEL clients integrate with AD as pure AD client (using winbind or SSSD) i.e. users resolve successfully in "DOMAIN\user" format on RHEL clients. Thanks, Chetan._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From jasonbennett at us.ibm.com Tue Nov 15 16:36:51 2016 From: jasonbennett at us.ibm.com (Jason Bennett) Date: Tue, 15 Nov 2016 11:36:51 -0500 Subject: [gpfsug-discuss] multipath.conf for EMC V-max Message-ID: Trying to help a customer resolve a multipath.conf issue.... I am at a major stopping point regarding my deployment on Linux.?? I?m working with our SAN Team to get a good /etc/multipath.conf configuration with a stanza for our EMC V-max SAN presented.????? I have PowerPath installed and the SAN disk can been seen from both linux nodes but I need to set three items in a custom devices stanza to ensure no disks locking.??? The three items required for concurrent disk without locks are:?? feature=0,?? failback=immediate & no_path_retry=fail.?? If you were to reply with an example of the multipath.conf where you?ve used EMC V-max I would just be tickled pink. Thanks. Jason Bennett IBM -------------- next part -------------- An HTML attachment was scrubbed... URL: From ewahl at osc.edu Tue Nov 15 16:59:31 2016 From: ewahl at osc.edu (Wahl, Edward) Date: Tue, 15 Nov 2016 16:59:31 +0000 Subject: [gpfsug-discuss] multipath.conf for EMC V-max In-Reply-To: References: Message-ID: <9DA9EC7A281AC7428A9618AFDC4904995901FBDF@CIO-KRC-D1MBX02.osuad.osu.edu> Hey Jason if you want to get me an lsscsi output I can probably whip up a multi-path.conf block for your customer or talk to them on the phone if you like. Ed ----- Reply message ----- From: "Jason Bennett" To: "gpfsug main discussion list" Subject: [gpfsug-discuss] multipath.conf for EMC V-max Date: Tue, Nov 15, 2016 11:37 AM Trying to help a customer resolve a multipath.conf issue.... I am at a major stopping point regarding my deployment on Linux. I?m working with our SAN Team to get a good /etc/multipath.conf configuration with a stanza for our EMC V-max SAN presented. I have PowerPath installed and the SAN disk can been seen from both linux nodes but I need to set three items in a custom devices stanza to ensure no disks locking. The three items required for concurrent disk without locks are: feature=0, failback=immediate & no_path_retry=fail. If you were to reply with an example of the multipath.conf where you?ve used EMC V-max I would just be tickled pink. Thanks. Jason Bennett IBM -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweil at wustl.edu Tue Nov 15 17:18:50 2016 From: mweil at wustl.edu (Matt Weil) Date: Tue, 15 Nov 2016 11:18:50 -0600 Subject: [gpfsug-discuss] multipath.conf for EMC V-max In-Reply-To: <9DA9EC7A281AC7428A9618AFDC4904995901FBDF@CIO-KRC-D1MBX02.osuad.osu.edu> References: <9DA9EC7A281AC7428A9618AFDC4904995901FBDF@CIO-KRC-D1MBX02.osuad.osu.edu> Message-ID: http://www.emc.com/collateral/TechnicalDocument/docu5128.pdf page 219 this is the default in rhel. device { vendor "EMC" product "SYMMETRIX" path_grouping_policy multibus getuid_callout "/lib/udev/scsi_id --page=pre-spc3-83 --whitelisted --device=/dev/%n" path_selector "round-robin 0" path_checker tur features "0" hardware_handler "0" prio const rr_weight uniform no_path_retry 6 rr_min_io 1000 rr_min_io_rq 1 } my defaults defaults { user_friendly_names yes find_multipaths yes udev_dir /dev polling_interval 60 path_grouping_policy multibus path_checker readsector0 rr_min_io_rq 100 rr_weight priorities failback immediate max_fds max features "0" } run multipathd show config to see the defaults. On 11/15/16 10:59 AM, Wahl, Edward wrote: Hey Jason if you want to get me an lsscsi output I can probably whip up a multi-path.conf block for your customer or talk to them on the phone if you like. Ed ----- Reply message ----- From: "Jason Bennett" To: "gpfsug main discussion list" Subject: [gpfsug-discuss] multipath.conf for EMC V-max Date: Tue, Nov 15, 2016 11:37 AM Trying to help a customer resolve a multipath.conf issue.... I am at a major stopping point regarding my deployment on Linux. I?m working with our SAN Team to get a good /etc/multipath.conf configuration with a stanza for our EMC V-max SAN presented. I have PowerPath installed and the SAN disk can been seen from both linux nodes but I need to set three items in a custom devices stanza to ensure no disks locking. The three items required for concurrent disk without locks are: feature=0, failback=immediate & no_path_retry=fail. If you were to reply with an example of the multipath.conf where you?ve used EMC V-max I would just be tickled pink. Thanks. Jason Bennett IBM _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss ________________________________ The materials in this message are private and may contain Protected Healthcare Information or other information of a sensitive nature. If you are not the intended recipient, be advised that any unauthorized use, disclosure, copying or the taking of any action in reliance on the contents of this information is strictly prohibited. If you have received this email in error, please immediately notify the sender via telephone or return mail. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Chris.Schlipalius at pawsey.org.au Thu Nov 17 05:05:11 2016 From: Chris.Schlipalius at pawsey.org.au (Chris Schlipalius) Date: Thu, 17 Nov 2016 13:05:11 +0800 Subject: [gpfsug-discuss] Announcement of the next Australian SpectrumScale User Group - April 2017 (Sydney) Message-ID: Hello please see my announcement: http://www.spectrumscale.org/spectrum-scale-user-group-australia-meeting-syd ney-april-2017/ This is also a call for speakers submissions. Regards, Chris Schlipalius Senior Storage Infrastructure Specialist/Team Leader Pawsey Supercomputing Centre -------------- next part -------------- An HTML attachment was scrubbed... URL: From rkomandu at in.ibm.com Thu Nov 17 11:20:38 2016 From: rkomandu at in.ibm.com (Ravi K Komanduri) Date: Thu, 17 Nov 2016 16:50:38 +0530 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: Andy >> Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. S Scale CES side RHEL and SLES are supported as of date. Refer to the S Scale FAQ link ( http://www.ibm.com/support/knowledgecenter/STXKQY/ibmspectrumscale_welcome.html ) With Regards, Ravi K Komanduri From: Andy Parker1 To: gpfsug main discussion list Date: 11/15/2016 09:05 PM Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for the responses, using iptrace on AIX I was able to confirm that indeed the following is passed and cannot be matched by the AIX NFSV4 client. SPECTRUMSCALE\testuser1 at virtual1.com . This is in the response packet sent back from the CES server to the AIX NFSV4 client. Sent by Spectrum CES SPECTRUMSCALE\testuser1 at virtual1.com Expected by AIX NFSV4 testuser1 at virtual1.com !!!!!!!! NO MATCH !!!!!!! 00000200 00000180 00000001 00000024 53504543 |...........$SPEC| 00000210 5452554d 5343414c 455c7465 73747573 |TRUMSCALE\testus| 00000220 65723140 76697274 75616c31 2e636f6d |er1 at virtual1.com| 00000230 0000001f 53504543 5452554d 5343414c |....SPECTRUMSCAL| 00000240 455c7465 73744076 69727475 616c312e |E\test at virtual1.| 00000250 636f6d00 00000000 00000000 00000000 |com.............| Out of interest I setup an AIX 7.1 NFSV4 Server and AIX 7.1 NFSV4 client both authenticating against the AD LDAP. This worked fine. I suspect this is because the AIX LDAP (Posix) does attribute mapping so we only see the UID not DOMAIN\uid .. vi /etc/security/ldap/ldap.cfg # AIX-LDAP attribute map path. userattrmappath:/etc/security/ldap/sfur2user.map groupattrmappath:/etc/security/ldap/sfur2group.map # grep -i uid sfur2user.map username SEC_CHAR uid s na yes id SEC_INT uidNumber s na yes I wonder if Solaris 10/11 and HP-UX 11 are also not supported using NFSv4. Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_authenticationlimitations.htm # Going Forward Initially we want to provide only NFS and SMB CesNode services. So we based our decision to use AD + RFC2307 based on this table, believing that it would provide what we need today and future proof us a little by potentially allowing expansion to OBJ in the future. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_authconcept.htm NFSv4 is pretty mandatory in our design, we want to get rid of using Netgroup's and NFS V3 UID/GID mapping which as weak security. Ideally on day one we would want NFSV4 and Kerberos to provide better security for our clients. Its also likely that in the future corporate security policies may ban netgroup's for NFS authorization so using NFSv4 + kerberos would position my department well for future changes. Based on the table I guess I need to setup LDAP / TLS / Kerberos as the authentication service which will cover all bases expect OBJECT. Thanks again for everyone's comments, this was my first post and the responses were all very welcome. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: "Chetan R Kulkarni" To: gpfsug-discuss at spectrumscale.org Date: 15/11/2016 06:01 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org >> Summary / Question: >> Can anybody explain why I do not see userID / Group names when viewing >> via a NFS4 client and ideally how to fix this. This is not supported by Spectrum Scale (i.e. NFSv4 mount/access on AIX clients with AD+RFC2307 file authentication). Reason being AIX client integrates with AD like LDAP i.e. AIX client can't resolve the user in format "DOMAIN\user". NFSv4 server returns user in "DOMAIN\user" format and as AIX client doesn't understand "DOMAIN\user"; it translates to "nobody". Hence you see "nobody" under AIX NFSv4 mount. Please note that; with RHEL clients we see correct ownership under NFSv4 mounts. This is because RHEL clients integrate with AD as pure AD client (using winbind or SSSD) i.e. users resolve successfully in "DOMAIN\user" format on RHEL clients. Thanks, Chetan._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy_parker1 at uk.ibm.com Thu Nov 17 11:56:41 2016 From: andy_parker1 at uk.ibm.com (Andy Parker1) Date: Thu, 17 Nov 2016 11:56:41 +0000 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: >>S Scale CES side RHEL and SLES are supported as of date. Thanks for the update, the SLES & RHEL are the supported platforms for the SES Servers agreed. My question was possibly not fair / clear, I was trying to establish what NFS clients are supported to connect to the CES devices. I configured 'SS' with mmuserauth for AD and RFC2307 support, making a dangerous assumption that the RFC2307 would mean I would be able to use any RFC2307 compliant client NFS for NFS V3 & V4. This was true for NFS V3 and we connected AIX & Linux with no issues. However our aim is to remove NFSv3 and provide only NFSv4 + kerberos support. With NFS V4 only Linux clients worked OK due to the use of 'SSSD'. So we are broken for NFSV4 in our diverse environment ( AIX*, SOLARIS*, HPUX*) for the ID mapping at NFSV4 becomes broken. Currently I am looking to reconfigure and address the AD server's LDAP and Kerberos components natively and so hopefully remove the need for 'SSSD'. So we plan to configure using mmuserauth -type LDAP and provide all the required parameters in steady of -type AD. Not 100% sure this will work, but this is what we are about to try. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: Ravi K Komanduri/India/IBM To: gpfsug main discussion list , Andy Parker1 Date: 17/11/2016 11:20 Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Andy >> Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. S Scale CES side RHEL and SLES are supported as of date. Refer to the S Scale FAQ link ( http://www.ibm.com/support/knowledgecenter/STXKQY/ibmspectrumscale_welcome.html ) With Regards, Ravi K Komanduri From: Andy Parker1 To: gpfsug main discussion list Date: 11/15/2016 09:05 PM Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for the responses, using iptrace on AIX I was able to confirm that indeed the following is passed and cannot be matched by the AIX NFSV4 client. SPECTRUMSCALE\testuser1 at virtual1.com . This is in the response packet sent back from the CES server to the AIX NFSV4 client. Sent by Spectrum CES SPECTRUMSCALE\testuser1 at virtual1.com Expected by AIX NFSV4 testuser1 at virtual1.com !!!!!!!! NO MATCH !!!!!!! 00000200 00000180 00000001 00000024 53504543 |...........$SPEC| 00000210 5452554d 5343414c 455c7465 73747573 |TRUMSCALE\testus| 00000220 65723140 76697274 75616c31 2e636f6d |er1 at virtual1.com| 00000230 0000001f 53504543 5452554d 5343414c |....SPECTRUMSCAL| 00000240 455c7465 73744076 69727475 616c312e |E\test at virtual1.| 00000250 636f6d00 00000000 00000000 00000000 |com.............| Out of interest I setup an AIX 7.1 NFSV4 Server and AIX 7.1 NFSV4 client both authenticating against the AD LDAP. This worked fine. I suspect this is because the AIX LDAP (Posix) does attribute mapping so we only see the UID not DOMAIN\uid .. vi /etc/security/ldap/ldap.cfg # AIX-LDAP attribute map path. userattrmappath:/etc/security/ldap/sfur2user.map groupattrmappath:/etc/security/ldap/sfur2group.map # grep -i uid sfur2user.map username SEC_CHAR uid s na yes id SEC_INT uidNumber s na yes I wonder if Solaris 10/11 and HP-UX 11 are also not supported using NFSv4. Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_authenticationlimitations.htm # Going Forward Initially we want to provide only NFS and SMB CesNode services. So we based our decision to use AD + RFC2307 based on this table, believing that it would provide what we need today and future proof us a little by potentially allowing expansion to OBJ in the future. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_authconcept.htm NFSv4 is pretty mandatory in our design, we want to get rid of using Netgroup's and NFS V3 UID/GID mapping which as weak security. Ideally on day one we would want NFSV4 and Kerberos to provide better security for our clients. Its also likely that in the future corporate security policies may ban netgroup's for NFS authorization so using NFSv4 + kerberos would position my department well for future changes. Based on the table I guess I need to setup LDAP / TLS / Kerberos as the authentication service which will cover all bases expect OBJECT. Thanks again for everyone's comments, this was my first post and the responses were all very welcome. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: "Chetan R Kulkarni" To: gpfsug-discuss at spectrumscale.org Date: 15/11/2016 06:01 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org >> Summary / Question: >> Can anybody explain why I do not see userID / Group names when viewing >> via a NFS4 client and ideally how to fix this. This is not supported by Spectrum Scale (i.e. NFSv4 mount/access on AIX clients with AD+RFC2307 file authentication). Reason being AIX client integrates with AD like LDAP i.e. AIX client can't resolve the user in format "DOMAIN\user". NFSv4 server returns user in "DOMAIN\user" format and as AIX client doesn't understand "DOMAIN\user"; it translates to "nobody". Hence you see "nobody" under AIX NFSv4 mount. Please note that; with RHEL clients we see correct ownership under NFSv4 mounts. This is because RHEL clients integrate with AD as pure AD client (using winbind or SSSD) i.e. users resolve successfully in "DOMAIN\user" format on RHEL clients. Thanks, Chetan._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From andy_parker1 at uk.ibm.com Thu Nov 17 15:17:48 2016 From: andy_parker1 at uk.ibm.com (Andy Parker1) Date: Thu, 17 Nov 2016 15:17:48 +0000 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB In-Reply-To: References: Message-ID: >>Currently I am looking to reconfigure and address the AD server's LDAP and Kerberos components natively and so hopefully remove the need >>for 'SSSD'. So we plan to configure using mmuserauth -type LDAP and provide all the required parameters in steady of -type AD. >>Not 100% sure this will work, but this is what we are about to try. Just to report back, you cannot just use --type ldap and point it at the AD ldap server (389 / 636). Its fails because mmuserauth expects the Samba schema and other pre-reqs to be in place. We do not wish to mess to much with our AD schema so we will drop this approach. Summary: Looks like we have the following options on our 'SS' CES nodes with AD RFC2307 in place: SMB to all windows clients NFS3 access to all RFC2307 clients NFS4 access to Linux clients only Using the OpenLDAP / MIT Kerberos Servers approach would create to much of an over head for our team to manage 1000's of users. Using AD pretty much looks after this for us today and we have tooling in place namely IBM's Identity Manager to automate the user management. Our only change needed on the AD was to enable UNIX Services RFC2307 to allow the ID-MAPPING. Rgds AndyP Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: Andy Parker1/UK/IBM at IBMGB To: gpfsug main discussion list Cc: Jo Woods/UK/IBM at IBMGB Date: 17/11/2016 11:57 Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org >>S Scale CES side RHEL and SLES are supported as of date. Thanks for the update, the SLES & RHEL are the supported platforms for the SES Servers agreed. My question was possibly not fair / clear, I was trying to establish what NFS clients are supported to connect to the CES devices. I configured 'SS' with mmuserauth for AD and RFC2307 support, making a dangerous assumption that the RFC2307 would mean I would be able to use any RFC2307 compliant client NFS for NFS V3 & V4. This was true for NFS V3 and we connected AIX & Linux with no issues. However our aim is to remove NFSv3 and provide only NFSv4 + kerberos support. With NFS V4 only Linux clients worked OK due to the use of 'SSSD'. So we are broken for NFSV4 in our diverse environment ( AIX*, SOLARIS*, HPUX*) for the ID mapping at NFSV4 becomes broken. Currently I am looking to reconfigure and address the AD server's LDAP and Kerberos components natively and so hopefully remove the need for 'SSSD'. So we plan to configure using mmuserauth -type LDAP and provide all the required parameters in steady of -type AD. Not 100% sure this will work, but this is what we are about to try. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: Ravi K Komanduri/India/IBM To: gpfsug main discussion list , Andy Parker1 Date: 17/11/2016 11:20 Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Andy >> Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. S Scale CES side RHEL and SLES are supported as of date. Refer to the S Scale FAQ link ( http://www.ibm.com/support/knowledgecenter/STXKQY/ibmspectrumscale_welcome.html ) With Regards, Ravi K Komanduri From: Andy Parker1 To: gpfsug main discussion list Date: 11/15/2016 09:05 PM Subject: Re: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org Thanks for the responses, using iptrace on AIX I was able to confirm that indeed the following is passed and cannot be matched by the AIX NFSV4 client. SPECTRUMSCALE\testuser1 at virtual1.com . This is in the response packet sent back from the CES server to the AIX NFSV4 client. Sent by Spectrum CES SPECTRUMSCALE\testuser1 at virtual1.com Expected by AIX NFSV4 testuser1 at virtual1.com !!!!!!!! NO MATCH !!!!!!! 00000200 00000180 00000001 00000024 53504543 |...........$SPEC| 00000210 5452554d 5343414c 455c7465 73747573 |TRUMSCALE\testus| 00000220 65723140 76697274 75616c31 2e636f6d |er1 at virtual1.com| 00000230 0000001f 53504543 5452554d 5343414c |....SPECTRUMSCAL| 00000240 455c7465 73744076 69727475 616c312e |E\test at virtual1.| 00000250 636f6d00 00000000 00000000 00000000 |com.............| Out of interest I setup an AIX 7.1 NFSV4 Server and AIX 7.1 NFSV4 client both authenticating against the AD LDAP. This worked fine. I suspect this is because the AIX LDAP (Posix) does attribute mapping so we only see the UID not DOMAIN\uid .. vi /etc/security/ldap/ldap.cfg # AIX-LDAP attribute map path. userattrmappath:/etc/security/ldap/sfur2user.map groupattrmappath:/etc/security/ldap/sfur2group.map # grep -i uid sfur2user.map username SEC_CHAR uid s na yes id SEC_INT uidNumber s na yes I wonder if Solaris 10/11 and HP-UX 11 are also not supported using NFSv4. Does anyone know if the SpectrumScale CES (NFS/SMB) has a supported operating systems list published. I checked here but nothing found. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1adm_authenticationlimitations.htm # Going Forward Initially we want to provide only NFS and SMB CesNode services. So we based our decision to use AD + RFC2307 based on this table, believing that it would provide what we need today and future proof us a little by potentially allowing expansion to OBJ in the future. http://www.ibm.com/support/knowledgecenter/STXKQY_4.2.1/com.ibm.spectrum.scale.v4r21.doc/bl1ins_authconcept.htm NFSv4 is pretty mandatory in our design, we want to get rid of using Netgroup's and NFS V3 UID/GID mapping which as weak security. Ideally on day one we would want NFSV4 and Kerberos to provide better security for our clients. Its also likely that in the future corporate security policies may ban netgroup's for NFS authorization so using NFSv4 + kerberos would position my department well for future changes. Based on the table I guess I need to setup LDAP / TLS / Kerberos as the authentication service which will cover all bases expect OBJECT. Thanks again for everyone's comments, this was my first post and the responses were all very welcome. Rgds Andy Andy Parker Cloud & Development Platforms (C&DP) Andy_Parker1 at uk.ibm.com Desk: DW1B14 Tel: 37-245326 (01962-815326) Post: MP100, IBM Hursley Park, Winchester, SO21 2JN From: "Chetan R Kulkarni" To: gpfsug-discuss at spectrumscale.org Date: 15/11/2016 06:01 Subject: [gpfsug-discuss] SS 4.2.1 + CES NFS / SMB Sent by: gpfsug-discuss-bounces at spectrumscale.org >> Summary / Question: >> Can anybody explain why I do not see userID / Group names when viewing >> via a NFS4 client and ideally how to fix this. This is not supported by Spectrum Scale (i.e. NFSv4 mount/access on AIX clients with AD+RFC2307 file authentication). Reason being AIX client integrates with AD like LDAP i.e. AIX client can't resolve the user in format "DOMAIN\user". NFSv4 server returns user in "DOMAIN\user" format and as AIX client doesn't understand "DOMAIN\user"; it translates to "nobody". Hence you see "nobody" under AIX NFSv4 mount. Please note that; with RHEL clients we see correct ownership under NFSv4 mounts. This is because RHEL clients integrate with AD as pure AD client (using winbind or SSSD) i.e. users resolve successfully in "DOMAIN\user" format on RHEL clients. Thanks, Chetan._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss Unless stated otherwise above: IBM United Kingdom Limited - Registered in England and Wales with number 741598. Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU -------------- next part -------------- An HTML attachment was scrubbed... URL: From abeattie at au1.ibm.com Thu Nov 17 22:55:41 2016 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Thu, 17 Nov 2016 22:55:41 +0000 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data Message-ID: An HTML attachment was scrubbed... URL: From kevindjo at us.ibm.com Thu Nov 17 23:03:35 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Thu, 17 Nov 2016 23:03:35 +0000 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data In-Reply-To: Message-ID: Take a look at IBM Spectrum LSF or Spectrum Analytics. The raw data can be provided by Scale's perfmon data but the above solutions can graph and report on it. Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage IBM Certified Solution Advisor - Spectrum Computing V1 720-349-6199 - kevindjo at us.ibm.com > On Nov 17, 2016, at 5:56 PM, Andrew Beattie wrote: > > Good Morning, > > > I have a large managed services provider in Australia who are looking at the benefits of deploying Scale for a combination of Object storage and basic SMB file access. This data is typically historical data rather than highly accessed production data and the proposed services is designed to be a low cost option - think a private version of Amazon's Glacier type offering. The proposed solution will have Platinum - Flash, Gold - SAS, Silver - NL-SAS and Bronze - Tape tiers with different cost's per Tier > > One of the questions that they have asked is, how can they on a regular basis (6-10min increments), poll the storage array to determine what capacity is stored in what tier of disk, by company / user, and export the results (ideally via an API) into their Reporting and Billing system. They do something similar today for their VMWare farm (6 minute increments), to provide accountability for the number of virtual machines they are providing, and would like to extend this capability to their file storage offering, which today is based on basic virtual windows file servers > > Is anyone doing something similar today? and if so at what granularity? > > Andrew Beattie > Software Defined Storage - IT Specialist > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevindjo at us.ibm.com Thu Nov 17 23:09:58 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Thu, 17 Nov 2016 23:09:58 +0000 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data In-Reply-To: Message-ID: LSF RTM, that is. Sorry on the NY Subway. Kevin D. Johnson, MBA, MAFM Spectrum Computing, Senior Managing Consultant IBM Certified Deployment Professional - Spectrum Scale V4.1.1 IBM Certified Deployment Professional - Cloud Object Storage IBM Certified Solution Advisor - Spectrum Computing V1 720-349-6199 - kevindjo at us.ibm.com > On Nov 17, 2016, at 6:03 PM, Kevin D Johnson wrote: > > Take a look at IBM Spectrum LSF or Spectrum Analytics. The raw data can be provided by Scale's perfmon data but the above solutions can graph and report on it. > > Kevin D. Johnson, MBA, MAFM > Spectrum Computing, Senior Managing Consultant > > IBM Certified Deployment Professional - Spectrum Scale V4.1.1 > IBM Certified Deployment Professional - Cloud Object Storage > IBM Certified Solution Advisor - Spectrum Computing V1 > > 720-349-6199 - kevindjo at us.ibm.com > >> On Nov 17, 2016, at 5:56 PM, Andrew Beattie wrote: >> >> Good Morning, >> >> >> I have a large managed services provider in Australia who are looking at the benefits of deploying Scale for a combination of Object storage and basic SMB file access. This data is typically historical data rather than highly accessed production data and the proposed services is designed to be a low cost option - think a private version of Amazon's Glacier type offering. The proposed solution will have Platinum - Flash, Gold - SAS, Silver - NL-SAS and Bronze - Tape tiers with different cost's per Tier >> >> One of the questions that they have asked is, how can they on a regular basis (6-10min increments), poll the storage array to determine what capacity is stored in what tier of disk, by company / user, and export the results (ideally via an API) into their Reporting and Billing system. They do something similar today for their VMWare farm (6 minute increments), to provide accountability for the number of virtual machines they are providing, and would like to extend this capability to their file storage offering, which today is based on basic virtual windows file servers >> >> Is anyone doing something similar today? and if so at what granularity? >> >> Andrew Beattie >> Software Defined Storage - IT Specialist >> Phone: 614-2133-7927 >> E-mail: abeattie at au1.ibm.com >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From Valdis.Kletnieks at vt.edu Fri Nov 18 19:05:39 2016 From: Valdis.Kletnieks at vt.edu (Valdis Kletnieks) Date: Fri, 18 Nov 2016 14:05:39 -0500 Subject: [gpfsug-discuss] mmchdisk performance/behavior in a stretch cluster config? Message-ID: <121817.1479495939@turing-police.cc.vt.edu> So as a basis for our archive solution, we're using a GPFS cluster in a stretch configuration, with 2 sites separated by about 20ms worth of 10G link. Each end has 2 protocol servers doing NFS and 3 NSD servers. Identical disk arrays and LTFS/EE at both ends, and all metadata and userdata are replicated to both sites. We had a fiber issue for about 8 hours yesterday, and as expected (since there are only 5 quorum nodes, 3 local and 2 at the far end) the far end fell off the cluster and down'ed all the NSDs on the remote arrays. There's about 123T of data at each end, 6 million files in there so far. So after the fiber came back up after a several-hour downtime, I did the 'mmchdisk archive start -a'. That was at 17:45 yesterday. I'm now 20 hours in, at: 62.15 % complete on Fri Nov 18 13:52:59 2016 ( 4768429 inodes with total 173675926 MB data processed) 62.17 % complete on Fri Nov 18 13:53:20 2016 ( 4769416 inodes with total 173710731 MB data processed) 62.18 % complete on Fri Nov 18 13:53:40 2016 ( 4772481 inodes with total 173762456 MB data processed) network statistics indicate that the 3 local NSDs are all tossing out packets at about 400Mbytes/second, which means the 10G pipe is pretty damned close to totally packed full, and the 3 remotes are sending back ACKs of all the data. Rough back-of-envelop calculations indicate that (a) if I'm at 62% after 20 hours, it will take 30 hours to finish and (b) a 10G link takes about 29 hours at full blast to move 123T of data. So it certainly *looks* like it's resending everything. And that's even though at least 100T of that 123T is test data that was written by one of our users back on Nov 12/13, and thus theoretically *should* already have been at the remote site. Any ideas what's going on here? From aaron.s.knister at nasa.gov Fri Nov 18 19:21:54 2016 From: aaron.s.knister at nasa.gov (Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP]) Date: Fri, 18 Nov 2016 19:21:54 +0000 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data References: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data Message-ID: <5F910253243E6A47B81A9A2EB424BBA101DF8795@NDMSMBX404.ndc.nasa.gov> I believe ARCAStream has a product that could facilitate this also. I also believe their engineers are on the list. From: Andrew Beattie Sent: 11/17/16, 3:56 PM To: gpfsug main discussion list Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data Good Morning, I have a large managed services provider in Australia who are looking at the benefits of deploying Scale for a combination of Object storage and basic SMB file access. This data is typically historical data rather than highly accessed production data and the proposed services is designed to be a low cost option - think a private version of Amazon's Glacier type offering. The proposed solution will have Platinum - Flash, Gold - SAS, Silver - NL-SAS and Bronze - Tape tiers with different cost's per Tier One of the questions that they have asked is, how can they on a regular basis (6-10min increments), poll the storage array to determine what capacity is stored in what tier of disk, by company / user, and export the results (ideally via an API) into their Reporting and Billing system. They do something similar today for their VMWare farm (6 minute increments), to provide accountability for the number of virtual machines they are providing, and would like to extend this capability to their file storage offering, which today is based on basic virtual windows file servers Is anyone doing something similar today? and if so at what granularity? Andrew Beattie Software Defined Storage - IT Specialist Phone: 614-2133-7927 E-mail: abeattie at au1.ibm.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From bevans at pixitmedia.com Fri Nov 18 19:58:26 2016 From: bevans at pixitmedia.com (Barry Evans) Date: Fri, 18 Nov 2016 19:58:26 +0000 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data In-Reply-To: <5F910253243E6A47B81A9A2EB424BBA101DF8795@NDMSMBX404.ndc.nasa.gov> References: <5F910253243E6A47B81A9A2EB424BBA101DF8795@NDMSMBX404.ndc.nasa.gov> Message-ID: Thanks Aaron, We do indeed have some analytics tools based on our python API that can extract much of this info in a nice, easy to work with format. 6-10 minute increments might be slightly aggressive depending on the metadata spec and the number of object on the file system, but it's certainly doable. Andrew, feel free to contact us at info at arcastream.com if we can help - Sounds like a great use case. -- Barry Evans CTO & Co-Founder Pixit Media/ArcaStream Mobile: +44 (0)7950 666 248 http://www.pixitmedia.com http://www.arcastream.com On 18/11/2016 19:21, Knister, Aaron S. (GSFC-606.2)[COMPUTER SCIENCE CORP] wrote: > I believe ARCAStream has a product that could facilitate this also. I > also believe their engineers are on the list. > > > > *From:*Andrew Beattie > *Sent:* 11/17/16, 3:56 PM > *To:* gpfsug main discussion list > *Subject:* [gpfsug-discuss] Is anyone performing any kind of Charge > back / Show back on Scale today and how do you collect the data > > Good Morning, > I have a large managed services provider in Australia who are looking > at the benefits of deploying Scale for a combination of Object storage > and basic SMB file access. This data is typically historical data > rather than highly accessed production data and the proposed services > is designed to be a low cost option - think a private version of > Amazon's Glacier type offering. The proposed solution will have > Platinum - Flash, Gold - SAS, Silver - NL-SAS and Bronze - Tape > tiers with different cost's per Tier > One of the questions that they have asked is, how can they on a > regular basis (6-10min increments), poll the storage array to > determine what capacity is stored in what tier of disk, by company / > user, and export the results (ideally via an API) into their > Reporting and Billing system. They do something similar today for > their VMWare farm (6 minute increments), to provide accountability for > the number of virtual machines they are providing, and would like to > extend this capability to their file storage offering, which today is > based on basic virtual windows file servers > Is anyone doing something similar today? and if so at what granularity? > Andrew Beattie > Software Defined Storage - IT Specialist > Phone: 614-2133-7927 > E-mail: abeattie at au1.ibm.com > > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -- This email is confidential in that it is intended for the exclusive attention of the addressee(s) indicated. If you are not the intended recipient, this email should not be read or disclosed to any other person. Please notify the sender immediately and delete this email from your computer system. Any opinions expressed are not necessarily those of the company from which this email was sent and, whilst to the best of our knowledge no viruses or defects exist, no responsibility can be accepted for any loss or damage arising from its receipt or subsequent use of this email. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chekh at stanford.edu Sat Nov 19 00:00:04 2016 From: chekh at stanford.edu (Alex Chekholko) Date: Fri, 18 Nov 2016 16:00:04 -0800 Subject: [gpfsug-discuss] Is anyone performing any kind of Charge back / Show back on Scale today and how do you collect the data In-Reply-To: References: Message-ID: On 11/17/2016 02:55 PM, Andrew Beattie wrote: > Good Morning, > > > I have a large managed services provider in Australia who are looking at > the benefits of deploying Scale for a combination of Object storage and > basic SMB file access. This data is typically historical data rather > than highly accessed production data and the proposed services is > designed to be a low cost option - think a private version of Amazon's > Glacier type offering. The proposed solution will have Platinum - > Flash, Gold - SAS, Silver - NL-SAS and Bronze - Tape tiers with > different cost's per Tier > ... > > Is anyone doing something similar today? and if so at what granularity? We put different customers into their own filesets, and we put hard quotas on the filesets and we charge by the allocated quota and not by the current "usage". I.e. their quota is their "usage". IIRC billing is monthy and quota adjustments are manual and infrequent, and I'm guessing the adjustments are pro-rated. -- Alex Chekholko chekh at stanford.edu From olaf.weiser at de.ibm.com Sat Nov 19 07:39:56 2016 From: olaf.weiser at de.ibm.com (Olaf Weiser) Date: Sat, 19 Nov 2016 08:39:56 +0100 Subject: [gpfsug-discuss] mmchdisk performance/behavior in a stretch cluster config? In-Reply-To: <121817.1479495939@turing-police.cc.vt.edu> References: <121817.1479495939@turing-police.cc.vt.edu> Message-ID: An HTML attachment was scrubbed... URL: From SAnderson at convergeone.com Wed Nov 23 01:57:51 2016 From: SAnderson at convergeone.com (Shaun Anderson) Date: Wed, 23 Nov 2016 01:57:51 +0000 Subject: [gpfsug-discuss] SS and TCT setup Message-ID: <6cbcd68ae8f74a0980b0d3a1cb84699c@NACR502.nacr.com> I have a lab environment/sandbox I'm trying to setup TCT. I am getting the error: [root at gpfs42-2 gpfs_rpms]# mmchnode --cloud-gateway-enable -N gpfs42-2 mmchnode: [E] To enable Transparent Cloud Tiering nodes, you must first enable the Transparent Cloud Tiering feature. This feature provides a new level of storage tiering capability to the IBM Spectrum Scale customer. Please contact your IBM Client Technical Specialist (or send an email to scale at us.ibm.com) to review your use case of the Transparent Cloud Tiering feature and to obtain the instructions to enable the feature in your environment. mmchnode: Command failed. Examine previous error messages to determine cause. [root at gpfs42-2 gpfs_rpms]# Does anybody know what the magic is to get this enabled? I'm finding all references point to email scale at us.ibm.com and haven't received a reply. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From kenh at us.ibm.com Wed Nov 23 03:48:17 2016 From: kenh at us.ibm.com (Ken Hill) Date: Tue, 22 Nov 2016 22:48:17 -0500 Subject: [gpfsug-discuss] SS and TCT setup In-Reply-To: <6cbcd68ae8f74a0980b0d3a1cb84699c@NACR502.nacr.com> References: <6cbcd68ae8f74a0980b0d3a1cb84699c@NACR502.nacr.com> Message-ID: Shaun, mmchconfig tctEnable=yes Ken Hill Software Defined Solutions IBM Systems Phone: 1-540-207-7270 E-mail: kenh at us.ibm.com 2300 Dulles Station Blvd Herndon, VA 20171-6133 United States From: Shaun Anderson To: "gpfsug-discuss at spectrumscale.org" Date: 11/22/2016 08:58 PM Subject: [gpfsug-discuss] SS and TCT setup Sent by: gpfsug-discuss-bounces at spectrumscale.org I have a lab environment/sandbox I'm trying to setup TCT. I am getting the error: [root at gpfs42-2 gpfs_rpms]# mmchnode --cloud-gateway-enable -N gpfs42-2 mmchnode: [E] To enable Transparent Cloud Tiering nodes, you must first enable the Transparent Cloud Tiering feature. This feature provides a new level of storage tiering capability to the IBM Spectrum Scale customer. Please contact your IBM Client Technical Specialist (or send an email to scale at us.ibm.com) to review your use case of the Transparent Cloud Tiering feature and to obtain the instructions to enable the feature in your environment. mmchnode: Command failed. Examine previous error messages to determine cause. [root at gpfs42-2 gpfs_rpms]# Does anybody know what the magic is to get this enabled? I'm finding all references point to email scale at us.ibm.com and haven't received a reply. SHAUN ANDERSON STORAGE ARCHITECT O 208.577.2112 M 214.263.7014 NOTICE: This email message and any attachments here to may contain confidential information. Any unauthorized review, use, disclosure, or distribution of such information is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy the original message and all copies of it._______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1620 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1596 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1071 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 978 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1563 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1312 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1167 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1425 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1368 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 1243 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/png Size: 4453 bytes Desc: not available URL: From r.sobey at imperial.ac.uk Tue Nov 29 09:59:38 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 29 Nov 2016 09:59:38 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL Message-ID: All, As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? Just trying to gauge what other people do in this respect. Personally I've always upgraded everything at once - including kernel. Am I looking for trouble? Cheers Richard -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Tue Nov 29 10:19:59 2016 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Tue, 29 Nov 2016 10:19:59 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From janfrode at tanso.net Tue Nov 29 10:22:41 2016 From: janfrode at tanso.net (Jan-Frode Myklebust) Date: Tue, 29 Nov 2016 11:22:41 +0100 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: Message-ID: I think GPFS upgrades are a fine opportunity to check the FAQ and update to latest tested/supported OS versions. But please remember to check all components in the "Functional Support Matrices", and latest kernel tested. -jf On Tue, Nov 29, 2016 at 10:59 AM, Sobey, Richard A wrote: > All, > > > > As a general rule, when updating GPFS to a newer release, would you > perform a full OS update at the same time, and/or update the kernel too? > > > > Just trying to gauge what other people do in this respect. Personally I?ve > always upgraded everything at once ? including kernel. Am I looking for > trouble? > > > > Cheers > > Richard > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From r.sobey at imperial.ac.uk Tue Nov 29 14:25:03 2016 From: r.sobey at imperial.ac.uk (Sobey, Richard A) Date: Tue, 29 Nov 2016 14:25:03 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: Message-ID: Thank you both. The FAQ simply suggests ?keep your OS up to date? and the referenced minimum kernel version is the one we?re already running so I?ll stick with that for now. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 29 November 2016 10:23 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL I think GPFS upgrades are a fine opportunity to check the FAQ and update to latest tested/supported OS versions. But please remember to check all components in the "Functional Support Matrices", and latest kernel tested. -jf On Tue, Nov 29, 2016 at 10:59 AM, Sobey, Richard A > wrote: All, As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? Just trying to gauge what other people do in this respect. Personally I?ve always upgraded everything at once ? including kernel. Am I looking for trouble? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Buterbaugh at Vanderbilt.Edu Tue Nov 29 14:33:14 2016 From: Kevin.Buterbaugh at Vanderbilt.Edu (Buterbaugh, Kevin L) Date: Tue, 29 Nov 2016 14:33:14 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: Message-ID: Hi Richard, I would echo the previous comment about having a test cluster where you at least do some basic functionality testing. Also, as I?m sure you?re well aware, a kernel upgrade - whether or not you?re upgrading GPFS versions - is an especially good idea on RHEL systems right now thanks to ?Dirty Cow?. We do not generally install every GPFS PTF as it comes out ? mainly we install PTF?s that fix problems we are encountering ? but when we are upgrading GPFS we also generally take the opportunity to do a full yum update as well. HTHAL? Kevin On Nov 29, 2016, at 8:25 AM, Sobey, Richard A > wrote: Thank you both. The FAQ simply suggests ?keep your OS up to date? and the referenced minimum kernel version is the one we?re already running so I?ll stick with that for now. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 29 November 2016 10:23 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL I think GPFS upgrades are a fine opportunity to check the FAQ and update to latest tested/supported OS versions. But please remember to check all components in the "Functional Support Matrices", and latest kernel tested. -jf On Tue, Nov 29, 2016 at 10:59 AM, Sobey, Richard A > wrote: All, As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? Just trying to gauge what other people do in this respect. Personally I?ve always upgraded everything at once ? including kernel. Am I looking for trouble? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From S.J.Thompson at bham.ac.uk Tue Nov 29 18:27:49 2016 From: S.J.Thompson at bham.ac.uk (Simon Thompson (Research Computing - IT Services)) Date: Tue, 29 Nov 2016 18:27:49 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: , Message-ID: We typically have rolling updates to oses, with the exception of a couple of packages, like the kernel. Partly because we have to keep Scale in line and partly because we use Mellanox OFED and have had issues getting openibd working properly with updates, so we tend to push a kernel, ofed, gpfs update as an os reinstall. We test that on a subset of systems before rolling up. Where we push protocol updates, we have to time a service outage to ensure we can upgrade all smb packages at the same time. We do minor gpfs point releases in planned at risk windows. Simon ________________________________________ From: gpfsug-discuss-bounces at spectrumscale.org [gpfsug-discuss-bounces at spectrumscale.org] on behalf of Sobey, Richard A [r.sobey at imperial.ac.uk] Sent: 29 November 2016 14:25 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL Thank you both. The FAQ simply suggests ?keep your OS up to date? and the referenced minimum kernel version is the one we?re already running so I?ll stick with that for now. Richard From: gpfsug-discuss-bounces at spectrumscale.org [mailto:gpfsug-discuss-bounces at spectrumscale.org] On Behalf Of Jan-Frode Myklebust Sent: 29 November 2016 10:23 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL I think GPFS upgrades are a fine opportunity to check the FAQ and update to latest tested/supported OS versions. But please remember to check all components in the "Functional Support Matrices", and latest kernel tested. -jf On Tue, Nov 29, 2016 at 10:59 AM, Sobey, Richard A > wrote: All, As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? Just trying to gauge what other people do in this respect. Personally I?ve always upgraded everything at once ? including kernel. Am I looking for trouble? Cheers Richard _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From kevindjo at us.ibm.com Tue Nov 29 18:47:37 2016 From: kevindjo at us.ibm.com (Kevin D Johnson) Date: Tue, 29 Nov 2016 18:47:37 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: , Message-ID: An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Tue Nov 29 19:08:43 2016 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Tue, 29 Nov 2016 19:08:43 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: , , Message-ID: An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Tue Nov 29 20:44:17 2016 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Tue, 29 Nov 2016 20:44:17 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: Message-ID: <904EEBB5-E1DD-4606-993F-7E91ADA1FC37@cfms.org.uk> This is the first I've heard of this max_sectors_kb issue, has it already been discussed on the list? Can you point me to any more info? > On 29 Nov 2016, at 19:08, Luis Bolinches wrote: > > Seen that one on 6.8 too > > teh 4096 does NOT work if storage is XIV then is 1024 > > > -- > Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations > > Luis Bolinches > Lab Services > http://www-03.ibm.com/systems/services/labservices/ > > IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland > Phone: +358 503112585 > > "If you continually give you will continually have." Anonymous > > > ----- Original message ----- > From: "Kevin D Johnson" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL > Date: Tue, Nov 29, 2016 8:48 PM > > I have run into the max_sectors_kb issue and creating a file system when moving beyond 3.10.0-327 on RH 7.2 as well. You either have to reinstall the OS or walk the kernel back to 327 via: > > https://access.redhat.com/solutions/186763 > > Kevin D. Johnson, MBA, MAFM > Spectrum Computing, Senior Managing Consultant > > IBM Certified Deployment Professional - Spectrum Scale V4.1.1 > IBM Certified Deployment Professional - Cloud Object Storage V3.8 > IBM Certified Solution Advisor - Spectrum Computing V1 > > 720.349.6199 - kevindjo at us.ibm.com > > > > ----- Original message ----- > From: "Luis Bolinches" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug-discuss at spectrumscale.org > Cc: gpfsug-discuss at spectrumscale.org > Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL > Date: Tue, Nov 29, 2016 5:20 AM > > My 2 cents > > And I am sure different people have different opinions. > > New kernels might be problematic. > > Now got my fun with RHEL 7.3 kernel and max_sectors_kb for new FS. Is something will come to the FAQ soon. It is already on draft not public. > > I guess whatever you do .... get a TEST cluster and do it there first, that is better the best advice I could give. > > > -- > Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations > > Luis Bolinches > Lab Services > http://www-03.ibm.com/systems/services/labservices/ > > IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland > Phone: +358 503112585 > > "If you continually give you will continually have." Anonymous > > > ----- Original message ----- > From: "Sobey, Richard A" > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: "'gpfsug-discuss at spectrumscale.org'" > Cc: > Subject: [gpfsug-discuss] Upgrading kernel on RHEL > Date: Tue, Nov 29, 2016 11:59 AM > > All, > > > > As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? > > > > Just trying to gauge what other people do in this respect. Personally I?ve always upgraded everything at once ? including kernel. Am I looking for trouble? > > > > Cheers > > Richard > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > Ellei edell? ole toisin mainittu: / Unless stated otherwise above: > Oy IBM Finland Ab > PL 265, 00101 Helsinki, Finland > Business ID, Y-tunnus: 0195876-3 > Registered in Finland > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > Ellei edell? ole toisin mainittu: / Unless stated otherwise above: > Oy IBM Finland Ab > PL 265, 00101 Helsinki, Finland > Business ID, Y-tunnus: 0195876-3 > Registered in Finland > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Tue Nov 29 20:56:25 2016 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Tue, 29 Nov 2016 20:56:25 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: <904EEBB5-E1DD-4606-993F-7E91ADA1FC37@cfms.org.uk> References: <904EEBB5-E1DD-4606-993F-7E91ADA1FC37@cfms.org.uk>, Message-ID: An HTML attachment was scrubbed... URL: From nathan.harper at cfms.org.uk Tue Nov 29 20:59:45 2016 From: nathan.harper at cfms.org.uk (Nathan Harper) Date: Tue, 29 Nov 2016 20:59:45 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: <904EEBB5-E1DD-4606-993F-7E91ADA1FC37@cfms.org.uk> Message-ID: Ah, so an issue with the NSD nodes, as opposed the clients? > On 29 Nov 2016, at 20:56, Luis Bolinches wrote: > > Its been around in certain cases, some kernel <-> storage combination get hit some not > > Scott referenced it here https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/General+Parallel+File+System+%28GPFS%29/page/Storage+with+GPFS+on+Linux > > https://access.redhat.com/solutions/2437991 > > It happens also on 7.2 and 7.3 ppc64 (not yet on the list of "supported") it does not on 7.1. I can confirm this at least for XIV storage, that it can go up to 1024 only. > > I know the FAQ will get updated about this, at least there is a CMVC that states so. > > Long short, you create a FS, and you see all your paths die and recover and die and receover and ..., one after another. And it never really gets done. Also if you boot from SAN ... well you can figure it out ;) > > > -- > Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations > > Luis Bolinches > Lab Services > http://www-03.ibm.com/systems/services/labservices/ > > IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland > Phone: +358 503112585 > > "If you continually give you will continually have." Anonymous > > > ----- Original message ----- > From: Nathan Harper > Sent by: gpfsug-discuss-bounces at spectrumscale.org > To: gpfsug main discussion list > Cc: > Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL > Date: Tue, Nov 29, 2016 10:44 PM > > This is the first I've heard of this max_sectors_kb issue, has it already been discussed on the list? Can you point me to any more info? > > > >> On 29 Nov 2016, at 19:08, Luis Bolinches wrote: >> >> Seen that one on 6.8 too >> >> teh 4096 does NOT work if storage is XIV then is 1024 >> >> >> -- >> Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations >> >> Luis Bolinches >> Lab Services >> http://www-03.ibm.com/systems/services/labservices/ >> >> IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland >> Phone: +358 503112585 >> >> "If you continually give you will continually have." Anonymous >> >> >> ----- Original message ----- >> From: "Kevin D Johnson" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: gpfsug-discuss at spectrumscale.org >> Cc: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL >> Date: Tue, Nov 29, 2016 8:48 PM >> >> I have run into the max_sectors_kb issue and creating a file system when moving beyond 3.10.0-327 on RH 7.2 as well. You either have to reinstall the OS or walk the kernel back to 327 via: >> >> https://access.redhat.com/solutions/186763 >> >> Kevin D. Johnson, MBA, MAFM >> Spectrum Computing, Senior Managing Consultant >> >> IBM Certified Deployment Professional - Spectrum Scale V4.1.1 >> IBM Certified Deployment Professional - Cloud Object Storage V3.8 >> IBM Certified Solution Advisor - Spectrum Computing V1 >> >> 720.349.6199 - kevindjo at us.ibm.com >> >> >> >> ----- Original message ----- >> From: "Luis Bolinches" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: gpfsug-discuss at spectrumscale.org >> Cc: gpfsug-discuss at spectrumscale.org >> Subject: Re: [gpfsug-discuss] Upgrading kernel on RHEL >> Date: Tue, Nov 29, 2016 5:20 AM >> >> My 2 cents >> >> And I am sure different people have different opinions. >> >> New kernels might be problematic. >> >> Now got my fun with RHEL 7.3 kernel and max_sectors_kb for new FS. Is something will come to the FAQ soon. It is already on draft not public. >> >> I guess whatever you do .... get a TEST cluster and do it there first, that is better the best advice I could give. >> >> >> -- >> Yst?v?llisin terveisin / Kind regards / Saludos cordiales / Salutations >> >> Luis Bolinches >> Lab Services >> http://www-03.ibm.com/systems/services/labservices/ >> >> IBM Laajalahdentie 23 (main Entrance) Helsinki, 00330 Finland >> Phone: +358 503112585 >> >> "If you continually give you will continually have." Anonymous >> >> >> ----- Original message ----- >> From: "Sobey, Richard A" >> Sent by: gpfsug-discuss-bounces at spectrumscale.org >> To: "'gpfsug-discuss at spectrumscale.org'" >> Cc: >> Subject: [gpfsug-discuss] Upgrading kernel on RHEL >> Date: Tue, Nov 29, 2016 11:59 AM >> >> All, >> >> >> >> As a general rule, when updating GPFS to a newer release, would you perform a full OS update at the same time, and/or update the kernel too? >> >> >> >> Just trying to gauge what other people do in this respect. Personally I?ve always upgraded everything at once ? including kernel. Am I looking for trouble? >> >> >> >> Cheers >> >> Richard >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> Ellei edell? ole toisin mainittu: / Unless stated otherwise above: >> Oy IBM Finland Ab >> PL 265, 00101 Helsinki, Finland >> Business ID, Y-tunnus: 0195876-3 >> Registered in Finland >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss >> >> >> Ellei edell? ole toisin mainittu: / Unless stated otherwise above: >> Oy IBM Finland Ab >> PL 265, 00101 Helsinki, Finland >> Business ID, Y-tunnus: 0195876-3 >> Registered in Finland >> >> _______________________________________________ >> gpfsug-discuss mailing list >> gpfsug-discuss at spectrumscale.org >> http://gpfsug.org/mailman/listinfo/gpfsug-discuss > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss > > > Ellei edell? ole toisin mainittu: / Unless stated otherwise above: > Oy IBM Finland Ab > PL 265, 00101 Helsinki, Finland > Business ID, Y-tunnus: 0195876-3 > Registered in Finland > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at spectrumscale.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From luis.bolinches at fi.ibm.com Tue Nov 29 21:01:39 2016 From: luis.bolinches at fi.ibm.com (Luis Bolinches) Date: Tue, 29 Nov 2016 21:01:39 +0000 Subject: [gpfsug-discuss] Upgrading kernel on RHEL In-Reply-To: References: , <904EEBB5-E1DD-4606-993F-7E91ADA1FC37@cfms.org.uk>, Message-ID: An HTML attachment was scrubbed... URL: From Robert.Oesterlin at nuance.com Wed Nov 30 14:34:07 2016 From: Robert.Oesterlin at nuance.com (Oesterlin, Robert) Date: Wed, 30 Nov 2016 14:34:07 +0000 Subject: [gpfsug-discuss] Strategies - servers with local SAS disks Message-ID: <528C481B-632B-4ED9-BA4A-8595FC069DAB@nuance.com> Looking for feedback/strategies in setting up several GPFS servers with local SAS. They would all be part of the same file system. The systems are all similar in configuration - 70 4TB drives. Options I?m considering: - Create RAID arrays of the disks on each server (worried about the RAID rebuild time when a drive fails with 4, 6, 8TB drives) - No RAID with 2 replicas, single drive per NSD. When a drive fails, recreate the NSD ? but then I need to fix up the data replication via restripe - FPO ? with multiple failure groups - letting the system manage replica placement and then have GPFS due the restripe on disk failure automatically Comments or other ideas welcome. Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 -------------- next part -------------- An HTML attachment was scrubbed... URL: From UWEFALKE at de.ibm.com Wed Nov 30 16:28:24 2016 From: UWEFALKE at de.ibm.com (Uwe Falke) Date: Wed, 30 Nov 2016 17:28:24 +0100 Subject: [gpfsug-discuss] Strategies - servers with local SAS disks In-Reply-To: <528C481B-632B-4ED9-BA4A-8595FC069DAB@nuance.com> References: <528C481B-632B-4ED9-BA4A-8595FC069DAB@nuance.com> Message-ID: I have once set up a small system with just a few SSDs in two NSD servers, providin a scratch file system in a computing cluster. No RAID, two replica. works, as long the admins do not do silly things (like rebooting servers in sequence without checking for disks being up in between). Going for RAIDs without GPFS replication protects you against single disk failures, but you're lost if just one of your NSD servers goes off. FPO makes sense only sense IMHO if your NSD servers are also processing the data (and then you need to control that somehow). Other ideas? what else can you do with GPFS and local disks than what you considered? I suppose nothing reasonable ... Mit freundlichen Gr??en / Kind regards Dr. Uwe Falke IT Specialist High Performance Computing Services / Integrated Technology Services / Data Center Services ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Rathausstr. 7 09111 Chemnitz Phone: +49 371 6978 2165 Mobile: +49 175 575 2877 E-Mail: uwefalke at de.ibm.com ------------------------------------------------------------------------------------------------------------------------------------------- IBM Deutschland Business & Technology Services GmbH / Gesch?ftsf?hrung: Frank Hammer, Thorsten Moehring Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht Stuttgart, HRB 17122 From: "Oesterlin, Robert" To: gpfsug main discussion list Date: 11/30/2016 03:34 PM Subject: [gpfsug-discuss] Strategies - servers with local SAS disks Sent by: gpfsug-discuss-bounces at spectrumscale.org Looking for feedback/strategies in setting up several GPFS servers with local SAS. They would all be part of the same file system. The systems are all similar in configuration - 70 4TB drives. Options I?m considering: - Create RAID arrays of the disks on each server (worried about the RAID rebuild time when a drive fails with 4, 6, 8TB drives) - No RAID with 2 replicas, single drive per NSD. When a drive fails, recreate the NSD ? but then I need to fix up the data replication via restripe - FPO ? with multiple failure groups - letting the system manage replica placement and then have GPFS due the restripe on disk failure automatically Comments or other ideas welcome. Bob Oesterlin Sr Principal Storage Engineer, Nuance 507-269-0413 _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at spectrumscale.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss From abeattie at au1.ibm.com Wed Nov 30 20:51:12 2016 From: abeattie at au1.ibm.com (Andrew Beattie) Date: Wed, 30 Nov 2016 20:51:12 +0000 Subject: [gpfsug-discuss] Strategies - servers with local SAS disks In-Reply-To: <528C481B-632B-4ED9-BA4A-8595FC069DAB@nuance.com> References: <528C481B-632B-4ED9-BA4A-8595FC069DAB@nuance.com> Message-ID: An HTML attachment was scrubbed... URL: