From chair at gpfsug.org Wed Mar 1 19:08:19 2023 From: chair at gpfsug.org (chair at gpfsug.org) Date: Wed, 01 Mar 2023 19:08:19 +0000 Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 Message-ID: Hi all, Just a reminder that the next UK User Group meeting will be taking place in London (IBM York Road) on Wednesday 21st and Thursday 22nd June Now is your opportunity to help shape the agenda, please feel free to send me ideas for talks we could ask IBM for (I don't promise we'll get them on the agenda mind!). Also if you would like to do a user talk, please get in touch with me and let me know. It could be a large scale deployment or even just a couple of nodes, it's your opportunity to showcase how you use Spectrum Scale and what for. Every year people tell us how valuable they find the user talks, but this needs YOU, so please do think about if you are able to offer a talk! As in the past we are looking for sponsorship to enable us to run an evening networking event. I've sent out details to those who have sponsored in the past and to those who have asked us directly about sponsorship opportunities. If you either haven't received this or are interested in becoming a sponsor, please email me directly. Thanks Paul From knop at us.ibm.com Thu Mar 2 03:33:19 2023 From: knop at us.ibm.com (Felipe Knop) Date: Thu, 2 Mar 2023 03:33:19 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Message-ID: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> Walter, Just following up. I just realized that the SalesForce case below has been closed. The support case owner was correctly able to identify the root cause as being the same problem as the one I mentioned below. The fix will be in the upcoming 5.1.7.0 release. Thanks for opening the case and working with the support team on this one. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Walter Sklenka Reply-To: gpfsug main discussion list Date: Wednesday, February 22, 2023 at 5:23 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Thu Mar 2 08:31:33 2023 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Thu, 2 Mar 2023 08:31:33 +0000 Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 In-Reply-To: References: Message-ID: " it's your opportunity to showcase how you use Spectrum Scale and what for." Spectrum Scale? I haven?t heard the software being called that in a long time? ? #StorageScale Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com -----Original Message----- From: gpfsug-discuss On Behalf Of chair at gpfsug.org Sent: 01 March 2023 19:08 To: Gpfsug Discuss Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 Hi all, Just a reminder that the next UK User Group meeting will be taking place in London (IBM York Road) on Wednesday 21st and Thursday 22nd June Now is your opportunity to help shape the agenda, please feel free to send me ideas for talks we could ask IBM for (I don't promise we'll get them on the agenda mind!). Also if you would like to do a user talk, please get in touch with me and let me know. It could be a large scale deployment or even just a couple of nodes, it's your opportunity to showcase how you use Spectrum Scale and what for. Every year people tell us how valuable they find the user talks, but this needs YOU, so please do think about if you are able to offer a talk! As in the past we are looking for sponsorship to enable us to run an evening networking event. I've sent out details to those who have sponsored in the past and to those who have asked us directly about sponsorship opportunities. If you either haven't received this or are interested in becoming a sponsor, please email me directly. Thanks Paul _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Thu Mar 2 08:31:33 2023 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Thu, 2 Mar 2023 08:31:33 +0000 Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 In-Reply-To: References: Message-ID: " it's your opportunity to showcase how you use Spectrum Scale and what for." Spectrum Scale? I haven?t heard the software being called that in a long time? ? #StorageScale Daniel Daniel Kidger HPC Storage Solutions Architect, EMEA daniel.kidger at hpe.com +44 (0)7818 522266 hpe.com -----Original Message----- From: gpfsug-discuss On Behalf Of chair at gpfsug.org Sent: 01 March 2023 19:08 To: Gpfsug Discuss Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 Hi all, Just a reminder that the next UK User Group meeting will be taking place in London (IBM York Road) on Wednesday 21st and Thursday 22nd June Now is your opportunity to help shape the agenda, please feel free to send me ideas for talks we could ask IBM for (I don't promise we'll get them on the agenda mind!). Also if you would like to do a user talk, please get in touch with me and let me know. It could be a large scale deployment or even just a couple of nodes, it's your opportunity to showcase how you use Spectrum Scale and what for. Every year people tell us how valuable they find the user talks, but this needs YOU, so please do think about if you are able to offer a talk! As in the past we are looking for sponsorship to enable us to run an evening networking event. I've sent out details to those who have sponsored in the past and to those who have asked us directly about sponsorship opportunities. If you either haven't received this or are interested in becoming a sponsor, please email me directly. Thanks Paul _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From WPeters at ATPCO.NET Thu Mar 2 16:26:02 2023 From: WPeters at ATPCO.NET (Bill Peters) Date: Thu, 2 Mar 2023 16:26:02 +0000 Subject: [gpfsug-discuss] spanning datacenter to AWS In-Reply-To: References: Message-ID: Thanks Fred, It took a while, but it looks like you are correct. I will need to upgrade to 5.1.2.7 for a fix. Thanks, -Bill From: gpfsug-discuss On Behalf Of Frederick Stock Sent: Wednesday, February 22, 2023 2:01 PM To: gpfsug main discussion list ; gpfsug-discuss at spectrumscale.org Subject: Re: [gpfsug-discuss] spanning datacenter to AWS Bill, if my memory serves me correctly, there was a fix done in later versions of Scale (there may be an efix available) for the situation you described. Notably, Scale was not properly propagating information about files created through NFS. I suggest you contact Scale support to see if they can provide more details, as well as options for obtaining the fix, assuming my mind has not completely failed me on this issue ? Fred Fred Stock, Spectrum Scale Development Advocacy stockf at us.ibm.com | 720-430-8821 From: gpfsug-discuss > on behalf of Bill Peters > Date: Wednesday, February 22, 2023 at 12:53 PM To: gpfsug-discuss at spectrumscale.org

> Subject: [EXTERNAL] [gpfsug-discuss] spanning datacenter to AWS Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using Spectrum Scale Data Management Edition 5.1.1.0 Our Spectrum Scale cluster is running on Linux VMs on IBM z/VM. We have one application that cannot support the z/VM architecture so we used to have those servers running on VMware in our datacenter and those servers were client nodes in the Spectrum Scale cluster. This configuration worked great. We recently retired VMWare and moved all that workload to AWS. Because this was no longer on our LAN we thought it would be a good idea (IBM support also recommended it) to use CES NFS rather than adding the AWS instances to the cluster. Since doing this we have seen problems under high IO. Some NFS clients will try to access files that don?t seem to be there resulting in file not found errors. We know the files have been created but the NFS clients can?t see them. The read process runs successfully shortly after. We are not saturating our AWS connection. I haven?t seen any NFS tuning that looks like it would help, but that is an option I would be willing to try. The other option I?m thinking about is just adding the NFS clients to the cluster. Has anyone spanned datacenters like this? Thanks, any help is appreciated. -Bill Bill Peters Senior Platform Engineer 703-475-3386 wpeters at atpco.net atpco.net 45005 Aviation Drive Dulles, VA 20166 [A close up of a signDescription automatically generated] [Title: Facebook - Description: Facebook icon] [Title: Twitter - Description: Twitter icon] [Title: LinkedIn - Description: LinkedIn icon] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 17692 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1266 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 1329 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1378 bytes Desc: image004.png URL: From chair at gpfsug.org Fri Mar 3 11:43:49 2023 From: chair at gpfsug.org (chair at gpfsug.org) Date: Fri, 03 Mar 2023 11:43:49 +0000 Subject: [gpfsug-discuss] GPFS UK Meeting Wednesday 21st June - Thursday 22nd June 2023 - Postponed In-Reply-To: References: Message-ID: <852d6fde37de92b95917092171e1a008@gpfsug.org> Dear All, Due to an issue with the IBM Centre, we have had to postpone the planned meeting in June and will update you with a new date in the near future Sorry for this Regards Paul From dmagda+gpfs at ee.torontomu.ca Fri Mar 3 19:29:11 2023 From: dmagda+gpfs at ee.torontomu.ca (David Magda) Date: Fri, 3 Mar 2023 14:29:11 -0500 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? Message-ID: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> Hello, I am a new user of GPFS (Spectrum Scale) and would like to know if there is a ?best practice? on handling kernel updates on HPC clients. We are running Ubuntu 18.04 and 20.04 clients with 5.1.x, talking to RHEL storage servers, and would like to know how to handle re-compiling the client-side kernel modules. There is of course the ?mmbuildgpl? utility: https://www.ibm.com/docs/en/spectrum-scale/5.1.3?topic=reference-mmbuildgpl-command but how do folks invoke it? Manually, via cron at night on or reboot, via some kind of apt (dpkg-trigger(1)) / RPM hook? We have the ?unattended-upgrades? package enabled, which only installs security-tagged updates by default, but sometimes this does include kernel updates, which may become active on the next reboot: https://packages.ubuntu.com/search?keywords=unattended-upgrades So is there a best practice? Has someone invented this wheel that I could leverage, or will I have to invent it myself? Thanks for any info. -- David Magda From christof.schmitt at us.ibm.com Fri Mar 3 19:59:40 2023 From: christof.schmitt at us.ibm.com (Christof Schmitt) Date: Fri, 3 Mar 2023 19:59:40 +0000 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? In-Reply-To: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> References: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> Message-ID: <1bff90bbe0bb5052192a5d8769fef2c31aec9c21.camel@us.ibm.com> On Fri, 2023-03-03 at 14:29 -0500, David Magda wrote: > Hello, > > I am a new user of GPFS (Spectrum Scale) and would like to know if > there is a ?best practice? on handling kernel updates on HPC clients. > We are running Ubuntu 18.04 and 20.04 clients with 5.1.x, talking to > RHEL storage servers, and would like to know how to handle re- > compiling the client-side kernel modules. > > There is of course the ?mmbuildgpl? utility: > > > https://www.ibm.com/docs/en/spectrum-scale/5.1.3?topic=reference-mmbuildgpl-command > > but how do folks invoke it? Manually, via cron at night on or reboot, > via some kind of apt (dpkg-trigger(1)) / RPM hook? There is the Scale option: mmchconfig autoBuildGPL=yes When Scale starts, it checks whether the kernel modules are available for the current kernel; and if not, mmbuildgpl is run before acctually starting the daemon. That should be one way to solve this. See https://www.ibm.com/docs/en/spectrum-scale/5.1.7?topic=reference-mmchconfig-command Regards, Christof From jonathan.buzzard at strath.ac.uk Fri Mar 3 20:05:30 2023 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Fri, 3 Mar 2023 20:05:30 +0000 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? In-Reply-To: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> References: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> Message-ID: On 03/03/2023 19:29, David Magda wrote: > > Hello, > > I am a new user of GPFS (Spectrum Scale) and would like to know if > there is a ?best practice? on handling kernel updates on HPC clients. > We are running Ubuntu 18.04 and 20.04 clients with 5.1.x, talking to > RHEL storage servers, and would like to know how to handle > re-compiling the client-side kernel modules. > > There is of course the ?mmbuildgpl? utility: > > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ibm.com%2Fdocs%2Fen%2Fspectrum-scale%2F5.1.3%3Ftopic%3Dreference-mmbuildgpl-command&data=05%7C01%7Cjonathan.buzzard%40strath.ac.uk%7C441bff633acd491f1d5d08db1c1e178d%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C638134687790209306%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ZQS%2BCAgmIhByy6mrDx9UudY%2BjkRGsaudBE8LUI%2Ft9Mg%3D&reserved=0 > > but how do folks invoke it? Manually, via cron at night on or > reboot, via some kind of apt (dpkg-trigger(1)) / RPM hook? > > We have the ?unattended-upgrades? package enabled, which only > installs security-tagged updates by default, but sometimes this does > include kernel updates, which may become active on the next reboot: > > https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpackages.ubuntu.com%2Fsearch%3Fkeywords%3Dunattended-upgrades&data=05%7C01%7Cjonathan.buzzard%40strath.ac.uk%7C441bff633acd491f1d5d08db1c1e178d%7C631e0763153347eba5cd0457bee5944e%7C0%7C0%7C638134687790209306%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tfzFmrS5zdNlFz75newnxLxDPj5AzUWnF%2BX1NK6GoHE%3D&reserved=0 I would suggest that you disable any automatic upgrading of the kernel. Kernel upgrades should *only* be done *after* you have verified that it will work. If you don't it is only a matter of time before a security update breaks GPFS. There was at least one instance of that happing in the last five years. > > So is there a best practice? Has someone invented this wheel that I > could leverage, or will I have to invent it myself? > I use a RPM package called gpfs-helper that I created. It installs a local "helper" to the systemd gpfs unit file /etc/systemd/system/gpfs.service.d/install-module.conf [Service] ExecStartPre=-/usr/bin/dnf --assumeyes --enablerepo dssg install gpfs.gplbin-%v This causes the correct gpfs.gplbin RPM to be installed when starting GPFS. If the correct one is already installed it does nothing, otherwise it downloads the correct RPM and installs it before attempting to start GPFS. I am pretty sure you could do the same with apt. The %v is the magic bit which basically matches the running kernel. So after testing that GPFS works on the new kernel, I build the RPM, put it in the local repo and winner winner chicken dinner. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From Marcy.D.Cortes at wellsfargo.com Fri Mar 3 21:10:02 2023 From: Marcy.D.Cortes at wellsfargo.com (Marcy.D.Cortes at wellsfargo.com) Date: Fri, 3 Mar 2023 21:10:02 +0000 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? In-Reply-To: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> References: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> Message-ID: <411ca83df2a74e53a8ceef846e3dd04d@wellsfargo.com> We build the rpm and it gets installed if needed in a systemd script that puts it on and then restarts GPFS. Apps are dependent on a health check startup script. To use that auto build you have to have a compiler installed and we avoid that on production servers From: gpfsug-discuss > on behalf of: David Magda > Date: Friday, Mar 03, 2023 at 2:32 PM To: gpfsug-discuss at gpfsug.org > Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? Hello, I am a new user of GPFS (Spectrum Scale) and would like to know if there is a ?best practice? on handling kernel updates on HPC clients. We are running Ubuntu 18.04 and 20.04 clients with 5.1.x, talking to RHEL storage servers, and would like to know how to handle re-compiling the client-side kernel modules. There is of course the ?mmbuildgpl? utility: https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale/5.1.3?topic=reference-mmbuildgpl-command__;!!F9svGWnIaVPGSwU!tqT64CJ2SAvhvbhPiYeVKH9b1XztNG7r2d8dAVOj48Zej07X8wGOjK_lrIA3R9Y-lHfOSLaUjPtEWB5YIlBHO_nwEGE-eT0mkfQ$ but how do folks invoke it? Manually, via cron at night on or reboot, via some kind of apt (dpkg-trigger(1)) / RPM hook? We have the ?unattended-upgrades? package enabled, which only installs security-tagged updates by default, but sometimes this does include kernel updates, which may become active on the next reboot: https://urldefense.com/v3/__https://packages.ubuntu.com/search?keywords=unattended-upgrades__;!!F9svGWnIaVPGSwU!tqT64CJ2SAvhvbhPiYeVKH9b1XztNG7r2d8dAVOj48Zej07X8wGOjK_lrIA3R9Y-lHfOSLaUjPtEWB5YIlBHO_nwEGE-Mb-QG4A$ So is there a best practice? Has someone invented this wheel that I could leverage, or will I have to invent it myself? Thanks for any info. -- David Magda _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!F9svGWnIaVPGSwU!tqT64CJ2SAvhvbhPiYeVKH9b1XztNG7r2d8dAVOj48Zej07X8wGOjK_lrIA3R9Y-lHfOSLaUjPtEWB5YIlBHO_nwEGE-G6wiNEE$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Sun Mar 5 15:10:17 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Sun, 5 Mar 2023 15:10:17 +0000 Subject: [gpfsug-discuss] detectIpPairAggressiveness RE: Reasons for DiskLeaseThread Overloaded In-Reply-To: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> References: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> Message-ID: Hi Felipe! Yes . I am very sorry that I answer with such a delay!! This was the response from the Support ? The IP pair connectivity detection is to address this issue: If there are more than one IP pair between a pair of nodes, if one of the IP pair has some problems, and disk lease request or reply happens be sent on this IP pair, disk lease overdue could happen since TCP has very long retransmit timeout, but since the other IP pairs are in good condition, we should avoid the node expel and send disk lease via other good IP pairs. So, when sending disk lease and reply, we will detect the connectivity of the IP pair, if it?s in good condition, disk lease and reply will be sent, otherwise, we will try other IP pairs for sending. detectIpPairAggressiveness (undocumented configuration parameter) control whether we do detection, here there is only ONE IP pair, actually, we don't need to check the IP pair connectivity. But we still need ping to work since we have other places to do ping check, like when disk lease overdue happens. --------------------------- until the 5.1.7 code will be available early march detectIpPairAggressiveness may can be disabled. echo 999 | mmchconfig detectIpPairAggressiveness=0 -i THANK YOU VERY MUCH, Felipe!!! Best regards Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Donnerstag, 2. M?rz 2023 04:33 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Just following up. I just realized that the SalesForce case below has been closed. The support case owner was correctly able to identify the root cause as being the same problem as the one I mentioned below. The fix will be in the upcoming 5.1.7.0 release. Thanks for opening the case and working with the support team on this one. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 22, 2023 at 5:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss > Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list > Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Sun Mar 5 15:11:38 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Sun, 5 Mar 2023 15:11:38 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded // detectIpPairAggressiveness In-Reply-To: References: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> Message-ID: <2febe7ed49c240c0b458961e48b87904@Mail.EDVDesign.cloudia> Hi Felipe! Yes . I am very sorry that I answer with such a delay!! This was the response from the Support ? The IP pair connectivity detection is to address this issue: If there are more than one IP pair between a pair of nodes, if one of the IP pair has some problems, and disk lease request or reply happens be sent on this IP pair, disk lease overdue could happen since TCP has very long retransmit timeout, but since the other IP pairs are in good condition, we should avoid the node expel and send disk lease via other good IP pairs. So, when sending disk lease and reply, we will detect the connectivity of the IP pair, if it?s in good condition, disk lease and reply will be sent, otherwise, we will try other IP pairs for sending. detectIpPairAggressiveness (undocumented configuration parameter) control whether we do detection, here there is only ONE IP pair, actually, we don't need to check the IP pair connectivity. But we still need ping to work since we have other places to do ping check, like when disk lease overdue happens. --------------------------- until the 5.1.7 code will be available early march detectIpPairAggressiveness may can be disabled. echo 999 | mmchconfig detectIpPairAggressiveness=0 -i THANK YOU VERY MUCH, Felipe!!! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Donnerstag, 2. M?rz 2023 04:33 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Just following up. I just realized that the SalesForce case below has been closed. The support case owner was correctly able to identify the root cause as being the same problem as the one I mentioned below. The fix will be in the upcoming 5.1.7.0 release. Thanks for opening the case and working with the support team on this one. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 22, 2023 at 5:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss > Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list > Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Mon Mar 6 11:04:21 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Mon, 6 Mar 2023 11:04:21 +0000 Subject: [gpfsug-discuss] flash NVME / pool system and data together or separated In-Reply-To: <2febe7ed49c240c0b458961e48b87904@Mail.EDVDesign.cloudia> References: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> <2febe7ed49c240c0b458961e48b87904@Mail.EDVDesign.cloudia> Message-ID: <286790f1c9544bd6a27e9fdd61758f42@Mail.EDVDesign.cloudia> Hi! What would you suggest when using If we have a large flash storage: Does it make sense to split metdata from data? Are there any disadvantages when putty meta and data into the same disk pool (fastes technology) Best regards Walter Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss Im Auftrag von Walter Sklenka Gesendet: Sunday, March 5, 2023 4:12 PM An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded // detectIpPairAggressiveness Hi Felipe! Yes . I am very sorry that I answer with such a delay!! This was the response from the Support ? The IP pair connectivity detection is to address this issue: If there are more than one IP pair between a pair of nodes, if one of the IP pair has some problems, and disk lease request or reply happens be sent on this IP pair, disk lease overdue could happen since TCP has very long retransmit timeout, but since the other IP pairs are in good condition, we should avoid the node expel and send disk lease via other good IP pairs. So, when sending disk lease and reply, we will detect the connectivity of the IP pair, if it?s in good condition, disk lease and reply will be sent, otherwise, we will try other IP pairs for sending. detectIpPairAggressiveness (undocumented configuration parameter) control whether we do detection, here there is only ONE IP pair, actually, we don't need to check the IP pair connectivity. But we still need ping to work since we have other places to do ping check, like when disk lease overdue happens. --------------------------- until the 5.1.7 code will be available early march detectIpPairAggressiveness may can be disabled. echo 999 | mmchconfig detectIpPairAggressiveness=0 -i THANK YOU VERY MUCH, Felipe!!! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Donnerstag, 2. M?rz 2023 04:33 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Just following up. I just realized that the SalesForce case below has been closed. The support case owner was correctly able to identify the root cause as being the same problem as the one I mentioned below. The fix will be in the upcoming 5.1.7.0 release. Thanks for opening the case and working with the support team on this one. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 22, 2023 at 5:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss > Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list > Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Mon Mar 6 16:28:25 2023 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Mon, 6 Mar 2023 16:28:25 +0000 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? In-Reply-To: References: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> Message-ID: <6F466A3D-420B-4B66-9595-58B7846922D2@rutgers.edu> On Mar 3, 2023, at 15:05, Jonathan Buzzard wrote: I would suggest that you disable any automatic upgrading of the kernel. Kernel upgrades should *only* be done *after* you have verified that it will work. If you don't it is only a matter of time before a security update breaks GPFS. There was at least one instance of that happing in the last five years. Definitely more than once, and technically you should not upgrade at all unless the kernel is specifically listed as supported on this page, otherwise you may get to be the one that finds the non-obvious compatibility bug: https://www.ibm.com/docs/en/spectrum-scale?topic=STXKQY/gpfsclustersfaq.html As a result, we typically don?t bother with any automatic updating of GPFS, because if we?re going to upgrade either GPFS or the kernel, we already know about it and do it on purpose. We also build the RPMs on one system, since the vast majority of our equipment does not have the full compiler set installed. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' -------------- next part -------------- An HTML attachment was scrubbed... URL: From skylar2 at uw.edu Mon Mar 6 16:38:58 2023 From: skylar2 at uw.edu (Skylar Thompson) Date: Mon, 6 Mar 2023 08:38:58 -0800 Subject: [gpfsug-discuss] kernel updates and GPFS modules: manual, DKMS, cron, other? In-Reply-To: <6F466A3D-420B-4B66-9595-58B7846922D2@rutgers.edu> References: <0BB7CBDB-1DFF-4EE1-9306-2C6C3C64D66C@ee.torontomu.ca> <6F466A3D-420B-4B66-9595-58B7846922D2@rutgers.edu> Message-ID: <20230306163858.ubwp5lkwmqoqy6db@thargelion> Yep, this is our strategy too. We version-lock GPFS and the kernel in yum and define the target versions in our configuration management (Puppet), with the version-specific gplbin packages stored in our local yum repo. When we want to upgrade, we schedule an outage, build a new package, put in the repo, then bump the target versions in Puppet and reboot. On Mon, Mar 06, 2023 at 04:28:25PM +0000, Ryan Novosielski wrote: > On Mar 3, 2023, at 15:05, Jonathan Buzzard wrote: > > I would suggest that you disable any automatic upgrading of the kernel. Kernel upgrades should *only* be done *after* you have verified that it will work. > > If you don't it is only a matter of time before a security update breaks GPFS. There was at least one instance of that happing in the last five years. > > Definitely more than once, and technically you should not upgrade at all unless the kernel is specifically listed as supported on this page, otherwise you may get to be the one that finds the non-obvious compatibility bug: > > https://urldefense.com/v3/__https://www.ibm.com/docs/en/spectrum-scale?topic=STXKQY*gpfsclustersfaq.html__;Lw!!K-Hz7m0Vt54!m8GZ2APARwPJKv-TGg_fh2W4XPf_2ZHMVCXR97Wdf-lcpQ9X7zGpaUeKCI0KsEX_Vrt8__J-OROZ9CEPjiGG$ > > As a result, we typically don???t bother with any automatic updating of GPFS, because if we???re going to upgrade either GPFS or the kernel, we already know about it and do it on purpose. We also build the RPMs on one system, since the vast majority of our equipment does not have the full compiler set installed. > > -- > #BlackLivesMatter > ____ > || \\UTGERS, |---------------------------*O*--------------------------- > ||_// the State | Ryan Novosielski - novosirj at rutgers.edu > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus > || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark > `' > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > https://urldefense.com/v3/__http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org__;!!K-Hz7m0Vt54!m8GZ2APARwPJKv-TGg_fh2W4XPf_2ZHMVCXR97Wdf-lcpQ9X7zGpaUeKCI0KsEX_Vrt8__J-OROZ9Ny2WL5J$ -- -- Skylar Thompson (skylar2 at u.washington.edu) -- Genome Sciences Department (UW Medicine), System Administrator -- Foege Building S046, (206)-685-7354 -- Pronouns: He/Him/His From dmagda+gpfs at ee.torontomu.ca Mon Mar 6 21:36:09 2023 From: dmagda+gpfs at ee.torontomu.ca (David Magda) Date: Mon, 6 Mar 2023 16:36:09 -0500 Subject: [gpfsug-discuss] Mailman interface HTTP 500 Message-ID: To whomever runs the web server: Apache (?) seems to be throwing errors when I go to Mailman web interface: https://gpfsug.org/mailman/listinfo/ https://www.gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org https://www.spectrumscaleug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org The archives are fine (since they?re probably just flat HTML files and not CGI): https://www.gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/ https://www.spectrumscaleug.org/pipermail/gpfsug-discuss_gpfsug.org/ -- David Magda From systems at gpfsug.org Tue Mar 7 10:22:11 2023 From: systems at gpfsug.org (systems at gpfsug.org) Date: Tue, 7 Mar 2023 10:22:11 -0000 Subject: [gpfsug-discuss] Mailman interface HTTP 500 In-Reply-To: References: Message-ID: <095301d950de$ae1e8da0$0a5ba8e0$@gpfsug.org> Hi David, Thanks for letting us know, this has been reported on slack too. There seems to be a bug in one of the scripts that is causing EGID and RGID errors; it's been reported to the software vendor and we're awaiting feedback. -- systems -----Original Message----- From: gpfsug-discuss On Behalf Of David Magda Sent: 06 March 2023 21:36 To: gpfsug-discuss at gpfsug.org Subject: [gpfsug-discuss] Mailman interface HTTP 500 To whomever runs the web server: Apache (?) seems to be throwing errors when I go to Mailman web interface: https://gpfsug.org/mailman/listinfo/ https://www.gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org https://www.spectrumscaleug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org The archives are fine (since they?re probably just flat HTML files and not CGI): https://www.gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/ https://www.spectrumscaleug.org/pipermail/gpfsug-discuss_gpfsug.org/ -- David Magda _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From jpoling at us.ibm.com Wed Mar 8 17:14:32 2023 From: jpoling at us.ibm.com (Jenny Poling) Date: Wed, 8 Mar 2023 17:14:32 +0000 Subject: [gpfsug-discuss] mmchconfig ESS error Message-ID: We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS. Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch RH OS level is 8.7 Architect: ppc64le Jenny Poling IT Specialist / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com; www.hpss-collaboration.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Wed Mar 8 18:15:29 2023 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 8 Mar 2023 18:15:29 +0000 Subject: [gpfsug-discuss] mmchconfig ESS error Message-ID: <8CA50659-9D82-4112-88EF-77555A372FA9@us.ibm.com> Jenny, Please open a customer case for this one, as you may need help cleaning up some ?sql? statements in the mmsdrfs file. It appears to me that gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le should not be installed if the node is not an ESS server. Regards, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Jenny Poling Reply-To: gpfsug main discussion list Date: Wednesday, March 8, 2023 at 12:19 PM To: "gpfsug-discuss at gpfsug.org" Cc: Harry Yuen Subject: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error We are trying to manually upgrade Spectrum Scale from 5.?1.?2.?1 to 5.?1.?5.?1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS. Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch RH OS level is 8.7 Architect: ppc64le Jenny Poling IT Specialist / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com; www.hpss-collaboration.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Wed Mar 8 19:08:39 2023 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 8 Mar 2023 19:08:39 +0000 Subject: [gpfsug-discuss] mmchconfig ESS error In-Reply-To: <8CA50659-9D82-4112-88EF-77555A372FA9@us.ibm.com> References: <8CA50659-9D82-4112-88EF-77555A372FA9@us.ibm.com> Message-ID: <29CA0112-D2BE-46FE-815B-0CBE3BF0609F@us.ibm.com> All, Apologies. My that gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le should not be installed if the node is not an ESS server. statement was incorrect, as that RPM is installed even in non-ESS-server nodes. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: Felipe Knop Date: Wednesday, March 8, 2023 at 1:15 PM To: gpfsug main discussion list Cc: Harry Yuen Subject: Re: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error Jenny, Please open a customer case for this one, as you may need help cleaning up some ?sql? statements in the mmsdrfs file. It appears to me that gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le should not be installed if the node is not an ESS server. Regards, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Jenny Poling Reply-To: gpfsug main discussion list Date: Wednesday, March 8, 2023 at 12:19 PM To: "gpfsug-discuss at gpfsug.org" Cc: Harry Yuen Subject: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error We are trying to manually upgrade Spectrum Scale from 5.?1.?2.?1 to 5.?1.?5.?1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS. Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch RH OS level is 8.7 Architect: ppc64le Jenny Poling IT Specialist / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com; www.hpss-collaboration.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From anobre at br.ibm.com Wed Mar 8 20:47:34 2023 From: anobre at br.ibm.com (Anderson Ferreira Nobre) Date: Wed, 8 Mar 2023 20:47:34 +0000 Subject: [gpfsug-discuss] mmchconfig ESS error In-Reply-To: References: Message-ID: Hi Jenny, Try to run the upgrade with spectrumscale command. Perhaps it gives you more insight about what?s missing. From: gpfsug-discuss On Behalf Of Jenny Poling Sent: Wednesday, March 8, 2023 2:15 PM To: gpfsug-discuss at gpfsug.org Cc: Harry Yuen Subject: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error We are trying to manually upgrade Spectrum Scale from 5.?1.?2.?1 to 5.?1.?5.?1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS. Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch RH OS level is 8.7 Architect: ppc64le Jenny Poling IT Specialist / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com; www.hpss-collaboration.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjdoherty at yahoo.com Wed Mar 8 20:59:24 2023 From: jjdoherty at yahoo.com (Jim Doherty) Date: Wed, 8 Mar 2023 20:59:24 +0000 (UTC) Subject: [gpfsug-discuss] mmchconfig ESS error In-Reply-To: References: Message-ID: <542905356.658906.1678309164263@mail.yahoo.com> Try running? the following:? # script mmchconfig.log# DEBUG=1? ; mmchconfig??release=LATEST# exit Then you can open a ticket to the Spectrum Scale queue? ?and upload the mmchconfig.log along with a gpfs.snap from the node. On Wednesday, March 8, 2023 at 03:52:43 PM EST, Anderson Ferreira Nobre wrote: Hi Jenny, ? Try to run the upgrade with spectrumscale command. Perhaps it gives you more insight about what?s missing. ? From: gpfsug-discuss On Behalf Of Jenny Poling Sent: Wednesday, March 8, 2023 2:15 PM To: gpfsug-discuss at gpfsug.org Cc: Harry Yuen Subject: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error ? We are trying to manually upgrade Spectrum Scale from 5.?1.?2.?1 to 5.?1.?5.?1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package ZjQcmQRYFpfptBannerStart | | | This Message Is From an External Sender | | This message came from outside your organization. | | | ZjQcmQRYFpfptBannerEnd We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl ?completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS.? Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. ? mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. ? Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch ? RH OS level is 8.7 Architect: ppc64le ? ? Jenny Poling IT Specialist? / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com;www.hpss-collaboration.org ? _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Paul.Sanchez at deshaw.com Wed Mar 8 21:08:00 2023 From: Paul.Sanchez at deshaw.com (Sanchez, Paul) Date: Wed, 8 Mar 2023 21:08:00 +0000 Subject: [gpfsug-discuss] mmchconfig ESS error In-Reply-To: <542905356.658906.1678309164263@mail.yahoo.com> References: <542905356.658906.1678309164263@mail.yahoo.com> Message-ID: <0fbab32a4a0b491695671d2609488cbe@deshaw.com> You?re probably missing these? gpfs.gnr-5.1.5-1.ppc64le gpfs.gnr.base-1.0.0-0.ppc64le They?re hard to get from fixcentral unless you have an Erasure Code Edition license. IIRC, they?re in the fixcentral ESS upgrade bundle though which you can unpack and pluck the RPMs out of. The only catch is that while you can download any version of your entitled editions/architectures of GPFS from fixcentral? if the only way you?re getting these for ESS on ppc64le is via the IBM tested and released upgrade bundles then you don?t have access to arbitrary GPFS versions but rather just the ones which made it into a periodic ESS release. -Paul From: gpfsug-discuss On Behalf Of Jim Doherty Sent: Wednesday, March 8, 2023 15:59 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] mmchconfig ESS error This message was sent by an external party. Try running the following: # script mmchconfig.log # DEBUG=1 ; mmchconfig release=LATEST # exit Then you can open a ticket to the Spectrum Scale queue and upload the mmchconfig.log along with a gpfs.snap from the node. On Wednesday, March 8, 2023 at 03:52:43 PM EST, Anderson Ferreira Nobre > wrote: Hi Jenny, Try to run the upgrade with spectrumscale command. Perhaps it gives you more insight about what?s missing. From: gpfsug-discuss > On Behalf Of Jenny Poling Sent: Wednesday, March 8, 2023 2:15 PM To: gpfsug-discuss at gpfsug.org Cc: Harry Yuen > Subject: [EXTERNAL] [gpfsug-discuss] mmchconfig ESS error We are trying to manually upgrade Spectrum Scale from 5.?1.?2.?1 to 5.?1.?5.?1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd We are trying to manually upgrade Spectrum Scale from 5.1.2.1 to 5.1.5.1 The mmbuildgpl completes successfully however when we run mmchconfig to bring the software up to the latest we get complaints about Elastic Storage Server and a package that is required for ESS. Our box has no installation of ESS. Is there a way to bypass or has anyone else encountered a situation like this? Any suggestions will be greatly appreciated. We have combed google, look through logs, but there is not much information. mmchconfig release=LATEST mmchconfig: Verify you are executing the command on an ESS/GSS node and that the gpfs.gnr package is properly installed. mmchconfig: Command failed. Examine previous error messages to determine cause. Here are the rpm packages installed gpfs.license.adv-5.1.5-1.ppc64le gpfs.base-5.1.5-1.ppc64le gpfs.docs-5.1.5-1.noarch gpfs.gss.pmsensors-5.1.5-1.el8.ppc64le gpfs.adv-5.1.5-1.ppc64le gpfs.gpl-5.1.5-1.noarch gpfs.compression-5.1.5-1.ppc64le gpfs.gskit-8.0.55-19.1.ppc64le gpfs.msg.en_US-5.1.5-1.noarch RH OS level is 8.7 Architect: ppc64le Jenny Poling IT Specialist / HPSS Test Team IGNITE/Test Innovation Global Business Services - Public Service 713.582.7690(c) jpoling at us.ibm.com; www.hpss-collaboration.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.kidger at hpe.com Thu Mar 9 15:24:22 2023 From: daniel.kidger at hpe.com (Kidger, Daniel) Date: Thu, 9 Mar 2023 15:24:22 +0000 Subject: [gpfsug-discuss] flash NVME / pool system and data together or separated In-Reply-To: <286790f1c9544bd6a27e9fdd61758f42@Mail.EDVDesign.cloudia> References: <9ED44F39-4C0B-4640-9F7F-9D6DDBC6DB49@us.ibm.com> <2febe7ed49c240c0b458961e48b87904@Mail.EDVDesign.cloudia> <286790f1c9544bd6a27e9fdd61758f42@Mail.EDVDesign.cloudia> Message-ID: Walter, In general it is fine, but you do not say for example what the underlying provisioning or RAID is that you are using? Nor if this is say ECE, ESS, or perhaps NVMeoF ? For performance reasons, you might want your metadata to be 3-way replicated, or RAID1, if you do a lot of small metadata updates. Daniel From: gpfsug-discuss on behalf of Walter Sklenka Date: Monday, 6 March 2023 at 12:10 To: gpfsug main discussion list Subject: [gpfsug-discuss] flash NVME / pool system and data together or separated Hi! What would you suggest when using If we have a large flash storage: Does it make sense to split metdata from data? Are there any disadvantages when putty meta and data into the same disk pool (fastes technology) Best regards Walter Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss Im Auftrag von Walter Sklenka Gesendet: Sunday, March 5, 2023 4:12 PM An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded // detectIpPairAggressiveness Hi Felipe! Yes . I am very sorry that I answer with such a delay!! This was the response from the Support ? The IP pair connectivity detection is to address this issue: If there are more than one IP pair between a pair of nodes, if one of the IP pair has some problems, and disk lease request or reply happens be sent on this IP pair, disk lease overdue could happen since TCP has very long retransmit timeout, but since the other IP pairs are in good condition, we should avoid the node expel and send disk lease via other good IP pairs. So, when sending disk lease and reply, we will detect the connectivity of the IP pair, if it?s in good condition, disk lease and reply will be sent, otherwise, we will try other IP pairs for sending. detectIpPairAggressiveness (undocumented configuration parameter) control whether we do detection, here there is only ONE IP pair, actually, we don't need to check the IP pair connectivity. But we still need ping to work since we have other places to do ping check, like when disk lease overdue happens. --------------------------- until the 5.1.7 code will be available early march detectIpPairAggressiveness may can be disabled. echo 999 | mmchconfig detectIpPairAggressiveness=0 -i THANK YOU VERY MUCH, Felipe!!! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Donnerstag, 2. M?rz 2023 04:33 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Just following up. I just realized that the SalesForce case below has been closed. The support case owner was correctly able to identify the root cause as being the same problem as the one I mentioned below. The fix will be in the upcoming 5.1.7.0 release. Thanks for opening the case and working with the support team on this one. Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 22, 2023 at 5:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss > Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list > Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.horton at icr.ac.uk Thu Mar 9 15:44:39 2023 From: robert.horton at icr.ac.uk (Robert Horton) Date: Thu, 9 Mar 2023 15:44:39 +0000 Subject: [gpfsug-discuss] mmbackup vs SOBAR Message-ID: Hi Folks, I'm setting up a filesystem for "archive" data which will be aggressively tiered to tape using the Spectrum Protect (or whatever it's called today) Space Management. I would like to have two copies on tape for a) reading back the data on demand b) recovering accidentally deleted files etc c) disaster recovery of the whole filesystem if necessary. My understanding is: 1. Backup and Migration are completely separate things to Spectrum Protect. You can't "restore" from a migrated file nor do a DMAPI read from a backup. 2. A SOBAR backup would enable the namespace to be restored if the filesystem were lost but needs all files to be (pre-)migrated and needs the filesystem blocksize etc to match. 3. A SOBAR backup isn't much help for restoring individual (deleted) files. There is a dsmmigundelete utility that restores individual stubs but doesn't restore directories etc so you really want a separate backup. My thinking is to do backups to one (non-replicated) tape pool and migrate to another and run mmimgbackup regularly. I'd then have a path to do a full restore if either set of tapes were lost although it seems rather messy and it's a bit of a pain that SP needs to read everything twice. So... have I understood that correctly and does anyone have any better / alternative suggestions? Thanks, Rob Robert Horton | Scientific Computing Infrastructure Lead The Institute of Cancer Research | 237 Fulham Road, London, SW3 6JB T +44 (0) 20 7153 5350 | E robert.horton at icr.ac.uk | W www.icr.ac.uk | Twitter @ICR_London Facebook www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer [ICR Logo] The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.gif Type: image/gif Size: 3162 bytes Desc: image001.gif URL: From p.childs at qmul.ac.uk Thu Mar 9 16:07:43 2023 From: p.childs at qmul.ac.uk (Peter Childs) Date: Thu, 9 Mar 2023 16:07:43 +0000 Subject: [gpfsug-discuss] [EXTERNAL] mmbackup vs SOBAR In-Reply-To: References: Message-ID: I've been told "you really should be using SOBAR" a few times, but never really understood how to do so and the steps involved. I feel sure it should have some kind of white paper, so far I've been thinking to setup some kind of test system, but get a little lost on where to start (and lack of time). We currently use mmbackup to x2 servers using `--tsm-servers TSMServer1, TSMServer2` to have two independant backups and this works nicely until you lose a tape when restoring that tape is going to be a nightmare. (read rebuild the whole shadow database) We started with a copy pool until we filled our tape library up, and then swapped to Protect replication until we found this really did not work very well (really slow and missing files), and IBM surgested we use mmbackup with 2 servers and have two independ backups, which is working very well for us now. I think if I was going to implement SOBAR I'd want to run mmbackup as well as SOBAR will not give you point in time recovery or partial recovery and is really only a disarster solution. I'd also probably want 3 copies on tape, 1 in SOBAR, and 2x via mmbackup via two backups or via a copy pool I'm currently thinking to play with HSM and SOBAR on a test system, but have not started yet...... Maybe a talk at the next UG would be helpful on backups, I'm not sure if I want to do one, or if we can find an "expert" Peter Childs ________________________________________ From: gpfsug-discuss on behalf of Robert Horton Sent: Thursday, March 9, 2023 3:44 PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] mmbackup vs SOBAR CAUTION: This email originated from outside of QMUL. Do not click links or open attachments unless you recognise the sender and know the content is safe. Hi Folks, I?m setting up a filesystem for ?archive? data which will be aggressively tiered to tape using the Spectrum Protect (or whatever it?s called today) Space Management. I would like to have two copies on tape for a) reading back the data on demand b) recovering accidentally deleted files etc c) disaster recovery of the whole filesystem if necessary. My understanding is: 1. Backup and Migration are completely separate things to Spectrum Protect. You can?t ?restore? from a migrated file nor do a DMAPI read from a backup. 2. A SOBAR backup would enable the namespace to be restored if the filesystem were lost but needs all files to be (pre-)migrated and needs the filesystem blocksize etc to match. 3. A SOBAR backup isn?t much help for restoring individual (deleted) files. There is a dsmmigundelete utility that restores individual stubs but doesn?t restore directories etc so you really want a separate backup. My thinking is to do backups to one (non-replicated) tape pool and migrate to another and run mmimgbackup regularly. I?d then have a path to do a full restore if either set of tapes were lost although it seems rather messy and it?s a bit of a pain that SP needs to read everything twice. So? have I understood that correctly and does anyone have any better / alternative suggestions? Thanks, Rob Robert Horton | Scientific Computing Infrastructure Lead The Institute of Cancer Research | 237 Fulham Road, London, SW3 6JB T +44 (0) 20 7153 5350 | E robert.horton at icr.ac.uk | W www.icr.ac.uk | Twitter @ICR_London Facebook www.facebook.com/theinstituteofcancerresearch Making the discoveries that defeat cancer [ICR Logo] The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. From lgayne at us.ibm.com Thu Mar 9 16:17:34 2023 From: lgayne at us.ibm.com (Lyle Gayne) Date: Thu, 9 Mar 2023 16:17:34 +0000 Subject: [gpfsug-discuss] mmbackup vs SOBAR In-Reply-To: References: Message-ID: Here's a reference for SOBaR use: https://www.ibm.com/docs/en/spectrum-scale/4.2.1?topic=sobar-backup-procedure The advantage of SOBaR is that it can be used to backup the configuration and metadata, after which data restore can proceed selectively (based on need), and the data already restored can then be accessed while further data restore hasn't been completed. ________________________________ From: gpfsug-discuss on behalf of Peter Childs Sent: Thursday, March 9, 2023 11:07 AM To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] [EXTERNAL] mmbackup vs SOBAR I've been told "you really should be using SOBAR" a few times, but never really understood how to do so and the steps involved. I feel sure it should have some kind of white paper, so far I've been thinking to setup some kind of test system, but get a little lost on where to start (and lack of time). We currently use mmbackup to x2 servers using `--tsm-servers TSMServer1, TSMServer2` to have two independant backups and this works nicely until you lose a tape when restoring that tape is going to be a nightmare. (read rebuild the whole shadow database) We started with a copy pool until we filled our tape library up, and then swapped to Protect replication until we found this really did not work very well (really slow and missing files), and IBM surgested we use mmbackup with 2 servers and have two independ backups, which is working very well for us now. I think if I was going to implement SOBAR I'd want to run mmbackup as well as SOBAR will not give you point in time recovery or partial recovery and is really only a disarster solution. I'd also probably want 3 copies on tape, 1 in SOBAR, and 2x via mmbackup via two backups or via a copy pool I'm currently thinking to play with HSM and SOBAR on a test system, but have not started yet...... Maybe a talk at the next UG would be helpful on backups, I'm not sure if I want to do one, or if we can find an "expert" Peter Childs ________________________________________ From: gpfsug-discuss on behalf of Robert Horton Sent: Thursday, March 9, 2023 3:44 PM To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] mmbackup vs SOBAR CAUTION: This email originated from outside of QMUL. Do not click links or open attachments unless you recognise the sender and know the content is safe. Hi Folks, I?m setting up a filesystem for ?archive? data which will be aggressively tiered to tape using the Spectrum Protect (or whatever it?s called today) Space Management. I would like to have two copies on tape for a) reading back the data on demand b) recovering accidentally deleted files etc c) disaster recovery of the whole filesystem if necessary. My understanding is: 1. Backup and Migration are completely separate things to Spectrum Protect. You can?t ?restore? from a migrated file nor do a DMAPI read from a backup. 2. A SOBAR backup would enable the namespace to be restored if the filesystem were lost but needs all files to be (pre-)migrated and needs the filesystem blocksize etc to match. 3. A SOBAR backup isn?t much help for restoring individual (deleted) files. There is a dsmmigundelete utility that restores individual stubs but doesn?t restore directories etc so you really want a separate backup. My thinking is to do backups to one (non-replicated) tape pool and migrate to another and run mmimgbackup regularly. I?d then have a path to do a full restore if either set of tapes were lost although it seems rather messy and it?s a bit of a pain that SP needs to read everything twice. So? have I understood that correctly and does anyone have any better / alternative suggestions? Thanks, Rob Robert Horton | Scientific Computing Infrastructure Lead The Institute of Cancer Research | 237 Fulham Road, London, SW3 6JB T +44 (0) 20 7153 5350 | E robert.horton at icr.ac.uk | W www.icr.ac.uk | Twitter @ICR_London Facebook www.facebook.com/theinstituteofcancerresearch > Making the discoveries that defeat cancer [ICR Logo] The Institute of Cancer Research: Royal Cancer Hospital, a charitable Company Limited by Guarantee, Registered in England under Company No. 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. This e-mail message is confidential and for use by the addressee only. If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer and network. _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From st.graf at fz-juelich.de Thu Mar 9 18:22:35 2023 From: st.graf at fz-juelich.de (Stephan Graf) Date: Thu, 9 Mar 2023 19:22:35 +0100 Subject: [gpfsug-discuss] mmbackup vs SOBAR In-Reply-To: References: Message-ID: Hi Rob, we in J?lich have a long history and experience using GPFS & HSM. We don't use SOBAR (which is for disaster recovery) but mmbackup (single file restore). In principle it is working fine. But there are two problems. The first one is if the user rename the file/directory. If renamed, the file(s) must be backed up again and will trigger a recall(inline tape copy). The same happens if ACLs are modified on files. This is working, but if we are scaling up (1K ore more files are effected, we are storing >40PB in these file systems), the inline tape copy for backup is not tape optimized and will take a long time. If you want more details, you can contact me directly. Stephan On 3/9/23 16:44, Robert Horton wrote: > Hi Folks, > > I?m setting up a filesystem for ?archive? data which will be > aggressively tiered to tape using the Spectrum Protect (or whatever it?s > called today) Space Management. I would like to have two copies on tape > for a) reading back the data on demand b) recovering accidentally > deleted files etc c) disaster recovery of the whole filesystem if necessary. > > My understanding is: > > 1. Backup and Migration are completely separate things to Spectrum > Protect. You can?t ?restore? from a migrated file nor do a DMAPI > read from a backup. > 2. A SOBAR backup would enable the namespace to be restored if the > filesystem were lost but needs all files to be (pre-)migrated and > needs the filesystem blocksize etc to match. > 3. A SOBAR backup isn?t much help for restoring individual (deleted) > files. There is a dsmmigundelete utility that restores individual > stubs but doesn?t restore directories etc so you really want a > separate backup. > > My thinking is to do backups to one (non-replicated) tape pool and > migrate to another and run mmimgbackup regularly. I?d then have a path > to do a full restore if either set of tapes were lost although it seems > rather messy and it?s a bit of a pain that SP needs to read everything > twice. > > So? have I understood that correctly and does anyone have any better / > alternative suggestions? > > Thanks, > > Rob > > *Robert Horton*| Scientific Computing Infrastructure Lead > > The Institute of Cancer Research | 237 Fulham Road, London, SW3 6JB > > *T*+44 (0) 20 7153 5350 | *E* robert.horton at icr.ac.uk > | *W* www.icr.ac.uk > | *Twitter* @ICR_London > > > *Facebook*www.facebook.com/theinstituteofcancerresearch > > > *Making the discoveries that defeat cancer* > > ICR Logo > > The Institute of Cancer Research: Royal Cancer Hospital, a charitable > Company Limited by Guarantee, Registered in England under Company No. > 534147 with its Registered Office at 123 Old Brompton Road, London SW7 3RP. > > This e-mail message is confidential and for use by the addressee only. > If the message is received by anyone other than the addressee, please > return the message to the sender by replying to it and then delete the > message from your computer and network. > > > _______________________________________________ > gpfsug-discuss mailing list > gpfsug-discuss at gpfsug.org > http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -- Stephan Graf Juelich Supercomputing Centre Phone: +49-2461-61-6578 Fax: +49-2461-61-6656 E-mail: st.graf at fz-juelich.de WWW: http://www.fz-juelich.de/jsc/ --------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzender des Aufsichtsrats: MinDir Volker Rieke Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), Karsten Beneke (stellv. Vorsitzender), Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior --------------------------------------------------------------------------------------------- --------------------------------------------------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 5938 bytes Desc: S/MIME Cryptographic Signature URL: From jonathan.buzzard at strath.ac.uk Mon Mar 13 08:13:33 2023 From: jonathan.buzzard at strath.ac.uk (Jonathan Buzzard) Date: Mon, 13 Mar 2023 08:13:33 +0000 Subject: [gpfsug-discuss] mmbackup vs SOBAR In-Reply-To: References:

Message-ID: <37175a8e-444b-8b35-f187-3ec2b8eb0f1a@strath.ac.uk> On 09/03/2023 18:22, Stephan Graf wrote: > Hi Rob, > > we in J?lich have a long history and experience using GPFS & HSM. > We don't use SOBAR (which is for disaster recovery) but mmbackup (single > file restore). > In principle it is working fine. But there are two problems. > > The first one is if the user rename the file/directory. If renamed, the > file(s) must be backed up again and will trigger a recall(inline tape > copy). The same happens if ACLs are modified on files. > This is working, but if we are scaling up (1K ore more files are > effected, we are storing >40PB in these file systems), the inline tape > copy for backup is not tape optimized and will take a long time. > In the context of an "archive" file system as per Robert's original post I think that it reasonable and even a good idea to scan the file system and have all files over a couple of weeks old set immutable. That stops users renaming/moving files and triggering mass recalls. It's supposed to be an archive after all so you should not be doing that anyway. I also think that in context of an "archive" files system a reasonable approach is also to set very low quotas on the number of files a user is allowed. The idea is to "encourage" them to put the archived data in things like a zip or tar file. Again it's an "archive" not a general purpose file system. JAB. -- Jonathan A. Buzzard Tel: +44141-5483420 HPC System Administrator, ARCHIE-WeSt. University of Strathclyde, John Anderson Building, Glasgow. G4 0NG From dlmcnabb at gmail.com Thu Mar 23 04:52:28 2023 From: dlmcnabb at gmail.com (Daniel McNabb) Date: Wed, 22 Mar 2023 21:52:28 -0700 Subject: [gpfsug-discuss] How to: Show clients actively connected to a given NFS export (CES) In-Reply-To: References: <1f66c46d44af44d68959646812179090@loc.gov> <0c8e6e3032334ef6891195c13d284b56@loc.gov>