From systems at gpfsug.org Tue Feb 7 21:09:33 2023 From: systems at gpfsug.org (systems at gpfsug.org) Date: Tue, 07 Feb 2023 21:09:33 +0000 Subject: [gpfsug-discuss] Save the date: Spectrum Scale German User Meeting 2023 Message-ID: <98e6fa2d20ee6d6d5d6370e67e35d67c@gpfsug.org> Save?the?date?-?See?event?page?for?updates. https://www.spectrumscaleug.org/event/german-user-meeting-2023/ #IBM?#SpectrumScale?#GPFS Activate?to?view?larger?image, graphical?user?interface,?text,?application,?chat?or?text?message From TROPPENS at de.ibm.com Thu Feb 9 10:26:46 2023 From: TROPPENS at de.ibm.com (Ulf Troppens) Date: Thu, 9 Feb 2023 10:26:46 +0000 Subject: [gpfsug-discuss] Save the date: Spectrum Scale German User Meeting 2023 In-Reply-To: <98e6fa2d20ee6d6d5d6370e67e35d67c@gpfsug.org> References: <98e6fa2d20ee6d6d5d6370e67e35d67c@gpfsug.org> Message-ID: Registration for the German Spectrum Scale User Group Meetings is now open: https://www.spectrumscaleug.org/event/german-user-meeting-2023/ --- Ulf Troppens Senior Technical Staff Member Spectrum Scale Development IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Gregor Pillen / Gesch?ftsf?hrung: David Faller Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294 -----Original Message----- From: gpfsug-discuss On Behalf Of systems at gpfsug.org Sent: Tuesday, February 7, 2023 22:10 To: gpfsug-discuss at gpfsug.org Subject: [EXTERNAL] [gpfsug-discuss] Save the date: Spectrum Scale German User Meeting 2023 Save?the?date?-?See?event?page?for?updates. https://www.spectrumscaleug.org/event/german-user-meeting-2023/ #IBM?#SpectrumScale?#GPFS Activate?to?view?larger?image, graphical?user?interface,?text,?application,?chat?or?text?message _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org From Walter.Sklenka at EDV-Design.at Tue Feb 14 13:09:02 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Tue, 14 Feb 2023 13:09:02 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Message-ID: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at -------------- next part -------------- An HTML attachment was scrubbed... URL: From christian.vieser at 1und1.de Tue Feb 14 14:34:10 2023 From: christian.vieser at 1und1.de (Christian Vieser) Date: Tue, 14 Feb 2023 15:34:10 +0100 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: References: Message-ID: <450e988b-8b11-8d6d-cf1b-3ef4b50dcc90@1und1.de> What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: > > Dear Collegues! > > May I ask if anyone has a hint what could be the reason for Critical > Thread Watchdog warnings for Disk Leases Threads? > > Is this a ?local node? Problem or a network problem ? > > I see these messages sometimes arriving when NSD Servers which also > serve as NFS servers when they get under heavy NFS load > > Following is an excerpt from mmfs.log.latest > > 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 > seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. > > 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical > Thread Watchdog]------------------ > > 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) > is overloaded for more than 8 seconds > > 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 > mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) > > 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): > > 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 > BaseMutexClass::release() + 0x12 at ??:0 > > 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + > 0xB154F7E646041C0E at ??:0 > > 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster > xxx-cluster. > > 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 > seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. > > 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster > xxx-cluster. > > 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y > > 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool > /dev/fs4vm all -L -Y > > 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y > > 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool > /dev/fs4vm all -L -Y > > 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y > > 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool > /dev/fs4vm all -L -Y > > 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) > lease renewal is overdue. Pinging to check if it is alive > > 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address > 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 > ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 > backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 > rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 > > 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 > seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. > > 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster > xxx-cluster. > > 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y > > 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool > /dev/fs4vm all -L -Y > > Mit freundlichen Gr??en > */Walter Sklenka/* > */Technical Consultant/* > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Tue Feb 14 15:44:30 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Tue, 14 Feb 2023 15:44:30 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <450e988b-8b11-8d6d-cf1b-3ef4b50dcc90@1und1.de> References: <450e988b-8b11-8d6d-cf1b-3ef4b50dcc90@1und1.de> Message-ID: Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Tue Feb 14 23:06:07 2023 From: knop at us.ibm.com (Felipe Knop) Date: Tue, 14 Feb 2023 23:06:07 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Message-ID: <6E2C0773-70AB-42CF-AE63-10BDE76EC2D8@us.ibm.com> All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Walter Sklenka Reply-To: gpfsug main discussion list Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Wed Feb 15 09:18:41 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Wed, 15 Feb 2023 09:18:41 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <6E2C0773-70AB-42CF-AE63-10BDE76EC2D8@us.ibm.com> References: <6E2C0773-70AB-42CF-AE63-10BDE76EC2D8@us.ibm.com> Message-ID: <80aadaa44c3a4201ad94c18e368b83fa@Mail.EDVDesign.cloudia> Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Wed Feb 15 14:59:19 2023 From: knop at us.ibm.com (Felipe Knop) Date: Wed, 15 Feb 2023 14:59:19 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Message-ID: <78EB0454-5F02-4024-9975-96C7C8FC8AF1@us.ibm.com> Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Walter Sklenka Reply-To: gpfsug main discussion list Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Wed Feb 15 16:01:00 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Wed, 15 Feb 2023 16:01:00 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <78EB0454-5F02-4024-9975-96C7C8FC8AF1@us.ibm.com> References: <78EB0454-5F02-4024-9975-96C7C8FC8AF1@us.ibm.com> Message-ID: Hi Felipe! Thank you very much for the check and the explanation Yes, I will open a case and give you an update It is reproducable ( not periodically but over a day 2-3 times ) and interestingly only from one server, which is a filemanager but not clustermanager Have a nice day! Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Thu Feb 16 14:10:54 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Thu, 16 Feb 2023 14:10:54 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <78EB0454-5F02-4024-9975-96C7C8FC8AF1@us.ibm.com> References: <78EB0454-5F02-4024-9975-96C7C8FC8AF1@us.ibm.com> Message-ID: Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From knop at us.ibm.com Thu Feb 16 17:02:51 2023 From: knop at us.ibm.com (Felipe Knop) Date: Thu, 16 Feb 2023 17:02:51 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Message-ID: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss on behalf of Walter Sklenka Reply-To: gpfsug main discussion list Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Feb 17 05:43:22 2023 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Feb 2023 05:43:22 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> References: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> Message-ID: <58CE62DB-F9AD-4EA9-9F86-AD28C53FB14B@rutgers.edu> Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Fri Feb 17 09:43:53 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Fri, 17 Feb 2023 09:43:53 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <58CE62DB-F9AD-4EA9-9F86-AD28C53FB14B@rutgers.edu> References: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> <58CE62DB-F9AD-4EA9-9F86-AD28C53FB14B@rutgers.edu> Message-ID: <1372b2d60ac5457188f149a85281b618@Mail.EDVDesign.cloudia> Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From novosirj at rutgers.edu Fri Feb 17 22:51:31 2023 From: novosirj at rutgers.edu (Ryan Novosielski) Date: Fri, 17 Feb 2023 22:51:31 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: <1372b2d60ac5457188f149a85281b618@Mail.EDVDesign.cloudia> References: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> <58CE62DB-F9AD-4EA9-9F86-AD28C53FB14B@rutgers.edu> <1372b2d60ac5457188f149a85281b618@Mail.EDVDesign.cloudia> Message-ID: I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From Walter.Sklenka at EDV-Design.at Wed Feb 22 10:18:56 2023 From: Walter.Sklenka at EDV-Design.at (Walter Sklenka) Date: Wed, 22 Feb 2023 10:18:56 +0000 Subject: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded In-Reply-To: References: <2399C1F3-C7B4-4F9D-B51F-92F082529972@us.ibm.com> <58CE62DB-F9AD-4EA9-9F86-AD28C53FB14B@rutgers.edu> <1372b2d60ac5457188f149a85281b618@Mail.EDVDesign.cloudia> Message-ID: <05881da0324e4d7a9fc11673e44f635a@Mail.EDVDesign.cloudia> Hi ; sorry for the delay Our case is TS012184140 They are still analizing As soon as I get feedback I will update you Mit freundlichen Gr??en Walter Sklenka Technical Consultant EDV-Design Informationstechnologie GmbH Giefinggasse 6/1/2, A-1210 Wien Tel: +43 1 29 22 165-31 Fax: +43 1 29 22 165-90 E-Mail: sklenka at edv-design.at Internet: www.edv-design.at Von: gpfsug-discuss Im Auftrag von Ryan Novosielski Gesendet: Friday, February 17, 2023 11:52 PM An: gpfsug main discussion list Betreff: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded I talked about it a lot in TS011616986. Part of the problem is we?re having a lot of strange problems at the same time, and so the different issues we?re having often come together (like one cause shows two symptoms). I can?t remember if there was a case where I specifically mentioned the watchdog, or whether it was unexpectedly late lease times in general. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 17, 2023, at 04:43, Walter Sklenka > wrote: Hi Ryan and Felipe! Could you eventually tell me the case number if you remember it? I opened the case and would reference to your case ID Or shall I send you mine ? From: gpfsug-discuss > On Behalf Of Ryan Novosielski Sent: Freitag, 17. Februar 2023 06:43 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Thanks for this, Felipe. We?ve started seeing intermittent overdue leases in large numbers and don?t otherwise have an explanation for it, other than ?look at your network,? which actually does show occasional signs of strange behavior/higher-than-normal RTO values, but we?re not necessarily seeing those things happen at the same times as the lease issues. We?ve also seen ?GPFS Critical Thread Watchdog? recently. We had a case open about it, but didn?t draw any real conclusions. If any of our data might be helpful/if there?s a case we could reference to see if we?re also running into that, we could provide a gpfs.snap. FWIW, we are running 5.1.3-1 on the storage side (except one system that?s about to be upgraded that runs a combination of 5.0.3-2 and 5.0.5-1), and 5.1.6-0 (soon to be 5.1.6-1) on the remote/client cluster side. -- #BlackLivesMatter ____ || \\UTGERS, |---------------------------*O*--------------------------- ||_// the State | Ryan Novosielski - novosirj at rutgers.edu || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus || \\ of NJ | Office of Advanced Research Computing - MSB C630, Newark `' On Feb 16, 2023, at 12:02, Felipe Knop > wrote: Walter, Thanks for the detailed description. I don?t yet see anything glaringly incorrect on your configuration, but perhaps others might find something out of place. I?d encourage you to open a case, since I spoke with a colleague yesterday, and he mentioned that he is working on a problem that may cause the lease thread to ?loop? for a while. That might cause the critical thread watchdog to flag the lease thread as taking too long to ?check in?. Capturing gpfs.snap is important, since we?d be looking into all the [W] ------------------[GPFS Critical Thread Watchdog]------------------ instances. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Thursday, February 16, 2023 at 9:16 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi Felipe! Once again me. Thank you very much for the hint I did not open a PMR yet because I fear they will ask me/us if we are cracy ? I did not tell the full story yet We have a 3 node cluster, 2 NSD servers o1,o2 (same site ) and g1 (different site). (rhel 8.7) All of them are Vmware VMs O1 and o2 have each 4 NVME drives passed through , there is a software raid 5 made over these NVMEs , and from them made a single NSD , for a filesystem fs4vm (m,r=2 ) [root at ogpfs1 ras]# mmlscluster GPFS cluster information ======================== GPFS cluster name: edvdesign-cluster.local GPFS cluster id: 12147978822727803186 GPFS UID domain: edvdesign-cluster.local Remote shell command: /usr/bin/ssh Remote file copy command: /usr/bin/scp Repository type: CCR Node Daemon node name IP address Admin node name Designation ---------------------------------------------------------------------------- 1 ogpfs1-hs.local 10.20.30.1 ogpfs1-hs.local quorum-manager-perfmon 2 ogpfs2-hs.local 10.20.30.2 ogpfs2-hs.local quorum-manager-perfmon 3 ggpfsq.mgmt.cloudia xxxx.other.net ggpfsq.mgmt. a quorum-perfmon [root at ogpfs1 ras]# mmlsconfig Configuration data for cluster edvdesign-cluster.local: ------------------------------------------------------- clusterName edvdesign-cluster.local clusterId 12147978822727803186 autoload yes profile gpfsProtocolRandomIO dmapiFileHandleSize 32 minReleaseLevel 5.1.6.0 tscCmdAllowRemoteConnections no ccrEnabled yes cipherList AUTHONLY sdrNotifyAuthEnabled yes maxblocksize 16M [cesNodes] maxMBpS 5000 numaMemoryInterleave yes enforceFilesetQuotaOnRoot yes workerThreads 512 [common] tscCmdPortRange 60000-61000 [srv] verbsPorts mlx5_0/1 mlx5_1/1 [common] cesSharedRoot /fs4vmware/cesSharedRoot [srv] maxFilesToCache 10000 maxStatCache 20000 [common] verbsRdma enable [ggpfsq] verbsRdma disable [common] verbsRdmaSend yes [ggpfsq] verbsRdmaSend no [common] verbsRdmaCm enable [ggpfsq] verbsRdmaCm disable [srv] pagepool 32G [common] adminMode central File systems in cluster edvdesign-cluster.local: ------------------------------------------------ /dev/fs4vm [root at ogpfs1 ras]# mmlsdisk fs4vm -L disk driver sector failure holds holds storage name type size group metadata data status availability disk id pool remarks ------------ -------- ------ ----------- -------- ----- ------------- ------------ ------- ------------ --------- ogpfs1_1 nsd 512 1 yes yes ready up 1 system desc ogpfs2_1 nsd 512 2 yes yes ready up 2 system desc ggpfsq_qdisk nsd 512 -1 no no ready up 3 system desc Number of quorum disks: 3 Read quorum value: 2 Write quorum value: 2 And the two nodes o1 and o2 export the filesystem via CES NFS functions ( for VMware) I think this isn?supported , that a NSD Server is also a CES Node? And finally the RDMA Network : The both NSD servers also have a Mellanox ConnectX-6 Lx dual port 25Gb adapter also via passthrough And these interfaces we configured for rdma (RoCE) , Last but not least: this network is not switched but direct attached ( 2x25Gb directly connected between the NSD nodes ) RDMA Connections between nodes: Fabric 0 - Device mlx5_0 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 0 Y RTS (Y)256 478202 (0 ) 12728 67024 8864789(0 ) 22776 4643 0 0 Fabric 0 - Device mlx5_1 Port 1 Width 1x Speed EDR lid 0 hostname idx CM state VS buff RDMA_CT(ERR) RDMA_RCV_MB RDMA_SND_MB VS_CT(ERR) VS_SND_MB VS_RCV_MB WAIT_CON_SLOT WAIT_NODE_SLOT ogpfs2-hs.local 1 Y RTS (Y)256 477659 (0 ) 12489 67034 8864773(0 ) 22794 4639 0 0 [root at ogpfs1 ras]# You mentioned that it might be a cpu contention : Maybe due to the VM layer (scheduling with other VMS) ? And wrong layout of VMs ( 8 vCPUs and 64GB Mem) [ esxis only single socket with 32/64 cores HT) And also the direct attached RDMA ( +DAEMON) network is also not good? Do you think IBM would say no to check such a configuration ? Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 15:59 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Walter, Thanks for the details. The stack trace below captures the lease thread in the middle of sending the ?lease? RPC. This operation normally is not blocking, and we do not often block while sending the RPC. But the stack trace ?does not show? whether there was anything blocking the thread prior to the point where the RPCs are sent. At a first glance: 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) I believe nivcsw: 10 means that the thread was scheduled out of the CPU involuntarily, possibly indicating that there is some CPU contention going on. Could you open a case to get debug data collected? If the problem can be recreated, I think we?ll need a recreate of the problem with traces enabled. Thanks, Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Wednesday, February 15, 2023 at 4:23 AM To: gpfsug main discussion list > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! This is a ?full? sequence in mmfs.?log.?latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:?43:?51.?474+0100: [N] Disk lease period expired 0.?030 seconds ago in cluster ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! This is a ?full? sequence in mmfs.log.latest Fortunately this was also the last event until now (yesterday evening) Maybe you can have a look? 2023-02-14_19:43:51.474+0100: [N] Disk lease period expired 0.030 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:07.430+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_19:44:07.430+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_19:44:07.430+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 10) 2023-02-14_19:44:07.430+0100: [W] Call Trace(PID: 7294): 2023-02-14_19:44:07.431+0100: [W] #0: 0x000055CABE4A56AB NodeConn::sendMessage(TcpConn**, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, unsigned long long*, unsigned long long*, unsi gned int*, CondvarName, vsendCallback_t*) + 0x42B at ??:0 2023-02-14_19:44:07.432+0100: [W] #1: 0x000055CABE4A595F llc_send_msg(ClusterConfiguration*, NodeAddr, iovec*, int, unsigned char, int, int, int, unsigned int, DestTag*, int*, TcpConn**, unsigned long long*, u nsigned long long*, unsigned int*, CondvarName, vsendCallback_t*, int, unsigned int) + 0xDF at ??:0 2023-02-14_19:44:07.437+0100: [W] #2: 0x000055CABE479A55 MsgRecord::send() + 0x1345 at ??:0 2023-02-14_19:44:07.438+0100: [W] #3: 0x000055CABE47A169 tscSendInternal(ClusterConfiguration*, unsigned int, unsigned char, int, int, NodeAddr*, TscReply*, TscScatteredBuff*, int, int (*)(void*, ClusterConfig uration*, int, NodeAddr*, TscReply*), void*, ChainedCallback**, __va_list_tag*) + 0x339 at ??:0 2023-02-14_19:44:07.439+0100: [W] #4: 0x000055CABE47C39A tscSendWithCallback(ClusterConfiguration*, unsigned int, unsigned char, int, NodeAddr*, TscReply*, int (*)(void*, ClusterConfiguration*, int, NodeAddr*, TscReply*), void*, void**, int, ...) + 0x1DA at ??:0 2023-02-14_19:44:07.440+0100: [W] #5: 0x000055CABE5F9853 MyLeaseState::renewLease(NodeAddr, TickTime) + 0x6E3 at ??:0 2023-02-14_19:44:07.440+0100: [W] #6: 0x000055CABE5FA682 ClusterConfiguration::checkAndRenewLease(TickTime) + 0x192 at ??:0 2023-02-14_19:44:07.441+0100: [W] #7: 0x000055CABE5FAAC6 ClusterConfiguration::RunLeaseChecks(void*) + 0x366 at ??:0 2023-02-14_19:44:07.441+0100: [W] #8: 0x000055CABDF2B662 Thread::callBody(Thread*) + 0x42 at ??:0 2023-02-14_19:44:07.441+0100: [W] #9: 0x000055CABDF18680 Thread::callBodyWrapper(Thread*) + 0xA0 at ??:0 2023-02-14_19:44:07.441+0100: [W] #10: 0x00007F3B7563D1CA start_thread + 0xEA at ??:0 2023-02-14_19:44:07.441+0100: [W] #11: 0x00007F3B7435BE73 __GI___clone + 0x43 at ??:0 2023-02-14_19:44:10.512+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_19:44:10.512+0100: [N] Disk lease period expired 7.970 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_19:44:12.563+0100: [N] Disk lease reacquired in cluster xxx-cluster. Thank you very much! Best regards Walter From: gpfsug-discuss > On Behalf Of Felipe Knop Sent: Mittwoch, 15. Februar 2023 00:06 To: gpfsug main discussion list > Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded All, These messages like [W] ------------------[GPFS Critical Thread Watchdog]------------------ indicate that a ?critical thread?, in this case the lease thread, was apparently blocked for longer than expected. This is usually not caused by delays in the network, but possibly by excessive CPU load, blockage while accessing the local file system, or possible mutex contention. Do you have other samples of the message, with a more complete stack trace? Or was the instance below the only one? Felipe ---- Felipe Knop knop at us.ibm.com GPFS Development and Security IBM Systems IBM Building 008 2455 South Rd, Poughkeepsie, NY 12601 From: gpfsug-discuss > on behalf of Walter Sklenka > Reply-To: gpfsug main discussion list > Date: Tuesday, February 14, 2023 at 10:49 AM To: "gpfsug-discuss at gpfsug.org" > Subject: [EXTERNAL] Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded Hi! I started with 5.?1.?6.?0 and now am at [root@?ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.?1.?6.?1 ". the messages started from the beginning From: gpfsug-discuss On ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hi! I started with 5.1.6.0 and now am at [root at ogpfs1 ~]# mmfsadm dump version Dump level: verbose Build branch "5.1.6.1 ". the messages started from the beginning From: gpfsug-discuss > On Behalf Of Christian Vieser Sent: Dienstag, 14. Februar 2023 15:34 To: gpfsug-discuss at gpfsug.org Subject: Re: [gpfsug-discuss] Reasons for DiskLeaseThread Overloaded What version of Spectrum Scale is running there? Do these errors appear since your last version update? Am 14.02.23 um 14:09 schrieb Walter Sklenka: Dear Collegues! May I ask if anyone has a hint what could be the reason for Critical Thread Watchdog warnings for Disk Leases Threads? Is this a ?local node? Problem or a network problem ? I see these messages sometimes arriving when NSD Servers which also serve as NFS servers when they get under heavy NFS load Following is an excerpt from mmfs.log.latest 2023-02-14_12:06:53.235+0100: [N] Disk lease period expired 0.040 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:06:53.600+0100: [W] ------------------[GPFS Critical Thread Watchdog]------------------ 2023-02-14_12:06:53.600+0100: [W] PID: 7294 State: R (DiskLeaseThread) is overloaded for more than 8 seconds 2023-02-14_12:06:53.600+0100: [W] counter: 0 (mark-idle: 0 mark-active: 0 pre-work: 0 post-work: 0) sched: (nvcsw: 0 nivcsw: 8) 2023-02-14_12:06:53.600+0100: [W] Call Trace(PID: 7294): 2023-02-14_12:06:53.600+0100: [W] #0: 0x000055CABDF49521 BaseMutexClass::release() + 0x12 at ??:0 2023-02-14_12:06:53.600+0100: [W] #1: 0xB1557721BBABD900 _etext + 0xB154F7E646041C0E at ??:0 2023-02-14_12:07:09.554+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:07:09.554+0100: [N] Disk lease period expired 5.680 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_12:07:11.605+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_12:10:55.990+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:10:55.990+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_12:30:58.756+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.988+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:10:55.989+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:21:40.892+0100: [N] Node 10.20.30.2 (ogpfs2-hs.local) lease renewal is overdue. Pinging to check if it is alive 2023-02-14_13:21:40.892+0100: [I] The TCP connection to IP address 10.20.30.2 ogpfs2-hs.local :[1] (socket 106) state: state=1 ca_state=0 snd_cwnd=10 snd_ssthresh=2147483647 unacked=0 probes=0 backoff=0 retransmits=0 rto=201000 rcv_ssthresh=1219344 rtt=121 rttvar=69 sacked=0 retrans=0 reordering=3 lost=0 2023-02-14_13:22:00.220+0100: [N] Disk lease period expired 0.010 seconds ago in cluster xxx-cluster. Attempting to reacquire the lease. 2023-02-14_13:22:08.298+0100: [N] Disk lease reacquired in cluster xxx-cluster. 2023-02-14_13:30:58.760+0100: [I] Command: mmlspool /dev/fs4vm all -L -Y 2023-02-14_13:30:58.760+0100: [I] Command: successful mmlspool /dev/fs4vm all -L -Y Mit freundlichen Gr??en Walter Sklenka Technical Consultant _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org _______________________________________________ gpfsug-discuss mailing list gpfsug-discuss at gpfsug.org http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From WPeters at ATPCO.NET Wed Feb 22 17:50:50 2023 From: WPeters at ATPCO.NET (Bill Peters) Date: Wed, 22 Feb 2023 17:50:50 +0000 Subject: [gpfsug-discuss] spanning datacenter to AWS Message-ID: Hello all, I've been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I'd like to run past everyone and see if anyone has experience that may help. Currently using Spectrum Scale Data Management Edition 5.1.1.0 Our Spectrum Scale cluster is running on Linux VMs on IBM z/VM. We have one application that cannot support the z/VM architecture so we used to have those servers running on VMware in our datacenter and those servers were client nodes in the Spectrum Scale cluster. This configuration worked great. We recently retired VMWare and moved all that workload to AWS. Because this was no longer on our LAN we thought it would be a good idea (IBM support also recommended it) to use CES NFS rather than adding the AWS instances to the cluster. Since doing this we have seen problems under high IO. Some NFS clients will try to access files that don't seem to be there resulting in file not found errors. We know the files have been created but the NFS clients can't see them. The read process runs successfully shortly after. We are not saturating our AWS connection. I haven't seen any NFS tuning that looks like it would help, but that is an option I would be willing to try. The other option I'm thinking about is just adding the NFS clients to the cluster. Has anyone spanned datacenters like this? Thanks, any help is appreciated. -Bill Bill Peters Senior Platform Engineer 703-475-3386 wpeters at atpco.net atpco.net 45005 Aviation Drive Dulles, VA 20166 [A close up of a sign Description automatically generated] [Title: Facebook - Description: Facebook icon] [Title: Twitter - Description: Twitter icon] [Title: LinkedIn - Description: LinkedIn icon] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 17692 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1266 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 1329 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1378 bytes Desc: image004.png URL: From stockf at us.ibm.com Wed Feb 22 19:00:40 2023 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 22 Feb 2023 19:00:40 +0000 Subject: [gpfsug-discuss] spanning datacenter to AWS In-Reply-To: References: Message-ID: Bill, if my memory serves me correctly, there was a fix done in later versions of Scale (there may be an efix available) for the situation you described. Notably, Scale was not properly propagating information about files created through NFS. I suggest you contact Scale support to see if they can provide more details, as well as options for obtaining the fix, assuming my mind has not completely failed me on this issue ? Fred Fred Stock, Spectrum Scale Development Advocacy stockf at us.ibm.com | 720-430-8821 From: gpfsug-discuss on behalf of Bill Peters Date: Wednesday, February 22, 2023 at 12:53 PM To: gpfsug-discuss at spectrumscale.org Subject: [EXTERNAL] [gpfsug-discuss] spanning datacenter to AWS Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using Spectrum Scale Data Management Edition 5.1.1.0 Our Spectrum Scale cluster is running on Linux VMs on IBM z/VM. We have one application that cannot support the z/VM architecture so we used to have those servers running on VMware in our datacenter and those servers were client nodes in the Spectrum Scale cluster. This configuration worked great. We recently retired VMWare and moved all that workload to AWS. Because this was no longer on our LAN we thought it would be a good idea (IBM support also recommended it) to use CES NFS rather than adding the AWS instances to the cluster. Since doing this we have seen problems under high IO. Some NFS clients will try to access files that don?t seem to be there resulting in file not found errors. We know the files have been created but the NFS clients can?t see them. The read process runs successfully shortly after. We are not saturating our AWS connection. I haven?t seen any NFS tuning that looks like it would help, but that is an option I would be willing to try. The other option I?m thinking about is just adding the NFS clients to the cluster. Has anyone spanned datacenters like this? Thanks, any help is appreciated. -Bill Bill Peters Senior Platform Engineer 703-475-3386 wpeters at atpco.net atpco.net 45005 Aviation Drive Dulles, VA 20166 [A close up of a sign Description automatically generated] [Title: Facebook - Description: Facebook icon] [Title: Twitter - Description: Twitter icon] [Title: LinkedIn - Description: LinkedIn icon] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 17692 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1266 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 1329 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1378 bytes Desc: image004.png URL: From stockf at us.ibm.com Wed Feb 22 19:00:40 2023 From: stockf at us.ibm.com (Frederick Stock) Date: Wed, 22 Feb 2023 19:00:40 +0000 Subject: [gpfsug-discuss] spanning datacenter to AWS In-Reply-To: References: Message-ID: Bill, if my memory serves me correctly, there was a fix done in later versions of Scale (there may be an efix available) for the situation you described. Notably, Scale was not properly propagating information about files created through NFS. I suggest you contact Scale support to see if they can provide more details, as well as options for obtaining the fix, assuming my mind has not completely failed me on this issue ? Fred Fred Stock, Spectrum Scale Development Advocacy stockf at us.ibm.com | 720-430-8821 From: gpfsug-discuss on behalf of Bill Peters Date: Wednesday, February 22, 2023 at 12:53 PM To: gpfsug-discuss at spectrumscale.org Subject: [EXTERNAL] [gpfsug-discuss] spanning datacenter to AWS Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. ZjQcmQRYFpfptBannerEnd Hello all, I?ve been on the mailing list for a few years but have not been active except my introduction email. We are having an issue I?d like to run past everyone and see if anyone has experience that may help. Currently using Spectrum Scale Data Management Edition 5.1.1.0 Our Spectrum Scale cluster is running on Linux VMs on IBM z/VM. We have one application that cannot support the z/VM architecture so we used to have those servers running on VMware in our datacenter and those servers were client nodes in the Spectrum Scale cluster. This configuration worked great. We recently retired VMWare and moved all that workload to AWS. Because this was no longer on our LAN we thought it would be a good idea (IBM support also recommended it) to use CES NFS rather than adding the AWS instances to the cluster. Since doing this we have seen problems under high IO. Some NFS clients will try to access files that don?t seem to be there resulting in file not found errors. We know the files have been created but the NFS clients can?t see them. The read process runs successfully shortly after. We are not saturating our AWS connection. I haven?t seen any NFS tuning that looks like it would help, but that is an option I would be willing to try. The other option I?m thinking about is just adding the NFS clients to the cluster. Has anyone spanned datacenters like this? Thanks, any help is appreciated. -Bill Bill Peters Senior Platform Engineer 703-475-3386 wpeters at atpco.net atpco.net 45005 Aviation Drive Dulles, VA 20166 [A close up of a sign Description automatically generated] [Title: Facebook - Description: Facebook icon] [Title: Twitter - Description: Twitter icon] [Title: LinkedIn - Description: LinkedIn icon] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 17692 bytes Desc: image001.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image002.png Type: image/png Size: 1266 bytes Desc: image002.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 1329 bytes Desc: image003.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 1378 bytes Desc: image004.png URL: