[gpfsug-discuss] gpfsug-discuss Digest, Vol 155, Issue 2

Truong Vu truongv at us.ibm.com
Tue Jun 24 19:00:14 BST 2025


There is an undocumented option for this purpose. You can issue mmdelnode -f on the node bad node. This cleans up leftover configuration and stop/start services if needed.
>>> Specifically dual EPYC 9555
If tsgskkm is hung, you may hit a known gskit issue. Can you manually apply the workaround and see if it works?

Insert the following lines to file /usr/lpp/mmfs/lib/gsk8/C/icc/icclib/ICCSIG.txt
ICC_SHIFT=3
ICC_TRNG=TRNG_ALT4

Insert the following lines to file /usr/lpp/mmfs/lib/gsk8/N/icc/icclib/ICCSIG.txt
ICC_TRNG=TRNG_ALT4

Can you post lscpu output?

Thanks,
Tru.

On 6/24/25, 12:32 PM, "gpfsug-discuss on behalf of gpfsug-discuss-request at gpfsug.org <mailto:gpfsug-discuss-request at gpfsug.org>" <gpfsug-discuss-bounces at gpfsug.org <mailto:gpfsug-discuss-bounces at gpfsug.org> on behalf of gpfsug-discuss-request at gpfsug.org <mailto:gpfsug-discuss-request at gpfsug.org>> wrote:


Send gpfsug-discuss mailing list submissions to
gpfsug-discuss at gpfsug.org <mailto:gpfsug-discuss at gpfsug.org>


To subscribe or unsubscribe via the World Wide Web, visit
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > 
or, via email, send a message with subject or body 'help' to
gpfsug-discuss-request at gpfsug.org <mailto:gpfsug-discuss-request at gpfsug.org>


You can reach the person managing the list at
gpfsug-discuss-owner at gpfsug.org <mailto:gpfsug-discuss-owner at gpfsug.org>


When replying, please edit your Subject line so it is more specific
than "Re: Contents of gpfsug-discuss digest..."




Today's Topics:


1. Node in cluster but not in cluster (Laurence Horrocks-Barlow)




----------------------------------------------------------------------


Message: 1
Date: Tue, 24 Jun 2025 16:29:03 +0000
From: Laurence Horrocks-Barlow <lhorrocks-barlow at ocf.co.uk <mailto:lhorrocks-barlow at ocf.co.uk>>
To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org <mailto:gpfsug-discuss at gpfsug.org>>
Subject: [gpfsug-discuss] Node in cluster but not in cluster
Message-ID:
<LO2P265MB2877A1116CDB7D6F72379F7BD578A at LO2P265MB2877.GBRP265.PROD.OUTLOOK.COM <mailto:LO2P265MB2877A1116CDB7D6F72379F7BD578A at LO2P265MB2877.GBRP265.PROD.OUTLOOK.COM>>


Content-Type: text/plain; charset="utf-8"


Hi JAB,


Inline with Achim, I?d be tempted to do the following (based on what?s in your email)


>From the failed node


ssh <new node>
rm -rfv /var/mmf/gen
confirm ssh and/or api auth etc ? fix if required
confirm firewall ports ? fix if required
halt -p


---------------------------------


>From a node in the cluster


mmdelnode <new node> -p


---------------------------------


Boot new node up again


Check pings etc, and try again.


If that doesn?t work; let us know.






Laurence Horrocks-Barlow | Technical Director
[https://ocf.co.uk/media/0i1lzfjz/ocf-logo-strapline-2022-blk.png  <https://ocf.co.uk/media/0i1lzfjz/ocf-logo-strapline-2022-blk.png > ]<https://www.ocf.co.uk/  <https://www.ocf.co.uk/ > >
Phone:
0114 257 2200
Address:
OCF Limited, Unit 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield S35 2PG
Website:
www.ocf.co.uk<http://www.ocf.co.uk/  <http://www.ocf.co.uk/ > >
[LinkedIn icon]<https://www.linkedin.com/company/ocf-limited/  <https://www.linkedin.com/company/ocf-limited/ > > [Twitter icon] <https://twitter.com/ocf_hpc?lang=en  <https://twitter.com/ocf_hpc?lang=en > >
[https://ocf.co.uk/media/imkoyi2j/line.jpg  <https://ocf.co.uk/media/imkoyi2j/line.jpg > ]
OCF Limited is a company registered in England and Wales. Registered number 4132533, VAT number GB 780 6803 14. Registered office address: OCF Limited, 5 Rotunda Business Centre, Thorncliffe Park, Chapeltown, Sheffield, S35 2PG.
This message is private and confidential. If you have received this message in error, please notify us immediately and remove it from your system.


-------------


Scheduled Annual Leave:


Click to see my full annual leave calendar<https://calendar.google.com/calendar/u/0/embed?src=j5gt5c5fio5rvi4va5po200qf5v55040@import.calendar.google.com&ctz=Europe/London  <https://calendar.google.com/calendar/u/0/embed?src=j5gt5c5fio5rvi4va5po200qf5v55040@import.calendar.google.com&ctz=Europe/London > >


"It is well known that a vital ingredient of success is not knowing that what you're attempting can't be done."
-- Sir Terry Pratchett


I think the node was not added to the cluster, but some changes on the node itself took place already (like creating /var/mmfs/gen ... and possibly others)


There is/was a chapter/appendix to the GPFS Adminstration Guide, telling how to permanently remove GPFS from a node, that might help on how to stop the mmaddnode command to cope with a node already belonging to a cluster ...






--


Mit freundlichen Gr??en / Kind regards


Achim Rehor


Technical Support Specialist S?pectrum Scale and ESS (SME)
Advisory Product Services Professional
IBM Systems Storage Support - EMEA


Achim.Rehor at de.ibm.com<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > ><mailto:Achim.Rehor at de.ibm.com<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > >> +49-170-4521194
IBM Deutschland GmbH
Vorsitzender des Aufsichtsrats: Sebastian Krause
Gesch?ftsf?hrung: Gregor Pillen (Vorsitzender), Nicole Reimer,
Gabriele Schwarenthorer, Christine Rupp, Frank Theisen
Sitz der Gesellschaft: Ehningen / Registergericht: Amtsgericht
Stuttgart, HRB 14562 / WEEE-Reg.-Nr. DE 99369940




-----Original Message-----
From: Jonathan Buzzard <jonathan.buzzard at strath.ac.uk<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > ><mailto:Jonathan%20Buzzard%20%3cjonathan.buzzard at strath.ac.uk<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > >%3e>>
Reply-To: gpfsug main discussion list <gpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > ><mailto:gpfsug%20main%20discussion%20list%20%3cgpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > >%3e>>
To: gpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > > <gpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > ><mailto:%22gpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > >%22%20%3cgpfsug-discuss at gpfsug.org<http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > >%3e>>
Subject: [EXTERNAL] [gpfsug-discuss] Node in cluster but not in cluster
Date: Tue, 24 Jun 2025 16:13:48 +0100




I was trying to add a node to the cluster but after an hour I ctrl-c it
as it was clearly not in a good place. It told me that it didn't make
changes


^Cmmdsh: Caught SIG INT - terminating the child processes.
mmaddnode: Interrupt received: No changes made.


I checked to make sure all the networking is correct, GPFS packages are
installed on the node etc. and it is all good so I go to add it again and


mmaddnode: Node <##### censored> was not added to the cluster.
The node appears to already belong to a GPFS cluster.
mmaddnode: mmaddnode quitting. None of the specified nodes are valid.
mmaddnode: Command failed. Examine previous error messages to determine
cause.


Ok, lets try and delete it


mmdelnode: Incorrect node <##### censored> specified for command.
mmdelnode: No nodes were found that matched the input specification.
mmdelnode: Command failed. Examine previous error messages to determine
cause.




Trying to do mmsdrrestore on the node gives


mmsdrrestore: There is no record for this node in file
gpfs0:/var/mmfs/gen/mmsdrfs.
Either the node is not part of the cluster, or the file is for a
different cluster,
or not all of the node's adapter interfaces have been activated yet.
mmsdrrestore: Command failed. Examine previous error messages to
determine cause.


The node does not show up in mmlsnode or mmlscluster


Anyone and idea what is going on?




JAB.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20250624/9670fc42/attachment.htm  <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20250624/9670fc42/attachment.htm > >


------------------------------


Subject: Digest Footer


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at gpfsug.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org  <http://gpfsug.org/mailman/listinfo/gpfsug-discuss_gpfsug.org > 




------------------------------


End of gpfsug-discuss Digest, Vol 155, Issue 2
**********************************************






More information about the gpfsug-discuss mailing list