[gpfsug-discuss] GPFS Independent Fileset Limit vs Quotas?
Marc A Kaplan
makaplan at us.ibm.com
Tue Aug 14 16:31:15 BST 2018
True, mmbackup is designed to work best backing up either a single
independent fileset or the entire file system. So if you know some
filesets do not need to be backed up, map them to one or more indepedent
filesets that will not be backed up.
mmapplypolicy is happy to scan a single dependent fileset, use option
--scope fileset and make the primary argument the path to the root of the
fileset you wish to scan. The overhead is not simply described. The
directory scan phase will explore or walk the (sub)tree in parallel with
multiple threads on multiple nodes, reading just the directory blocks that
need to be read.
The inodescan phase will read blocks of inodes from the given inodespace
... since the inodes of dependent filesets may be "mixed" into the same
blocks as other dependend filesets that are in the same independent
fileset, mmapplypolicy will incur what you might consider "extra"
overhead.
From: "Peinkofer, Stephan" <Stephan.Peinkofer at lrz.de>
To: gpfsug main discussion list <gpfsug-discuss at spectrumscale.org>
Date: 08/14/2018 12:50 AM
Subject: Re: [gpfsug-discuss] GPFS Independent Fileset Limit vs
Quotas?
Sent by: gpfsug-discuss-bounces at spectrumscale.org
Dear Marc,
If you "must" exceed 1000 filesets because you are assigning each project
to its own fileset, my suggestion is this:
Yes, there are scaling/performance/manageability benefits to using
mmbackup over independent filesets.
But maybe you don't need 10,000 independent filesets --
maybe you can hash or otherwise randomly assign projects that each have
their own (dependent) fileset name to a lesser number of independent
filesets that will serve as management groups for (mm)backup...
OK, if that might be doable, whats then the performance impact of having
to specify Include/Exclude lists for each independent fileset in order to
specify which dependent fileset should be backed up and which one not?
I don’t remember exactly, but I think I’ve heard at some time, that
Include/Exclude and mmbackup have to be used with caution. And the same
question holds true for running mmapplypolicy for a “job” on a single
dependent fileset? Is the scan runtime linear to the size of the
underlying independent fileset or are there some optimisations when I just
want to scan a subfolder/dependent fileset of an independent one?
Like many things in life, sometimes compromises are necessary!
Hmm, can I reference this next time, when we negotiate Scale License
pricing with the ISS sales people? ;)
Best Regards,
Stephan Peinkofer
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20180814/07b77df9/attachment.htm>
More information about the gpfsug-discuss
mailing list