[gpfsug-discuss] gpfs performance monitoring
Salvatore Di Nardo
sdinardo at ebi.ac.uk
Wed Sep 3 18:27:44 BST 2014
Hello everybody,
here i come here again, this time to ask some hint about how to monitor
GPFS.
I know about mmpmon, but the issue with its "fs_io_s" and "io_s" is that
they return number based only on the request done in the current host,
so i have to run them on all the clients ( over 600 nodes) so its quite
unpractical. Instead i would like to know from the servers whats going
on, and i came across the vio_s statistics wich are less documented and
i dont know exacly what they mean. There is also this script
"/usr/lpp/mmfs/samples/vdisk/viostat" that runs VIO_S.
My problems with the output of this command:
echo "vio_s" | /usr/lpp/mmfs/bin/mmpmon -r 1
mmpmon> mmpmon node 10.7.28.2 name gss01a vio_s OK VIOPS per second
timestamp: 1409763206/477366
recovery group: *
declustered array: *
vdisk: *
client reads: 2584229
client short writes: 55299693
client medium writes: 190071
client promoted full track writes: 465145
client full track writes: 9249
flushed update writes: 4187708
flushed promoted full track writes: 123
migrate operations: 114
scrub operations: 450590
log writes: 28509602
it sais "VIOPS per second", but they seem to me just counters as every
time i re-run the command, the numbers increase by a bit..
Can anyone confirm if those numbers are counter or if they are OPS/sec.
On a closer eye about i dont understand what most of thosevalues mean.
For example, what exacly are "flushed promoted full track write" ??
I tried to find a documentation about this output , but could not find
any. can anyone point me a link where output of vio_s is explained?
Another thing i dont understand about those numbers is if they are just
operations, or the number of blocks that was read/write/etc . I'm asking
that because if they are just ops, i don't know how much they could be
usefull. For example one write operation could eman write 1 block or
write a file of 100GB. If those are oprations, there is a way to have
the oupunt in bytes or blocks?
Last but not least.. and this is what i really would like to accomplish,
i would to be able to monitor the latency of metadata operations.
In my environment there are users that litterally overhelm our storages
with metadata request, so even if there is no massive throughput or huge
waiters, any "ls" could take ages. I would like to be able to monitor
metadata behaviour. There is a way to to do that from the NSD servers?
Thanks in advance for any tip/help.
Regards,
Salvatore
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20140903/d96e6643/attachment.htm>
More information about the gpfsug-discuss
mailing list