1 2 Previous Next 17 Replies Latest reply on Dec 19, 2013 1:52 PM by GordonShewach

    Need help interpreting server statistics

    pthomas

      Hello everyone,

       

      I am a bit concerned with some of the stats that I am seeing on our server at the moment (specifically the Elapsed Time, I/O Time and Wait Time).

       

      I am also a bit confused by the numbers showing in the "Average" column and how they are able to change so drastically in such a short space of time. My understanding is the average numbers are calculated based on the server uptime, but if that is the case how are the numbers able to have such massive swings in such a short time?

       

      I don't have the server specs on hand, but if required I can get them off our system administrator. I do know that it is running on Mac OSX 10.8 and the Database cache has been set to 3,072 MB as per the recommendation in the admin console.

       

      Here are some screenshots taken this morning from one of our servers, the server has been running for about 6 days:

      1.png

      Stats as at 8:48am 06/08/2013:

      2.png

       

      Stats as at 9:09am 06/08/2013:

      3.png

       

      Stats as at 9:12am 06/08/2013:

      4.png

       

      Stats as at 9:24am 06/08/2013:

      5.png

       

      Is it normal for the averages to change that much? I would have expected the change in numbers over time to be much smaller given the short period of time I took those screenshots over compared to the length of time the server has been running?

       

      Cheers,

       

      Paul.

        • 1. Re: Need help interpreting server statistics
          taylorsharpe

          The statistics will always be changing as people are using the database.  Yes, I see the average values change a lot on mine while it is being used.

           

          Note that the times are in millions of a second.  So in your last stat at 9:243am on 6/8/2013, your Peak (maximum) time/call was only about 1.5 seconds and your average call is 0.000342 seconds. 

           

          Pay attention to your Cache Hit %.  You want its average to be between 95 and 99%.  If you are at 100% all the time, then your cache is probably too large (which is OK if you have lots of spare RAM).  If it is below 95%, you need to increase your cache.  Remember you can only set the Admin console cache to half of the overall RAM.  So if you're maxed out on cache and your average is going below 95%, it is time to buy some more RAM for the server. 

          1 of 1 people found this helpful
          • 2. Re: Need help interpreting server statistics
            pthomas

            Hi Taylor - thanks for the reply!

             

            I guess I was just concerned that the numbers changed so much in that space of time, I would imagine that over time the averages should get to a reasonably stable state, and even large spikes in the current numbers would only have a minimal impact on the average (unless those large spikes continued for a prolonged period of time).

             

            I understand that the times are in millions of a second, but that is per call and when I have been watching the client stats some of them have had upwards of 200 calls, which can lead to a noticable slow down for that user!

             

            One thing I am concerned about is the disk speed in the server. According to the following URL (I understand that article is reasonably old, but I would assume the underlying information is still valid (if anyone has a link to a more update article discussing stats I would love to see it!):

             

            http://www.briandunning.com/browse/browse0110.shtml  

             

            The I/O time being over 100 may indicate that the disks should be upgraded, however because the numbers are all over the place I can't be sure if the disks are really an issue or not.

             

            More specifically it would be good to get a rough idea on how much of an improvement faster disks would give us so I can justify the cost of upgrading, but I need to have a stable baseline of the current performance before I can do that!

             

            I will just have to keep an eye on it over the next few days and see if the averages settle down a bit I guess

             

            Cheers,

             

            Paul.

            • 3. Re: Need help interpreting server statistics
              taylorsharpe

              Out of curiosity, what type of hard drive do you have?  Is it a RAID and/or SSD?  That can make a lot of difference off of a stock system. 

              • 4. Re: Need help interpreting server statistics
                pthomas

                I don't have direct access to the server, but I asked our system administrator and he said that it has the stock standard hard drives that shipped with the Mac server, they are configured into a raid, but are probably just 5,000 RPM SATA drives.

                • 5. Re: Need help interpreting server statistics
                  taylorsharpe

                  Then you probably have a Mac Mini OS X Server and those drives are typically 5400 rpm drives and very slow.  You could upgrade to SSD.  What I do to get good performance out of a Mac Mini Server is to pair it with a Pegasus Thunderbolt RAID (R4 or R6) and you'll get blazing fast speeds in the neighborhood of 600 MB/s, which is faster than most SSDs.  They come standard with 7200 rpm SATA drives, but as a RAID 10 system, they really can give you faster performance than many very expensive enterprise SAN systems. 

                   

                  Some people really frown on the Mac Mini Server because it is basically a consumer level machine that just has an i7 processor added to it for performance.  While the processor boost is nice, hard drives are what really matter to databases.  So if you match it with a fast RAID, then you're doing pretty well.  An enterprise person will comment on the Mac Mini not having redundent ethernet, backup power supply, etc., which is all true.  But usually for the price of an enterprise class server, I could have 2 or 3 Mac Mini servers all ready to swap out upon failure and I would just keep all the data and startup drive (create its own partition) on the more reliable RAID. 

                   

                  Then again, I can't wait to see what FileMaker will do on the announced, but not yet shipping, Mac Pro!  Hard drive read/writes are supposed to be over a Gigabyte per second!  And with up to 12 of the new xeon E processors, that will be quite an animal.  Then again, I'm sure it will cost many times what a Mac Mini Server does. 

                  • 6. Re: Need help interpreting server statistics
                    pthomas

                    Hi Taylor,

                     

                    I have obtained the following information from our system administrator:

                     

                    Hardware – Xserve (early 2009) - introduced 2009 discontinued 2011

                    Model – Xserve3,1

                    Harddrive – 60gb (SAS Drive) (7200rpm)

                    Ram – 6gb (1066 MhZ DDR3)

                    Processor – 2.26 Ghz Quad- Core Intel Xeon (8mb shared L3 cache per processor)

                    OS – 10.8.4

                    RAID

                    3 x 1tb

                     

                    Users have been complaining about the performance since our upgrade to FM 12, and the stats look really bad (average elapsed time is currently sitting at 167,413 - average I/O time is 7,947 and average wait time is 15,564). However after a server reboot the stats seem to be ok for a day or so.

                     

                    Is our hardware just not up to running FM 12?

                     

                    I see that it only has 6GB of RAM and the recommended is 8GB, however the Cache stats look fine, so I would assume that RAM is not the issue?

                     

                    It wouldn't surprise me if there is some level of corruption in our database due to the age and how it has been worked on over the years, however any corruption would also have been there when we were running FM 11, so unless FM 12 is just more sensative to that sort of thing I wouldn't imagine that would be the reason for the slow performance?

                     

                    Thanks,

                     

                    Paul.

                    • 7. Re: Need help interpreting server statistics
                      wimdecorte

                      You need to look at at the admin console stats panel, with the client stats tab selected when users are complaining or you see a long "elapsed time / call" in the overall stats.  Then you can check with the users what it it is they are doing at that time.  That should give you some pointers as to what areas are of the solution are not optimal.  Or use a tool like FMbench.

                       

                      You mention RAID but not what type.  If it is RAID 5, see if you can get it switched over to RAID1+0.  Faster hard disk will help but if the delay is in the "elapsed time per call" column generally then it comes down to pure processing power and inefficiencies in the solution.

                      1 of 1 people found this helpful
                      • 8. Re: Need help interpreting server statistics
                        pthomas

                        Hi Wim - thanks for the feedback!

                         

                        I have just discovered that the server is configured with a RAID 5... however the FileMaker database is hosted on the 60GB hard drive that is not even part of the RAID

                         

                        The only data being stored on the RAID are the backups - scheduled and progressive.

                         

                        I've had a look at the client stats, but to be honest people are just doing basic finds and data entry and things like that when it is slow, the same sort of thing they always do.

                         

                        Here is a snapshot of the user stats when I first checked this morning:

                         

                        Capture.JPG

                         

                        More and more it is looking like some sort of corruption in our solution I think, however I am not sure why it has only started happening since the upgrade to FM12?

                         

                        Thanks for the tip about FMBench - I am going to check it out now, hopefully it can highlight where in the solution the problems may be!

                         

                        Cheers,

                         

                        Paul.

                        • 9. Re: Need help interpreting server statistics
                          wimdecorte

                          Looking at those stats, the i/o time is very high and clearly a big bottleneck.  I would focus on that first.   The numbers that I am seeing there are not typical.  What is your solution doing that would generate a lot of i/o?

                          Is the cache hit % consistently at 100%

                           

                          How much free disk space?  How big is the total solution?

                          How many records in the various tables?

                           

                          You mention the users doing basic finds.  Those can be "expensive" depending on the architecture of the solution.  Say for instance that you have 1,000,000 records in a table and have an unstored calc field in that table  whith users doing searches on that field that generate large found sets. And the results of those are shown in a wide list view with many other unstored calc fields.  That sort of activity is going to be a real performance killer.

                          • 10. Re: Need help interpreting server statistics
                            BowdenData

                            Paul,

                             

                            Check out this webinar from FileMaker Academy. It covers some of the items that your solution might be doing and what Wim alludes to.

                             

                            http://www.filemakeracademy.com/index.php/december-20-7-tips-to-improve-performance-of-your-filemaker-solution/

                             

                             

                            Doug

                            • 11. Re: Need help interpreting server statistics
                              pthomas

                              Hi Wim,

                               

                              The cache hit% is always at 100.

                               

                              I have spent the last couple of hours sitting with our assistant system administrator (our sys admin is away at the moment) and we have uncovered some more information.

                               

                              Turns out the sys admin hadn't set up the server correctly (spotlight was still enabled, a bunch of adobe software was installed which wasn't required and was taking up space etc...) so we have tidied some of that stuff up which will no doubt help, although I don't believe it will make a massive difference.

                               

                              The server had 14GB of free disk space which we have managed to push out to 34GB.

                               

                              We have also discovered that the RAID does in fact cover the 60GB volume so the FileMaker databases are on a RAID 5.

                               

                              The RAID consists of 3 x Western Digital RE3 WD1002FBYS 1TB 7200 RPM 32MB Cache SATA 3.0Gb/s drives.

                               

                              Having a quick look we found the following drive that we could potentially use to upgrade our RAID drives for a minimal cost:

                               

                              Western Digital WD1002FAEX 1TB 7200 RPM 64MB Cache SATA 6.0Gb/s.

                               

                              Do you think that would be a big enough boost to be worth it?

                               

                              To answer your other questions:

                               

                              Our database file is 4.26GB.

                              The main tables in the solution are Clients - 19262 records, Contracts - 61,311 records, Bookings - 267,975 records.

                               

                              All in all I would have thought this was towards the smaller end of the scale.

                               

                              We have 20-30 concurrent users.

                               

                              Cheers,

                               

                              Paul.

                              • 12. Re: Need help interpreting server statistics
                                pthomas

                                Thanks Doug - I will check out the link

                                • 13. Re: Need help interpreting server statistics
                                  wimdecorte

                                  Paul Thomas wrote:

                                   

                                   

                                  Western Digital WD1002FAEX 1TB 7200 RPM 64MB Cache SATA 6.0Gb/s.

                                   

                                  Do you think that would be a big enough boost to be worth it?

                                   

                                   

                                  I doubt it.  But it is not going to hurt.  Also look into switching to to RAID 1+0 instead of 5.

                                  Remember that RAID is not about speed but data redundancy so any RAID is going to be slower than no RAID (typically).  But 1+0 is considered more efficient for database server operations.

                                   

                                   

                                  Paul Thomas wrote:

                                   

                                   

                                   

                                  Our database file is 4.26GB.

                                  The main tables in the solution are Clients - 19262 records, Contracts - 61,311 records, Bookings - 267,975 records.

                                   

                                  All in all I would have thought this was towards the smaller end of the scale.

                                   

                                  We have 20-30 concurrent users.

                                   

                                   

                                  The size and # of records is not dramatic.  But those numbers don't tell us anything about the complexity of the design.

                                   

                                  A good analysis of the solution should reveal some quick wins.

                                  • 14. Re: Need help interpreting server statistics
                                    pthomas

                                    I will speak to our sys admin when he is back about swapping to RAID 1+0 although looking at the server I am not sure if there is a drive bay for a 4th drive so that may be a sticking point!

                                     

                                    I am sure there are a lot of improvements that I can make to the solution (it is a number of years old and has been worked on by a bunch of different people) however I am in the process of developing a replacement solution so don't want to spend too much time working on the existing one if I can avoid it.

                                     

                                    I guess my main concern is that these performance issues only started since the upgrade to FM 12, so any issues with the solution design would have been there in FM 11!

                                     

                                    Thanks again for your help - much appreciated!

                                    1 2 Previous Next