We recently upgraded from FileMaker Server 13 to FileMaker Server 16. Since the upgrade on March 31, FMS has stopped responding on two separate occasions and had to be rebooted. This is an unexpected amount of downtime, and I'd like some advice and assistance from FileMaker troubleshooting the problem. I suspect that the cause is an Import Records script step, but I don't know how to confirm this.
MacOS Sierra 10.12.6
Mac Pro (Mid 2010)
Processor 2 x 2.66 GHz 6-Core Intel Xeon
Memory 16 GB 1066 MHz DDR3
Worker Machine Information
MacOS Sierra 10.12.6
Mac mini Server (Mid 2010)
Processor 2.66 GHz Intel Core 2 Duo
Memory 4GB 1067 MHz DDR3
FileMaker Environment Overview
1 server and 1 worker machine (specs above)
~80 hosted databases
~50-60 connected FMPA 15 clients during business hours; most are on OS X; a small minority of users are on FMPA 16 or Windows
~7-10 connected FM Go clients; some are always-on kiosks
1 rarely used WebDirect solution
3 external services making intermittent XML connections (~1,000 requests per day)
On April 10, staff arrived at the office in the morning and were unable to open any databases. When trying to open a database, FileMaker displayed a "Find in Progress" dialog. This dialog appeared even before the OnFirstWindowOpen script trigger begins. We rebooted the server machine and after the reboot, behavior returned to normal.
On April 16, staff arrived at the office in the morning and discovered the same behavior. In addition, the FMS Admin site was not responding. We rebooted the server machine again and after the reboot, behavior returned to normal.
Looking at Stats.log, I see that our "Remote Calls In Process" is normally about 0-2. Starting at 1:10 AM on April 10, this started climbing until it reached 36 by 1:19 AM. It occasionally increased over the next seven hours, reaching 70 at 8:44 AM, when we rebooted the server.
In addition, the "Elapsed Time/call" is normally around 300-1000. Starting at 1:10 AM on April 10, this jumped up to about 25,000,000 and remained around there until the reboot at 8:44 AM.
Similarly, around 6:04 PM on April 13, "Remote Calls In Process" started climbing from its normal range of 0-2 up to about 33. It reached 40 on April 16 at 7:50 AM, when we rebooted the server.
The "Elapsed Time/call" also increased to around 21,000,000 between 6:04 AM April 13 and 7:50 AM April 16.
At 12:34 PM April 15, an error appeared in the log indicating "Admin Server process is not responding."
By comparing these logs to gaps in records that are created occasionally via XML connection, it looks like FileMaker was probably not responding to any requests (XML, FMP, or other) from the time that "Remote Calls in Process" and "Elapsed Time/call" started increasing until we rebooted (1:10 AM April 10—8:44 AM April 10; and 6:04 PM April 13—7:50 AM April 16).
Scheduled Import Correlation
While looking at the logs, I noticed that we have a spike in "Remote Calls/sec" once an hour, just after the hour, that lasts 5-10 minutes. During normal activity, "Remote Calls/sec" is around 20-50. During these hourly spikes, it reaches 300-400.
Both of the downtimes started during one of these hourly spikes, which suggests that this activity is causing the issue.
We have a client that is always connected and periodically runs a variety of maintenance and reporting scripts. The timing of these hourly spikes corresponds precisely to one of those scripts which is importing data from another data source. A lot of records (tens of thousands) are potentially added or updated during this script, so the dramatic increase in remote calls while it's running seems reasonable to me.
Before upgrading from FMS13, we had frequent problems with the web publishing service crashing on our worker machine. It's possible that those crashes are related to our new problems on FMS16. We didn't try too hard to fix that issue for a few reasons: we hoped that an upgrade would solve the issue; the web publishing service was not a high priority service; and this crash was easily fixed with a reboot, which we eventually scripted to happen automatically.
Why is FileMaker Server 16 crashing where FileMaker Server 13 wasn't? What can we do to troubleshoot this further? How can we adjust our import scripts to avoid triggering this crash (if they are truly the cause) without disabling the imports entirely?