13 Replies Latest reply on Nov 24, 2011 12:37 PM by Mike_Mitchell

    Guidelines and pitfalls for fast Finds

    davehob

      In my application, I have a "People List" layout, which shows people who match the search criteria entered by user. It's a simple List view layout, using a scripted Find when the user clicks the "Find Now" button. Although it's been quick enough when run on our server, I'm now trying to run it on a remote host, and it's very slow.

       

      One of the problems seems to be a summary field on the People table (for showing the no. of People listed) - it takes several seconds to "summarize" when the Find is run.

       

      The other culprit is related data used as a Find criterion - e.g. if the user enters a value in a date field to find people who have attended since that date (necessitating a Find on the date field in a related Attendances table). Again, this really slows things down.

       

      How should I be doing things differently? If these are "no-no's" to be avoided, is there a list of similar pitfalls to avoid in future, plus alternative techniques?

       

      Thanks in anticipation,

       

      Dave.

        • 1. Re: Guidelines and pitfalls for fast Finds
          Mike_Mitchell

          Hello, Dave.

           

          You've discovered that bandwidth matters.       

           

          Allow me to explain a bit about how FileMaker's client / server model works "under the hood" as far as record fetching so you'll have a better idea. The model is very record-centric. If the client needs a record, it will fetch the entire record - all the fields and data associated with them (with a few minor exceptions like ESS fields, unstored calculations not yet displayed, and container data not yet displayed). So, for example, if you have a table with 100 fields in it, even if your current layout only shows 5 fields in a list view, all 100 fields are fetched for every record - and if some of them are sizeable, that may be shoving a lot of data over the wire.

           

          Now, what do I mean by "every" record? Turns out, this is how the client fetches records:

           

          • In Form view, in 25-record chunks. (The first 25 records as soon as you display the layout, additional 25-record chunks as they're needed.)
          • In List or Table view, as many records as will display. Additional records are fetched as they are needed for display.

           

          Sounds okay so far, right? Well ... there are some notable exceptions that create some of the problems you're seeing.

           

          When I perform aggregate operations, the entire found set has to be fetched. What's an "aggregate" function?

           

          • Sorting
          • Summarizing
          • Creating a new record
          • Replace
          • Relookup

           

          Ew. So you can see how your Summary field might be a problem, right? The system has to fetch the entire found set in order to calculate it.

           

          The related search is a bit different. We'd probably need a bit more information to diagnose it more precisely. Are you doing the search through a portal? That would be one thing that could slow it down, especially if you're using portal filtering. Is the Date field in the Attendances table indexed? If not, it'll slow it down a lot. Also, the entire related record is fetched when it's needed, so watch out for the number of fields in that table, too.

           

          Things you can do to speed it up:

           

          • Strip out unnecessary fields from your tables (especially the table where you're doing your Find). Substitute one-to-one relationships or conditional formatting (as appropriate) to move some of the complexity out of the data layer and reduce the amount of data that has to be shoved down the wire.
          • Watch your data model. If you have more than, say, 25 - 35 fields in a table, you might have data modeling issues. Or you might be putting things in the data layer that ought to be someplace else (like in the presentation layer with Conditional Formatting).
          • Get rid of summary fields on the display. Use the Get ( FoundCount ) function instead, or some other method to avoid embedding the summarizing in the data layer.

          • If your users need to perform a Find on related data, and you know that's slowing you down, script it for them. Allow them to put their date into a global in a Custom Dialog and then perform the Find behind the scenes, collecting the appropriate key fields for the Person table. Perform a second find in the main window to locate those key fields and see if that doesn't work better.
          • Avoid using portal filtering (it's expensive) if you're having performance problems with it.
          • Use lean graphics (small size, few colors).
          • Don't use different graphics on different layouts; go for a common look to take advantage of caching.
          • If creating a new record causes slowdowns, consider scripting it so you have an empty found set when the new record is created (like in a "new record" window, maybe).

           

          Those are just some ideas off the top of my head. Let me know if you need something different.

           

          HTH

           

          Mike

          1 of 1 people found this helpful
          • 2. Re: Guidelines and pitfalls for fast Finds
            davehob

            Mike,

             

            That is really helpful, and thanks so much for the comprehensive reply.  It's going to take me a while to digest your advice, and I think there's a lot of work ahead to get this thing fit for putting on the remote host. (For a start, there are over 400 fields in the People table, and I can't readily see what I can take out of the data table itself.)

             

            I'll feed back how I get on, but meanwhile, thanks again.

             

            Dave.

            • 3. Re: Guidelines and pitfalls for fast Finds
              ptrc

              Thank you Mike for this answer.

               

              I ahve been looking at some of our solutions recently and was wondering why it was taking some time to display the data.

               

              Peter

              • 4. Re: Guidelines and pitfalls for fast Finds
                Mike_Mitchell

                You're welcome. FWIW, I got most of that information from this year's DevCon. If you can afford it, the conference is definitely worth your while.

                 

                P.S. With 400 fields in the table, it's very likely some of them are used strictly for display purposes. For example, you might have First Name, Middle Name, and Last Name, and then you have a Full Name that's a concatenation - something like:

                 

                Full Name = Last Name & ", " & First Name & " " & Left ( Middle Name ; 1 ) & "."

                 

                You can probably get that out of the data layer and use merge fields on the layout where it appears. Just as an example. (We have lots of that sort of thing in the solutions we inherit from people.)

                 

                 

                • 5. Re: Guidelines and pitfalls for fast Finds
                  davehob

                  Mike,

                   

                  I would love to go to DevCon some time, but I work for a cash-strapped charity, with just about enough money to pay my salary, but certainly no extras.  Although I’ve considered funding a trip to DevCon myself, once I’ve added in the transatlantic flight, it’s just too much.  Maybe one day!

                   

                  Taking the summary field off the layout has made an immediate, very noticeable difference – thanks. Re. the issue of display data, you’re absolutely right, that there are many concatenated fields (names, addresses, etc. for display purposes), and also a lot of other summary fields (for report generation), so I obviously need to do something with these. 

                   

                  Given that I have recently split the solution into “Data” and “Interface” files, if I’ve understood you correctly, I think I need a table in Interface for “additional People fields”, and that this is related to the People table in the Data file by the one-to-one relationship? 

                   

                  There are also a lot of unstored calcs (Boolean 1/0 fields, to indicate gender, etc, which are then totalled in the summary fields).  Although you’ve pointed out that, as unstored calcs, they probably don’t slow things down, presumably these should also go in the “additional” table, leaving just the bare bones data in the table in the Data file?  I see (I think) that this additional data is then only accessed when needed.

                   

                  I think I’m beginning to get it – hope you don’t mind me running this past you, but it seems pretty crucial to get this right.

                   

                  Regards,

                   

                  Dave.  

                  • 6. Re: Guidelines and pitfalls for fast Finds
                    Mike_Mitchell

                    Dave -

                     

                    Yeah, I certainly understand the "me gots no cash" issue. No worries.

                     

                    Your concatenated fields can be eliminated by using a combination of merge fields, regular text objects, and web viewers. Concatenated fields used to be needed a lot more in the "old days" of FileMaker, but their need has diminished greatly with the new tools we have.

                     

                    I apparently gave you the wrong impression with the "one-to-one" relationship comment. Sorry. Let me explain further.

                     

                    The one-to-one relationship is a technique you use when you have a table that has a few fields that change on a frequent basis, and other fields that rarely change. It was recommended by Jon Thatcher, and attempts to leverage the user cache to reduce the amount of network traffic. In order to understand why this is useful, I'll need to explain the user cache a bit.

                     

                    FileMaker maintains a cache on each user's machine that contains the records that user is browsing. (Kinda obvious, if you think about it.) However, since it's a multiuser environment, that cache will need to be updated to contain any changes made by other users so each user will see the current information. Of course, that will increase network traffic and slow things down if those records are large.

                     

                    Jon's technique involves separating the table into two tables, one containing the fields that change often and one containing the fields that change only rarely. Join them using the primary key. This way, when someone makes a change to one of the fields in the frequently-changed table, only the fields in that table will be downloaded to the users' caches. This is especially useful if the fields that don't change too often are large. (The example Jon used was a Products table, where the product description doesn't change very much, but the price might change more frequently.)

                     

                    So, this really has very little to do with the separation model - where you have a data file and an interface file. The separation model, in general, will have no tables in the interface file - or maybe only a table to hold user session globals. The purpose of the interface file is to hold all your layouts and scripting, so if you make a change to the code, you don't have to re-import the data; you can just swap out the interface file.

                     

                    Make sense?

                     

                    Now, on the subject of Booleans for gender, et. al. I'm a little confused as to why these are unstored. If they really are being used for summarizing a count of how many men / women (for example) are on this report, then doing them as unstored is REALLY going to slow things down. Presumably, these are being used for reporting / statistics? If they are, then remember: Unstored calculations don't load to the client - until they are needed. But when you summarize those Booleans to get a total count, guess what? Those unstored calcs will all load for the entire found set, and then they will slow things down!

                     

                    If we are using these Booleans for statistics, as I suspect, then I suggest changing things around a bit. First, make them stored; they will summarize a LOT faster that way. Second, if you're doing reports on some frequency (like end of month or something), you can create a Reports table, in which you store the results of the report, so that once the report is generated, it doesn't have to be re-generated on the fly - which means it will display much faster.

                     

                    Another option is to separate out the fields that you use for the stats. For example, I assume you have a "gender" field, presumably with options of "M" and "F" (or the like). What about using two fields - one for "M" and one for "F"? Then, you can use the Count function to total the number of "M" entries and the number of "F" entries and get away from the Boolean altogether. (FWIW - Count is considerably faster than Sum.)

                     

                    Anyway, I hope that helps. Let me know if this isn't making sense. (I'm writing it and I'm not sure it makes sense.)        

                     

                    Mike

                    • 7. Re: Guidelines and pitfalls for fast Finds
                      shearn

                      I might not be understanding your situation completely but it sounds like you've got a list of people records and using a summary field to count them. If that's the case, you don't need a summary field, just use the found count which you can get by creating an unstored calculation of Get (FoundCount).

                      • 8. Re: Guidelines and pitfalls for fast Finds
                        davehob

                        Mike,

                         

                        Thanks again for your input, which is invaluable to my understanding.  I’ve done an analysis of my People table’s fields, and the 400 (approx) fields can be categorised as follows:

                         

                        “Basic data”, which doesn’t change much, and therefore probably belongs in the main table – i.e. key fields (primary and foreign), address, DoB, ethnicity, gender, home details (address, type of home, etc.), language, religion, “status” indicators (lone parent, refugee, etc.).  There are about 50 of these altogether, including audit fields (record modified date, etc.), which is more than you originally recommended, but rather less than 400!

                         

                        “Derived data”, such as concatenated address fields and other display stuff.  You’ve pointed out how most of these can be replaced by merge variables on demand, and I’ll aim to do so.

                         

                        “Calculated data”, which often changes.  Examples include date of most recent attendance, “concern” indicators, dates of membership renewals, etc.  These are the things that I’ve earmarked for putting in the new, one-to-one table.

                         

                        The “Boolean indicator” fields that I referred to earlier – loads of these, which are already split out as you suggested in your previous post.  So, for Gender, there are 3 calc fields, PPL_Gender_M, PPL_Gender_F, and PPL_Gender_Unknown, each set to 1 or 0 according to the contents of PPL_Gender. Each of these is then totalled in a corresponding summary field, PPL_Tot_Female, etc.  There are, therefore, dozens of these fields, which are ultimately used for reporting, using subsummaries (hence the need for summary fields, rather than counts, which you recommended).  The structure was initially set up by another developer, but I guess these should also go in the new table.  That said, I’m struggling then to see how they will correctly calculate the figures across the found set required for reporting. BTW, re. your suggestion of storing report data – unfortunately, these reports need to be available “on the fly”.  The above mechanism does allow that to happen, but pretty slowly.

                         

                        “Search globals” – rightly or wrongly, there are also a lot of globals used for the “Find Now” function that I referred to in my original post.  User enters a value in the search global (e.g. “Female” in the Gender global, PPL_g_SearchGender), and this is used in the Find.  My instinct tells me that these should be in the Interface file, rather than the People table.  That would certainly reduce the field count, but would be quite a bit of work to change it – but I can cope with that, IF it’s the right thing to be doing!

                         

                        Thanks again, for your support and patience.

                         

                        Dave.

                        • 9. Re: Guidelines and pitfalls for fast Finds
                          davehob

                          Hi.  Yes, using a merge variable and Get (FoundCount) certainly speeds things up (i.e. not displaying the summary field), although I do need that summary elsewhere in a subsummary report (see previous post).  Maybe there's a neat way around that too.

                           

                          Thanks for your input.

                           

                          Dave.

                          • 10. Re: Guidelines and pitfalls for fast Finds
                            Mike_Mitchell

                            Dave -

                             

                            Glad it's been helpful.

                             

                            When you refer to "calculated data", I'm assuming you're referring to calculation fields that live in the main table, but mirror data that live in other tables? What makes me think this is the first example - date of most recent attendance. (I remember you saying Attendance was a separate table.) If that assumption is correct, you may be able to eliminate these calculations completely and just reference the related data by sorting the relationship. For example, if you sort Attendance on Date_Attended (descending), then when you call Attendance::Date_Attended in a calculation (or just put the field on a layout), it will automatically be the most recent date. (Mirroring related fields used to be needed sometimes in the pre-version 7 days, because the model only supported looking at data one relationship away, but that's not true any more.)

                             

                            On the Booleans - yes, I realize they're already split out. What I'm suggesting is that they go away and instead, you make the data entry fields separate fields. (Huh?) This is done because you can make a Summary field defined as "Count of" instead of "Total of". (It's in the Options dialog for Summary fields.) This mimics the same performance boost you get from using Count instead of Sum. (Gotta love FileMaker; always another way to skin the cat.)       

                             

                            (That's not the best way to do it from a data modeling standpoint, but if you need summaries on those fields, it might work out a little better.)

                             

                            Or, you could keep the Booleans and still change the Summary fields from "Total of" to "Count of". That will still create a performance boost; just make sure to change the calcs on the Booleans from "1" or "0" to "1" or null.

                             

                            I wouldn't be too concerned about the search globals. I wouldn't personally have done it that way, just for clutter reasons, but I don't believe globals are a heavy hit on the payload. Each client gets its own copy of the globals, and the only time they go back to the server is if they need to be used for something at the server level (like a calculation or a relational join). I'd put it down the list of things to worry about. Strictly speaking, they could go into a Globals table with only one record (because there's really no need to tie them to the main table); might want to look at that in the future.

                             

                            Let me know if this needs clarification.

                             

                            Mike

                            1 of 1 people found this helpful
                            • 11. Re: Guidelines and pitfalls for fast Finds
                              davehob

                              Mike,

                               

                              I have had many "ahhh, I see" moments in these exchanges, and your help has improved things greatly in the Find.  But, more to the point, the rest of the solution will also be much improved in due course through your input.

                               

                              Re. your last suggestion, I can indeed get much of the data previously held as calc fields when I need it, via relationships and merge variables.  Altogether slicker - I guess it was as it was due to limitations of previous versions of FM.

                               

                              Thanks again,

                               

                              Dave.

                              • 12. Re: Guidelines and pitfalls for fast Finds
                                brsamuel

                                This might be worthy of a new discussion thread, but ...

                                 

                                Could you explain a bit further about this:

                                When I perform aggregate operations, the entire found set has to be fetched. What's an "aggregate" function?

                                 

                                • Creating a new record

                                 

                                This bullet item caught me by surprise.

                                 

                                Marc

                                • 13. Re: Guidelines and pitfalls for fast Finds
                                  Mike_Mitchell

                                  Sure, Marc.

                                   

                                  I probably need to clarify. Creating a new record is not an "aggregate function" in the general context of the term. (You won't, for example, find it under the Help definition.) It's in the context of those things that cause the client to fetch the entire found set that it fits.

                                   

                                  Whenever I create a new record, the FileMaker client will fetch the entire found set. To be perfectly honest, I'm not entirely sure why this is. Just something that came out of the sessions at DevCon. I don't have my notes with me at the moment. Maybe someone else remembers the function for this convention.

                                   

                                  Sorry for the confusion.

                                   

                                  Mike