4 Replies Latest reply on Apr 11, 2011 2:15 PM by philmodjunk

    Speeding up a large sorting operation

    KevinPfeiffer

      Title

      Speeding up a large sorting operation

      Post

      A question mostly for Phil, I guess. It's spawned by the comment earlier in another thread about speeding up operations with large data sets. For a standard sort (say, the sorted condition in which you prefer to always find a particular set of records), what about storing a sort key that is then updated nightly (or whenever)? You sort the large data set once a day, for example, then re-assign a serial number to a sorting field based on the current sorted condition... ?

        • 1. Re: Speeding up a large sorting operation
          philmodjunk

          Hmmm, but just assigning a value to a field doesn't actually sort the records unless you are viewing them in a sorted portal or with Go To Related Records via a sorted relationship. You just end up sorting on this new field instead of the old. Unless one of the fields from the original fields specified in your sort order cannot be indexed, there's no saving here. If there is, this is a version of the first option that I specified in that other discussion.

          Another trick that works like gang busters in theory but proves very problematic in practice is to sort the data, then import the entire massive set of records into a clone of the file and then bring it up in place of the original file. Then the unsorted "sort order" becomes the sort you want and this order is extremely to quick to bring up for even very large data sets. The catch is that now you have this massive, risky import records operation running for hours just to get a quicker sort...

          For those interested, it may have been this thread to which Kevin Pfeiffer is referring: 

          400,000 Records. Need help with performence

          • 2. Re: Speeding up a large sorting operation
            KevinPfeiffer

            Oops, as usual I guess each situation has its own specific criteria. In my case (a taxonomic database), I have a species record set with about 6,000 species records. Species are occassionally being added, so it's convenient to have a taxonomically sorted list (rather than order of creation). This means sorting by: family, subfamily, genus, subgenus, species, and subspecies, which even with only a few thousand records does take more than a few seconds.

            I just tried this -- works fine. My main sort script, which runs whenever this table is first opened, for example, now simply sorts the "sortIndex" field, which contains serial numbers starting with 1. I can see that one could do this for several preferred sort states if desired. Of course, new records will not appear in sorted order until the update script runs, but for my purposes, for example, that's not such a serious problem.

            • 3. Re: Speeding up a large sorting operation
              KevinPfeiffer

              >  this is a version of the first option that I specified in that other discussion

              Ah, I see what you mean. In my case I could concatenate the necessary sort terms (e.g. "FamilySubfamilyGenusSubgenus..." etc.) whenever a new record is created or a name changed and store this in my indexed field. Then sorts would always work and be quite fast (one pass), even with newly created records.

              • 4. Re: Speeding up a large sorting operation
                philmodjunk

                Ok, I can see where sorting on a single field's index would be faster than sorting on 6 fields, but I'd definitely make sure that they are all fully indexed fields to get maximum performance out of any sorts that include this many different fields.

                Actually, this is one case where FileMaker 10 and 11's dreaded and much maligned "auto-sort" might come into useful play. Once you have sorted your records, any newly added records will be automatically sorted into the existing sort order. This order will persist until you perform a find, use Show All records, or Show Omitted only to bring up a different found set.