5 Replies Latest reply on Feb 5, 2015 12:35 PM by MorkAfur

    Is SORTING expensive computation-wise?

    MorkAfur

      Title

      Is SORTING expensive computation-wise?

      Post

      Does FileMaker keep sorted data structures between sorts?

      Thus if I sorted by last name, but then later, based on another Find, I needed to sort by first name, I'm wondering if FMP keeps the sorted information for a particular sort once you sort it.

      I'm guessing behind the scenes, once you do a sort, FMP will save an internal data structure for THAT sort, and keep it updated, so if you switch back to that sort later from another sort, it's not computationally expensive.

      Anybody know?

      Thanks,

      -m

        • 1. Re: Is SORTING expensive computation-wise?
          SteveMartino

          Is this standalone or on a server?

          Are you experiencing performance issues?

           

          • 2. Re: Is SORTING expensive computation-wise?
            MorkAfur

            Thanks for writing back.

            This is standalone for now. Due to the cost of "Server"'s concurrent licensing "model" ($$$), I'm staying away from it.

            -------

            The reason I'm asking about the computation cost of switching indexes is that during development, we may only have a few hundred records, but after the client loads 50,000 or more, I want to make sure switching sorts back and forth won't be an issue, performance wise.

            In the old Visual FoxPro (and FoxPro) days (well, I guess it's still a current product, but just barely), it would keep a "CDX" file with the index pointers computed once you created an index. And, VFP would also keep all indexes up to date. Then, you could just "SET INDEX TO ...." and it would switch to that index within the CDX file with no computational, or extremely minimal, overhead (once you created the index "tag" the first time, that is.).

            So, I'm guessing FMP does something similar, but chooses to hide the details since many of its users aren't developers. I'm all for the KISS principle as long as I actually know what's happening...

            Hope that clarifies the question.

            I could always generate 100,000 dummy data records with my data gen program and test, but was hoping the answer would be well known and I was interested in some of the internals if possible.

            Look forward to any update.

            Thanks again.

            - m

            • 3. Re: Is SORTING expensive computation-wise?
              philmodjunk

              As far as I know, each Sort operations starts the sort from scratch. These are O(N)LogN operations most likely so the time required not only is a function of the number of records to sort (N), but is something that grows geometrically with the number of records to sort. I'm not a programmer with inside info, that just seems to match what I observe when I work with the data.

              Things that can dramatically affect the time needed to sort records:

              The number of records (As noted above)
              Whether or not the sort field is indexed
              Whether it is or is not a stored calculation field
              Whether or not the sort field is from the layout's table occurrence or a related table occurrence.
              "Re-ordering" the sort results by summary field sub total...

              • 4. Re: Is SORTING expensive computation-wise?
                MorkAfur

                Yeah, I was afraid it would have to compute the sort each time.

                Though what actually happens is mostly opaque.

                I guess I'll have to generate100K records and experiment. I'll report back.

                Thanks Phil.

                • 5. Re: Is SORTING expensive computation-wise?
                  MorkAfur

                  I'm doing a "Find" then if the search string is not found (last name), then I do another find using what they entered to see if it's the firstname. If Get(FoundCount) > 1, I SHOW ALL, then SORT is so they can navigate records "near" to the first one "found" (Joe, Joes, etc.).  The idea is that I'm trying to search both first name and last name based on what they enter in the "Find" box (without them having to specify).

                  ------------------------

                  OK, here's what I found.

                  Here is one of the 500,00 sample data records: 313,Lane,Tullock,N,10/12/2013 2:41:41 PM 

                  ("memberPK","firstname","lastname","memberisactive","MEMBER_DATE_OF_VISIT")

                  Index was set to index fields automatically, the default.  The testing script sorts records on first name or last name upon entry to the respective field.

                  Number of records      Machine      Sort Time

                  500,000                        HDD               5 s

                  100,000                        HDD               < 1s

                  -------------------------------------------------------

                  500,000                         SDD               < 4s

                  100,000                         SDD               < 1s

                  ------------------------------------

                  Finding data in FMP is extremely fast so all good there.

                  I suppose after 100,000 records or so, perhaps they would need to only search by last name or expect a slight delay if data were needed to be sorted again.

                  If you would like a quick 500,000 records of test data (about 7 MB zipped) I used, just let me know.

                  Thanks,

                  - m