5 Replies Latest reply on Aug 17, 2015 8:41 AM by philmodjunk

    General advice required on large dataset calculations

    Stu412

      Title

      General advice required on large dataset calculations

      Post

      Hi all

      My main dataset will always be growing in record sizes and potentially keep growing in field numbers, currently around 70,000 records.

      I've not been using FM so long that I could have thought of this when I started the project, so I'm asking now.

      By calculating directly on the data field and adding further summary fields etc here, I've noticed slowdowns.

      Is there any justification for using this merely as a source and for each layout/report, creating a separate related table( not a table instance) which is populated by a script?

      My second question directly off the back of that would be, assuming any sort of Yes to the above, and in an ever changing data environment, how often should the script run?  You'd think after every change...

      I'm just looking for experience and opinion in this area as scripting could do a lot of the heavy lifting, it's just a case of whether it's quicker in run time and easier to follow moving forward.

      Quite woolly I know, I just feel I need a conversation about this!!

      Thanks

       

        • 1. Re: General advice required on large dataset calculations
          philmodjunk

          Is there any justification for using

          Can you describe in more detail how you plan to use that table to produce your report?

          One longstanding trick in FileMaker is to set up a "summary table" where much of aggregate data is "pre-calculated" and stored as data in simple number fields, but this is a trick that can only be made good use of  for data not frequently updated if at all after a certain point in the work flow is reached.

          • 2. Re: General advice required on large dataset calculations
            Stu412

            I am thinking of pre calculated data on a table which is ready to go via the calculations held in the script.  I am also thinking, as you've rightly pointed out, about the refreshment of the data.

            I'll be looking at data which is old and static (5 years worth) and on the same report, having the current year's data, which is liable to change until a certain status is reached.  All of the data in my DB has a status of either 'complete' or 'in draft'.

            If complete, I'd not need to recalculate it and in a script simply step around that subroutine.

            I'm also using the same data for charting, nice simple data, nothing that really needs further operation.  But again, unless it's marked as 'complete' on the 'ChartsData' table, it's liable to change.

            Potentially one path would be upon running the relevant script to remove any records on the target table which are 'in draft', ensuring part of the population routine on this script is to always set a field with a status.

            Is this trickery quite common or have I come up with something which other developers look at and say 'Noooo, young Padawan!'

            Thanks

             

            • 3. Re: General advice required on large dataset calculations
              philmodjunk

              It sounds like a method that's been in use for years. I worked out the following method back in FileMaker 3 or 4 (I forget) for a scrap metal business that's the source of the "Junk" in my forum name. The business has from 300 to 1000 customers a day with a purchase order for each customer that has a minimum of 4 line items per PO.

              So to generate 5 year comparison reports without an extended wait while millions of records are summarized required a summary table. The summary table has one record for each type of material purchased from a customer that day with a total Qty purchased, a total of funds expended to purchase that material. This reduces up to 1000 line items of data to about 20 records for the day. Over a 5 year span that's a huge reduction in the number of calculations needed to compute totals and averages.

              A general outline of the script:

              Loop
                   Find all line item records for PO's "Printed" today not marked as "summarized".
                   Exit loop if no records found
                   Constrain found set to all records with the same type of material as the first record in the found set.
                   Create a new record and copy the value of summary fields to number fields in the new record to save aggregate values
                   Use Replace Field Contents to "mark" found set as "summarized".
              End Loop

              But keep in mind that this is data that has been printed on a hard copy purchase receipt and handed to the customer. Almost never, does data from a printed PO get changed other than to void it and create a new one--and then almost never on a day other than the day it was originally printed and thus this data works very well for this summarizing technique. The original records are retained in an archive table and we can pull the data back and "re-summarize" it if the need arises.

              • 4. Re: General advice required on large dataset calculations
                Stu412

                Phil, it's good to know two things - 1) the meaning of the name(!) and 2) that what I'm thinking of doing in the future isn't out of the park.

                The loop looks very similar to what I'd be doing:

                Loop

                Find records marked as 'draft' for this customer

                Gather information for variable1, variable2 etc

                Goto destination table, remove draft records

                New record

                Populate fields with variables, (thus replacing previous draft records)

                All the customer summary data on one table, useful for both reports and charting.

                Thanks for the confirmation, I'm feeling somewhat better now

                • 5. Re: General advice required on large dataset calculations
                  philmodjunk

                  It's generally possible to set up a relationship such that you can create the new summary record and update the value of it's fields with set field steps and not need to do the whole copy values to fields, change layouts, new record, set fields , rinse and repeat..." scenario.