1 2 Previous Next 17 Replies Latest reply on Aug 17, 2017 9:47 AM by beverly

    The Cost of Calculated Fields?

    Vinny

      I have yet to role out a complex design that would entail a large amount of data and sophistication.  I'm working on something larger and I'm concerned about the "cost" of calculated fields.

       

      As an example, take an accounting system which has a chart of accounts, and a lot of transactions related to those accounts.

       

      If I were to run a report to show the account balance for each account (sum of credits less debits as an example equation) - what is the best design?

       

      Should account balance be a calculated field, or should I leave it as a normal field and update it as needed?

       

      This might be an oversimplified example but I'm curious about the trade offs.

       

      it could also apply to inventory: current balance, average cost, etc.

       

      I realize that calculated fields will create performance issues, but I don't know if it's worth designing a system that is constantly updating / managing fields rather than calculating on the fly.

       

      Thanks in advance for your feedback!

        • 1. Re: The Cost of Calculated Fields?
          schwjm

          If you are careful with record locking you can do what you are trying to do, and even have it perform all right, but it is not something to be taken lightly. Accounting systems rely on transactions to be solid, and FileMaker doesn't directly offer this concept, but you can more or less get this using tricks with the relationship graph and careful scripting.

           

          As far as a general approach, in any setting, the fewer related records being loaded on a layout, the better (whether work is being done server or client side), and calculated fields consulting potentially a lot of transactions at once will not scale well. Every additional record loaded can add up quickly against your performance. This means you should use cache fields and update them whenever necessary. In the past I've gone with a hybrid approach which allowed all caching to be done in the background nightly, and the only related records that had to be loaded were ones from the past 24 hours. This reduces how much caching work you have to do, and it shouldn't be any less accurate than fully calculating against all transactions every time. That worked for the scale of that project (it was a very light "accounting" system but more inventory focused) but your mileage may vary.

          • 2. Re: The Cost of Calculated Fields?
            philmodjunk

            I don't see a very important distinction made here that needs to be made: stored calculations vs. unstored. In most contexts, the "cost" of a stored calculation or an auto-entered calculation for that matter is no different than a data field.

             

            It's the unstored calculations and also summary fields that are most likely to "cost" too much if improperly used. In particular, calculations to compute an "aggregate value" (total, average, max, standard deviation), if used to compute a value from large groups of records that cause delays.

             

            There re are a number discussions here and elsewhere that you can access on the subject. I can't search them out for you right now but others may be posting links even as I type.

             

            What I do suggest is that if you employ an updated number field approach that you keep the unstored equivalent handy on a hidden layout as a way to cross check your updated fields for accuracy during development.

            • 3. Re: The Cost of Calculated Fields?
              sccardais

              Phil

               

              I”d be interested in learning more about this from your reply above.

               

              What I do suggest is that if you employ an updated number field approach that you keep the unstored equivalent handy on a hidden layout as a way to cross check your updated fields for accuracy during development.

               

              Can you elaborate or point to some other source that explains this idea in more detail?

              • 4. Re: The Cost of Calculated Fields?
                philmodjunk

                unstored calculation fields don't evaluate and thus don't affect performance unless you put it on a layout and bring up some records or reference the field in another calculation that is evaluating--such as in the step of a script. So if you park your unstored calcs on a utility layout that the user never brings up each time that you replace one with an alternative approach such as a script updated number field, you still have the field available to you as the developer without it slowing down your system. During development and testing, you can check that field against its replacement. If the fields don't match, you know that you missed a type of transaction that affects that total.

                 

                Everytime I've replaced unstored calcs/summary fields with data fields and a script, I've found the ability to pop the hood and compare the two values a very useful debugging tool and have often put a script in place for emergency use that uses the unstored calc to update the data field should I discover that I've goofed somewhere and the fields are out of synch for a large number of records.

                 

                Case in point. The original solution for the company that influenced my forum name that I still use has an "invoice and line items" type arrangement with an unstored calc computing the sum of the line items. When you get several million records in the main "invoice" table, finding all records with a certain total becomes very slow. So I added a number field that is updated by a script when the cashier "prints" the invoice to store the same totals. Searches on that field are much faster than searching on the original. But I kept that field in place and even added an error checking calculation that returns "error" if the two are not equal. I've been able to use those fields a few times to correct data (and then update my design to prevent a recurrence) problems when the two fields' values did not match.

                2 of 2 people found this helpful
                • 5. Re: The Cost of Calculated Fields?
                  sccardais

                  Thanks very much, Phil.

                   

                  This is very helpful and explains why several of my layouts are too slow.

                  • 6. Re: The Cost of Calculated Fields?
                    srzuch

                    In many accounting systems, the user may want to run reports as of different points of time, and/or for different time periods.  This is another reason to use calculation/summary fields in additional to storing totals.

                    • 7. Re: The Cost of Calculated Fields?
                      Vinny

                      Phil,

                       

                      I guess my next question would be: why "wouldn't" I make the account balance a "stored" calculation?  If it updates anytime I commit a related transaction, then we are only updating that one calculation and performance hit wouldn't be terrible.  Then, when a report is run, it will perform faster because the calculations are already done.

                       

                      Are there situations where I would want to avoid a stored calculation?

                       

                      If there are no major implications to a stored calculation I'm thinking it would be much less complex than scripting the updates to a regular non calculated field.

                      • 8. Re: The Cost of Calculated Fields?
                        philmodjunk

                        Yes, but can they be stored? Calculation fields that refer to a field from a global field, summary field, or a related field cannot be stored. You could set it up as an auto-entered calculation on a data field, but then it won't update automatically.

                        • 9. Re: The Cost of Calculated Fields?
                          srzuch

                          Vinny

                           

                          In accounting systems, you may want the account balances as of different dates, or for different periods of time, or for certain transaction types (e.g. cash basis vs accrual basis), or based on a different date (e.g. transaction date vs entry date), and so on.  One current balance per account may not meet your users needs. 

                           

                          Storing the current balance, and maintaining summary or calculation fields,  may be the most appropiate.

                           

                          Steve

                          2 of 2 people found this helpful
                          • 10. Re: The Cost of Calculated Fields?
                            taylorsharpe

                            Big accounting systems run evening reports summarizing large totals and saving in tables and these summary reports can be accessed for getting totals much faster than totaling the whole database.  So you can increase performance by calling stored data made each evening in batch files.  But if your search is outside the scope of the stored summary data, you have to just wait for the long calculation to perform.  It is a tricky balance knowing how to store summary data and make it available for reports verses having everything calculated on the fly.  But planning on managing stored results can improve performance a lot.  The caveat is planning on when you need say today's hourly data, then using last night's summary data isn't going to work, etc. 

                            • 11. Re: The Cost of Calculated Fields?
                              srzuch

                              Yes, as I stated in response to Vinny, using a mix of stored vs calculated fields will be appropriate in many accounting systems.

                               

                              But the design even gets more complex, especially when you need to have access to but lock down prior period amounts.  For example, in one of the portfolio accounting systems I use to consult on, you could run historical reports based on a "transactions as of date", such as running a June balance sheet and 2nd quarter income statement for all transactions entered on or before July 15th.

                              • 12. Re: The Cost of Calculated Fields?
                                FileKraft

                                i understand your scope here is mostly calculation fields - in regards of overall performance of your solution you need also to consider that indexed fields have a cost on record creation and security settings controlling access to your data are another dimension of the equation...

                                1 of 1 people found this helpful
                                • 13. Re: The Cost of Calculated Fields?
                                  beverly

                                  "historical" is the main reason I use a scripted Set Field over a calculation. IF something needs to recalculate it, there are strict rules for doing so.

                                  Beverly

                                  1 of 1 people found this helpful
                                  • 14. Re: The Cost of Calculated Fields?
                                    Vinny

                                    Phil,

                                     

                                    You are right, I wasn't thinking correctly.  I can't set it up as a stored calculation because it refers to a related table.

                                     

                                    If I use the scripted method and update the balance in a non-calculation field, I may run into record locking issues. I wonder if I can set up a related table for this data instead and store it there to prevent locking issues.

                                     

                                    I also wonder if I can use ExecuteSQL and a virtual table for reporting instead of basing the report on an unstored calculation.  I'd be interested to see the performance comparison and which one is faster.  I would think that they would be relatively the same?

                                    1 2 Previous Next