8 Replies Latest reply on Dec 29, 2010 4:45 PM by eibcga

    Aggregate functions



      Aggregate functions


      Hello all,

      I'm a beginner using FMPA11v2 on a Mac and created a solution to analyze financial transactions in various ways.

      Attached is my Relationships Graph, a "General Ledger" layout based on the LEDGER table, and a list of new Summary fields I added to that table so I can get some statistics from each account on the layout.  Also shown are other numerous LEDGER table layouts I use to analyze accounts in various ways (by the way, "FSA" stands for Financial Statement Area).  In addition to the existing aggregate functions I've added to the General Ledger layout using Summary fields (Total of, min, max, avg, stDev, count), which all work as intended, I would also like to add Variance.  Since there's no Summary field for Variance, I figured I would need to use the Variance function.

      As you can probably tell, the General Ledger layout is a cross-tab report, where the Debit and Credit columns are based on IF statements that put the value from the Amount field in the Debit column if the amount is positive, and in the Credit column if the amount is negative.  I then use Summary fields to aggregate the Debit and Credit fields in various ways, all of which work as intended.  I also use functions to compute the average of each month or quarter on numerous layouts by having to use the GetSummary function (see attached).  Again, everything works great so far.

      My problem is that I would now like to compute the variance of all the debit (and credit) values that appear in the Debit field (or Credit field), but there is no option for this in the Summary field type.  So, I created a new Calculation field in the LEDGER table called "var_dr" and use the formula, "VarianceP( debit )" with result as number, but all I get as a value in Browse mode for that field in the layout is a question mark? (and yes, the field is wide enough so show the full result). Aggregation works great using Summary fields, but can't get it to work using a function.  Any ideas?

      I would also like to compute the same statistics (average, min, max, etc.) in various other layouts as was done in the General Ledger layout, but not sure how to tackle that, since there problems that arise when trying to use related fields in some functions, or the results just show question marks.  I know there must be a way to do this, but I'm missing something.  Again, any ideas to guide my way or some examples would be very helpful.

      Thanks you.


        • 1. Re: Aggregate functions

          First, some basic info on aggregate functions. While, like summary fields, they can compute values that span multiple records, their syntax and the needed table structure is different. Using the aggregate function, Sum as an example, you have three basic ways to use an aggregate function:

          Sum (field1; field2; field3) where these fields are all in the same record. It's equivalent to field1 + field2 + field3

          Sum ( repeatingfield ) this sums up all the repetitions in a repeating field. Useful in FileMaker's flat file days, but not used nearly as often today.

          Sum ( relatedtable::Field ) this sums up all the values of field for all records related to the current record where the calculation using sum was defined. This is the key difference, you need a relationship in place that matches to the group of records you want to supply data to the aggregate function.

          You can roll your own calculation that uses summary fields to compute the VarianceP, using the actual formula for this function as documented in FileMaker Help.

          Define a field, DebitSquared, that squares your individual value (x in the help formula):  DebitField^2

          Define a summary field, sTotalSquaresDebit as the Total of DebitSquared. and another field, snCount as the count of debit. You should already have a field, sTotalDebit that is simply the total of Debit.

          Now this formula computes the variance of debit for you entire found set:

          sTotalSquaresDebit / snCount - ( sTotalDebit / snCount ) ^ 2

          And to compute this for groups within the current found set, you can use Get Summary functions instead of direct references to the summary fields.

          • 2. Re: Aggregate functions

            Thank you very much, PhilModJunk, I will give this a try.

            I'm puzzled why using the formula VarianceP(LEDGER::debit), for example, or some other calculation for that matter, that I may want to do in the future based on summarized data from local or related fields, would not work, when all transaction detail records are in the LEDGER join table already [you mentioned above the definition of, e.g., Sum(relatedtable::field)].  All my layouts are LEDGER table layouts.  All aggregations that I do are based on the LEDGER records (but many break-fields are from parent tables like Account, Name, etc.).   I've read somewhere that FM does not do some things in a way one might think due to some internal efficiency issues, etc., and consequently, work-arounds have to be made.  As you can see from the above screenshot, I've accumulated quite a lot of layouts already and they all work great.  But, I have hit a road block in being able to expand them with more calculations because they don't work as intended (the variance calc. is just one example).  Thanks for baring with me and I'm learning a lot.  Any further comments?  

            • 3. Re: Aggregate functions

              "I'm puzzled why using the formula VarianceP(LEDGER::debit), for example, or some other calculation for that matter, that I may want to do in the future based on summarized data from local or related fields, would not work, when all transaction detail records are in the LEDGER join table already"

              You need a relationship to control what subset of the records supply the list of values to the aggregate function. If you don't use the relatedtable::Field format, you are limited to only those fields you explicitly list inside the parenthesis and this makes it nearly impossible to refer to more than one record in a given table.

              The relationship referenced in the function can be a self join relationship where you have the same data source table on both sides of the relationship, but one way or the other, it's the relationship that controls what values are used to compute the result.

              • 4. Re: Aggregate functions

                Thanks again, I'll play around with this and see if I can figure it out from what you're telling me.

                At the moment, the subset of records that are supplying the list of debit values to the aggregate function are, for all related records that have an amount which is positive, put that amount in the 'debit' field, then add up those values in the 'total_debit' Summary field as total of 'debit'.

                I originally thought that, since function parameters can be constants, field references, other functions, or expressions, then using the 'debit' field as the parameter of the, for example, VarianceP function, would be good enough, as I want the variance of all the debit amounts.  As I'm beginning to realize, my current table structure is not enough for FM to "figure out" what I meant.

                My current relationships in the Graph already compute all the results I've needed so far.  It sounds to me a lot of work by having to set up even more relationships, but I'm beginning to see your point and understand how things work.  I'll carry on, and I'll post again if I'm still having trouble.

                Thank you for all your help.  I'm really enjoy FM and amazed at it's ease-of-use and powerful features.

                • 5. Re: Aggregate functions

                  I've added all the summary and calc fields you suggested in the LEDGER table, but I still get the result value as a question mark when I'm in Browse mode in the General Ledger layout.  All fields mentioned below (your naming convention added for your convenience) are in the LEDGER table:

                  cnDebit_squared = debit^2

                  cnCredit_squared = credit^2

                  sTotal_dr_squared = Total of debit_squared

                  sTotal_cr_squared = Total of credit_squared

                  cnVar_dr = total_dr_squared / count_dr - ( total_debit / count_dr ) ^ 2

                  cnVar_cr = total_cr_squared / count_cr - ( total_credit / count_cr ) ^ 2

                  I'm not getting the intended result probably because, as you mentioned, "You need a relationship to control what subset of the records supply the list of values to the aggregate function".  

                   While up to this point, the sub-summary parts in all the sub-summary layouts were good enough to supply the list of values in related records to provide various aggregations based on various summary fields, the GetSummary function, and the existing relationships, this aggregation method falls short when Functions (e.g., like VarianceP) are used.

                  I'm seeing now that, when the Summary field as a means of aggregation falls short, then I have to manually create, using relationships, what the Summary field does, by aggregating in another way (I'm guessing FM developers had to aggregate using relationships before FM had summary field types?)

                  Aggregating stuff using relationships rather than summary fields is new to me and I'll have to chew on that for a bit.  Thanks again for pointing me in the right direction.

                  • 6. Re: Aggregate functions

                    I have done some further reading and also tried this little example using a new file.

                    Both examples suggest that summary fields and sub-summary reports are not always necessary and can be replaced by using aggregate functions and a self-join relationship.  Is this the idea you're suggesting that would help to solve my VarianceP problem?
                    I'm completely open to re-designing my database file so that all my existing sub-summary reports and summary fields are replaced with their equivalents using just functions in calculation fields and various self-join relationships, with new table-occurance groups and modified self-join relationships for each type of "report".  As long as I get the same information as I'm getting now.
                    Am I on the right track?  Having to previously create all those sub-summary reports and summary fields, etc., is a great learning exercise for me and I'm glad I did them.  But, if there's a better way that will help me add more calculations to my current database to summarize records in more ways than I can now, then I'm up for the learning experience.  Comments please?  Thank you again.
                    • 7. Re: Aggregate functions

                      Going back two posts, if all you have are summary fields, you do not need any additional relationships. Summary fields work on the total records in your found set or, if place in a sub summary part or referenced in a GetSummary function, a sub group of the found set, provided the records are properly sorted.

                      If you are gettin a ? and the field is wide enough to display the result, the most likely cause is division by either zero or a null value, so you should check the results being returned by count_dr and count_cr.

                      Referring to you last post, I've been suggesting that you not use aggregate functions and to use summary fields instead so that you do not have to define the relationships needed to feed values into the aggregate varianceP function.

                      Using a related table and aggregate functions can produce the same or nearly the same results as a summary report with summary fields, the final choice is up to you. IMO, this is not a case where going to the trouble of using an aggregate function makes sense unless you find you need or already have the relationships needed to implement the aggregate functions.

                      • 8. Re: Aggregate functions

                        I missed two things… I solved my problem using the varianceP formula you provided from the FM Help, but (1) I forgot to use the GetSummary functions on the Summary fields, and (2) I didn't realize my calculation field showing the varianceP result was not wide enough, so I got a question mark as the result.  Now it works as intended.  Thanks!

                        GetSummary ( total_dr_squared ; _kf_acct_id ) / GetSummary ( count_dr ; _kf_acct_id ) - ( GetSummary ( total_dr ; _kf_acct_id ) / GetSummary ( count_dr ; _kf_acct_id ) ) ^ 2

                        UPDATE: Since the standard deviation is just the square root of the variance, I've deleted the above calculations and fields and replaced with:

                        GetSummary ( stdev_dr ; _kf_acct_id ) ^ 2