3 Replies Latest reply on Feb 24, 2016 8:06 PM by bigtom

    Finding the Sum of non-empty fields for Statistical Linear Regression

    jared944

      Hello everyone,

       

      I am developing a model to look at business income by month and want to attempt to apply a trend analysis to predict the next months income. I have a test database I will share.

       

      The table contains 2 important fields - the amount and the period. The amount contains the income during the specific period. I have other calculation fields that are used in the final linear regression formula.

       

      Y Intercept = b =  ( ( Sum X^2 ) ( Sum Y ) - (Sum X ) ( Sum XY ) )  /  n ( Sum X ^ 2 ) - ( Sum X ) ^ 2 )

       

      Slope = m = n ( ( Sum of X*Y ) - ( Sum of X ) * ( Sum of Y ) ) / n ( Sum of X^2 ) - ( Sum of X ) ^ 2

       

      X Intercept = Y = m * x + b

       

      Here the x intercept would be plotted on a graph as the trend line compared to the amount.

       

      Period (X)       1      2      3      4      5      6

      Amount (Y)     4      4      3      4      8      10

       

      X Intercept      2   3.3     4.7   6.1   7.4    8.8

       

      Screen Shot 2016-02-24 at 4.32.27 PM.png

       

      The problem I have is if I add a record with an empty amount :

       

      Period (X)       1      2      3      4      5      6      7

      Amount (Y)     4      4      3      4      8      10

       

      X Intercept      2   3.3     4.7   6.1   7.4    8.8   4

       

      Screen Shot 2016-02-24 at 12.37.20 PM.png

       

      What I want :

       

      Screen Shot 2016-02-24 at 1.16.50 PM.png

       

      If you see in the example file, the graph does not work with an additional blank field, which is what I need because I want to visualize the graph for the future. This is because I am making a summery of certain fields that now contain non-zero information, specifically :

       

      forecast_sum_x  =  Total of Period

      forecast_n           =  count of ID

       

      I can create a related table and use a calculation to find these values (which is how I came up with the 3rd graph above), but then I cannot apply the trend to a found set - it is using the entire set of field information from the self-join relationship.

       

      So, I need to be able to summarize the fields that are not empty for forecast_n and forecast_sum_x in order for it to work properly, and it doesn't appear I can use a related table for that information, unless I am missing something.

       

      Any ideas would greatly be appreciated.

       

      Thanks!

       

      Jared