jared944

Finding the Sum of non-empty fields for Statistical Linear Regression

Discussion created by jared944 on Feb 24, 2016
Latest reply on Feb 24, 2016 by bigtom

Hello everyone,

 

I am developing a model to look at business income by month and want to attempt to apply a trend analysis to predict the next months income. I have a test database I will share.

 

The table contains 2 important fields - the amount and the period. The amount contains the income during the specific period. I have other calculation fields that are used in the final linear regression formula.

 

Y Intercept = b =  ( ( Sum X^2 ) ( Sum Y ) - (Sum X ) ( Sum XY ) )  /  n ( Sum X ^ 2 ) - ( Sum X ) ^ 2 )

 

Slope = m = n ( ( Sum of X*Y ) - ( Sum of X ) * ( Sum of Y ) ) / n ( Sum of X^2 ) - ( Sum of X ) ^ 2

 

X Intercept = Y = m * x + b

 

Here the x intercept would be plotted on a graph as the trend line compared to the amount.

 

Period (X)       1      2      3      4      5      6

Amount (Y)     4      4      3      4      8      10

 

X Intercept      2   3.3     4.7   6.1   7.4    8.8

 

Screen Shot 2016-02-24 at 4.32.27 PM.png

 

The problem I have is if I add a record with an empty amount :

 

Period (X)       1      2      3      4      5      6      7

Amount (Y)     4      4      3      4      8      10

 

X Intercept      2   3.3     4.7   6.1   7.4    8.8   4

 

Screen Shot 2016-02-24 at 12.37.20 PM.png

 

What I want :

 

Screen Shot 2016-02-24 at 1.16.50 PM.png

 

If you see in the example file, the graph does not work with an additional blank field, which is what I need because I want to visualize the graph for the future. This is because I am making a summery of certain fields that now contain non-zero information, specifically :

 

forecast_sum_x  =  Total of Period

forecast_n           =  count of ID

 

I can create a related table and use a calculation to find these values (which is how I came up with the 3rd graph above), but then I cannot apply the trend to a found set - it is using the entire set of field information from the self-join relationship.

 

So, I need to be able to summarize the fields that are not empty for forecast_n and forecast_sum_x in order for it to work properly, and it doesn't appear I can use a related table for that information, unless I am missing something.

 

Any ideas would greatly be appreciated.

 

Thanks!

 

Jared

Attachments

Outcomes