Hi,

I'd like to predict the sales of each of my SKU based on their sale history.

I'd like something statically sound with just the sale history (no weather influence etc).

It's on online retail store

So, do you have implement such a thing ?

thanks

Hi,

I'd like to predict the sales of each of my SKU based on their sale history.

I'd like something statically sound with just the sale history (no weather influence etc).

It's on online retail store

So, do you have implement such a thing ?

thanks

An existing web service that does the same thing is, in my opinion, a better option. The web service does the math, you just feed it the data from FileMaker and get back whatever data or chart that the web service provides.

I suspect that you can find web services that can already do this that you simply or not so simply need to integrate into your solution using Insert From URL to send data and get a response back.

- 3 people found this helpful
The typical math involved in this is nothing FileMaker can't or shouldn't handle internally. Any production feature (as opposed to an analyst's research tool) using time series analysis (e.g., sales forecasting) tends to be a custom fit, so the work someone else might have done in FileMaker won't necessarily translate to what you're doing. We could stand to have better broadly available primitives supporting that, though.

Facebook's Prophet tool has a number of techniques that, while not super-refined, are both perfectly reasonable starting points from an analytical perspective, and look like they should be easy to adapt to work in FileMaker without needing to call an outside service. Check out the paper describing how it works.

You should probably also be interested in "online", "stream", or "incremental" forecasting algorithms. It sounds like your situation is a good fit for that type of model, and even without that, stream algorithms have attractive calculation speed guarantees.

I think philmodjunk is right. Know when to offload processing. FileMaker scripts, with loops, and lots of processing can be slow. And, IMHO, using the "script work space" to write statistical routines, with its dialog-based interface, might be ineffective.

The way I did this was, in a loop in FMP, gather the error data for the given year. Then in an INSERT FROM URL FMP script step, do a

**POST**to the web service. The web service returns the expected error count in week 52 - right into a FMP field.The web service is extremely fast. I've written services like this for other customers. It's not a big deal.

-----

Another approach is to use JDBC and FMP's free JDBC driver, externally. If you can write an SQL statement that describes which SKUs (say for a date range or other criteria) need to be considered, a Java program can just extract that data directly from the FMP database, do the number crunching, and write it back, create an output file, send you an email, whatever.. F-A-S-T.

If you could include some dummy (sanitized) sample data, with your expected value(s), I could tell you more.

A final benefit for the service approach is that it's not FMP-Only. Thus, any application that could do a GET (for small data sets) or POSTs (for larger data sets) could send data to the service. JDBC, the second approach above, works with databases only.

Thanks V.

**Very important:**Please post some representative data and your expected values before deciding on an approach. If you post a 3,000 sample data-set, I'll be glad to try to run them all as a test. I'll even post benchmark data.I'm assuming you're looking for a future value of "y" given a value of "x" -- as in simple regression.

3000 is not that much and if it is a daily procedure and not something interactive changing on each sale it should be easily doable with built-in reporting techniques natively FM.

But it requires some math to translate your stats into a report. If you define how you want to predict in dependency of past data it might be worth a try.

Repeating fields and summary reports can express a lot of Sigma Algebra as used in statistic formula - it all depends how you want to predict...

The nature of forecasting is that the source data tells you both what your inputs are and roughly what your outputs should be. (Or the desired forecast is based on some function of the same data, such as if we want to bias a forecast to over-estimate demand more often than under-estimating to minimize how often we run out of stock.) The "expected values" aren't as precisely definable as, for example, a table showing the inputs and outputs for a sine function.

Even basic forecasting is a bit more complicated than univariate regression, but not much. Reasonable models for some cases can be built from a series of univariate regressions via boosting, for example. An ARIMA model might be a more conventional starting point.

- 1 person found this helpful
I don't think this is an unreasonable computation for FileMaker to handle even if it does need to update the forecast in response to every individual sale. That's what streaming and online algorithms are for. The idea is just to update the forecast statistics without needing to re-read the historical data. This often gets done even when we don't need updates that frequently due to the guarantee that the execution time will only be linear with respect to the amount of data there is to process.

"Sigma Algebra" has a specific meaning that isn't what I think you have in mind. You are right that most of the computational work in most of the statistics most people are mostly interested in boils down to a lot of summing and counting, which summary fields and FileMaker's built-in aggregate functions do.

Hi all,

I whipped up a quick sample file, with actual data for a bit more than 3 months of sales for 2 representative SKUs

The goal is to get, foreach sku, the number of days the current inventory could sustain, or in another words to be able to predict the date where inventory will be 0

Our current method, used for years is based on a 3 months average, so anything will be better.

The end result, the number of days is meant to warn our buying staff about product that may become unavailable.

It's meant to highlight product that needs attention (so amongst our 3700 SKU, maybe 300 hundred will have to be considered by staff).

A human, will decide if there's need to buy and how much. So, don't fret, the company won't go bankrupt if results are false.

It's a guidance, not an auto-order process.

There's 2 tables :

sales : that contains the sales per day, please not that it doesn't include all dates,

**dates are discontinued, as only the days on which there was sales are recorded**current_inventory : contains

the unique skus,

and their current inventory quantity,

plus the start date for the sales to consider,

a calc that's our current lame algorithm that computes the number of days left of inventory.

A placeholder calc field for a smarter way to compute the days left

I used calc fields as an example but you can use whatever method

Data is anonymized, so you can do whatever you want

File authentification, the usual default

Admin

blank

- 1 person found this helpful
Forecasting isn't just an algorithm, you don't just need the math but also the interpretation and human intuition to create the model behind it. Although FM should be fine in doing the computations, building the model in FM would be very time consuming. I would also advise in using a BI tool instead, Tableau for example.

Although you mention 3000 SKUs, you would need to have a strong N (sample size) to support forecasting for each and everyone of them. If you have a SKU for example that is only sold like 5 pieces per year then why bother unless it is a high value item. If it is a high value item, why not forecast "by hand"? You will need some volume to make forecasting and usually a sufficient N would be denoted by the model that you use and forecast interval.

Also if you just need a sales forecast for total sales then there is no need to do it for each and every SKU. You can do a forecast on the total sales alone which is much easier and faster and more reliable anyway.

- 1 person found this helpful
I look forward to seeing what jbante comes up with using the interesting online sites he refers to.

What I did, just to get a handle on the data a bit, was to graph it -- always a good idea.

I created a new column called "DayOfYear" which is a simple calculated field from the sales date (day of year). Then, for SKU "115" only, I imported the data into "R" and did a few graphs.

Below is a graph of the data, by Day of Year for SKU "115"

(1) Line chart

(2) density chart

(3) step chart

(4) barplot

(5) boxplot

**I also used the "ggplot" package in R to do a smoothing graph of the data:****Here's a point graph of the same data:**As you can see, the data are all over the place making a straight forward regression line doubtful.

**In any case, running the linear regression function (to get x-intercept and related data), I got:**Residuals:

Min 1Q Median 3Q Max

-2.9211 -2.3153 -1.2463 0.9888 16.3256

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 4.806148 1.035666 4.641 5.27e-06 ***

dataFrame$DayOfYear -0.007255 0.005673 -1.279 0.202

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.313 on 289 degrees of freedom

Multiple R-squared: 0.005627, Adjusted R-squared: 0.002187

F-statistic: 1.636 on 1 and 289 DF, p-value: 0.202

---

Thus, I agree with others who posted that a model is necessary first. The data by themselves don't give you enough to go on or I don't understand well enough what you actually need to do.

Hope this is useful info...

Thanks great

Thus, I agree with others who posted that a model is necessary first. The data by themselves don't give you enough to go on or I don't understand well enough what you actually need to do.

Hope this is useful info...

Those data are all what I have. And as I said, I just want something better than the 3 month averaging, to get avg. sales per day, and then divide my inventory by it.

Maybe rather than day, we should look at weeks, it would smooth things out

I just want to knwo how to apply a stistical formula to that data set in filemaker

and the problem is also that I can't understand the output you posted

Again, I'm not sure I totally understand what you're trying to do, but the challenge from my perspective is in the prediction part, not the tracing part -- though graphics is always a good thing!

To answer your previous question:

When R returned this:

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept)

**4.806148**1.035666 4.641 5.27e-06 ***dataFrame$DayOfYear

**-0.007255**0.005673 -1.279 0.202This answer assumes I had the data Y ~ A, so that the theoretical "Model" of your regression line would be of the form: Y = B + B(sub1) * x. The "B" is actually Beta, but this is just a stand in for a simple slope-intercept model: y = mx + b. (or in this case, y = b + mx).

So, using the data I got from R, the theoretical matching regression line for the data I tried would be:

**y = 4.806148 + -0.007255 * x****Graphing the regression line gives you this (from the charted data):**My requirement are very simple

SKU0115 has 56 units in the warehouse,

when (in number of days from today), according to the past sales, will there be 0 unit in the warehouse ?

in other words, with the current stock how many days can I run my business

See my inventory table / layout in my sample file, that's days_left_current_method_prediction calc

The days_left_current_method_prediction crappy method tells me I've 6,58 days of inventory left

so what would cool_regression_calc give ?

I think my analysis above may be on the right track then.

jbante alluded to a website, technical documentation, and other techniques for you to consider and he seemed to think what you want to do is possible in FileMaker. However, since I would

*not*use FileMaker for a problem like this, other than to extract the data, I'll let him take over from here.>So, we we solve the equation for "y", that is, when will the line hit the x-axis, we get that x = 658.

in 658 days I think that's when nobody on earth will want to by that product, the slope measure customer desire I guess.

But the number of day to sale all the 56 items of the warehouse is much much sooner

It's very strange you get 658, and my crappy method return 6,58 days (which seem ok according to our business), coincidence ?

What's the formula you used, how to implement it in filemaker that's the question

ok, so we have the slope, the question now is when my inventory will be depleted,

so i need a formula, given the ax+b slope, that will compute tomorrow sales, and after towmorow sales an make the sum of those future days till we hit my inventory level of 56

that would be

SUM (

( 4.806148 + -0.007255 * (x+1)) +

( 4.806148 + -0.007255 * (x+2)) +

( 4.806148 + -0.007255 * (x+3)) +

....

( 4.806148 + -0.007255 * (x+n))

) = 56

how to compute n ?

x would be a given, that's the current day

I think it would be the integral of the slope, problem I don't understand zilch about integrals anymore :-)

The formula (solving for y=0) is the regression line calculated from your data as I interpreted it (using the least-squares technique).

The SkeletonKey link above appears to use the same basic least-squares technique.

I would always use "R" for this type of analysis due to it's rich and mature graphics, add-on packages, and complete statistical suite. (All free.) Best yet, you can use FileMaker data directly in R. (Or, from FileMaker you can integrate with R, as I've documented here on the forum).

The link you're using has the same least-squares technique I was using in R though you may be using different data.

If we agree in a particular dataset you're interested in, as in (x,y) values, I'll re-run my analysis.

FWIW, for least-squares, no integration is required.

If you need to calculate probabilities (geometrically-> the area under the prob. distribution curve, where integration would apply), you should be using a stat program at that point. Or, from FileMaker, possibly integration with a micro-service that can make those calculations easily using a fast, free standard add-on library (hint: never write statistical code yourself in a CF when time-tested free libraries exist).

Looking at your sample file, it appears you are selling 8.5 per day of SKU01 and 8.0 per day of SKU02. Of course, you aren't selling exactly that amount each day (especially the 0.5), but these are your average daily sales over the 3-month or longer period of your data (Sample Size/#Days in Sample).

If you are forecasting sales for the purpose of minimizing stock runouts, I would be looking at "stock reorder point." You set this value for each SKU and generate a report of each day of SKUs for which quantity available is less than or equal to stock reorder point (this assumes you run a stock report each day).

The stock reorder point is usually calculated "by feel." The people closest to the items have the best "feel" for it -- you can look at printouts all day but the stock pickers in the warehouse probably have the better data. Whatever, the stock reorder point is the quantity necessary to keep you in business until you can replenish the item. This is based on days remaining of sales, which includes supplier lead time and whatever time you need for ordering, receipt and restocking.

Many ERP inventory systems also include a field for "safety stock quantity." As the name implies, this is a quantity that takes into account what you want to have on hand "to feel safe." For most items, this can be the same value as stock reorder. But for high-turnover and/or seasonal items or when you are running promotions, you can set it higher. Your daily stock reorder report can also evaluate against this field. The main advantage to having this additional field is so you don't have to keep adjusting the default reorder point.

This should give you some ideas. Good luck!

Another ROM to help set that value would be a rolling sales average (Time Series). Based on all that data that's in your DB, by SKU, you could compute the average sales for all or for date range and then divide the quantity on hand by that average (for each SKU) to get a rough idea, based on average sales, how long inventory would last.

Whether you used all the data, the most recent data, or a rolling time-series average (or other) is more of an art, with trial and error likely, than science.

Right, the graph I showed and the regression line (using least-squares, no integration necessary!) shows you the trend of sales based on data. From the very slight negative slope of the regression line, given the data, for SKU 115, we see that sales are trending downward, but just a very little.

Perhaps if you had daily inventory, not just "on hand", that would be better data with which to make predictions.

---

If your data follows a Poisson distribution (need to test for "goodness of fit" to see if Poission), you could ask interesting questions like: "what's the probability that I'll have exactly (or at least) 10 sales next week?"

fmpdude wrote:

Why don't you whip up a quick example, using the OP's sample data posted, using one of the online services you mentioned for the OP?

I

*was*looking for something fun to do this weekend. I didn't mention an "online service". "Online" as applied to an algorithm doesn't mean a service hosted by someone on the internet, but a technique for solving a problem without access to the full data. The term "streaming algorithm" has associations closer to the intent — looking at data points one at a time as they come in, and not reviewing old data points when updating a summary — now that "online" means something different to most people than it used to. The literature is very muddy on the distinction between them, so telling folks to only look for one term leads them to overlook lots of good stuff indexed under the other. Basic least-squares linear regression is one of the happy examples where the online/stream calculation is exact.DIYDashboard — OneBusinessApp is Luke's thing, too.

I attended his presentation at DevCon 2016 and found it very interesting. (Missed it this year, though.)

If I understand what you're asking .... the way I did that (in my case it was error prediction), was to assume linearity, and then to used linear statistical regression. Computed the regression line equation given my existing data (in FMP) then extrapolated using some data point in the future.

I did all this in a REST service running in Tomcat app server, than call that service from FMP.

Works great.