11 Replies Latest reply on Oct 30, 2014 7:50 PM by jormond

    250,000 + records




      I have a database with over 250,000 records which contain information relating to people who have taken out magazine subscriptions spanning decades. These are currently all in one table but there only around 5,500 active records - the others contain people who are no longer alive, have ceased their subscriptions, are no longer contactable or are otherwise 'inactive'.


      My question: is there a best practice for handling this scenario? I'm wondering about moving the deliquent records to a separate table which is identical to the source table but maybe there's a better way?!


      Many thanks,



        • 1. Re: 250,000 + records

          Just leave them in there. They're not hurting anything. If you move the dead records into another table, then all you have are two tables where one has a bunch of dead records and the other having some live records AND, eventually, some dead records.

          Create a status field and tag them for what they are.



          • 2. Re: 250,000 + records

            How is the performance? If only 3% of your data set are active you might

            consider setting up a very narrow table which carries the IDs of the

            active records. You can create a one to one relationship between this

            new table and the original. That will allow you to handle the small,

            active, data set without having to modify the existing system very much.



            • 3. Re: 250,000 + records

              Hi Kyle,


              Whilst they're not hurting anything it's seems obvious, and I should have stated this, that there is going to be a hit on performance when running any kind on analysis on the 5,500 records as the entire data set must be processed. I figure that it just makes sense to have those records somehow separate to the 250,000 which are of no real value.



              • 4. Re: 250,000 + records

                Hi Malcolm,


                Performance in general use is fine. I do have an issue with some dashboard widgets but I'm fixing those as I was referencing unstored calcs across the relationship(!) I'm seeing an improvement as I set about fixing that issue. However, I like the sound of your suggestion and will give that a go. Thanks.



                • 5. Re: 250,000 + records

                  250K records is very small,  so your "analysis" may not be efficient


                  Normally,  a script would "find" the active records first


                  but if an active record must be compared to all records,  it will slow down,  as the file grows


                  e.g.  if all records are self related,  add an "active" key to the relationship,  so non active are excluded





                  > that there is going to be a hit on performance when running any kind on analysis on the 5,500 records as the entire data set must be processed

                  • 6. Re: 250,000 + records

                    Thanks Greg.


                    Even uncomplicated statements like 'Count ( SubsCancelledThisMonth::id )' take 20+ seconds to complete and I have a number of simliar calculations on the dashboard screen. Collectively they take nearly two minutes to complete. Is it better to create a layout with just the fields on that I need to look at and script it? I'd assumed a simple calculation would be more efficient as a script would effectively have many more lines of code.


                    Thanks for your input.



                    • 7. Re: 250,000 + records

                      Yes, unstored calcs across a relationship are very slow


                      and if any Relation uses < or >,  it's even worse


                      Dashboards are tricky in general,  since FileMaker is slow at math


                      Perhaps run a script each night,  to store the values you need


                      or,  if perhaps  CancelledThisMonth = 1,  then "find" the active records,  and show a plain summary,  rather than an Aggregate Count




                      > I was referencing unstored calcs across the relationship(

                      • 8. Re: 250,000 + records

                        Thanks Greg. I didn't know about the < or > being an issue.


                        I like the idea of an 'end of day' routine to fix the values. I'm not sure they require real-time values so that should work quite well. I can provide a 'refresh' mechanism if they need it.


                        For some of the values a 'find' will work just fine as often I just need the 'Found Count' value - no real math involved - so that's also a good option.


                        Thanks for your help. Much appreciated.



                        • 9. Re: 250,000 + records

                          +1 to Malcolms suggestion.


                          When I built an inventory solution where 99% of the stock was active (but knowing things would be archived sooner or later) I built a "thin" Active table and an "ALL" table.

                          All of the data is on the "ALL" table and the thin table just has UniqueID and FoundCount calcs.


                          Now (5 years later), ~ 3% of the records are active, all the rest are archived...and it runs just as fast as when new.

                          The record count in the "ALL" table is a few hundred thousand...The record count in the "Active" table is a few hundred.  Speed is not compromised on day to day use.


                          Good luck!

                          • 10. Re: 250,000 + records

                            The Count () function will need to evaluate every recrd in the found set.  sloooow.  Instead, add an unstored calc to the table.  The calculation should be "Get ( FoundCount)"  When you perform a find of the active accounts, this new field will provide you with the same results, but do so instantaneously.  This field is so useful, I make it a standard field in virtually every table I create - just in case I need it.



                            A most excellent blog article by Daniel Wood can be found here:




                            • 11. Re: 250,000 + records

                              You could also avoid the field itself if you don't need it always refresh on it's own. Just set a variable that displays the found set. Depending on what you need.