10 Replies Latest reply on Nov 27, 2016 9:34 PM by philmodjunk

    SQL statement advice

    kiwikaty

      Hi there

       

      Each night I need to reconcile around 70,000 records and once a week 315,000 records. At the moment I have a unstored calc which compares about 10 fields between the two tables to see if there are any differences eg.

      if (file1 field A not equal to file2 field A; 1; 0) +

      if (file1 field B not equal to file2 field B; 1; 0) +

      if (file1 field C not equal to file2 field C; 1; 0) +...

       

      Across the massive record set I then do a replace on a local field of the unstored calc value into an indexed number field - I then do a find on this indexed field for > 0 and then with this found set (which is every record where the data in file1 does not match something in file2) I do an import into file2 using "update matching records".

       

      The thing is the replace and then import is taking hours each night. Initially I was doing an import matching on all but then that did not make sense to import ones that did not need updating and I did not like how many records were being touched.

       

      I need to claw back some of the time this process is taking so wondering if I could somehow use SQL instead inside FM to find all the ones that have differences. The import will still be slow but marking ones that need importing may be quicker.

       

      I was just wondering if someone who knew FM and SQL knew what sort of statement I could use to locate the set that needs updating maybe without having to store the unstored calc first? Or maybe these is just a whole better way of approaching this nightly task?

       

      Any advice appreciated.

       

      Katy B

        • 1. Re: SQL statement advice
          erolst

          Assuming that all 10 fields are stored, the first thing that comes to mind is creating a relationship between the tables with the field pairs as predicates, i.e.

           

          table1::primaryKey = table2::primaryKey

          table1::A = table2::A

          table2::B = table2::B

          etc.

           

          Then:

          GtRR from table2 to table1, matching only

          If the matching set = the total set, there is nothing to update; otherwise:

          show omitted

          GtRR back to table2 via a relationship that matches only the primary key

          import from 1 with update

          • 2. Re: SQL statement advice
            kiwikaty

            Thanks erolst

             

            This will in fact work for 6 of the difference matches and I will use this. A couple of the matches have a custom function converting dates (as there is in fact a file0 that is fully re-populating file1 each night) but I have logged a change for these dates to be converted before we get the data and that change should happen next week.

             

            The other couple are checking for the existence of other related records and a status on the record but I can move these out into constrained finds once I have reduced the sets down based on the method you have outlined above.

             

            I should have posted this years ago!

             

            From you experience is a quicker to do an "replace" or a looping update using set fields?

             

            Thank you for your help.

             

            Katy B

            • 3. Re: SQL statement advice
              kiwikaty

              Foiled - seems something not indexed as it cannot find on related values or go to related values without it hanging. I will keep working on it . Thank you again.

              • 4. Re: SQL statement advice
                erolst

                Bummer. Well, the best of luck ...

                 

                Good luck. btw, RFC is faster than a script loop.

                • 5. Re: SQL statement advice
                  David Moyer

                  Hi,

                  the replace/find seems to be the sticky bit here.  You might try testing this option for speed:  script this process, looping (as you just mentioned) through your found set of records, evaluating the unstored field and skipping or omitting that record.

                  If condition

                    Go to next record - skip exit after last

                  Else

                    Omit record

                  End If

                  (see Skip or Omit ... potential pitfall  for pitfalls.)

                  • 6. Re: SQL statement advice
                    taylorsharpe

                    Going through all the records is what takes a long time.  So if you can limit the number of records to be compared, the faster it goes.  One way to do this is to look at Creation Dates and Modification dates assuming you keep those in your tables.  That way you only compare ones where the a modification date or creation date is created than the last sync.  Then go back and look to see that anything deleted is removed.  It is all just part of doing a good synchronization quickly. 

                     

                    If you really want this to be done optimally and continuously, you might check at 360Works Mirror Sync. 

                    • 7. Re: SQL statement advice
                      David Moyer

                      dude, evaluating Modification Date is a great point.  Limit the record set.

                      • 8. Re: SQL statement advice
                        kiwikaty

                        Thanks again for the replies. No I am afraid that checking mod dates does not help. File1 is imported from another system each night and it does not contain the modification dates from that system. File1 is the master data for a number of fields in File2 but not all... as this table contains a lot of locally collected data also. It appears to be the volume of the data that is causing the headaches. I had read that performing a find with SQL was a lot quicker but I guess if fields are un-indexed then they are slow (sometimes impossible) to find on whether you use a SQL statement or a native find. File1 contains 356,000 recs (which are imported from File0 each night), File2 contains 807,000 records.

                        Even if I use indexed fields in a relate and find on an indexed related field from File1 to File2 it basically it spends a very long time "processing query" - things are not so bad if I was not using a multi-predicate join but there is no single predicate join that will give 1-1. I could make a relate using combined keys but then I am adding another field that needs to be "calculated" each night as the file0 comes in and it would also need a matching indexed field in file2.

                        • 9. Re: SQL statement advice
                          taylorsharpe

                          The simple answer is that you need to work with the people sending you the data.  Either have them cull the list down based on modification timestamps or send you the data with creation and modification timestamps.  If they send a culled down list, make sure they include the ones that need deleted. 

                           

                          SQL in FileMaker rarely performs any faster than a regular FileMaker find.  Sometimes it does on things like counts.  But don't think SQL in FM makes things go faster.  Sometimes it even goes slower.  It is a tool to enhance options of gathering data, but it is not a performance tool.  Yes, we all thought that it would be a performance tool when it came out, but we discovered otherwise. 

                           

                          You could import into a duplicate table with no indexes because the import will be quick and then look through comparing them to the main database and only updating fields that have changed and most importantly not telling it to set fields that have not changed because that slows things down. 

                           

                          I still recommend you get with the owner of the external data and request they include creation and modification timestamps.  That would be the best solution. 

                          • 10. Re: SQL statement advice
                            philmodjunk

                            Another way to compare records (or multiple fields);

                             

                            GetContainerAttribute ( list ( firld1 ; field2 ; field3 ; Etc... ) ; "MD5" )

                             

                            Do this for both records and compare this value to see if they are or are not the same.