1 2 3 Previous Next 35 Replies Latest reply on Jul 12, 2017 10:36 AM by beverly

    JOINS in ExecuteSQL

    taylorsharpe

      FileMaker's SQL process can be very fast for straight forward SELECTS, but when you add JOINS, it can become very slow.  And how you do things can make a big difference.

       

      For example, doing a SELECT based on the larger table in a JOIN is better than basing it on the smaller table:

       

       

      Based on the Smaller Table:

       

      Let ( [

       

      F1 = Get ( CurrentTimeUTCMilliseconds ) ;

      F2 = "SELECT

             TSUD.\"Taxpayer Name\",

             TSUD.\"NAICS Code\"

        FROM

             NAICS

        LEFT JOIN

             TxSalesUseDatabase AS TSUD

        ON

             NAICS.NAICS = TSUD.\"NAICS Code\"

        WHERE

             TSUD.\"Location Zip Code\" = ? AND

             NAICS.NAICS LIKE ?" ;

      F3 = ExecuteSQL ( F2 ; " | " ; ¶ ; 75209 ; "4%" ) ;

      F4 = Get ( CurrentTimeUTCMilliseconds ) ;

      F5 = ( F4 - F1 ) / 1000

       

      ] ;

       

      F5

       

      )

       

      RESULT:

      11.866  <====== Seconds

       

       

      Based on Larger Table

       

      Let ( [

       

      F1 = Get ( CurrentTimeUTCMilliseconds ) ;

      F2 = "SELECT

             TSUD.\"Taxpayer Name\",

             TSUD.\"NAICS Code\"

        FROM

             TxSalesUseDatabase AS TSUD

        LEFT JOIN

             NAICS

        ON

             NAICS.NAICS = TSUD.\"NAICS Code\"

        WHERE

             TSUD.\"Location Zip Code\" = ? AND

             NAICS.NAICS LIKE ?" ;

      F3 = ExecuteSQL ( F2 ; " | " ; ¶ ; 75209 ; "4%" ) ;

      F4 = Get ( CurrentTimeUTCMilliseconds ) ;

      F5 = ( F4 - F1 ) / 1000

       

      ] ;

       

      F5

       

      )

       

      RESULT:

      .540  <====== ONLY 5 Tenths of a Second

       

       

       

      NOW FOR THE SURPRISE!

       

      If you don't use a JOIN and you instead do TWO SQL's and an "IN" Statement, you get fast performance:

       

      Let ( [

       

      F1 = Get ( CurrentTimeUTCMilliseconds ) ;

      F2 = "SELECT

             \"NAICS\"

        FROM

             NAICS

        WHERE

             \"NAICS\" LIKE ?" ;

      F3 = ExecuteSQL ( F2 ; ¶ ; "', '" ; "4%" ) ;

       

      F4 = "SELECT

             \"Taxpayer Name\",

             \"NAICS Code\"

        FROM

             TxSalesUseDatabase

        WHERE

             \"Location Zip Code\" = ? and

             \"NAICS Code\" IN ( '" & F3 & "' )" ;

      F5 = ExecuteSQL ( F4 ; " | " ; ¶ ; 75209 ) ;

       

      F6 = Get ( CurrentTimeUTCMilliseconds ) ;

      F7 = ( F6 - F1 ) / 1000

       

      ] ;

       

      F7

       

      )

       

      RESULT:

      .634   <===== 6 Tenths of a Second!!!

       

       

      CONCLUSION:

       

      Really try to avoid JOINs and if you can do it with multiple SQL's and an IN statement, it probably is faster than a Join, especially if it is based on the Smaller of the two tables in a JOIN!

       

       

      FYI, the Large table has 739364 records and the small table has 3195 records.

        • 1. Re: JOINS in ExecuteSQL
          fmpdude

          That's amazing!

           

          That's good to know. What you're describing seems like a SQL engine query optimization / indexing problem.

           

          -----

           

          With MySQL, for example, queries using joins are extremely fast especially if you always remember to create an index on the joined field. I've had similar issues where a query would take 8 seconds in FMP but only 0.15 seconds in MySQL. Same query, not "that" much data (500,000 rows).

           

          If you have data you can post, I'll try to do your query in MySQL for an objective comparison.

          • 2. Re: JOINS in ExecuteSQL
            monkeybreadsoftware

            We also run into JOIN speed issues.

             

            I just made an idea to improve:

            Improve SQL speed, especially for joins

            1 of 1 people found this helpful
            • 3. Re: JOINS in ExecuteSQL
              taylorsharpe

              Thanks Christian and fmpdude!  Yes, it is amazing how much faster these SQL calls with joins can be in MS SQL or MySQL.  So obviously there is something under the hood that can be done to make it go faster.  I went and voted for Christian's Improve SQL idea and sure hope something can be done.  Some things in ExecuteSQL can be real fast such as simple SELECT statements.  But as soon as you get complicated, it really slows down exponentially and in comparison to other SQL engines. 

              1 of 1 people found this helpful
              • 4. Re: JOINS in ExecuteSQL
                philipHPG

                I'm curious whether the bigger factor in the performance difference is the ordering of the tables when using the LIKE operator in the WHERE clause. LIKE is known to be a heavy operator. So, if you weren't using a LIKE operator in the WHERE statement, would you still see the same difference when re-ordering the tables (or using an IN statement)?

                 

                Or, another way of looking at it, is the more significant issue not whether the bigger or smaller table is listed first, but whether the table that is involved in the LIKE operation is listed first?

                 

                Of course, FileMaker doesn't give much indication on how they optimize so sometimes it is only by testing that such things are discovered.

                • 5. Re: JOINS in ExecuteSQL
                  philmodjunk

                  For example, doing a SELECT based on the larger table in a JOIN is better than basing it on the smaller table:

                  Note that this is exactly the same advice that we give for performing a regular "find" in FileMaker--go to the layout on the many side of the relationship and perform the find there, then, if necessary GTRR back to a layout based on the "one" side.

                   

                  As I learned at DEVCON last year, FileMaker's DB engine "translates" the SQL into regular FileMaker database operations--and thus is not likely to be any faster than an equivalent process created via other FileMaker DB actions.

                  1 of 1 people found this helpful
                  • 6. Re: JOINS in ExecuteSQL
                    monkeybreadsoftware

                    See also the thread here:

                    Re: Querying fields on server slow via SQL

                     

                    FileMaker_Tables and FileMaker_Fields can be queried in a test each with < 1 second time.

                    But the same database with a JOIN query to match table names takes 20 seconds.

                    1 of 1 people found this helpful
                    • 7. Re: JOINS in ExecuteSQL
                      CICT

                      I'm curious whether the bigger factor in the performance difference is the ordering of the tables when using the LIKE operator in the WHERE clause

                      Not a straight answer to your question, but certainly specifying the starting SELECT table in relation to the JOIN tables does affect performance. Sometimes the starting table chosen can be very unFileMaker like in terms of the search logic and doesn't seem to make sense, but you can increase the speed at times taking this approach.

                       

                      Andy

                      • 8. Re: JOINS in ExecuteSQL
                        beverly

                        This is awesome, Taylor!

                        Did you try the "implicit join"?

                         

                        " SELECT TSUD.\"Taxpayer Name\", TSUD.\"NAICS Code\"

                        FROM NAICS, TxSalesUseDatabase AS TSUD

                        WHERE NAICS.NAICS = TSUD.\"NAICS Code\"

                            AND TSUD.\"Location Zip Code\" = ?

                            AND NAICS.NAICS LIKE ? "

                         

                        I'd be curious to see if you get different result (better? worse?) than with the 'JOIN...ON' or the 'IN()'

                         

                        Since this might be more like the 'joins' on the RG (relationship graph), perhaps there is not as much translation into "FMP-speak"?

                        Beverly

                        • 9. Re: JOINS in ExecuteSQL
                          taylorsharpe

                          beverly wrote:

                           

                          This is awesome, Taylor!

                          Did you try the "implicit join"?

                           

                          " SELECT TSUD.\"Taxpayer Name\", TSUD.\"NAICS Code\"

                          FROM NAICS, TxSalesUseDatabase AS TSUD

                          WHERE NAICS.NAICS = TSUD.\"NAICS Code\"

                          AND TSUD.\"Location Zip Code\" = ?

                          AND NAICS.NAICS LIKE ? "

                           

                          I'd be curious to see if you get different result (better? worse?) than with the 'JOIN...ON' or the 'IN()'

                           

                          Since this might be more like the 'joins' on the RG (relationship graph), perhaps there is not as much translation into "FMP-speak"?

                          Beverly

                           

                          12.575 seconds

                          • 10. Re: JOINS in ExecuteSQL
                            fmpdude

                            12.6 seconds? Bug. Period.

                             

                            If you have this data in a non-proprietary format, I'd love to do the exact same query in MySQL or Oracle. Based on my own results testing, these other environments' results would be surprisingly faster.

                             

                            My own SQL results with a single table and a simple LIKE query are 8 seconds in FMP and 0.15 seconds in MySQL. The percent difference means FMP can't be used for those queries in any production application.

                             

                            Bug. (Or toy.)

                             

                            Very sad that things are this slow in FMP and we need to "search" for ways around performance problems that simply do not exist in other environments for the queries I've tested.

                            • 11. Re: JOINS in ExecuteSQL
                              taylorsharpe

                              Yes, fmpdude, it is unfortunately slow particularly when you get to joins.  Obviously it can be done much faster in non-proprietary systems.  If you want, this is a file of Sales and Use Tax permit holders from the State of Texas that I get through an Open Records request.  So the data is free.  I would be glad to give you the file to play with it and compare if you want to load it on to other databases for comparison. 

                               

                              There are some wonderful things about the Draco engine, but there seems to be some limitations that I certainly do not understand such as these. 

                               

                              Send me an email and I'll get you the file. Or if someone else wants to make comparisons, feel free to contact me.  It's a good file to play with since it is all public data and it has a lot of info. 

                               

                              Taylor@TaylorMadeServices.com

                              • 12. Re: JOINS in ExecuteSQL
                                fmpdude

                                Will send you an email shortly.

                                 

                                Thanks Taylor.

                                • 13. Re: JOINS in ExecuteSQL
                                  okramis

                                  This matches exactly my findings.

                                  How does it perform, if you put the first query as a sub-query inside the IN-clause like this?

                                   

                                  Let ( [

                                   

                                  F1 = Get ( CurrentTimeUTCMilliseconds ) ;

                                  F2 = "SELECT

                                         \"Taxpayer Name\",

                                         \"NAICS Code\"

                                    FROM

                                         TxSalesUseDatabase

                                    WHERE

                                         \"Location Zip Code\" = ? and

                                         \"NAICS Code\" IN (SELECT

                                                                              \"NAICS\"

                                                                        FROM

                                                                              NAICS

                                                                        WHERE

                                                                              \"NAICS\" LIKE ?)" ;

                                  F3 = ExecuteSQL ( F2 ; " | " ; ¶ ; 75209 ; 4% ) ;

                                   

                                  F4 = Get ( CurrentTimeUTCMilliseconds ) ;

                                  F5 = ( F4 - F1 ) / 1000

                                   

                                  ] ;

                                   

                                  F5

                                   

                                  )

                                  1 of 1 people found this helpful
                                  • 14. Re: JOINS in ExecuteSQL
                                    fmpdude

                                    Tried to send you an email from my ProtonMail account, but got a rejection (first ever from ProtonMail):

                                     

                                    Final-Recipient: rfc822; taylor@taylormadeservices.com Original-Recipient: rfc822;taylor@taylormadeservices.com

                                     

                                    Action: failed Status: 5.7.1 Remote-MTA: dns; mail

                                     

                                    Diagnostic-Code: smtp; 554 5.7.1 : Relay access denied

                                    1 2 3 Previous Next