10 Replies Latest reply on May 7, 2013 11:07 AM by kaostika

    SQL Join Performance questions

    kaostika

      Hi,

       

      Lets say I have 5 tables. I want to recreate the Data Model in SQL with JOINs.

       

      I then want to get the result in a text field. So rather than re write the JOIN portion of the sql each time I make a new query can I write it once and use it for all my queries. Even if I dont use the fields from the tables referenced in the JOIN.

       

      For Example

       

      I JOIN invoices, Line items, Contacts, Address, and Prodcuts. I run SQL query but I only SELECT the fields from Products and Contacts.

       

      I want to keep all the JOIN text. Will this slow things down? Or is this just a bad practice. I am trying to write the JOIN once to recreate my data model and leave it. Maybe put the text into a CF

       

      Oreste

        • 1. Re: SQL Join Performance questions
          taylorsharpe

          Your question is not simple.  How a database performs depends on many things.  You can store your results in an indexed field and those results will be searchable very fast.  On the other side of things, the storage and indexing of the additional field takes up space and any time you import or make changes to the field, the process of updating the index slows things down a lot.  Note that the join is not a live continuous result unless it is an unstored calculation, which will make things go a lot slower and not be searchable very fast.  The question will be whether using SQL is faster or regular relationship if faster.  Many people seem to be under the impression that a SQL calculation is faster than a normal FileMaker calculation and that rarely is true.  When it is a benefit is when you can avoid making additional relationships that may only be needed for one purpose.  I love Execute SQL, but it has not been the performance booster that many people have incorrectly assumed. 

          • 2. Re: SQL Join Performance questions
            mikebeargie

            I love Execute SQL, but it has not been the performance booster that many people have incorrectly assumed. 

            YMMV (your mileage may vary)!

             

            There are numerous uses for ExecuteSQL() that "boost performance". Using it as a substitute for some calculations may not be one of them. If you assume that every time you use it, your performance will magically increase, then yes, you're wrong. The power in ExecuteSQL() lies in the ability to gather data in a way that was faster than the tools we had previously available to us.

             

            Say for instance I was calculating a virtual list report of two tables, where in 11 I had to loop through each record in the first table, going to the second table, performing a find, setting my value, returning, going to the next record and repeating. Now I never need to leave the first table, saving me exponential amounts of script steps (~4 per record). Better yet, why even bother looping at all? With ExecuteSQL() I can just do a join and return the data in an array that I can specify and parse the format for. If I ran a report in the above instance on 1,000 records, I'd save ~4,000 script steps. What happens when I have 10,000 records? 50,000? I would call that a heck of a performance boost.

             

            It's gotta be boosting performance somewhere from all the hype I've heard from the upper eschelons of developers!

            • 3. Re: SQL Join Performance questions
              kaostika

              I guess I didnt explain my self properly.  I am not looking at SQl as compared to other methods. 

               

              Lets say I have 10 reports and I use some sort of Virual List.

               

              I in some reports I need some fields from table A and in others Table B.  My JOIN that I write has the JOINs from Table A, B, C, D, E .../   The question I have is if I should keep the Extra JOIN in the SELECT even if I am not using them at that time.

              • 4. Re: SQL Join Performance questions
                taylorsharpe

                The boost in performance is found in finding new ways to use SQL to avoid things like looping scripts.  But it isn't natively doing a find or sort any faster in ExecuteSQL than in a regular FileMaker find or sort.  It is a tool that lets us skip schema and approach problems differently, which is great.  But some people think that just because you use SQL, it is faster.  If you do the same thing in SQL that you do in FileMaker normally, such as a simple find and sort in the same table, then you will not see a performance difference.  I have to admit when I first heard about ExecuteSQL, I thought we would be able to use it to have much faster sorts like I can get in Oracle or MS SQL Server, but that is not true.  Still, I find ExecuteSQL very beneficial and use it frequently in my current solutions to avoid loops and address tables that are not related.  In this manner, it makes the soution faster. 

                • 5. Re: SQL Join Performance questions
                  Mike_Mitchell

                  Oreste -

                   

                  Your question is basically, "Can I save myself some development time by using the same query in different places, even if it might cost me some performance lag in the solution?"

                   

                  The answer is, "Depends."    

                   

                  Essentially, your extra JOIN clauses are extra processing the database engine has to perform. It's also extra data that have to be cached. That will hurt performance. How much? That will be highly variable, depending on how many records are in each table, bandwidth, etc.

                   

                  The flip side is, how much time (and client money) do you spend on optimizing your SQL queries? What's the ROI? We all try to standardize code to a certain degree because customizing every single operation every time is just cost-prohibitive. So there's always a balance.

                   

                  My suggestion would be to construct a calculation for your query in such a way that you can comment out parts of it that you don't need for the particular operation. That way, you can retain the bulk of the calculation and turn off the parts that aren't needed. I don't know if that's feasible in your particular case, but it might give you the best of both worlds.

                   

                  The other bit of advice would be ... test it and see. Load up your development instance with lots of data, try it both ways (optimized and standardized) and see how much of a performance hit you suffer. The view might not be worth the climb (considering, of course, the final installation conditions, factoring in the number of users, network, disk speed, etc.).

                   

                  HTH

                   

                  Mike

                  • 6. Re: SQL Join Performance questions
                    kaostika

                    Mike, you got the question partially right.  Yes I am saving my self time by putting in the entire join.  But lets look at it from the client perspective.  The concept of a JOIN is foreign to them.  So lets say I make a layout so they can generate their own reports.  They only understand fields and maybe tables.  In this case I need to make the JOIN "permanant" 

                     

                    They just pick the fields they want to see and Viola!

                    • 7. Re: SQL Join Performance questions
                      Mike_Mitchell

                      In that case, I'm not entirely sure what your question is. You're building a dynamic report tool. Any field the user wants has to be available at all times. So your question would be what - do I attempt to build a querying tool that can generate any query on the fly based on a list of fields (including JOIN clauses), or do I flatten the database in advance so the user can pick whatever columns he wants from a flattened data model?

                       

                      I know what my answer would be, because there's no way I'm smart enough do to the former.       

                       

                      Mike

                      • 8. Re: SQL Join Performance questions
                        kaostika

                        Let me rephrase the question.After years of being a 'prisoner' of the graph and of context I am now sudddenly free. While I know the language of this new found freedom, I dont understand the consequences. 

                         

                        Well there are a few issues.

                        First, the SQL is new, and I need to bullet proof my work.  So making the JOINs once helps with that. 

                         

                        Second.  I Dont know in advance what they may need, So if the JOIN is already mapped out I wont have to dig around to find out what I did.  I may have to come back after several months.  I can Represent the data model in SQL, Or in other words replicate the relationship graph. 

                         

                        Third, Lets say I have a SUM (invoicelineitems::Price )   calculation in an invoice layout.  I know that there is a relationship.  I know that the SUM if it is on a layout will cause Filemaker to render that calc.  There for it will slow things down.  However in the Filemaker way of doing things I dont have to create a seperate graph for each report. For Instance I dont want to show the price for what ever reason the Layout will not slow down to calculate all the SUM's.   If the SUM calc is not being used on a report I dont need to modify the graph.

                         

                        Now with SQL I am using the same Logic.  The Relationship Graph = a JOIN in SQL.  But if I have the SUM in the SELECT statement (or the appropriate SQL syntax ), will the same slow down happen  So In filemaker the having the relationship only afftect things that are SHOWN.  Does the field in the SELECT statement behave like it is being SHOWN

                         

                        To summarize.  I have one Graph that represents the Date model in Filemaker.  The performance is affect by things being SHOWN.

                         

                        I have one set of JOIN statements in a SQL query, Does the perfomance change with the field being in the SELECT statement

                         

                        Oreste

                        • 9. Re: SQL Join Performance questions
                          Mike_Mitchell

                          I'm not an expert on this, but as I understand it, the ExecuteSQL function executes when it's required and, when it evaluates, it evaluates. Its result will be returned to whatever to return it to. So it really doesn't matter if some part of it is "shown" or not; in order for it to execute, it has to execute.

                           

                          I don't think you can "partially execute" a SQL query. It's not analogous to the Relationships Graph.

                           

                          If someone knows different, please explain.

                           

                          Mike

                          • 10. Re: SQL Join Performance questions
                            kaostika

                            Yes, we cannot  partially execute the SQL, the question I guess we should ask is what ever is going on behind the hood, doing the calculations if there are not being returned.

                             

                            I will do some tests

                             

                            Oreste