11 Replies Latest reply on Nov 23, 2013 11:24 AM by philmodjunk

    Graphing the 5 most frequent items

    NaturSalus

      Title

      Graphing the 5 most frequent items

      Post

           Hello,

           What I am trying to achieve is to set up a horinzontal bar chart dynamically, so that the graph gets its values from the 5 most frequent items.

           So the first question is whether this is feasable using FM.

           If the answer is yes, then I have to figure out how to identify the 5 most frequent items.

           So let's imagine that I have the Question table that keeps track of questions asked.

      Question table fields:

             
      •           __kp_Question (number, auto-enter)
      •      
      •           QuestionSubject (text)
      •      
      •           TotalCountSubject (summary = count of QuestionSubject)

            

           The Question::QuestionSubject field gets its values from the Subject value list with the following options:

             
      •           Product
      •      
      •           Service
      •      
      •           Cost
      •      
      •           Payment
      •      
      •           Delivery
      •      
      •           Refunds
      •      
      •           Support
      •      
      •           etc

            

            

           Since I can't use a summary field to sort finds, I wonder what is the way to have listed the 5 most popular questions.

            

           Thanks

            

        • 1. Re: Graphing the 5 most frequent items
          philmodjunk
               

                    Since I can't use a summary field to sort finds,

               But you can. Check out the re-order by summary field option at the bottom of the Sort Records dialog.

               But that might not be the most efficient method.

               Let's see if I can rephrase the question to make sure that I understand all the details:

               You want to chart the number of records in Question for the 5 most commonly selected values in QuestionSubject such that each bar charts the count for one of these "top 5" values.

               There are a number of ways to get these totals as a delimited list of values. ExecuteSQL is pretty much made to order to produce such a list but I seem to recall that you are still using FileMaker 11. This is still possible in v11 but the implementation details are quite different and more complex.

               So do you need a method that works in FileMaker 11?

          • 2. Re: Graphing the 5 most frequent items
            NaturSalus

                 Hello Phil,

                 Thanks for looking into my question.

                  

            You want to chart the number of records in Question for the 5 most commonly selected values in QuestionSubject such that each bar charts the count for one of these "top 5" values.

                 Right.

                 For the following values:

                 Product has 25 records
                 Service has 15 records
                 Cost has 10 records
                 Payment has 60 records
                 Delivery has 70 records
                 Refunds has 95 records
                 Support has 110 records
                  
                 The horizontal Bar graph would look like the following:
                  
                 Support -------------------------------------------------------------
                 Refunds -----------------------------------------------------------
                 Delivery -----------------------------------------------------
                 Payment ------------------------------------------------
                 Product -----------------------------------
                  
                     # of questions
                  
            but I seem to recall that you are still using FileMaker 11.
                  
                 Good memory, but I made the transition FM 11 to FM 12 I guess 6 months ago. The reason being that all my projects made under FM 11 turned out corrupted and that prevented me from importing directly. I discovered that when you reassured me that it was possible to import directly from version B the data of version A of the same project. So I remade all my projects from scratch using FM 12 and I must say that it is not so bad and that has some good things, although there is a long list of things that needs improving.
                 Hoping that FM 13 fixes most issues and doesn't create more annoying ones.
                  
            There are a number of ways to get these totals as a delimited list of values. ExecuteSQL is pretty much made to order to produce such a list
                  
            I am willing to learn new tricks...
                  
            Thanks,
                  
                  
                  
            • 3. Re: Graphing the 5 most frequent items
              philmodjunk

                   Take a look at this "summary recap" example for using ExecuteSQL. It can be modified to produce such a list of all 7 subtotals. If you put that executeSQL function call inside a LeftValues function, you can limit the resulting list to just the first 5 values.

              FMP 12 Tip: Summary Recaps (Portal Subtotals)

              • 4. Re: Graphing the 5 most frequent items
                NaturSalus

                     Phil,

                      

                     Thanks for the tip and the demo.

                      

                • 5. Re: Graphing the 5 most frequent items
                  NaturSalus

                       It seems I don't quite understand SQL logic.

                       The first problem I have run to understand your demo is that I wasn't able to see the raw data since there is just one record.

                        

                       Okay in your demo there are two ExecuteSQL calculation fields. I would say that the one that is related to my case is the following:

                  cSummaryRecap (calculation unstored =

                       ExecuteSQL ( "SELECT txtCategory, Sum ( numbValue ) FROM Data GROUP BY txtCategory ORDER BY 2 DESC" ; Char (9) ; "" )

                        

                       In your example the Data table contains 3 fields whose values are entered by the user:

                         
                  •           txCategory for the food category. Values entered in the demo: apple, kiwi, orange, strawberry, 
                  •      
                  •           numValue field I gues fo sales in units
                  •      
                  •           dteDate field to record the sale date

                        

                       so if I understood it right:

                       the ExecuteSQL function selects from the txtCategory field the value of the numbValue field and adds it up grouping it by category and then displaying the values in descending order using as criteria  Sum ( numbValue )

                       Although I have read both the "The Missing FM 12 ExecuteSQL Reference" and  "FileMaker 12 ODBC and JDBC Guide" I haven't been able to understand the meaning of the Char (9) part of the expression.

                        

                       Now in my case,

                       the criteria I am using to get summaries is QuestionSubject

                       the field that is beeing upded up is __kp_Question

                        

                       So I guess the ExecuteSQL function in my case would be something like:

                       ExecuteSQL ( "SELECT QuestionSubject, Sum (  Found ) FROM Question GROUP BY QuestionSubject ORDER BY 2 DESC" ; Char (9) ; "" )

                       I had to create the Found field (calculation unstored = Get (Found Count); number)

                        Alternatively, it could be: 

                       ExecuteSQL ( "SELECT QuestionSubject, Count__kp_Question ) FROM Question GROUP BY QuestionSubject ORDER BY 2 DESC" ; Char (9) ; "" )

                        

                       Is that correct?

                  • 6. Re: Graphing the 5 most frequent items
                    philmodjunk
                         

                              The first problem I have run to understand your demo is that I wasn't able to see the raw data since there is just one record.

                         There are 8 records in the table, just enough to demonstrate the results produced by the ExecuteSQL query.

                         

                              I haven't been able to understand the meaning of the Char (9) part of the expression.

                         Char ( 9 ) returns the tab character. The function puts this character between each field value returned by the function. Your query will not need that character, but in the demo it's what allows me to use appearance settings to set a tab stop on the "recap" field at the bottom of the report such that the number subtotals are right justified to the right hand edge of the field.

                         The example that I posted is for a slightly different purpose. It needs both the category name and the subtotal in each row of data shown in the field. For you Y-series of delimited data, you only need the subtotals so you can use:

                         LeftValues ( ExecuteSQL ( "SELECT Count__kp_Question ) FROM Data GROUP BY QuestionSubject ORDER BY 1 DESC" ; "" ; ) ; 5 )

                         You'll need a different list of values for the x-series labels and that will require a different query.

                    • 7. Re: Graphing the 5 most frequent items
                      NaturSalus

                           Hello Phil,

                           Thanks for correcting the ExecuteSQL function, now I am begining to get familiarized with it.

                            

                           I have noticed that in you answer you have remove the QuestionSubject field after the SELECT command :

                           LeftValues ( ExecuteSQL ( "SELECT Count ( __kp_Question ) FROM Question GROUP BY QuestionSubject ORDER BY 1 DESC" ; "" ; ¶ ) ; 5 )

                           instead of :

                            LeftValues (ExecuteSQL ( "SELECT QuestionSubject, Count ( __kp_Question ) FROM Question GROUP BY QuestionSubject ORDER BY 1 DESC" ; ""; ); 5)

                           is this so or is a misspelling error?

                            

                      Line Graph instead of Horizontal bar Graph

                           Since my goal is  to graph the number of questions of the 5 most popular subjects per quarter I think that a line graph is more suited than a horizontal bar graph as I originally suggested. I have uploaded an image to make it clear.

                           Quarters (Q1, Q2, Q3, Q4) in the X-axis vs Number of questions per subject  in the Y-axis

                            Therefore, on:

                             
                      •           the X axis I should have the totals per quarter (Q1, Q2, Q3, Q4).
                      •      
                      •           the Y axis I should have five lines, one for each of the top questions. Each line would have 5 points, one per quarter.
                            
                            
                      Data series for the y-axis
                            
                           Correct me if I am wrong, could we generate individually one by one each of the 5 series of data for the Y-axis by modifying in some way  the LeftValues(ExecuteSQL) function ?
                            
                           Could the following be used to extract each one of the 5 series of data:
                            
                           GetValue (LeftValues(ExecuteSQL); 1) to get the y- axis data series for the subject / question most frequently asked
                           GetValue (LeftValues(ExecuteSQL); 2) to get the y- axis data series for the second most frequently asked subject / question
                           GetValue (LeftValues(ExecuteSQL); 3) to get the y- axis data series for the third most frequently asked subject / question
                           GetValue (LeftValues(ExecuteSQL); 4) to get the y- axis data series for the fourth most frequently asked subject / question
                           GetValue (LeftValues(ExecuteSQL); 5) to get the y- axis data series for the fifth most frequently asked subject / question
                            
                            
                      Data series for the x-axis
                            
                            
                                Since the  data for the X-series need to be calculated I have created some new fields in the Question table hoping that they would be useful. So the fields of the Question table are the following:
                                 
                                __kp_Question (number, auto-enter)
                                QuestionSubject (text)
                                QuestionReceptionDate 
                                Found field (calculation unstored = Get (Found Count); number)
                                cQuarterNumber (calculation =Ceiling ( Month ( QuestionReceptionDate ) / 3 )
                                TotalCountQuarter (summary = count of cQuarterNumber)
                                TotalCountSubject (summary = count of QuestionSubject)
                            
                            
                           I wonder whether the following SQLExecute function would provide the values for the x-axis.
                            
                           ExecuteSQL ( "SELECT cQuarterNumber, Count ( cQuarterNumber) FROM Question GROUP BY cQuarterNumber ORDER BY 1 DESC" ; ""; )
                            
                            
                           and to this I would apply the LeftValues function, so that in the end the calculation for the x-axis would be:
                            
                           LeftValues ( ExecuteSQL ( "SELECT cQuarterNumber, Count (cQuarterNumber, ) FROM Question GROUP BY cQuarterNumber, ORDER BY 1 DESC" ; ¶ ); 5 )
                            
                           If this is incorrect what would be the correct setting?
                                 
                            
                           Thanks
                      • 8. Re: Graphing the 5 most frequent items
                        philmodjunk
                             

                                  is this so or is a misspelling error?

                             It is not an error. You only want the group counts separated by return in the results. You'd use a second query for the Y-axis that uses the other field to produce the labels for the x-axis.

                             The need to show totals for each quarter throws a really wicked curve at this that shipwrecks the current approach. Here's why:

                             The groups need to be calculated from groups produced by grouping on two fields: Quarter and question. But the Order of these groups has to be determined based on counts produced by grouping only by the question. Otherwise, the order of the groups will be different for each quarter and you won't be able to align the data points to the same series of labels.

                             I'm still mulling that issue over....

                        • 9. Re: Graphing the 5 most frequent items
                          NaturSalus

                               Hello Phil,

                          The need to show totals for each quarter throws a really wicked curve at this that shipwrecks the current approach. 

                          Okay I understand what you say from "the point of view" of the ExecuteSQL function.

                          I guess that the problem is related to the need to match data series from the y-axis with quarterly totals from data series for the x-axis, and this has nothing to do with the type of the type of graph chosen.

                          Am I right on this point?

                                

                                

                                

                          • 10. Re: Graphing the 5 most frequent items
                            philmodjunk

                                 We are on the same page.

                                 I have some very fuzzy ideas as to a solution for putting the data on one graph, but they are pretty hairy and I'll need to sit down and do some major cogitating to see if they look possible enough to make happen. I think we'll need to come up with some kind of "two stage" approach where stage one determines both the subjects that make up the "top 5" and the order in which they should be sorted. Stage two would be to produce the needed data series in an order matching that determined by stage one.

                                 I may end up using a script to load values into text fields before that all works like we need it.

                            • 11. Re: Graphing the 5 most frequent items
                              philmodjunk

                                   I've figured out a nonSQL way to get these charts.

                                   Define 20 calculation fields as:

                                   First field: GetValue ( gSubjectList ; 1 ) = QuestionSubject And cQuarterNumber = 1
                                   Second Field: GetValue ( gSubjectList ; 2 ) = QuestionSubject And cQuarterNumber = 1

                                   and so forth, for all combinations of quarters 1-4 and values 1-5

                                   gSubjectList is a global field to be populated via a script.

                                   Define 20 Summary fields to compute totals of each calculation field.

                                   Define one more summary field, sOverallCounts that is just the "count of" a never empty field.

                                   Use a script to perform a find to find all records for the specified year and to sort them by QuestionSubject with the re-order option used to re-order them by sOverallCounts. Then have the script loop through the records starting from the top, building the return separated list of values in gSubjectList. Have the loop set to exit on the first record that is not part of the top 5 groups. Then use Omit Multiple Records to omit the records not part of the top 5 from your found set.

                                   In your chart, base it on a layout for this table and specify Y-series using a LIst function to List a different group of 5 summary fields. Use gSubJect List for your X axis labels. Use a delimited data data source option for the chart.

                                   Complex, and there are probably other ways to get this chart, but the above should work.