1 2 Previous Next 24 Replies Latest reply on Feb 26, 2015 1:50 AM by nicolai

    SQL: How to count distinct values with field containing multiple values

    openspace

      Hello is there a SQL calc that can count a list of distinct values where each field can have one or more values separated by a ¶?

       

      Right now I'm using: ExecuteSQL ("SELECT COUNT (DISTINCT Field) FROM Table" ; "" ; "")

      but this only counts the first value of a field that contains two values giving me an incorrect result (example below would incorrectly result: 4)

       

      Example:

      a¶b

      a¶b

      c

      d

      d

      e¶f

      e

       

      I would like the result to display a count of =6

       

        • 1. Re: SQL: How to count distinct values with field containing multiple values
          mardikennedy

          Does it have to be SQL?  Wouldn't ValueCount (ValueListItems (Get ( FileName ); "Values" )) achieve the same thing? (with a Refresh)

           

          Rgds, Mardi

          • 2. Re: SQL: How to count distinct values with field containing multiple values
            openspace

            the main reason I was using sql was because I was pulling information from tables that are not always directly related with fields that are not global. I'm pulling these stats to create a dashboard layout.

            • 3. Re: SQL: How to count distinct values with field containing multiple values
              mardikennedy

              Fair enough.  Note that the VL here is based on 'Values from field'.

              • 4. Re: SQL: How to count distinct values with field containing multiple values
                erolst

                openspace wrote:

                Right now I'm using: ExecuteSQL ("SELECT COUNT (DISTINCT Field) FROM Table" ; "" ; "")

                 

                Example:

                a¶b

                a¶b

                c

                d

                d

                e¶f

                e

                 

                I would like the result to display a count of =6

                The result you're after would actually be 5 (a¶b, c, d, e¶f, e).

                 

                You could either use a calculated field that substitutes ¶ with some other character and count that – or, preferably, use a better data structure to begin with.

                • 5. Re: SQL: How to count distinct values with field containing multiple values
                  nicolai

                  erolstis correct, for the ExecuteSQL DISTINCT and COUNT the following values

                   

                  a¶b

                  c


                  are the two distinct values and not three. And so is the next one:


                  b¶b

                  b


                  will give you COUNT result of two and DISTINCT will return both


                  openspace  is trying to get a count of (a,b, c, d, e,f) from her example. There was a recent discussion on this topic, sorry, I can't find it. No one came up with the full solution in FQL as it has some limitations.  The solution was to get a list as the result of the ExecuteSQL, use  a custom function to de-duplicate it and use Value count on the resulting list.


                  There are a number of custom functions, I will use http://www.briandunning.com/cf/596 for example solution

                   

                  ValueCount (

                       UniqueValues(

                                 ExecuteSQL ( "SELECT Field FROM Table" ; "" ; "" )

                       )

                  )

                   

                  I still second erolist with advise that you need to look into your data structure.

                  • 6. Re: SQL: How to count distinct values with field containing multiple values
                    rrrichie

                    For this reason I create a separate "Dashboard" TO and create new TO's to the existing tables just for the dashboard use.

                     

                    The idea is the the Dasboard TO has it's own group of TO's to deal with and doesn't interfere with the rest of the database.

                    • 7. Re: SQL: How to count distinct values with field containing multiple values
                      rrrichie

                      You can construct a CASE WHEN field LIKE '%a%' construct but that would slow things down quite a bit and is fixed to the knowing the values.

                       

                      For some of these "issues" I run a nightly script that collects data for dashboard.  It creates new tables with tallied results.  This makes the graphs really fast. But it isn't real time.  The script on the server takes about 10 minutes to complete :-) So it could be run during the day if needed.

                       

                      Collecting data in real time for dashboard does get slower and slower over time.  A dashboard I had took about 20 minutes to display (using standard Filemaker sorts/summaries).  It tallied about 40k records, with details per account, per month, per model, per type of repair and invoice payment information.

                      • 8. Re: SQL: How to count distinct values with field containing multiple values
                        openspace

                        Hmm, my structure is a bi tricky to explain...it appears as follows.

                        Screen Shot 2015-02-25 at 8.20.13 AM.png

                        In this scenario I am trying to calculate the number of members (new, lapsed, renewed, not renewed, etc). The key word is members. Right now, my memberships table can have more than one member or "customer" from the customers table tied to one membership.

                         

                        When a membership is created the user creates an invoice to pay the membership fee. All of my stats from the membership funnel to the invoice data table, where a line by line record of each membership fee is kept. This allows me to treat memberships like a product and take similair stats. Honestly it's working as far as I can tell and the experience for the user is fairly intuitive.

                         

                        Previously, it was suggested that I make a third table between the customers table and the memberships table. Which I did, and then realized it did not give me the desired result, plus I realized that it would make more sense to spill renewal data into the invoice data table, because each membership is required to be related to a customer and it is also require that it be paired to an invoice, and again this type of workflow is easily managed by the user. I hope that makes sense. I also enjoy this structure because a membership can be created from within an invoice by simply checking variables and calling custom dialogues.

                         

                        The SQL I have now, counts the number of memberships successfully for the stats I am producing, but the client does not care about membership numbers they care about "members" numbers.

                         

                        nicolai I tried this method, but it seems to produce the same number as COUNT DISTINCT. I'm using the custom function remove duplicates, which I believe produces a similair result to the unique values function. I'm just looking at ID values consisting of numbers.

                         

                        Let (

                        [ SQL = ExecuteSQL ("SELECT \"MCustomerIds\" FROM \"Invoice Data\" WHERE MExpiryFYear >= SelectedYear" ; "" ; "")] ;

                        If ( not IsEmpty(SQL) ; ValueCount ( RemoveDuplicates (SQL) ) ; 0 ) )

                        • 9. Re: SQL: How to count distinct values with field containing multiple values
                          nicolai

                          I am not looking at your data structure, just the SQL

                           

                          I tried it and it works fine for me. I think the "RemoveDuplicates" is not removing duplicates properly. Try to use the custom function I used in my post. I tested it and it seems to work fine.


                          Let (

                          [ SQL = ExecuteSQL ("SELECT \"MCustomerIds\" FROM \"Invoice Data\"" ; "" ; "")] ;

                          If ( not IsEmpty(SQL) ; ValueCount ( UniqueValues (SQL) ) ; 0 ) )

                          • 10. Re: SQL: How to count distinct values with field containing multiple values
                            erolst

                            What is that field you're interested in, and what do its multiple values represent?


                            EDIT: OK, I can deduce that from your calculation; if you're interested in Members, you could do this:

                             

                            Let ( [

                              memberIDs =

                                ExecuteSQL ( "

                                  SELECT \"MCustomerIds\"

                                  FROM \"Invoice Data\"

                                  WHERE MExpiryFYear >= SelectedYear

                                  " ; "" ; ""

                                ) ;

                              Case (

                                not IsEmpty ( memberIDs ) ;

                                ExecuteSQL ( "

                                  SELECT COUNT (*)

                                  FROM Customers

                                  WHERE id /* customer table primary key */ IN (" & Substitute ( memberIDs ; ¶ ; "," ) & ")

                                   " ; "" ; ""

                                ) ;

                                0

                              )

                            )

                             

                            But again, I think you're trying to make up for deficiencies in your data structure by employing elaborate calculations (which, IIRC, has also been suggested previously …)

                            • 11. Re: SQL: How to count distinct values with field containing multiple values
                              openspace

                              erolst

                              Blast, I can't seem to get this code to work.

                               

                              Thanks for having so much patience, I can sense a bit of frustration in the air. I realize I have not been able to digest what you're trying to communicate.

                               

                              I would like to understand what you mean by improving the data structure rather than dismissing the topic. In the past I made a post to try and record payments to pull stats, which I was hoping I solved with the structure above, but from the sounds of everyones replies it sounds like I did not. If you can bare with me, how would you solve my structural faults so that I can calculate member/customer totals rather than membership totals?

                               

                              To put it in layman's terms one last time. Each customer can have ONE membership ONLY, and each membership can have MANY customers(or "members"). Also, each membership can have MANY renewals, but each renewal can only have ONE membership. I would like to pull stats for member total not membership totals.

                              • 12. Re: SQL: How to count distinct values with field containing multiple values
                                nicolai

                                Blast, I realised I gave you a wrong answer. Sorry for nonsense. Obviously the problem is more likely with the result of your SQL rather than CF.

                                 

                                I can see erolst  providing another answer and this seems to be a better approach.

                                 

                                Generally, I would recommend breaking  the statement down into components and test one at a time.

                                1. Look at the result of ExecuteSQL ("SELECT \"MCustomerIds\" FROM \"Invoice Data\" WHERE MExpiryFYear >= SelectedYear" ; "" ; ""). You should have result as a list, e.g. "aa¶b¶cb"
                                2. Pass result to RemoveDuplicates and check the output - should have no duplicates
                                3. pass to ValueCount


                                Check your data, you filtered result might not have duplicates.



                                • 13. Re: SQL: How to count distinct values with field containing multiple values
                                  erolst

                                  openspace wrote:

                                  Blast, I can't seem to get this code to work.

                                  Are your keys numerical or text?

                                   

                                  As to the rest: typing on a iPad isn't fun; I get back to you on that.

                                  • 14. Re: SQL: How to count distinct values with field containing multiple values
                                    nicolai

                                    I think you have unnecessary open square bracket in your formula ->  Let (

                                     

                                    I really  like this solution for two reasons:

                                     

                                    1. No need for Custom Function
                                    2. It technically uses subquery, which FQL is a bit awkward with. You can use them, but they has to be table based. This is really a nice way around it.

                                     

                                    My only concern would be using IN on thousands of values in the list. I suspect it could be slow.

                                     

                                    Can I just add, that I do not see anything wrong with my suggested approach. The fact that openspace can't get both of our queries working probably highlights the fact that there is something else in the data structure we are not aware of.

                                    1 2 Previous Next