5 Replies Latest reply on Feb 3, 2014 3:54 PM by aeu477

    Calc to find duplicates in key field containing carriage returns

    aeu477

      Given a text field where the user will often be entering carriage returns (multiple paragraphs of text); the database needs to be able to highlight the field in red if its total content is identical to that of the same field any other record in the table. Normally I would just create a self-join relationship using the text field in question as the key, and then have the calculation simply count the records in that relationship to determine if there are duplicates.

       

      The problem I'm facing is that the carriage returns in the key field are treated as an "or" condition in the relationship. Thus, for example: if the user types a non-unique first paragaph into the text field, yet types a unique second paragraph; the ideal outcome would be that the system would determine this to be unique. But right now, it determines it to be a duplicate. I need to be able to compare the entire text block, including carriage returns, but have FMP not treat them as the "or" condition.

       

      Any ideas or insights would be much appreciated!

        • 1. Re: Calc to find duplicates in key field containing carriage returns
          Malcolm

          As you know, the carriage return is the delimiter for each key in the field. Therefore your key may not contain carriage returns. You will need to have a field which is analogous. This example replaces the carriage return with a backspace.

           

          substitute(multi_key_field; ¶ ; char(8) )

           

          There was a restriction on the effective length of the key because FileMaker would only index the first x characters. It was once a bit more than 100 characters. I forget what the current length is because it’s rarely necessary to concatenate keys now.

           

          Malcolm

          • 2. Re: Calc to find duplicates in key field containing carriage returns
            aeu477

            Thanks for the response, Malcolm. I have tried creating a calc key that substitutes out the carriage returns, but the problem is that when I do that I can no longer use it as a self-matching key. Two cases:

             

            For this, let's say the two fields (in the same table), are "OriginalText" which is the field in quesiton, and then "CalcTextKey" which will be the unstored calculation you are suggesting, that removes the carriage returns.

             

            1) The relationship is from CalcTextKey >> to OriginalText. When I use this relationship in a a calculation like "Count( ThisSelfRelationship::id )", it returns 0. Without the carriage returns, it no longer matches the fields, so this fails.

             

            2) The relationship is from CalcTextKey >> to CalcTextKey (same field used as a foreign key in the self relationship). Since it's an unstored calc field, it's unindexable and therefore will not work for this relationship.

             

            I'm not sure what other options there are here... am I missing something obvious?

             

            Also, regarding the character limit you mention; how would you recommend I go about determining if a rather large text block is a duplicate? Some sort of mathematical hash? and if so, then how would I implement that as an unstored calculation that I can use to conditionally format the interface to tell the user that it's a dupe?

            • 3. Re: Calc to find duplicates in key field containing carriage returns
              Malcolm

              2) The relationship is from CalcTextKey >> to CalcTextKey (same field used as a foreign key in the self relationship). Since it's an unstored calc field, it's unindexable and therefore will not work for this relationship.

               

              Make it a stored calculation.

               

              Also, regarding the character limit you mention; how would you recommend I go about determining if a rather large text block is a duplicate? Some sort of mathematical hash?

               

              That’s a good idea. You can build MD5 hash using custom functions, using CLI, or if you are happy with plug-ins, the Goya plugin does MD5. There’s a small possibility of a hash collision but as you have access to the original data, you can easily test for authenticity.

               

               

              Malcolm

              • 4. Re: Calc to find duplicates in key field containing carriage returns
                PeterWindle

                Why not use a global to enter the text, once the text is entered in the global, it can check against any data in the same field via relatioships, then script a setfield that updates the text field upon record commit or something like that....?

                • 5. Re: Calc to find duplicates in key field containing carriage returns
                  aeu477

                  Great! Thanks. One last question involving the stored calculation... one of the reasons I use unstored for keys is because I'm not sure if/when FileMaker will update an unstored calculation when the associated field value is changed. So in this case, is the user changes the field, how will I know when FileMaker will update the key such that the interface can warn the user of a possible duplicate value?

                   

                  My only thought here is to have a field script trigger to refresh the window and clear the cache, which I would assume forces FMP to do said updating. But can you speak to the aforementioned question at all?