3 Replies Latest reply on Nov 6, 2013 2:20 PM by philmodjunk

    Compute the similarity between two entries in a field?

    ShayH

      Title

      Compute the similarity between two entries in a field?

      Post

           I need a way to compute the similarity between two entries in a field (different records, same table, same field.) I have a field called org_name, and I want to find all of the records in the same table that have a similar entry (to the current record) for this field. Eventually, I think I would want to set up another table occurrence so I can display them in a portal... but that's later.

           I'm having this problem because this is a user-inputted field, so there are a lot of standardization problems. For example,"Jefferson Elementary" "West Jefferson Elementary" "Jefferson Elementary School" "Jefferson County Elementary" won't show up as a match, even if they refer to the same school. Even more problematic are "-", "&", "and", "#", "No.", "5", "Five"... etc.

           I am somewhat acquainted with the algorithms for edit distance, affine gap distance, Smith-Waterman distance... etc., but I can't see how I would implement something like that using only scripts. Does FileMaker Pro support recursion or dynamic programming techniques?

           Can any of these things be done in Filemaker Pro 12? I haven't been able to find examples of other people trying to do anything similar, so I would greatly appreciate any insight at all.

           Much thanks!

        • 1. Re: Compute the similarity between two entries in a field?
          philmodjunk

               Does FileMaker Pro support recursion or dynamic programming techniques?

               Yes. Scripts and calculations can be recursive. Using FileMaker Advanced, you can also set up recursive custom functions.

               But in most circumstances, such a complex text matching algorithm is not needed. Instead, the user interface is designed so that a user selects the name instead of typing it in and then the name is entered with 100% consistency. Ideally, this is done by linking records by an ID number instead of the name.

               In addition to setting up a simple drop down list of schools to select from, you may find the more sophisticated methods used in these demo files to be of interest:

          https://www.dropbox.com/s/0pm1gdqcfi2ndpv/EnhancedValueSelection.fp7 (open this file from the File menu in FileMaker 12 to get a copy converted to the .fmp12 format.)

               FileMaker 12 users: https://dl.dropbox.com/u/78737945/SimpleNameLookupDemo.fmp12
               Pre FileMaker 12 Users: https://dl.dropbox.com/u/78737945/SimpleNameLookupDemo.fp7

          • 2. Re: Compute the similarity between two entries in a field?
            ShayH

                 Hi PhilModJunk, thanks for the quick response!

                 I had a look at your demos, but I don't think this is what I'm looking for right now. Such techniques may prove to be very useful in the future, so thank you for sharing them. These are excellent ways to prevent non-standardized data, but I'm trying to correct the old problem of data that has already been entered into our database non-standard.

                 We have already switched to similar input select methods, however, there are still millions of records in our database that were created before we switched to this new method. What I'm trying to do is clean up those old records - data scrubbing, in effect - so that they are standardized throughout the entire database. The org_name field is not the only field we need to standardize, but it is the most urgent.

            • 3. Re: Compute the similarity between two entries in a field?
              philmodjunk

                   Then perhaps you missed the rest of my answer:

                   Yes. Scripts and calculations can be recursive. Using FileMaker Advanced, you can also set up recursive custom functions.

                   Thus the algorithms for complex partial text matching should be something that can be implemented in FileMaker.

                   And if you have FileMaker Advanced do a web search for custom functions. There's at least one web site--by Brian Dunning, that serves as a resource page for sharing custom functions. You may find that what you want has already been designed and uploaded there.