7 Replies Latest reply on Oct 20, 2011 12:46 PM by JohnWolcott

    Deleting duplicate records from found set

    JohnWolcott

      Title

      Deleting duplicate records from found set

      Post

         I suspect this is a fairly common problem, but I couldln't find a script to do it in either the knowledge base or on this forum so I created two different scripts to do the same thing.  I'd like advice on which is better, which hopefully will include some tips on building better scripts.  Version 1 uses a global field and Version 2 uses a variable for the field that contains duplicate values, in this case a field called "Names".  Is there a preference?  Version 1 is 12 lines long which is one line shorter, but Version 2 seems a bit cleaner or more elegant.  I tested both scripts on a found set of 7,500 records, and both took 1:32 so there doesn't seem to be any difference in execution times.
        The database consists of grades for students for each term for the last 5 years (54,001 records).  I need to know how many students were enrolled each term. Since a student may take more than one class in a term, when I seacrh for all students in a given term, I have many duplicates.  The scripts get rid of the duplicates so I can print out a list of names of students enrolled in each term.

      Version 1

      Set Error Capture [On]
      Sort Records [Restore; No dialog]  (this sorts on the field "Name")
      Go to Record/Request/Page [First]
      Set Field [Grades::g_name; Grades::Name]
      Loop
        Go to Record/Request/Page [Next; Exit after last]
        Loop
          Exit Loop If [Grades::g_name=/ Grades::Name]  (not sure how to show "Not Equal" other than =/)
          Omit Record
        End Loop
        Set Field [Grades::g_name; Grades::Name]
      End Loop

      Version 2

      Set Error Capture [On]
      Sort Records [Restore; No dialog]  (this sorts on the field "Name")
      Go to Record/Request/Page [First]
      Set Variable [$_name; Grades::Name]
      Go to Record/Request/Page [Next; Exit after last]  (Just in case there is only one record found)]
      Loop
        If [$_name = Grades::Name]
          Omit Record
        Else
          Set Variable [$_name; Grades::Name]
          Go to Record/Request/Page [Next; Exit after last]
        End If
      End Loop

        • 1. Re: Deleting duplicate records from found set
          philmodjunk

          Suprised you couldn't find the KnowlegeBase article on this. It's a bit dated but still works. http://help.filemaker.com/app/answers/detail/a_id/3441/kw/duplicate

          Both scripts, as you have written them, may have an issue if the last record in your found set is a duplicate value. In those cases, Omit Record moves the current record to the previous record and that can change how the exit loop If step evaluates and can then omit the first record with that value.

          As you have found, there's no obvious advantage between using a global field or a variable here. Both give identical results. You could, in fact, use a variable with the first script and a global field with the second. With a variable, you don't have to define that extra field so that's small possible advantage for using a variable.

          And if you used the GetNthRecord to refer to the field in the previous record, you don't need to use either option for this type of script.

          You are aware that these scripts do not actually delete any record correct? They just omit them from the current found set.

          • 2. Re: Deleting duplicate records from found set
            JohnWolcott

             Interesting.  You are correct, there is a bug in the two scripts above if the last record in the found set is a duplicate.  I'm in awe of people who can look at a script and recognize if it will fail and where.  I'm concrete and have to try it to see where the problems are.  BUT, the script in the knowledge base doesn't work in this case either!  It will mark all the duplicates, but if I replace the "Mark duplicates" with "Omit record", it leaves two duplicates for every set of duplicate records.  That isn't a problem if you are just marking duplicates for a later find, but in my case it is a problem.  I don't want any duplicates.  Consequently, I seem to be stuck.  Either I have two copies of duplicates, or I lose the last record if it is a duplicate.

              BTW, yes, I want to omit and not delete these records.  A large number of people would be upset if I deleted those records of student enrollments and grades!

            • 3. Re: Deleting duplicate records from found set
              philmodjunk

              You can't just substitute an Omit for the mark records step as omit changes the current record.

              Note, though that thd find at the end will produce a found set of just the selected records.

              The main flaw to this approach is that you can't have two users run this script at the same time as the find will find records marked by both users and this could produce confusion.

              What's the end result you are trying to produce here?

              I'm thinking you might be able to set this up by creating a summary report with a sub summary part and the body layout part removed to "condense" all the rows of records down to one for each student.

              • 4. Re: Deleting duplicate records from found set
                JohnWolcott

                 My goal is to produce an excel spreadsheet with a list of all the students registered for one or more classes for a given term.  I have a database that contains every student's course registration.  TA student record is in the database for every course taken.  When I do a search for a term, a student may have one or several records, one for each course.  I would like to delete the duplicates based on student name so there is only one record for every student and then export the found set to excel.  In other words, I want just the opposite of the script in ID 3441. I don't want to see the duplicates, I want to get rid of them.

                • 5. Re: Deleting duplicate records from found set
                  JohnWolcott

                   OK, I used the script from the article in the knowledgebase ID=3441 and just added some steps to modify the last find, add a new request, and omit all the records with "X" in the "Mark" field, which I called "duplicates" in my script.  Here is the script, which seems to work.  Again, any suggestions for improvement are appreciated.  Perhaps I could have used Constrain Find instead of the new request, but I couldn't get it to work.

                  Show All Records
                  Sort Records [Restore, No Dialog] Sort by Name in Ascending order
                  Go to Record/Request/Page [First]
                  Replace Field Contents [No dialog, 'duplicates', " "]
                  Loop
                  Set Field['g_namel','Name']
                  Go to Record/Request/Page [Next, Exit after last]
                  If ['g_name = Name']
                  Set Field ['duplicates', '"X"']
                  Else
                  Set Field ['g_name','Name']
                  End If
                  End Loop
                  Modify Last Find [Restore] Use the Find that found the current records
                  New Record/Request
                  Set Field [Grades::duplicates; "X"]
                  Omit Record
                  Perform Find [] Find records when "duplicates"= X and omit them
                  Sort Records [Restore; No dialog] Sort by Name in accending order

                  • 6. Re: Deleting duplicate records from found set
                    philmodjunk

                    Looks good, but remember that this will fail if two users try to do this at the same time.

                    • 7. Re: Deleting duplicate records from found set
                      JohnWolcott

                       Thanks for your help.  Couldn't have done it without you and script debugger.