7 Replies Latest reply on Jun 11, 2014 4:35 PM by ChristineGarcia

    Merge records based on duplicate ID number

    ChristineGarcia

      Title

      Merge records based on duplicate ID number

      Post

           Hello,

           I am attempting to merge data from duplicate records in FileMaker Pro (11). I have identified the duplicate records based on a unique ID number but am struggling with merging data from one record to the other where fields are blank. This is a very large database which makes manual copy and paste quite inefficient. I am new to FileMaker Pro and any assistance is very much appreciated!

        • 1. Re: Merge records based on duplicate ID number
          philmodjunk

               Have you ever created a script in FileMaker before this?

               In the cases where you find duplicate records, is there never more than two records with the same duplicate ID or might there be 3, 4 or more records with the same ID?

          • 2. Re: Merge records based on duplicate ID number
            ChristineGarcia

                 There is never more than two records with the same ID. Unfortunately, I have very very little experience in writing scripts but I can usually catch on quickly

            • 3. Re: Merge records based on duplicate ID number
              philmodjunk

                   No problem, but knowing your inexperience tells me to make extra detailed posts.

                   So your script can:

                   Perform a find using the ! operator to find all records that have a duplicate ID
                   Sorts on the ID field to put the duplicate records in adjacent pairs of records
                   Then goes to the first record and loops through them all moving data from one record to another to merge the data.

                   There are some details to that which are still a bit fuzzy.

                   Once you have merged the data, do you then want to delete a record to remove the duplicate?

                   And is this a case where you have several fields and when this script is done any fields that are empty in record A but not Record B of the duplicate pair should get a copy of this data from Record B? And (if you want to keep the duplicate, also copy missing data from A back to B?

                   My answer is going to assume that:

                     
              1.           You never have more than two duplicate records
              2.      
              3.           You plan to keep only one record if the pair
              4.      
              5.           And that you are only updating the value in a field if the field is empty in the record to be kept.


                   Before you create this script, add a field to this named "Mark".
                   Make a copy of your database. That way you can toss this copy if the script doesn't work right and you end up with a lot of messed up records.
                   Open Manage | Scripts
                   Click New to start the creation of a new script.
                   Enter a name for the script at the top of the script editor that is now open
                   Add the script steps shown below by finding the script on the left and double clicking it.
                   After adding a script, you may need to interact with controls that then appear at the bottom of the editor for specifying specific options for that script step.
                   Comments explaining details to you will appear to the right of the script step like this: -----> Clear the pause check box

                   The script:

                   Perform Find [Restore]
                        ----> click specify button, then click New, then click the name of the ID field in the lower left window pane, enter an ! and then click "Add"
                   Sort Records [Restore ; no dialog ] --> click Specify and select the ID field as a field on which to sort. Click the Perform without dialog check box.
                   Go To record/Request/Page [First]
                   Loop
                      If [ Isempty ( YourTable::FieldThatMightBeEmptyHere ) ]
                         Set Field [ YourTable::FieldThatMightBeEmptyHere ; GetNthRecord ( YourTable::FieldThatMightBeEmptyHere ; Get ( RecordNumber ) + 1) ]
                                ----> see detailed instructions below
                      End IF ---> every time you add an If [] step this step is also added at the same time.
                     Go To record/Request/Page [First]
                     Set Field [YourTable::Mark ; 1 ]
                     Go to Record/Request/Page [Next ; Exit After Last ] ---> This step exits the loop after processing the last record pair
                   End Loop

                   When Setting up Set Field, there are two Specify buttons that must be clicked. To get Set Field [Table::Field ; Expression], add set field to your script and click the first button (specify target field). Select Table::Field from the list of fields. Do not click the specify button next to the repetition box. Click OK to close this dialog box. Now click the lower specify button (calculated result) and create the expression to the right of the semicolon (;). Do not try to type in the semicolon.

                   In the steps that refer to actual fields, I don't know the name of your table so have called it "yourTable" in this example. I don't know the name of the field that might need data from Record B so I called it "FieldThatMightBeEmptyHere". If you have several such fields, set up one such "IF Block" for each.

                   GetNthRecord is a function that can refer to a different record in the same found set of records by referring to it's place in the current found set. Get ( RecordNumber ) returns the position of the current record in that same found set. Thus, Get (recordNumber) +1 refers to the position of the second record in the pair of duplicate records.

                   The above script does not actually delete the duplicates. It marks them by putting a 1 in the Mark field. After this script runs, you can perform a find for all records that have a 1 in the Mark field and use Delete All Records to delete them. You could add steps for this to your script, but I advise that you not do this at first so that you can inspect your records and make sure that the script did everything correctly before you delete anything.
                    

              • 4. Re: Merge records based on duplicate ID number
                ChristineGarcia

                     Wow! Thank you for the detailed response. This was great and all of your assumptions were spot on!

                     I have written the script exactly as it appears above however, the script seems to be taking a long time to run (45 mins and counting). The database I am testing this script on is a subset of the full database (700 records v. 220,000). As a newbie, I'm not sure if this is normal...

                     Attached is a screen shot of the script. Table is ContraCosta, "FieldThatMightBeEmpty" is Lat. I will be adding other potentially empty fields to be merged under the IF block so I expect the script to take even longer once those fields are accounted for. 

                     Thanks again and let me know what your thoughts are on the processing time.

                • 5. Re: Merge records based on duplicate ID number
                  philmodjunk

                       If you are using a mac, press command period to halt the script. If windows, press Escape to halt it. I see a flaw in the script.

                  • 6. Re: Merge records based on duplicate ID number
                    philmodjunk

                         Once you have the script stopped, change:

                         Go To record/Request/Page [First]

                         to

                         Go To record/Request/Page [Next]

                         My brain must have been disconnected at that point...blush

                    • 7. Re: Merge records based on duplicate ID number
                      ChristineGarcia

                           Beautiful! Worked like a charm. You are a life saver! Thanks for all of your assistance!