4 Replies Latest reply on Jan 19, 2010 12:53 PM by RSchaub

    Eliminating duplicates

    hwdavy

      Title

      Eliminating duplicates

      Post

      Mybrother-in-lawsellssmallmilitaryequipment (resistors, capacitors) andhasadatabaseof65,000records. Hedoesn'tknowalotaboutdatabasecapabilitiesHehasenteredtheoriginalpartno. inthepartno. fieldand, ifthe  producthasone, enteredanalternativepartno. inthealternativepartno. fieldThen, tobesuretofindit, heenteredthealternativepartno. inthepartnofieldandthepartno. inthealternativepartno. fieldAnd, ofcourse, addedcostandretailpricetoeachentry.

       

      Isthereawaytoeliminateallofthealternativepartno. inthepartno. fieldsothatthereisonlyoneentryforthatpartanditscostandresalevalueHehas17,000duplicateentriesinhisdatabase.

       

      Hwdavy

        • 1. Re: Eliminating duplicates
          RSchaub
            

          From FileMaker Help Section

           

          dentifying duplicate values using a self-join relationship
          This procedure identifies "extra" instances of duplicated records. You specify the criteria that determine which is the primary record.

          This procedure uses a self-join relationship and a calculation field referencing the relationship to determine which records are duplicates.

          To find duplicate records except the first instance:

          1.
          If you plan to delete the duplicate records that you find, make a backup copy of the file.


          For more information, see Saving and copying files.

          2.
          Identify a field that determines a unique entity in your file.


          For example, in a Contacts database, the Last Name field is probably not a good choice, because you might have several people with the same last name. Social Security Number is a better choice. You can also create a calculation field (returning a text result) that combines data in several fields to make a unique identifier. An example formula is First Name & Last Name & Phone Number.

          3.
          Define a self-join relationship.


          Use your chosen identifying field as the match field in both tables in the relationship. For more information, see About self-joining relationships.

          The primary record is the first matching record according to the sort order defined in the relationship.

          4.
          Define two fields:

          Counter, a text field with an auto-entered serial number (select Serial number and accept the default values for Next and Increment by).


          Check Duplicates, a calculation field with a text result, with the formula:



          If(Counter = table1::Counter, "Unique", "Duplicate")

          5.
          Choose Records menu > Show All Records.

          6.
          Click the new Counter field, choose Records menu > Replace Field Contents, and Replace with serial numbers. Again, accept the default values. Select Update serial number in Entry Options, and click Replace.


          This will assign a serial number to all existing records in your database. Serial numbers will automatically be entered in new records.

          7.
          Perform a find for Duplicate in the Check Duplicates field.


          The first record in any series of duplicates now holds the value "Unique" in the Check Duplicates field, and all duplicate records within the same series are marked "Duplicate".

          Important Records with no value in the match field will be flagged as duplicates. Once set up as above, this system will identify duplicate records automatically as they are created.
           



























          • 2. Re: Eliminating duplicates
            MikeyG79
               Just a note. All databases we create have a row_index field. Auto incrementing integer. This way, no matter what the data, there is always a known unique key.
            • 3. Re: Eliminating duplicates
              RSchaub
                 Well then do as above only use your serial number field in its place
              • 4. Re: Eliminating duplicates
                RSchaub
                  

                in Step 4

                 

                omit this "Counter, a text field with an auto-entered serial number (select Serial number and accept the default values for Next and Increment by)."

                 

                use your field instead