3 Replies Latest reply on Feb 6, 2013 6:51 PM by JG

    Avoiding duplication of imported data



      Avoiding duplication of imported data


           I'm sure I'm not the first person to have this issue, but I can't seem to find the right search terms to find a solution! Any help would be appreciated.

           I have three (relevant) tables: Members, Events, and Attendance. Attendance is a join table between the other two, showing which members attended which events. This is similar to the Students/Classes examples I've seen about the place.

           Data is imported to Attendance, as this is the closest match to our existing setup: we receive a list of people who attended an event, details about their profession, and the relevant event_id (the Event record will already have been created prior to receiving this info).

           The list of people attending may be a mix of people who are brand new to our client list, or who have attended previous events and will therefore already have a record in the Members table. I want to create Members records for the new people and prevent any changes from happening to the details of existing Members. 


           - list of people added to Attendance join table
           - see if they are in Members table:
           -- if so, do nothing
           -- else, add new record in Members using data from Attendance

           How do I accomplish this?

        • 1. Re: Avoiding duplication of imported data

               Am I correct that a name is what you have to determine if the data from an imported record is for an existing or new record in the members table?

               If so, that can get messy if you don't have very good data to import and even then, names are not unique and this can cause issues.

               If you can resolve such issues prior to import, this method may serve:

               Go to a layout based on Members.

               Show All Records

               Select Import records from the File Menu

               Specify an Import matching values import. Select the name field or fields as match fields.

               Select the 'Add new if they don't match' option.

               But this may overwrite data in existing records that you don't want and relies on unique, correct names.

               Thus you may find it safer to import your data into a different table and use the name fields as match fields in a relationship. After inspecting your data and correcting any errors, resolving duplicate names etc, you can perform a found set for all records that don't match an existing record and then use import records to pull that data into your members table.

          • 2. Re: Avoiding duplication of imported data

                 Yes, names are already a tricky item for us - as you point out, they're not always unique. We've found that emails work for us (so far, fingers crossed) as unique identifiers, since we don't get duplicate names within a single domain name. 

                 Thanks for your suggestion - I'll test it first on my dummy database using the email field in place of name fields.

            • 3. Re: Avoiding duplication of imported data

                   That worked really well! I used email as the match field and was more selective about which fields I was using so I didn't overwrite anything.

                   So really, it's two imports from the same data: one to Attendees with all of the fields, and then one to Members being a bit more careful about which fields are used. I think if I set this up as a script it'll help our admin staff get through it speedily, as it will walk them through the process without skipping anything, and give them some sensible defaults they won't have to mess around with too often.

                   Thanks so much for this, much appreciated!