6 Replies Latest reply on Mar 30, 2012 2:39 PM by ClaudiuNemes

    Find duplicates in external database

    ClaudiuNemes

      Title

      Find duplicates in external database

      Post

      Hi there,

      I have two databases: one for admittance (selection) and one for monitoring. The first one selects from thousands of subjects max. 100 subjects and the second one monitors the selected (admitted) subjects. Each year we will have one or two admittance process's. 

      Is there a way to check for duplicates (birth registration number) via relationships (same subject could not attend more than once)?

      I now that I could import the selected records into the monitoring database, but if I find duplicates i have to delete those duplicates and to remake the selection again without the duplicates found.

      Many thanks for your time,
      Looking forward to hear news from yours,
      Claudiu

        • 1. Re: Find duplicates in external database
          philmodjunk

          It's possible to omit duplicates during the import process. Would that work for you?

          It's also possible to perform finds with ! or use a self join on one or multiple fields to find duplicates, but if the only reason is so you can keep duplicates out of the records you import...

          • 2. Re: Find duplicates in external database
            ClaudiuNemes

            Hi Phil,

            Many thanks for your reply. I know that I could find duplicates in a db file but the problem is that the fields are already imported and If duplicates are found they must me deleted from the first database (selection) remade the selection process without the duplicated values and reimport them into the monitoring db.

            It could be usefull to omit duplicates uppon import if for instance the duplicates will be ommited and replaced by the next record (last imported record will be in some years according to a count - let's say first 50 will be selected or to a minimum GPA let's say all above 8.5).

            Again Phil, I have no ideea how to do it, so If you could give me some guidance I'll appreciate this. I already try some imports and I could not found any option. Do you suggest to add an import button and to a attach a script to it (import, omit duplicates ...)

            For future we intend to extend this database and I wonder If you could be interested in a collaboration. I could do the working and I could use your expertise for "fine tunnings". If you're interested please let me know about your rates info[at]claudiopolis[dot]org.

            Many thanks for your time,

            Yours,

            Claudiu

            • 3. Re: Find duplicates in external database
              philmodjunk

              What values in the imported data need to be checked to identify duplicates? Is it a value in a single field or must the combined value of several fields be checked for duplicate values? (Does two records with the same "birth registration" number make a duplicate or is it two records with the same birth registration number and same subject that makes a duplicate?)

              Leaving out duplicate records--assuming that the records are duplicate in all fields where data you need for your report is stored, should not affect any counts you make to determine ranking or "top fifty" type determinations.

              • 4. Re: Find duplicates in external database
                ClaudiuNemes

                Hi Phil

                One person could attend only once to this project so BirthRegNo is the only criteria for duplicates. In our area BirthRegNo is a unique number related to one person only. So, BirthRegNo is the only citeria for finding duplicates.

                Yours,

                Claudiu

                • 5. Re: Find duplicates in external database
                  philmodjunk

                  When you import records, the table that receives the imported data is called the target table and the table from which the data is imported is called the source table. In the target table, open up field options for BirthRegNo, click the validation tab and specify "unique values" and "validate always". Now when you import data from your source table, only the first record with a given value in BirthRegNo will be imported, succeeding records that match existing records in the table will not be imported.

                  • 6. Re: Find duplicates in external database
                    ClaudiuNemes

                    Hi Phil,

                    Do you think is so simple? :) BirthRegNo is already setted in validation to be unique and validate always. But I was never thinking that on import it will automatically omit duplicates.

                    Many thanks I'll test this and let you know.

                    Yours,

                    Claudiu