1 2 Previous Next 15 Replies Latest reply on Feb 17, 2016 7:46 PM by jrockstrom

    prevent importing duplicate records?

    jrockstrom

      I have multiple users who want to load training records for our sports group. The fields are 'M/ship ID', 'Month', 'Training Days', 'Type of training', 'Location of training'. My concern is when they import the records, they could accidentally do it twice and double the number of training days. I would like to run a check and if the data in all 5 fields is exactly the same then the import for that record is ignored. If it is not exactly the same across the 5 fields the record is added to the db.

       

      I'm a moderate developer, at best, working with FMP14 Advanced on Mac.

       

      If anyone can help I'd be very grateful!

       

      John R

      PS The coaches are all honorary and I can't impose stringent rules on them, which is why I have to build in the security at my end.

        • 1. Re: prevent importing duplicate records?
          Mike_Mitchell

          John -

           

          A few different ways to accomplish this:

           

          1) Simplest way is to import based on matching fields ("Update matching records in found set") and just select all the fields in the import as matching fields.

           

          2) Create a concatenated field that merges all the records, like this:

           

               M/ship ID & Month & Training Days & Type of training & Location of training

           

          Use that as a match field instead. (Note: This will work more tightly than the first option, since the order of the data matters.)

           

          3) Import the data into a utility table, basically a copy of the eventual destination table. Then, use scripting and relationships to check for duplicates before creating the records in the final database, either via Import or Set Field across the relationship.

           

          I will point out, though, that it looks like some of your fields might be free text ("Type of training", for example). Can you reasonably ensure that "duplicates" are properly filtered out? Can users, for example, accidentally add a carriage return or space to a field, or are the original data constrained so they can't do this?

           

          It's questions like this that make option 3 attractive, as it allows you to perform cleanup scripts on the incoming data before committing it to the final database. I suggest you do it that way, just for the sake of better error trapping to protect the data integrity (which you obviously care about, else you wouldn't have asked).

           

          HTH

           

          Mike

          • 2. Re: prevent importing duplicate records?
            user19752

            "select all the fields in the import as matching fields" is not an option, FM says "No fields were selected for import."

            I want the feature for years, but FM14 not have yet...

             

            Make a concatenated field (with some delimiter for better), then validate it as "always" "unique", then when import duplicated records are ignored.

            • 3. Re: prevent importing duplicate records?
              Mike_Mitchell

              All the fields that he wants to match, not all the fields in the table. Obviously, if you set every field in the table to be a match, then nothing would be imported (unless there’s not an existing record and you choose “Add remaining records”).

               

              The validation idea is a good one.

              • 4. Re: prevent importing duplicate records?

                As others have written, there are LOTS of options.

                 

                Regardless of the approach, for max control, I would create a script or external process to make sure I was in complete control over what happens...

                 

                Option 1: (From FMP - FAST)

                 

                After importing data into temp table and use a matching field in the Relationship graph and join the tables.

                 

                Something like a "select <Field_List> from TEMP_IMPORT_TABLE, actual_Table where <fields_match>"

                 

                You could then set up a relationship in the Relationship pane with the matching fields joining the two tables.

                 

                Something like ...

                Untitled.png

                Script pseudo-code:

                 

                Set Error Capture [ on ]

                 

                Enter Find Mode

                 

                Go to Layout [ Table1 ( Table1 ) ]

                 

                Set Field [ Table2::matchField ; "*" ]

                 

                Omit Record

                 

                Perform Find

                 

                <add/update/whatever>

                 

                -----

                 

                Option 2:  (From FMP- SLOWER)

                Or, you could just do a table scan from the data imported into the temp table (all the imported data) and then, in the script, only add (or update?) the actual rows to your "real" data table based on your rules.

                 

                (This may be the easiest option, but would also be the slowest due to the full table scan.)

                 

                ----

                 

                Option 3: (FMP External for Unique New Data Import into temp table, then FMP Script for Import -- FAST)

                A couple ways to do this (remotely/) externally via JDBC.

                (FileMaker includes an excellent JDBC driver with its installation files.)

                 

                Since FileMaker supports SQL "DELETE" (using JDBC), I would import all the data into the temp table, then in one SQL statement, I could remove all the rows in the temp table that are already in the "real" data table. Once that's done, a script could just add those non-duplicated rows to the real data table.

                 

                ---

                 

                Option 4: (External to FMP  - FAST) -- (How I would do this)

                 

                Optimally, I would do all this in Java/JDBC FMP connection as the way to import data via a Web site form.

                 

                I would write a small Java program that would open the text file (CSV?) and do a SQL against the actual "real" data table for matching records. If no match is found, then just issue a SQL INSERT INTO for that CSV record into the actual data table in FMP right from values read from the text file.

                 

                No FM script, no table scans, no FM SQL-SELECT-only limitations, etc.

                 

                You could even set this up on a web server to have users enter that data and then connect to the FM database, via JDBC, ....wherever.

                 

                Java is extremely powerful working with text files making this option simpler than it might sound.

                 

                -----

                 

                There may be some third party FMP solutions for handling these items, but I find that most FMP third party utilities are very expensive and at least some of these are easily reproducible/free if you don't need them to be "native" in FMP.

                 

                Please let me know if you have any questions, OK?

                 

                Enjoy!

                 

                - m

                 

                P.S. One of the most common data queries is: "Find the data in one table that is/isn't in the other."

                 

                Something like this query is often used:

                 

                select clientID FROM tempImportTable where clientID not in (Select clientID in RealDataTable)

                • 5. Re: prevent importing duplicate records?
                  jrockstrom

                  Thanks for the suggestions, Mike. I think you and everyone else who has tried to help, failed to read the small print that said I was 'a moderate developer, at best'! I shall have to downgrade myself from that position to rank beginner.

                   

                  I follow the suggestions but don't have the technical skills to follow most of them. Good on you, Morkus, you might very well do a Java/JDBC FMP connection, I on the other hand couldn't possibly manage it! :-)))

                   

                  Mike, I can control the terminology mostly, as we have set training categories and by suppling an Excel file with the 5 fields setup plus instructions to fill them with the right details. I currently have a simple script to allow them to import their data, what I'm concerned about is someone goes "i'm not sure if that imported correctly?' (even if I have supplied a splash window that says it did). And imports the same data all over again.

                   

                  The idea of a script to compare the importing data against existing data sounds like my best bet, but I don't how to write it. I imagine it is something like:

                  If

                  All 5 import fields exactly match the five fields in the existing record - don't import; otherwise import.

                   

                  I also liked your concatenated idea, but need more direction of how to implement it?

                   

                  Much appreciate the support from everyone!

                   

                  John

                  • 6. Re: prevent importing duplicate records?
                    Mike_Mitchell

                    In your particular situation, I would recommend an intermediate processing table that contains all the fields you want to import. Your script should import to that, and then process out all the records that have a match in the final, destination table. You can do this via a relationship (which will be much faster than trying to script it line by line). The relationship should be between the temporary table and the destination table, and should include all five fields with "=" comparators.

                     

                    Once the data are imported, you can perform a Find for anything that's an exact match and omit them from the found set (use Omit Records and Constrain Found Set). That's your "new" data, which can then be imported straight to the destination table without worry. If you don't find anything new, then the data set is a complete duplicate (like the aforementioned user who believes it "broke").

                     

                    In cases like this, it's generally a good idea to tag each temporary record with the account name of the person executing the script so multiple users don't step on each other. Just add the Get ( AccountName ) function to the Find to constrain the set to the records that apply only to the current user. You can then delete that user's records after import is complete.

                     

                    HTH

                     

                    Mike

                    • 7. Re: prevent importing duplicate records?

                      I did give you two possible FMP (no JDBC) ways of doing this too.

                       

                      The basic idea was that if FileMaker's built-in import doesn't give you enough control, then you have to manage that import by some process you create.

                       

                      I suggested four possible ways to do that.

                       

                      Good luck.

                       

                      - m

                      • 8. Re: prevent importing duplicate records?
                        siplus

                        You can't control the import. let them import whatever, then after the import delete "duplicate" records.

                         

                        You need a special field "IamOriginal" based upon a calc and a self-relationship. Examples abound.

                        • 9. Re: prevent importing duplicate records?
                          jrockstrom

                          Thanks, Mike, I'll give the intermediate processing table a go! May take me a week or two, I'll report back when it's done in case you're interested if I can resolve the issue this way.

                           

                          Many thanks,

                           

                          John

                          • 10. Re: prevent importing duplicate records?

                            Yup, this table import is what I suggested in my second of four possible approaches in this thread.

                             

                            This approach is the slowest - requiring a table scan after the import (OK if not too much imported data each time), but does offer good control and it's all in FM.

                             

                            As Mike indicated, consideration should be paid about two users trying to do an import at the same time.

                             

                            Use-Case: everyone uses the same login:

                            You may need to set a semaphore ($$globalVariable, like $$importing=true) at the start of the import script so only one import can happen at a time. (Remember to set that global variable to False at the end of the script.)


                            Use-Case: everyone has their own login:

                            Assuming users have their own user log-ins, Mike's Get(AccountName) idea above is an excellent, and elegant, way to handle this multi-user import possibility also!

                             

                            Good luck.

                             

                            - m

                            • 11. Re: prevent importing duplicate records?
                              jrockstrom

                              Hi Mike,

                               

                              I've setup a Temporary table to hold the imported records.

                              There is a concatenated field in both tables of all the imported fields plus the user ID. I have established the relationship using '=' for the concatenated field only, as I thought this was the criteria now (rather than the individual matching fields?).

                              What I can't work out is how to get the Temp Table to recognise any duplicated records that already reside in the Permanent table and load across only new records.

                               

                              I've prepared a script which does the importing of data to the Temp table, then hopefully will move records as mentioned above, my thought was to finish the script with deleting all records in the Temp table for the user ID who just imported the data. That way it clears only records for the user the script is dealing with and not anyone else who may possibly be importing other data at the same time. Not sure how to do that part either?


                              I can send the relationship map or the script edit page, if it helps?


                              Best regards,


                              John


                              • 12. Re: prevent importing duplicate records?
                                Mike_Mitchell

                                Do you have a relationship between the temp table and the permanent table? Based on something unique?

                                • 13. Re: prevent importing duplicate records?
                                  jrockstrom

                                  I was using the concatenated field as my unique value to establish the relationship, though as I think about there is a remote chance that the same student could have exactly the same set of data entered by the same instructor on another occasion. The unique value I've used throughout the solution is the person's membership ID, and this field is in both the Temp and Perm tables, perhaps a better choice?

                                   

                                  John

                                  • 14. Re: prevent importing duplicate records?
                                    Mike_Mitchell

                                    Whatever is unique in each record. You can check it with a Count ( ) function in the temp table:

                                     

                                         Count ( parentTable::key )

                                     

                                    If that's > 1, it's a duplicate.

                                     

                                    (This would be an unstored aggregate, so it might be slow for large data sets. Use with caution.)

                                    1 2 Previous Next