4 Replies Latest reply on Feb 12, 2013 9:29 AM by alas

    Two Scripting Questions

    alas

      1) I have a script that is designed to find duplicate entries. I follow the "typlcal" approaches outlined out there. Sets a global field to the data from the "name" and compares it to the next record, marks the dupes and then finds all of the dupes. My "names" are unstored calculated fields because it involves data points from related tables and thus can't be stored. Within the LAN, it takes about 10 seconds to run the script. On a WAN, it is taking 10 minutes (with around 1500 records). Is there a way to improve this? Is there a way to maybe have the server (hosted on FMS Advanced) run the sorts/loop/find so that it is not passing all of that data over the WAN? I've even tried having the calculated name field being "set" to another field as a step in the script, with the field being set being stored and indexed, but that first step is even taking a long time as it runs through 1500 records.

       

      2) Is there a way to prevent a find dupes script like this from being reflected as a "modification" of the underlying record. Since there are these writes to temporary fields to do the dupes comparison, it is reflecting as thought the records themselves are being modified. Is there a better option here?

        • 1. Re: Two Scripting Questions
          Malcolm

          1) I have a script that is designed to find duplicate entries.  I follow the "typlcal" approaches outlined out there.  Sets a global field to the data from the "name" and compares it to the next record, marks the dupes and then finds all of the dupes. 

           

          option 1. Set a variable to the value you are testing. Omit the records as soon as you discover that they are duplicates. After completing the loop use the show omitted script step.

           

          option 2. As above, except that you omit the original, leaving only duplicates. This method is necessary if you are de-duping a subset of the full record set.

           

          option 3. Put the record IDs of duplicates into a global variable. After completing the loop use those IDs to locate the dupes ( by find or by relationship )

           

          2) Is there a way to prevent a find dupes script like this from being reflected as a "modification" of the underlying record. 

           

          The options above, none modify the record.

           

          malcolm

          1 of 1 people found this helpful
          • 2. Re: Two Scripting Questions
            Mike_Mitchell

            Another option: Use a self-joining relationship and the Count ( ) function. Won't work in your specific case (because your duplicate key is an unstored calc), but you can avoid the script entirely in this way; just search for a calculation:

             

                 Count ( self::key ) > 1

             

            For future reference.    .

             

            Mike

            1 of 1 people found this helpful
            • 3. Re: Two Scripting Questions
              alas

              Malcom,

               

              Thank you for your responses.  I am finding your idea around using variables interesting.  I have to think about how I may implement that in my solution.  If you have an example of where you have done that and can share the script, I'd love to see how you did this.  I like the variable idea b/c it would adress my 2nd point as well. 

               

              I am also thinking about, and perhaps you have some thoughts around this, of trying to find a way to solve my "unstored calculation" issue since this seems to be a recurring issue for me.

               

              i have separated my data into various related tables for efficiency (and security purposes (different people work on different parts of the data).  But the result of this is the "primary" display/table, which needs to use the data from the other related tables, ends up with unstored calculations/results.  So, for example, take a name of a wine.  It involves the Brand (one table), sometimes a "name" assigned by the brand (from the primary table), the "type"/grape from another table, sometimes a geography reference (from another table) and the vintage (from the primary table).  So the tables are related by unique keys and the references for the brand, grape, geography, etc. are unique IDs with their names in the other tables.  So when I have a calculated name in the primary table, it combines the components of the names from the various tables and adds them to the primary table fields.   This creates the "name" as used for the database.  But it cannot store this information, and thus my slow scripting/finds/sorts/etc. when on a WAN.

               

              Do you have any suggestions to help address this?  I have thought about "scripting" the selection of the component items, and adding to this script a step that does a "set field" to set a field in the primary table with the component item name, and then using this in the calculation of the full name (instead of the name in the related table).

               

              My concern/issue, is "what happens if the component item's name changes in the related table?"  My other script with the "set field" will not pick up this change. 

               

              Thoughts?

              • 4. Re: Two Scripting Questions
                alas

                Mike, very interesting.  I would have to think about if there is a way to use this in my solution.  Take a look at my comment above to Malcolm, in case you have any thoughts about that or how it may help work with your idea.  Thanks.

                 

                Andrew.