1 of 1 people found this helpful
1) I have a script that is designed to find duplicate entries. I follow the "typlcal" approaches outlined out there. Sets a global field to the data from the "name" and compares it to the next record, marks the dupes and then finds all of the dupes.
option 1. Set a variable to the value you are testing. Omit the records as soon as you discover that they are duplicates. After completing the loop use the show omitted script step.
option 2. As above, except that you omit the original, leaving only duplicates. This method is necessary if you are de-duping a subset of the full record set.
option 3. Put the record IDs of duplicates into a global variable. After completing the loop use those IDs to locate the dupes ( by find or by relationship )
2) Is there a way to prevent a find dupes script like this from being reflected as a "modification" of the underlying record.
The options above, none modify the record.
1 of 1 people found this helpful
Another option: Use a self-joining relationship and the Count ( ) function. Won't work in your specific case (because your duplicate key is an unstored calc), but you can avoid the script entirely in this way; just search for a calculation:
Count ( self::key ) > 1
For future reference. .
Thank you for your responses. I am finding your idea around using variables interesting. I have to think about how I may implement that in my solution. If you have an example of where you have done that and can share the script, I'd love to see how you did this. I like the variable idea b/c it would adress my 2nd point as well.
I am also thinking about, and perhaps you have some thoughts around this, of trying to find a way to solve my "unstored calculation" issue since this seems to be a recurring issue for me.
i have separated my data into various related tables for efficiency (and security purposes (different people work on different parts of the data). But the result of this is the "primary" display/table, which needs to use the data from the other related tables, ends up with unstored calculations/results. So, for example, take a name of a wine. It involves the Brand (one table), sometimes a "name" assigned by the brand (from the primary table), the "type"/grape from another table, sometimes a geography reference (from another table) and the vintage (from the primary table). So the tables are related by unique keys and the references for the brand, grape, geography, etc. are unique IDs with their names in the other tables. So when I have a calculated name in the primary table, it combines the components of the names from the various tables and adds them to the primary table fields. This creates the "name" as used for the database. But it cannot store this information, and thus my slow scripting/finds/sorts/etc. when on a WAN.
Do you have any suggestions to help address this? I have thought about "scripting" the selection of the component items, and adding to this script a step that does a "set field" to set a field in the primary table with the component item name, and then using this in the calculation of the full name (instead of the name in the related table).
My concern/issue, is "what happens if the component item's name changes in the related table?" My other script with the "set field" will not pick up this change.
Mike, very interesting. I would have to think about if there is a way to use this in my solution. Take a look at my comment above to Malcolm, in case you have any thoughts about that or how it may help work with your idea. Thanks.