Any batch update, via replace field contents or looping script, might be doing a corresponding batch update of a massive index on a field from a table with that many records. That indexing, could conceivably be the source of your reduced performance here....
so, in a sense it would cause the users to be getting their individual indexes tweaked, causing their machines to crawl. Is that what you are saying?
Is there a solution???
Replace is not safe for an industrial environment and yes it does slow everyone down.
Why is it not safe.
Try this. Using a small table for testing purposes, have 12 users edit 12 different records.
Replace will skip these 12 records since they are locked and you will not know what those 12 records are unless you have a method of discovering this.
A better method, both for accuaracy and happy users is something like this:
This script loops until every record is update whether or not it is temporarily locked. You may have to go around and close peoples records
Add an updated timestamp field to your table.
set var $$timestamp to current timestamp
Perform find timestamp field not $$timestamp
exit loop if foundcout = 0 etc
go to record first
if record is locked
set timestamp field to $$Timestamp
go to record next exit last
xxxx every 100 records or so pause for 1 second to allow other users some time
After the loop, repeat it to pickup the locked records until there are no records found. Just keep the $timestamp the same.
Is there no way to keep the records updated without running a big script?
Change your design so that this massive update is not needed?
Schedule the process to take place late at night?
Make the changes in smaller increments for shorter periods of time?
Note the question marks. Much depends on why you need to do such a batch update of such massive proportions in the first place..
A better method, both for accuaracy and happy users is something like this:
This script loops until every record is update whether or not it is temporarily locked. You may have to go around and close peoples record
I used to make the same comment until someone else schooled me on how to do this with replace field contents. Fact is that both methods require adding code to check for and handle record locking issues. A looping script can manage the issue one record at a time, with replace field contents, you can manage them as a group after the replace field contents has taken place. Here's how and I freely note that this is not always the best alternative, but much of the time, it works just as well, if not better than the looping script when it comes to handling locks.
Use set error capture to keep a record lock error from interrupting the script.
Do the replace
Check immediately for an error code reporting that record locks kept one or more records from being updated.
Do a find for all records that didn't get updated. This can usually be done by either searching for data in the updated field that didn't get changed or for by searching the modified fields for fields that don't show the right account name and a modification time stamp that falls within a specific range.
Once your script has that found set, you have the same options as a looping script:
Report this list of un-updated records back to the user for them to decide what to do:
Repeat the "replace", but now just on this found set.
Set up a "delayed update" for this set some set period of time in the future.
we receive a data dump once a week from an outside vendor. We need to ingest this data but it is a bit more complex than a straight import although that's part of the process. In a sense it is sales data showing where are products where sold (along with a lot of other metrics). So we ingest the sales data, but then have to match it to our own products and do some data massaging to get it into our formats and not our vendor's format. The entire script takes about 20 hours to complete, so we start it in the evening, but inevitably it runs during the next business day. The timing is set in stone, but we want to minimize the drain on the users if possible.
I have put this data into a table no users directly access hoping that would help, but it seems to affect all users even ones that never reference this table via virtual lists, etc. and we are not worried about record locking in this case, since no users ever alter the data in this table, they only reference it.
I thought about taking 50,000 record chunks to break up the updating a 1 million all at once, but not sure if that is helpful or not?? If it is affecting by the users getting reindexed, then I assume this smaller chunk wouldn't help the situation.
Perhaps you could set up a different server using back ups of your current data that is sufficiently up to date for the needed "data massage". Then you do all the processing that you can on this other server and only import over the finished product.
By the way, I just tried to get really crafty and loop through one record at a time, the newly imported records and eliminate replace steps, being very careful to only have one commit step after multiple set fields ... along with never leaving the layout. I thought it would substantially cut down time, but it actually did the opposite.
The script run time went from approximate 20 hours to 120 hours ... ugh. it was looping through approximately 800,000 records and taking less than 1 second per record ... but with so many records it really added up.
I did this thinking a loop->set field (instead of import->replace) would minimize the impact on the users. I actually don't know if this did so, since 120 hours was too long to consider this as an option...
a separate server ... this is an interesting idea. Although much of the updating is basically mixing our data back into theirs. But this could offload a lot of the work for sure.
I wonder if just a separate database on the same server would solve the issue? In other words ... is the issue causing server performance issues OR an issue within the database itself??? Obviously if it's an indexing issue within the database that's different than if it's a cache / memory (hardware) issue??
My gut is that it isn't a hardware issue, but rather a limitation within the server software and how FM handles multiple users when one user is altering one table heavily. but I have no evidence to back that up.
Are the script(s) being run ON the server (with schedule or PSoS)? Or is this client-triggered?
Sent from miPhone
I guess the question comes down to what can I do to minimize the impact on the users. And of course, we could stop the process during business hours and set it to start up where it left off again after hours, but I still ponder my original question as to when needing to update large chunks of data what method has the least impact on the users.
If I need to run a replace on a single field for 1 million records ... speed wise, a replace is generally going to be the fastest method (generalizing here of course ...) and is it normal that while that replace is occurring, all other users are being affected even though they aren't accessing the table being altered?
and if it is normal that users are impacted, does breaking it into smaller chunks help minimize their impact?
The works has to get done and we want to do it as quickly as possible, but if there is some smarter method that doesn't give up too much speed while helping the other users not get bogged down ... then I'm all ears...
it's a PSOS. I'm actually logging in as a user and running a script that literally asks me to locate the txt file that contains all the new data. Once I select the txt file, it imports the data into a temporary table then activates a PSOS and the script exits.
So the user's machine is stuck while it imports the data (about 20 minutes), then the rest of the work is done server side via a psos.
"the table that this script is altering is not being used by any users."
I don't believe this should not result any significant slowdown for users. We perform millions of row imports in a large table that others are viewing but not modifying, and none of them have noticed a speed difference. You are performing a replace on a table (so this is different) but in my case and yours the users are not using that table.
Seems to me that FMS should be using a separate processor like it does with imports, and leave other users alone.
The reason we import without replace is that we process the data ahead of time using SQL, then import the resulting rows. Much faster, and without bothering the users.
Of course replace operates on only one field while you might want to update numerous fields hence back to my loop...