11 Replies Latest reply on Oct 18, 2016 7:29 AM by lmagnan

    PSOS Complex Search Script generating a dmp file

    lmagnan

      Hello,

       

      I have a fairly complex database project for which I need to run a similarity search between an imported string and what I have in my database to match the imported document to the entries I have in my database. I am using the Levenshtein custom function written by Steven Allen which does exactly what I need.

       

      The solution is hosted on a Windows 2012 Server that is dedicated only to FileMaker Server 14, with Indexing, Back-ups and Anti-Virus Software pointed away from the database to make they don't interfere like the documentation mentioned. The server has 8 Gb or Ram and the issue happens all the time regardless of the number of user connected or the load on the solution, which by the way is very low all the time. The RAM available for cache is 1500Mb and I am never really getting anything else than 100% cache hit on the statistics page.

       

      The way I have setup my calculation is using an un-indexed calculation field where the Levenshtein function looks for differences between a local variable and a text field in the record. I make a search for whatever record has less than a certain number of differences, and I also look at a related table the same way for another field, both done at the same time using a perform find. I then write down the possible matches in the common field so it is available to clients to look at while the function looks at other entries.

       

      Since the Levenshtein function takes quite a while to resolve, I thought it would be a good idea to split the list of entries I have to match and split it into pieces so I can run the Levenshtein separately on many server side scripts. The server side scripts would then write the UUIDs of the matches onto a common field, a step for which I took the record locking into account by basically writing a loop that tries to write the matches to the field until it gets no error, meaning the set field and the commit step went through.

       

      I have tried many different combinations to find the most efficient way to get the imported entries analysed. However, whenever I get anything more than one server side script running, FMServer generates a .dmp file in the log folder for each entry it tries to analyse. The database stays up however and I don't get any error message whatsoever. I realized the dump file thing only a few days later.

       

      When I first tried it I had something like 5 or 6 scripts running in parallel, each doing their part of the task. However, some of the scripts would never finish and I would get an admin console lock up. I thought I was maybe asking a little much to the server and tried it with fewer and fewer instances of the script running in parallel until I got all the way down to two, where the server keeps doing the dump file thing...

       

      When I get into the DMP file and look at the content I see an exception (C0000005) listed against Support.DLL, which seems to be a module from the FMServer folder. I never get any other error or DMP file for whatever else I do with the database, even whatever is related to the same set of data onto which the error is produced.

       

      What I was wondering is if the error can come from the Levenshtein function building arrays so big to analyse similarities that the database runs out of room for it? From what I understand, the function is complex and has to build super big arrays to find out what the differences are. Since it is so complex, could I be running out of resources, or the problem could be something else?

       

      Thanks for your help!

       

      LM

        • 1. Re: PSOS Complex Search Script generating a dmp file
          mikebeargie

          It sounds like you're crashing the memory. If you disable the script step that actually does that calculation does it work as expected?

           

          If it was a problem with disk space you would probably get an error back from filemaker. memory crashes tend to stop processes completely though.

          • 2. Re: PSOS Complex Search Script generating a dmp file
            lmagnan

            Hey Mike,

             

            The script runs through if I only run one instance of it.

             

            I was doing the multiple PSOS to get the analysis done quicker, as doing it with only one PSOS can take up to 15-20 minutes.

            • 3. Re: PSOS Complex Search Script generating a dmp file
              lmagnan

              Just to add on to my last answer:

               

              The weirdest part of all this is even though I get a crash message, I still get the expected results from the script.

               

              So I get an exception error, but the calculation still completes and gives me what I need. Only problem, apart from the 100s of DMP files generated if you run the script multiple times, is the admin console locking issue for some scripts where I can't force stop them anymore...

              • 4. Re: PSOS Complex Search Script generating a dmp file
                mikebeargie

                I am guessing it’s resource demand and/or timeout then.

                 

                You may be able to architect your solution and process in a way that will perform better, or utilize a plugin (EG BaseElements, Scriptmaster) that works better for your use case that you can pass the processing off to.

                 

                Actually, looking at that CF, another possibility is you may be hitting the recursion stack limit.

                 

                Are you looping through portions of the entire document (EG chunk by 100 words at a time and average the score), or passing the entire document?

                 

                360works ScriptMaster is free and runs java functions, something like this may be adaptable for you:

                https://commons.apache.org/sandbox/commons-text/jacoco/org.apache.commons.text.similarity/LevenshteinDistance.java.html

                • 5. Re: PSOS Complex Search Script generating a dmp file
                  mikebeargie

                  That specific note makes me think that it’s the recursion limit.

                  • 6. Re: PSOS Complex Search Script generating a dmp file
                    lmagnan

                    Mike,

                     

                    Thanks for your input on that one. I usually find what's wrong  searching through these forums, but I have to say I was looking at a dead end on that one.

                     

                    I'm looking through a document list and associated modification names.

                     

                    Basically, for each imported entry, I start by running a search for what I call approval documents, which have a specific field set to 1. That filters out the found set to a more manageable collection of records.

                     

                    I have two tables on which I run my search. The first one is the document table, into which I look for similarity between the document ID number and a local variable. The function is in an unstored field where it looks for differences between a local variable and one of the fields from that record. I basically want my doc ID to be within 20% of the imported doc ID. If its the case, I write down the UUID of the entry(ies) and carry on with the other imported entries.

                     

                    I also run the search against a mod name table, which is related to the document table. I run the search the same way as the document ID search, only difference is that I am looking for a different variable.

                     

                    Both my searches are run "at the same time" using a constrain found set script step to constrain the previously built found set of approval documents. The constrain found set is built of two separate find requests.

                     

                    Since the calculation is evaluated from each record being searched, how can I reach the recursion limit? There's also the fact that the script does go all the way to the end with no issues if I run only one instance at the same time... I also thought that separate FMS Sessions were just like separate clients connecting to the server, I didn't think they would share recursion limits?

                     

                    Thanks for your help,

                    • 7. Re: PSOS Complex Search Script generating a dmp file
                      mikebeargie

                      Two processes would not share recursion limits, however two processes share RAM. All of filemaker’s calculation engine is processed in the RAM cache so you’re essentially doubling the demand by running both at once.

                       

                      Like I said it mostly depends on what you’re comparing. If it’s just a short string of text it should be fine, if it’s an entire document’s worth of contents, it may be way too much load.

                      • 8. Re: PSOS Complex Search Script generating a dmp file
                        lmagnan

                        Hey Mike,

                         

                        Thanks for your answer. Still trying to wrap my head around the inner workings of this, but I think your suggestion to go ahead with a plugin might be the simplest way to make it work. I was already using BE for something else and I looked into Script Master who is pretty impressive as well.

                         

                        How does it work when you use a plugin? Where does the processing get done? I thought plugins were some sort of way to interpret non-FM Code by the FMaker calculation engine. As you see, I am not that familiar with plugins and only use them once in a while for specific tasks.

                         

                        One thing I did notice on the server when I ran the script again was that the CPU Frequency goes up over 100% (sometimes all the way up to 120 or 130%), while I'm getting 12 or 18% CPU loads when I run two or three scripts together... Could the high frequency be causing the crash and the sort of admin console lock-up, along with the C0000005 exception in the DMP file?

                        • 9. Re: PSOS Complex Search Script generating a dmp file
                          mikebeargie

                          The demo file that accompanies scriptmaster has working examples.

                           

                          You have a chunk of java code that you turn into a calculation engine function in filemaker.

                           

                          Like:

                           

                          SM_MyFunction( p1 ; p2 ; p3 )

                           

                          Would pass three parameters to a scriptmaster java script, and return a result in the calculation engine. Basically it’s for writing custom functions out of Java.

                          • 10. Re: PSOS Complex Search Script generating a dmp file
                            wimdecorte

                            Open a support ticket with FMI and have them take a look at those DMP files; they are an indication of some internal exception and it may give them a clue.

                             

                            Instead of searching on the unstored calc, how about a scripted loop through the records and resolve the distance calc in the script instead of forcing FMS to resolve the calc for all records before the search can be done?

                            • 11. Re: PSOS Complex Search Script generating a dmp file
                              lmagnan

                              Wim,

                               

                              Yea, thought about it. This way the memory used would be really small since it would only be evaluating two strings, one after the other, instead of looking at all my approval documents at once and trying to figure out which one is similar. Since I would be evaluating multiple calcs separately it would probably end up being a little slower for each script, but I would be able to run multiple instances like I was trying to do without crashing the server. I was trying to make it quicker, but as you've seen its not really working out for me I'll still open a support ticket just to see what FMSupport comes up with...

                               

                              Thanks for the idea Mike, I'll look into this today for sure. I was just wondering how it worked "behind the scenes", like where the resources are taken when you use a plugin? Is it using the FMaker calculation engine to interpret the code passed to the plugin, or the plugin kind of generates its own thread and uses separate resources from FMaker?

                               

                              I tried a few examples this morning and there's some pretty cool stuff to do with ScriptMaster, its a little easier to understand than BE when you first start using it too!